blob: 6a1bd2a9fc14ae2962a6388cbc3e49af701f6420 [file] [log] [blame]
<html lang="en">
<head>
<title>Regexp Subexpressions - The GNU C Library</title>
<meta http-equiv="Content-Type" content="text/html">
<meta name="description" content="The GNU C Library">
<meta name="generator" content="makeinfo 4.13">
<link title="Top" rel="start" href="index.html#Top">
<link rel="up" href="Regular-Expressions.html#Regular-Expressions" title="Regular Expressions">
<link rel="prev" href="Matching-POSIX-Regexps.html#Matching-POSIX-Regexps" title="Matching POSIX Regexps">
<link rel="next" href="Subexpression-Complications.html#Subexpression-Complications" title="Subexpression Complications">
<link href="http://www.gnu.org/software/texinfo/" rel="generator-home" title="Texinfo Homepage">
<!--
This file documents the GNU C library.
This is Edition 0.12, last updated 2007-10-27,
of `The GNU C Library Reference Manual', for version
2.8 (Sourcery G++ Lite 2011.03-41).
Copyright (C) 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2001, 2002,
2003, 2007, 2008, 2010 Free Software Foundation, Inc.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License, Version 1.3 or
any later version published by the Free Software Foundation; with the
Invariant Sections being ``Free Software Needs Free Documentation''
and ``GNU Lesser General Public License'', the Front-Cover texts being
``A GNU Manual'', and with the Back-Cover Texts as in (a) below. A
copy of the license is included in the section entitled "GNU Free
Documentation License".
(a) The FSF's Back-Cover Text is: ``You have the freedom to
copy and modify this GNU manual. Buying copies from the FSF
supports it in developing GNU and promoting software freedom.''-->
<meta http-equiv="Content-Style-Type" content="text/css">
<style type="text/css"><!--
pre.display { font-family:inherit }
pre.format { font-family:inherit }
pre.smalldisplay { font-family:inherit; font-size:smaller }
pre.smallformat { font-family:inherit; font-size:smaller }
pre.smallexample { font-size:smaller }
pre.smalllisp { font-size:smaller }
span.sc { font-variant:small-caps }
span.roman { font-family:serif; font-weight:normal; }
span.sansserif { font-family:sans-serif; font-weight:normal; }
--></style>
<link rel="stylesheet" type="text/css" href="../cs.css">
</head>
<body>
<div class="node">
<a name="Regexp-Subexpressions"></a>
<p>
Next:&nbsp;<a rel="next" accesskey="n" href="Subexpression-Complications.html#Subexpression-Complications">Subexpression Complications</a>,
Previous:&nbsp;<a rel="previous" accesskey="p" href="Matching-POSIX-Regexps.html#Matching-POSIX-Regexps">Matching POSIX Regexps</a>,
Up:&nbsp;<a rel="up" accesskey="u" href="Regular-Expressions.html#Regular-Expressions">Regular Expressions</a>
<hr>
</div>
<h4 class="subsection">10.3.4 Match Results with Subexpressions</h4>
<p>When <code>regexec</code> matches parenthetical subexpressions of
<var>pattern</var>, it records which parts of <var>string</var> they match. It
returns that information by storing the offsets into an array whose
elements are structures of type <code>regmatch_t</code>. The first element of
the array (index <code>0</code>) records the part of the string that matched
the entire regular expression. Each other element of the array records
the beginning and end of the part that matched a single parenthetical
subexpression.
<!-- regex.h -->
<!-- POSIX.2 -->
<div class="defun">
&mdash; Data Type: <b>regmatch_t</b><var><a name="index-regmatch_005ft-880"></a></var><br>
<blockquote><p>This is the data type of the <var>matcharray</var> array that you pass to
<code>regexec</code>. It contains two structure fields, as follows:
<dl>
<dt><code>rm_so</code><dd>The offset in <var>string</var> of the beginning of a substring. Add this
value to <var>string</var> to get the address of that part.
<br><dt><code>rm_eo</code><dd>The offset in <var>string</var> of the end of the substring.
</dl>
</p></blockquote></div>
<!-- regex.h -->
<!-- POSIX.2 -->
<div class="defun">
&mdash; Data Type: <b>regoff_t</b><var><a name="index-regoff_005ft-881"></a></var><br>
<blockquote><p><code>regoff_t</code> is an alias for another signed integer type.
The fields of <code>regmatch_t</code> have type <code>regoff_t</code>.
</p></blockquote></div>
<p>The <code>regmatch_t</code> elements correspond to subexpressions
positionally; the first element (index <code>1</code>) records where the first
subexpression matched, the second element records the second
subexpression, and so on. The order of the subexpressions is the order
in which they begin.
<p>When you call <code>regexec</code>, you specify how long the <var>matchptr</var>
array is, with the <var>nmatch</var> argument. This tells <code>regexec</code> how
many elements to store. If the actual regular expression has more than
<var>nmatch</var> subexpressions, then you won't get offset information about
the rest of them. But this doesn't alter whether the pattern matches a
particular string or not.
<p>If you don't want <code>regexec</code> to return any information about where
the subexpressions matched, you can either supply <code>0</code> for
<var>nmatch</var>, or use the flag <code>REG_NOSUB</code> when you compile the
pattern with <code>regcomp</code>.
</body></html>