| <html lang="en"> | 
 | <head> | 
 | <title>Subexpression Complications - The GNU C Library</title> | 
 | <meta http-equiv="Content-Type" content="text/html"> | 
 | <meta name="description" content="The GNU C Library"> | 
 | <meta name="generator" content="makeinfo 4.13"> | 
 | <link title="Top" rel="start" href="index.html#Top"> | 
 | <link rel="up" href="Regular-Expressions.html#Regular-Expressions" title="Regular Expressions"> | 
 | <link rel="prev" href="Regexp-Subexpressions.html#Regexp-Subexpressions" title="Regexp Subexpressions"> | 
 | <link rel="next" href="Regexp-Cleanup.html#Regexp-Cleanup" title="Regexp Cleanup"> | 
 | <link href="http://www.gnu.org/software/texinfo/" rel="generator-home" title="Texinfo Homepage"> | 
 | <!-- | 
 | This file documents the GNU C library. | 
 |  | 
 | This is Edition 0.12, last updated 2007-10-27, | 
 | of `The GNU C Library Reference Manual', for version | 
 | 2.8 (Sourcery G++ Lite 2011.03-41). | 
 |  | 
 | Copyright (C) 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2001, 2002, | 
 | 2003, 2007, 2008, 2010 Free Software Foundation, Inc. | 
 |  | 
 | Permission is granted to copy, distribute and/or modify this document | 
 | under the terms of the GNU Free Documentation License, Version 1.3 or | 
 | any later version published by the Free Software Foundation; with the | 
 | Invariant Sections being ``Free Software Needs Free Documentation'' | 
 | and ``GNU Lesser General Public License'', the Front-Cover texts being | 
 | ``A GNU Manual'', and with the Back-Cover Texts as in (a) below.  A | 
 | copy of the license is included in the section entitled "GNU Free | 
 | Documentation License". | 
 |  | 
 | (a) The FSF's Back-Cover Text is: ``You have the freedom to | 
 | copy and modify this GNU manual.  Buying copies from the FSF | 
 | supports it in developing GNU and promoting software freedom.''--> | 
 | <meta http-equiv="Content-Style-Type" content="text/css"> | 
 | <style type="text/css"><!-- | 
 |   pre.display { font-family:inherit } | 
 |   pre.format  { font-family:inherit } | 
 |   pre.smalldisplay { font-family:inherit; font-size:smaller } | 
 |   pre.smallformat  { font-family:inherit; font-size:smaller } | 
 |   pre.smallexample { font-size:smaller } | 
 |   pre.smalllisp    { font-size:smaller } | 
 |   span.sc    { font-variant:small-caps } | 
 |   span.roman { font-family:serif; font-weight:normal; }  | 
 |   span.sansserif { font-family:sans-serif; font-weight:normal; }  | 
 | --></style> | 
 | <link rel="stylesheet" type="text/css" href="../cs.css"> | 
 | </head> | 
 | <body> | 
 | <div class="node"> | 
 | <a name="Subexpression-Complications"></a> | 
 | <p> | 
 | Next: <a rel="next" accesskey="n" href="Regexp-Cleanup.html#Regexp-Cleanup">Regexp Cleanup</a>, | 
 | Previous: <a rel="previous" accesskey="p" href="Regexp-Subexpressions.html#Regexp-Subexpressions">Regexp Subexpressions</a>, | 
 | Up: <a rel="up" accesskey="u" href="Regular-Expressions.html#Regular-Expressions">Regular Expressions</a> | 
 | <hr> | 
 | </div> | 
 |  | 
 | <h4 class="subsection">10.3.5 Complications in Subexpression Matching</h4> | 
 |  | 
 | <p>Sometimes a subexpression matches a substring of no characters.  This | 
 | happens when ‘<samp><span class="samp">f\(o*\)</span></samp>’ matches the string ‘<samp><span class="samp">fum</span></samp>’.  (It really | 
 | matches just the ‘<samp><span class="samp">f</span></samp>’.)  In this case, both of the offsets identify | 
 | the point in the string where the null substring was found.  In this | 
 | example, the offsets are both <code>1</code>. | 
 |  | 
 |    <p>Sometimes the entire regular expression can match without using some of | 
 | its subexpressions at all—for example, when ‘<samp><span class="samp">ba\(na\)*</span></samp>’ matches the | 
 | string ‘<samp><span class="samp">ba</span></samp>’, the parenthetical subexpression is not used.  When | 
 | this happens, <code>regexec</code> stores <code>-1</code> in both fields of the | 
 | element for that subexpression. | 
 |  | 
 |    <p>Sometimes matching the entire regular expression can match a particular | 
 | subexpression more than once—for example, when ‘<samp><span class="samp">ba\(na\)*</span></samp>’ | 
 | matches the string ‘<samp><span class="samp">bananana</span></samp>’, the parenthetical subexpression | 
 | matches three times.  When this happens, <code>regexec</code> usually stores | 
 | the offsets of the last part of the string that matched the | 
 | subexpression.  In the case of ‘<samp><span class="samp">bananana</span></samp>’, these offsets are | 
 | <code>6</code> and <code>8</code>. | 
 |  | 
 |    <p>But the last match is not always the one that is chosen.  It's more | 
 | accurate to say that the last <em>opportunity</em> to match is the one | 
 | that takes precedence.  What this means is that when one subexpression | 
 | appears within another, then the results reported for the inner | 
 | subexpression reflect whatever happened on the last match of the outer | 
 | subexpression.  For an example, consider ‘<samp><span class="samp">\(ba\(na\)*s \)*</span></samp>’ matching | 
 | the string ‘<samp><span class="samp">bananas bas </span></samp>’.  The last time the inner expression | 
 | actually matches is near the end of the first word.  But it is | 
 | <em>considered</em> again in the second word, and fails to match there.  | 
 | <code>regexec</code> reports nonuse of the “na” subexpression. | 
 |  | 
 |    <p>Another place where this rule applies is when the regular expression | 
 | <pre class="smallexample">     \(ba\(na\)*s \|nefer\(ti\)* \)* | 
 | </pre> | 
 |    <p class="noindent">matches ‘<samp><span class="samp">bananas nefertiti</span></samp>’.  The “na” subexpression does match | 
 | in the first word, but it doesn't match in the second word because the | 
 | other alternative is used there.  Once again, the second repetition of | 
 | the outer subexpression overrides the first, and within that second | 
 | repetition, the “na” subexpression is not used.  So <code>regexec</code> | 
 | reports nonuse of the “na” subexpression. | 
 |  | 
 |    </body></html> | 
 |  |