blob: 06dfc7422093894f27efbe7682a18096bbf64c19 [file] [log] [blame]
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=US-ASCII">
<title>User's Guide</title>
<link rel="stylesheet" href="../../../doc/src/boostbook.css" type="text/css">
<meta name="generator" content="DocBook XSL Stylesheets V1.75.2">
<link rel="home" href="../index.html" title="The Boost C++ Libraries BoostBook Documentation Subset">
<link rel="up" href="../xpressive.html" title="Chapter&#160;29.&#160;Boost.Xpressive">
<link rel="prev" href="../xpressive.html" title="Chapter&#160;29.&#160;Boost.Xpressive">
<link rel="next" href="reference.html" title="Reference">
</head>
<body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF">
<table cellpadding="2" width="100%"><tr>
<td valign="top"><img alt="Boost C++ Libraries" width="277" height="86" src="../../../boost.png"></td>
<td align="center"><a href="../../../index.html">Home</a></td>
<td align="center"><a href="../../../libs/libraries.htm">Libraries</a></td>
<td align="center"><a href="http://www.boost.org/users/people.html">People</a></td>
<td align="center"><a href="http://www.boost.org/users/faq.html">FAQ</a></td>
<td align="center"><a href="../../../more/index.htm">More</a></td>
</tr></table>
<hr>
<div class="spirit-nav">
<a accesskey="p" href="../xpressive.html"><img src="../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../xpressive.html"><img src="../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../index.html"><img src="../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="reference.html"><img src="../../../doc/src/images/next.png" alt="Next"></a>
</div>
<div class="section">
<div class="titlepage"><div><div><h2 class="title" style="clear: both">
<a name="xpressive.user_s_guide"></a><a class="link" href="user_s_guide.html" title="User's Guide">User's Guide</a>
</h2></div></div></div>
<div class="toc"><dl>
<dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.introduction">Introduction</a></span></dt>
<dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.installing_xpressive">Installing
xpressive</a></span></dt>
<dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.quick_start">Quick Start</a></span></dt>
<dt><span class="section"><a href="user_s_guide.html#xpressive.user_s_guide.creating_a_regex_object">Creating
a Regex Object</a></span></dt>
<dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.matching_and_searching">Matching
and Searching</a></span></dt>
<dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.accessing_results">Accessing
Results</a></span></dt>
<dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.string_substitutions">String
Substitutions</a></span></dt>
<dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.string_splitting_and_tokenization">String
Splitting and Tokenization</a></span></dt>
<dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.named_captures">Named Captures</a></span></dt>
<dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.grammars_and_nested_matches">Grammars
and Nested Matches</a></span></dt>
<dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions">Semantic
Actions and User-Defined Assertions</a></span></dt>
<dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.symbol_tables_and_attributes">Symbol
Tables and Attributes</a></span></dt>
<dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.localization_and_regex_traits">Localization
and Regex Traits</a></span></dt>
<dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.tips_n_tricks">Tips 'N Tricks</a></span></dt>
<dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.concepts">Concepts</a></span></dt>
<dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.examples">Examples</a></span></dt>
</dl></div>
<p>
This section describes how to use xpressive to accomplish text manipulation
and parsing tasks. If you are looking for detailed information regarding specific
components in xpressive, check the <a class="link" href="reference.html" title="Reference">Reference</a>
section.
</p>
<div class="section">
<div class="titlepage"><div><div><h3 class="title">
<a name="boost_xpressive.user_s_guide.introduction"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.introduction" title="Introduction">Introduction</a>
</h3></div></div></div>
<a name="boost_xpressive.user_s_guide.introduction.what_is_xpressive_"></a><h3>
<a name="id3091401"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.introduction.what_is_xpressive_">What
is xpressive?</a>
</h3>
<p>
xpressive is a regular expression template library. Regular expressions (regexes)
can be written as strings that are parsed dynamically at runtime (dynamic
regexes), or as <span class="emphasis"><em>expression templates</em></span><sup>[<a name="id3091425" href="#ftn.id3091425" class="footnote">4</a>]</sup> that are parsed at compile-time (static regexes). Dynamic regexes
have the advantage that they can be accepted from the user as input at runtime
or read from an initialization file. Static regexes have several advantages.
Since they are C++ expressions instead of strings, they can be syntax-checked
at compile-time. Also, they can naturally refer to code and data elsewhere
in your program, giving you the ability to call back into your code from
within a regex match. Finally, since they are statically bound, the compiler
can generate faster code for static regexes.
</p>
<p>
xpressive's dual nature is unique and powerful. Static xpressive is a bit
like the <a href="http://spirit.sourceforge.net" target="_top">Spirit Parser Framework</a>.
Like <a href="http://spirit.sourceforge.net" target="_top">Spirit</a>, you can build
grammars with static regexes using expression templates. (Unlike <a href="http://spirit.sourceforge.net" target="_top">Spirit</a>,
xpressive does exhaustive backtracking, trying every possibility to find
a match for your pattern.) Dynamic xpressive is a bit like <a href="../../../libs/regex" target="_top">Boost.Regex</a>.
In fact, xpressive's interface should be familiar to anyone who has used
<a href="../../../libs/regex" target="_top">Boost.Regex</a>. xpressive's innovation
comes from allowing you to mix and match static and dynamic regexes in the
same program, and even in the same expression! You can embed a dynamic regex
in a static regex, or <span class="emphasis"><em>vice versa</em></span>, and the embedded regex
will participate fully in the search, back-tracking as needed to make the
match succeed.
</p>
<a name="boost_xpressive.user_s_guide.introduction.hello__world_"></a><h3>
<a name="id3091497"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.introduction.hello__world_">Hello,
world!</a>
</h3>
<p>
Enough theory. Let's have a look at <span class="emphasis"><em>Hello World</em></span>, xpressive
style:
</p>
<pre class="programlisting"><span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">iostream</span><span class="special">&gt;</span>
<span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">&gt;</span>
<span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span><span class="special">;</span>
<span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span>
<span class="special">{</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">hello</span><span class="special">(</span> <span class="string">"hello world!"</span> <span class="special">);</span>
<span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="identifier">sregex</span><span class="special">::</span><span class="identifier">compile</span><span class="special">(</span> <span class="string">"(\\w+) (\\w+)!"</span> <span class="special">);</span>
<span class="identifier">smatch</span> <span class="identifier">what</span><span class="special">;</span>
<span class="keyword">if</span><span class="special">(</span> <span class="identifier">regex_match</span><span class="special">(</span> <span class="identifier">hello</span><span class="special">,</span> <span class="identifier">what</span><span class="special">,</span> <span class="identifier">rex</span> <span class="special">)</span> <span class="special">)</span>
<span class="special">{</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="identifier">what</span><span class="special">[</span><span class="number">0</span><span class="special">]</span> <span class="special">&lt;&lt;</span> <span class="char">'\n'</span><span class="special">;</span> <span class="comment">// whole match
</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="identifier">what</span><span class="special">[</span><span class="number">1</span><span class="special">]</span> <span class="special">&lt;&lt;</span> <span class="char">'\n'</span><span class="special">;</span> <span class="comment">// first capture
</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="identifier">what</span><span class="special">[</span><span class="number">2</span><span class="special">]</span> <span class="special">&lt;&lt;</span> <span class="char">'\n'</span><span class="special">;</span> <span class="comment">// second capture
</span> <span class="special">}</span>
<span class="keyword">return</span> <span class="number">0</span><span class="special">;</span>
<span class="special">}</span>
</pre>
<p>
This program outputs the following:
</p>
<pre class="programlisting">hello world!
hello
world
</pre>
<p>
The first thing you'll notice about the code is that all the types in xpressive
live in the <code class="computeroutput"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span></code> namespace.
</p>
<div class="note"><table border="0" summary="Note">
<tr>
<td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../../doc/src/images/note.png"></td>
<th align="left">Note</th>
</tr>
<tr><td align="left" valign="top"><p>
Most of the rest of the examples in this document will leave off the <code class="computeroutput"><span class="keyword">using</span> <span class="keyword">namespace</span>
<span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span><span class="special">;</span></code>
directive. Just pretend it's there.
</p></td></tr>
</table></div>
<p>
Next, you'll notice the type of the regular expression object is <code class="computeroutput"><span class="identifier">sregex</span></code>. If you are familiar with <a href="../../../libs/regex" target="_top">Boost.Regex</a>, this is different than what you
are used to. The "<code class="computeroutput"><span class="identifier">s</span></code>"
in "<code class="computeroutput"><span class="identifier">sregex</span></code>" stands
for "<code class="computeroutput"><span class="identifier">string</span></code>", indicating
that this regex can be used to find patterns in <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span></code>
objects. I'll discuss this difference and its implications in detail later.
</p>
<p>
Notice how the regex object is initialized:
</p>
<pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="identifier">sregex</span><span class="special">::</span><span class="identifier">compile</span><span class="special">(</span> <span class="string">"(\\w+) (\\w+)!"</span> <span class="special">);</span>
</pre>
<p>
To create a regular expression object from a string, you must call a factory
method such as <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html#id1532404-bb">basic_regex&lt;&gt;::compile()</a></code></code>.
This is another area in which xpressive differs from other object-oriented
regular expression libraries. Other libraries encourage you to think of a
regular expression as a kind of string on steroids. In xpressive, regular
expressions are not strings; they are little programs in a domain-specific
language. Strings are only one <span class="emphasis"><em>representation</em></span> of that
language. Another representation is an expression template. For example,
the above line of code is equivalent to the following:
</p>
<pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">(</span><span class="identifier">s1</span><span class="special">=</span> <span class="special">+</span><span class="identifier">_w</span><span class="special">)</span> <span class="special">&gt;&gt;</span> <span class="char">' '</span> <span class="special">&gt;&gt;</span> <span class="special">(</span><span class="identifier">s2</span><span class="special">=</span> <span class="special">+</span><span class="identifier">_w</span><span class="special">)</span> <span class="special">&gt;&gt;</span> <span class="char">'!'</span><span class="special">;</span>
</pre>
<p>
This describes the same regular expression, except it uses the domain-specific
embedded language defined by static xpressive.
</p>
<p>
As you can see, static regexes have a syntax that is noticeably different
than standard Perl syntax. That is because we are constrained by C++'s syntax.
The biggest difference is the use of <code class="computeroutput"><span class="special">&gt;&gt;</span></code>
to mean "followed by". For instance, in Perl you can just put sub-expressions
next to each other:
</p>
<pre class="programlisting"><span class="identifier">abc</span>
</pre>
<p>
But in C++, there must be an operator separating sub-expressions:
</p>
<pre class="programlisting"><span class="identifier">a</span> <span class="special">&gt;&gt;</span> <span class="identifier">b</span> <span class="special">&gt;&gt;</span> <span class="identifier">c</span>
</pre>
<p>
In Perl, parentheses <code class="computeroutput"><span class="special">()</span></code> have
special meaning. They group, but as a side-effect they also create back-references
like <code class="literal">$1</code> and <code class="literal">$2</code>. In C++, there is no
way to overload parentheses to give them side-effects. To get the same effect,
we use the special <code class="computeroutput"><span class="identifier">s1</span></code>, <code class="computeroutput"><span class="identifier">s2</span></code>, etc. tokens. Assign to one to create
a back-reference (known as a sub-match in xpressive).
</p>
<p>
You'll also notice that the one-or-more repetition operator <code class="computeroutput"><span class="special">+</span></code> has moved from postfix to prefix position.
That's because C++ doesn't have a postfix <code class="computeroutput"><span class="special">+</span></code>
operator. So:
</p>
<pre class="programlisting"><span class="string">"\\w+"</span>
</pre>
<p>
is the same as:
</p>
<pre class="programlisting"><span class="special">+</span><span class="identifier">_w</span>
</pre>
<p>
We'll cover all the other differences <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes" title="Static Regexes">later</a>.
</p>
</div>
<div class="section">
<div class="titlepage"><div><div><h3 class="title">
<a name="boost_xpressive.user_s_guide.installing_xpressive"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.installing_xpressive" title="Installing xpressive">Installing
xpressive</a>
</h3></div></div></div>
<a name="boost_xpressive.user_s_guide.installing_xpressive.getting_xpressive"></a><h3>
<a name="id3092554"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.installing_xpressive.getting_xpressive">Getting
xpressive</a>
</h3>
<p>
There are two ways to get xpressive. The first and simplest is to download
the latest version of Boost. Just go to <a href="http://sf.net/projects/boost" target="_top">http://sf.net/projects/boost</a>
and follow the <span class="quote">&#8220;<span class="quote">Download</span>&#8221;</span> link.
</p>
<p>
The second way is by directly accessing the Boost Subversion repository.
Just go to <a href="http://svn.boost.org/trac/boost/" target="_top">http://svn.boost.org/trac/boost/</a>
and follow the instructions there for anonymous Subversion access. The version
in Boost Subversion is unstable.
</p>
<a name="boost_xpressive.user_s_guide.installing_xpressive.building_with_xpressive"></a><h3>
<a name="id3092607"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.installing_xpressive.building_with_xpressive">Building
with xpressive</a>
</h3>
<p>
Xpressive is a header-only template library, which means you don't need to
alter your build scripts or link to any separate lib file to use it. All
you need to do is <code class="computeroutput"><span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">&gt;</span></code>.
If you are only using static regexes, you can improve compile times by only
including <code class="computeroutput"><span class="identifier">xpressive_static</span><span class="special">.</span><span class="identifier">hpp</span></code>. Likewise,
you can include <code class="computeroutput"><span class="identifier">xpressive_dynamic</span><span class="special">.</span><span class="identifier">hpp</span></code> if
you only plan on using dynamic regexes.
</p>
<p>
If you would also like to use semantic actions or custom assertions with
your static regexes, you will need to additionally include <code class="computeroutput"><span class="identifier">regex_actions</span><span class="special">.</span><span class="identifier">hpp</span></code>.
</p>
<a name="boost_xpressive.user_s_guide.installing_xpressive.requirements"></a><h3>
<a name="id3092614"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.installing_xpressive.requirements">Requirements</a>
</h3>
<p>
Xpressive requires Boost version 1.34.1 or higher.
</p>
<a name="boost_xpressive.user_s_guide.installing_xpressive.supported_compilers"></a><h3>
<a name="id3092775"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.installing_xpressive.supported_compilers">Supported
Compilers</a>
</h3>
<p>
Currently, Boost.Xpressive is known to work on the following compilers:
</p>
<div class="itemizedlist"><ul class="itemizedlist" type="disc">
<li class="listitem">
Visual C++ 7.1 and higher
</li>
<li class="listitem">
GNU C++ 3.4 and higher
</li>
<li class="listitem">
Intel for Linux 8.1 and higher
</li>
<li class="listitem">
Intel for Windows 10 and higher
</li>
<li class="listitem">
tru64cxx 71 and higher
</li>
<li class="listitem">
MinGW 3.4 and higher
</li>
<li class="listitem">
HP C/aC++ A.06.14
</li>
</ul></div>
<p>
Check the latest tests results at Boost's <a href="http://beta.boost.org/development/tests/trunk/developer/xpressive.html" target="_top">Regression
Results Page</a>.
</p>
<div class="note"><table border="0" summary="Note">
<tr>
<td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../../doc/src/images/note.png"></td>
<th align="left">Note</th>
</tr>
<tr><td align="left" valign="top"><p>
Please send any questions, comments and bug reports to eric &lt;at&gt;
boost-consulting &lt;dot&gt; com.
</p></td></tr>
</table></div>
</div>
<div class="section">
<div class="titlepage"><div><div><h3 class="title">
<a name="boost_xpressive.user_s_guide.quick_start"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.quick_start" title="Quick Start">Quick Start</a>
</h3></div></div></div>
<p>
You don't need to know much to start being productive with xpressive. Let's
begin with the nickel tour of the types and algorithms xpressive provides.
</p>
<div class="table">
<a name="id3092894"></a><p class="title"><b>Table&#160;29.1.&#160;xpressive's Tool-Box</b></p>
<div class="table-contents"><table class="table" summary="xpressive's Tool-Box">
<colgroup>
<col>
<col>
</colgroup>
<thead><tr>
<th>
<p>
Tool
</p>
</th>
<th>
<p>
Description
</p>
</th>
</tr></thead>
<tbody>
<tr>
<td>
<p>
<code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex&lt;&gt;</a></code></code>
</p>
</td>
<td>
<p>
Contains a compiled regular expression. <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex&lt;&gt;</a></code></code>
is the most important type in xpressive. Everything you do with
xpressive will begin with creating an object of type <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex&lt;&gt;</a></code></code>.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results&lt;&gt;</a></code></code>,
<code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match&lt;&gt;</a></code></code>
</p>
</td>
<td>
<p>
<code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results&lt;&gt;</a></code></code>
contains the results of a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
or <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>
operation. It acts like a vector of <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match&lt;&gt;</a></code></code>
objects. A <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match&lt;&gt;</a></code></code>
object contains a marked sub-expression (also known as a back-reference
in Perl). It is basically just a pair of iterators representing
the begin and end of the marked sub-expression.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
</p>
</td>
<td>
<p>
Checks to see if a string matches a regex. For <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
to succeed, the <span class="emphasis"><em>whole string</em></span> must match the
regex, from beginning to end. If you give <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results&lt;&gt;</a></code></code>,
it will write into it any marked sub-expressions it finds.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>
</p>
</td>
<td>
<p>
Searches a string to find a sub-string that matches the regex.
<code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>
will try to find a match at every position in the string, starting
at the beginning, and stopping when it finds a match or when the
string is exhausted. As with <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>,
if you give <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>
a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results&lt;&gt;</a></code></code>,
it will write into it any marked sub-expressions it finds.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code>
</p>
</td>
<td>
<p>
Given an input string, a regex, and a substitution string, <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code>
builds a new string by replacing those parts of the input string
that match the regex with the substitution string. The substitution
string can contain references to marked sub-expressions.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_iterator.html" title="Struct template regex_iterator">regex_iterator&lt;&gt;</a></code></code>
</p>
</td>
<td>
<p>
An STL-compatible iterator that makes it easy to find all the places
in a string that match a regex. Dereferencing a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_iterator.html" title="Struct template regex_iterator">regex_iterator&lt;&gt;</a></code></code>
returns a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results&lt;&gt;</a></code></code>.
Incrementing a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_iterator.html" title="Struct template regex_iterator">regex_iterator&lt;&gt;</a></code></code>
finds the next match.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator&lt;&gt;</a></code></code>
</p>
</td>
<td>
<p>
Like <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_iterator.html" title="Struct template regex_iterator">regex_iterator&lt;&gt;</a></code></code>,
except dereferencing a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator&lt;&gt;</a></code></code>
returns a string. By default, it will return the whole sub-string
that the regex matched, but it can be configured to return any
or all of the marked sub-expressions one at a time, or even the
parts of the string that <span class="emphasis"><em>didn't</em></span> match the
regex.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler&lt;&gt;</a></code></code>
</p>
</td>
<td>
<p>
A factory for <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex&lt;&gt;</a></code></code>
objects. It "compiles" a string into a regular expression.
You will not usually have to deal directly with <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler&lt;&gt;</a></code></code>
because the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex&lt;&gt;</a></code></code>
class has a factory method that uses <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler&lt;&gt;</a></code></code>
internally. But if you need to do anything fancy like create a
<code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex&lt;&gt;</a></code></code>
object with a different <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">locale</span></code>,
you will need to use a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler&lt;&gt;</a></code></code>
explicitly.
</p>
</td>
</tr>
</tbody>
</table></div>
</div>
<br class="table-break"><p>
Now that you know a bit about the tools xpressive provides, you can pick
the right tool for you by answering the following two questions:
</p>
<div class="orderedlist"><ol class="orderedlist" type="1">
<li class="listitem">
What <span class="emphasis"><em>iterator</em></span> type will you use to traverse your
data?
</li>
<li class="listitem">
What do you want to <span class="emphasis"><em>do</em></span> to your data?
</li>
</ol></div>
<a name="boost_xpressive.user_s_guide.quick_start.know_your_iterator_type"></a><h3>
<a name="id3093949"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.quick_start.know_your_iterator_type">Know
Your Iterator Type</a>
</h3>
<p>
Most of the classes in xpressive are templates that are parameterized on
the iterator type. xpressive defines some common typedefs to make the job
of choosing the right types easier. You can use the table below to find the
right types based on the type of your iterator.
</p>
<div class="table">
<a name="id3093973"></a><p class="title"><b>Table&#160;29.2.&#160;xpressive Typedefs vs. Iterator Types</b></p>
<div class="table-contents"><table class="table" summary="xpressive Typedefs vs. Iterator Types">
<colgroup>
<col>
<col>
<col>
<col>
<col>
</colgroup>
<thead><tr>
<th>
</th>
<th>
<p>
std::string::const_iterator
</p>
</th>
<th>
<p>
char const *
</p>
</th>
<th>
<p>
std::wstring::const_iterator
</p>
</th>
<th>
<p>
wchar_t const *
</p>
</th>
</tr></thead>
<tbody>
<tr>
<td>
<p>
<code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex&lt;&gt;</a></code></code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="identifier">sregex</span></code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="identifier">cregex</span></code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="identifier">wsregex</span></code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="identifier">wcregex</span></code>
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results&lt;&gt;</a></code></code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="identifier">smatch</span></code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="identifier">cmatch</span></code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="identifier">wsmatch</span></code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="identifier">wcmatch</span></code>
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler&lt;&gt;</a></code></code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="identifier">sregex_compiler</span></code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="identifier">cregex_compiler</span></code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="identifier">wsregex_compiler</span></code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="identifier">wcregex_compiler</span></code>
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_iterator.html" title="Struct template regex_iterator">regex_iterator&lt;&gt;</a></code></code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="identifier">sregex_iterator</span></code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="identifier">cregex_iterator</span></code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="identifier">wsregex_iterator</span></code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="identifier">wcregex_iterator</span></code>
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator&lt;&gt;</a></code></code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="identifier">sregex_token_iterator</span></code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="identifier">cregex_token_iterator</span></code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="identifier">wsregex_token_iterator</span></code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="identifier">wcregex_token_iterator</span></code>
</p>
</td>
</tr>
</tbody>
</table></div>
</div>
<br class="table-break"><p>
You should notice the systematic naming convention. Many of these types are
used together, so the naming convention helps you to use them consistently.
For instance, if you have a <code class="computeroutput"><span class="identifier">sregex</span></code>,
you should also be using a <code class="computeroutput"><span class="identifier">smatch</span></code>.
</p>
<p>
If you are not using one of those four iterator types, then you can use the
templates directly and specify your iterator type.
</p>
<a name="boost_xpressive.user_s_guide.quick_start.know_your_task"></a><h3>
<a name="id3094491"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.quick_start.know_your_task">Know
Your Task</a>
</h3>
<p>
Do you want to find a pattern once? Many times? Search and replace? xpressive
has tools for all that and more. Below is a quick reference:
</p>
<div class="table">
<a name="id3094512"></a><p class="title"><b>Table&#160;29.3.&#160;Tasks and Tools</b></p>
<div class="table-contents"><table class="table" summary="Tasks and Tools">
<colgroup>
<col>
<col>
</colgroup>
<thead><tr>
<th>
<p>
To do this ...
</p>
</th>
<th>
<p>
Use this ...
</p>
</th>
</tr></thead>
<tbody>
<tr>
<td>
<p>
<span class="inlinemediaobject"><img src="../images/tip.png" alt="tip"></span> <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.see_if_a_whole_string_matches_a_regex">See
if a whole string matches a regex</a>
</p>
</td>
<td>
<p>
The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
algorithm
</p>
</td>
</tr>
<tr>
<td>
<p>
<span class="inlinemediaobject"><img src="../images/tip.png" alt="tip"></span> <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.see_if_a_string_contains_a_sub_string_that_matches_a_regex">See
if a string contains a sub-string that matches a regex</a>
</p>
</td>
<td>
<p>
The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>
algorithm
</p>
</td>
</tr>
<tr>
<td>
<p>
<span class="inlinemediaobject"><img src="../images/tip.png" alt="tip"></span> <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.replace_all_sub_strings_that_match_a_regex">Replace
all sub-strings that match a regex</a>
</p>
</td>
<td>
<p>
The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code>
algorithm
</p>
</td>
</tr>
<tr>
<td>
<p>
<span class="inlinemediaobject"><img src="../images/tip.png" alt="tip"></span> <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.find_all_the_sub_strings_that_match_a_regex_and_step_through_them_one_at_a_time">Find
all the sub-strings that match a regex and step through them one
at a time</a>
</p>
</td>
<td>
<p>
The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_iterator.html" title="Struct template regex_iterator">regex_iterator&lt;&gt;</a></code></code>
class
</p>
</td>
</tr>
<tr>
<td>
<p>
<span class="inlinemediaobject"><img src="../images/tip.png" alt="tip"></span> <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.split_a_string_into_tokens_that_each_match_a_regex">Split
a string into tokens that each match a regex</a>
</p>
</td>
<td>
<p>
The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator&lt;&gt;</a></code></code>
class
</p>
</td>
</tr>
<tr>
<td>
<p>
<span class="inlinemediaobject"><img src="../images/tip.png" alt="tip"></span> <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.split_a_string_using_a_regex_as_a_delimiter">Split
a string using a regex as a delimiter</a>
</p>
</td>
<td>
<p>
The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator&lt;&gt;</a></code></code>
class
</p>
</td>
</tr>
</tbody>
</table></div>
</div>
<br class="table-break"><p>
These algorithms and classes are described in excruciating detail in the
Reference section.
</p>
<div class="tip"><table border="0" summary="Tip">
<tr>
<td rowspan="2" align="center" valign="top" width="25"><img alt="[Tip]" src="../../../doc/src/images/tip.png"></td>
<th align="left">Tip</th>
</tr>
<tr><td align="left" valign="top"><p>
Try clicking on a task in the table above to see a complete example program
that uses xpressive to solve that particular task.
</p></td></tr>
</table></div>
</div>
<div class="section">
<div class="titlepage"><div><div><h3 class="title">
<a name="xpressive.user_s_guide.creating_a_regex_object"></a><a class="link" href="user_s_guide.html#xpressive.user_s_guide.creating_a_regex_object" title="Creating a Regex Object">Creating
a Regex Object</a>
</h3></div></div></div>
<div class="toc"><dl>
<dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes">Static
Regexes</a></span></dt>
<dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.dynamic_regexes">Dynamic
Regexes</a></span></dt>
</dl></div>
<p>
When using xpressive, the first thing you'll do is create a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex&lt;&gt;</a></code></code>
object. This section goes over the nuts and bolts of building a regular expression
in the two dialects xpressive supports: static and dynamic.
</p>
<div class="section">
<div class="titlepage"><div><div><h4 class="title">
<a name="boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes" title="Static Regexes">Static
Regexes</a>
</h4></div></div></div>
<a name="boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.overview"></a><h3>
<a name="id3094974"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.overview">Overview</a>
</h3>
<p>
The feature that really sets xpressive apart from other C/C++ regular expression
libraries is the ability to author a regular expression using C++ expressions.
xpressive achieves this through operator overloading, using a technique
called <span class="emphasis"><em>expression templates</em></span> to embed a mini-language
dedicated to pattern matching within C++. These "static regexes"
have many advantages over their string-based brethren. In particular, static
regexes:
</p>
<div class="itemizedlist"><ul class="itemizedlist" type="disc">
<li class="listitem">
are syntax-checked at compile-time; they will never fail at run-time
due to a syntax error.
</li>
<li class="listitem">
can naturally refer to other C++ data and code, including other regexes,
making it simple to build grammars out of regular expressions and bind
user-defined actions that execute when parts of your regex match.
</li>
<li class="listitem">
are statically bound for better inlining and optimization. Static regexes
require no state tables, virtual functions, byte-code or calls through
function pointers that cannot be resolved at compile time.
</li>
<li class="listitem">
are not limited to searching for patterns in strings. You can declare
a static regex that finds patterns in an array of integers, for instance.
</li>
</ul></div>
<p>
Since we compose static regexes using C++ expressions, we are constrained
by the rules for legal C++ expressions. Unfortunately, that means that
"classic" regular expression syntax cannot always be mapped cleanly
into C++. Rather, we map the regex <span class="emphasis"><em>constructs</em></span>, picking
new syntax that is legal C++.
</p>
<a name="boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.construction_and_assignment"></a><h3>
<a name="id3095069"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.construction_and_assignment">Construction
and Assignment</a>
</h3>
<p>
You create a static regex by assigning one to an object of type <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex&lt;&gt;</a></code></code>.
For instance, the following defines a regex that can be used to find patterns
in objects of type <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span></code>:
</p>
<pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="char">'$'</span> <span class="special">&gt;&gt;</span> <span class="special">+</span><span class="identifier">_d</span> <span class="special">&gt;&gt;</span> <span class="char">'.'</span> <span class="special">&gt;&gt;</span> <span class="identifier">_d</span> <span class="special">&gt;&gt;</span> <span class="identifier">_d</span><span class="special">;</span>
</pre>
<p>
Assignment works similarly.
</p>
<a name="boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.character_and_string_literals"></a><h3>
<a name="id3095216"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.character_and_string_literals">Character
and String Literals</a>
</h3>
<p>
In static regexes, character and string literals match themselves. For
instance, in the regex above, <code class="computeroutput"><span class="char">'$'</span></code>
and <code class="computeroutput"><span class="char">'.'</span></code> match the characters
<code class="computeroutput"><span class="char">'$'</span></code> and <code class="computeroutput"><span class="char">'.'</span></code>
respectively. Don't be confused by the fact that <code class="literal">$</code> and
<code class="literal">.</code> are meta-characters in Perl. In xpressive, literals
always represent themselves.
</p>
<p>
When using literals in static regexes, you must take care that at least
one operand is not a literal. For instance, the following are <span class="emphasis"><em>not</em></span>
valid regexes:
</p>
<pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">re1</span> <span class="special">=</span> <span class="char">'a'</span> <span class="special">&gt;&gt;</span> <span class="char">'b'</span><span class="special">;</span> <span class="comment">// ERROR!
</span><span class="identifier">sregex</span> <span class="identifier">re2</span> <span class="special">=</span> <span class="special">+</span><span class="char">'a'</span><span class="special">;</span> <span class="comment">// ERROR!
</span></pre>
<p>
The two operands to the binary <code class="computeroutput"><span class="special">&gt;&gt;</span></code>
operator are both literals, and the operand of the unary <code class="computeroutput"><span class="special">+</span></code> operator is also a literal, so these statements
will call the native C++ binary right-shift and unary plus operators, respectively.
That's not what we want. To get operator overloading to kick in, at least
one operand must be a user-defined type. We can use xpressive's <code class="computeroutput"><span class="identifier">as_xpr</span><span class="special">()</span></code>
helper function to "taint" an expression with regex-ness, forcing
operator overloading to find the correct operators. The two regexes above
should be written as:
</p>
<pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">re1</span> <span class="special">=</span> <span class="identifier">as_xpr</span><span class="special">(</span><span class="char">'a'</span><span class="special">)</span> <span class="special">&gt;&gt;</span> <span class="char">'b'</span><span class="special">;</span> <span class="comment">// OK
</span><span class="identifier">sregex</span> <span class="identifier">re2</span> <span class="special">=</span> <span class="special">+</span><span class="identifier">as_xpr</span><span class="special">(</span><span class="char">'a'</span><span class="special">);</span> <span class="comment">// OK
</span></pre>
<a name="boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.sequencing_and_alternation"></a><h3>
<a name="id3095533"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.sequencing_and_alternation">Sequencing
and Alternation</a>
</h3>
<p>
As you've probably already noticed, sub-expressions in static regexes must
be separated by the sequencing operator, <code class="computeroutput"><span class="special">&gt;&gt;</span></code>.
You can read this operator as "followed by".
</p>
<pre class="programlisting"><span class="comment">// Match an 'a' followed by a digit
</span><span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="char">'a'</span> <span class="special">&gt;&gt;</span> <span class="identifier">_d</span><span class="special">;</span>
</pre>
<p>
Alternation works just as it does in Perl with the <code class="computeroutput"><span class="special">|</span></code>
operator. You can read this operator as "or". For example:
</p>
<pre class="programlisting"><span class="comment">// match a digit character or a word character one or more times
</span><span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="special">+(</span> <span class="identifier">_d</span> <span class="special">|</span> <span class="identifier">_w</span> <span class="special">);</span>
</pre>
<a name="boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.grouping_and_captures"></a><h3>
<a name="id3095686"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.grouping_and_captures">Grouping
and Captures</a>
</h3>
<p>
In Perl, parentheses <code class="computeroutput"><span class="special">()</span></code> have
special meaning. They group, but as a side-effect they also create back-references
like <code class="literal">$1</code> and <code class="literal">$2</code>. In C++, parentheses
only group -- there is no way to give them side-effects. To get the same
effect, we use the special <code class="computeroutput"><span class="identifier">s1</span></code>,
<code class="computeroutput"><span class="identifier">s2</span></code>, etc. tokens. Assigning
to one creates a back-reference. You can then use the back-reference later
in your expression, like using <code class="literal">\1</code> and <code class="literal">\2</code>
in Perl. For example, consider the following regex, which finds matching
HTML tags:
</p>
<pre class="programlisting"><span class="string">"&lt;(\\w+)&gt;.*?&lt;/\\1&gt;"</span>
</pre>
<p>
In static xpressive, this would be:
</p>
<pre class="programlisting"><span class="char">'&lt;'</span> <span class="special">&gt;&gt;</span> <span class="special">(</span><span class="identifier">s1</span><span class="special">=</span> <span class="special">+</span><span class="identifier">_w</span><span class="special">)</span> <span class="special">&gt;&gt;</span> <span class="char">'&gt;'</span> <span class="special">&gt;&gt;</span> <span class="special">-*</span><span class="identifier">_</span> <span class="special">&gt;&gt;</span> <span class="string">"&lt;/"</span> <span class="special">&gt;&gt;</span> <span class="identifier">s1</span> <span class="special">&gt;&gt;</span> <span class="char">'&gt;'</span>
</pre>
<p>
Notice how you capture a back-reference by assigning to <code class="computeroutput"><span class="identifier">s1</span></code>,
and then you use <code class="computeroutput"><span class="identifier">s1</span></code> later
in the pattern to find the matching end tag.
</p>
<div class="tip"><table border="0" summary="Tip">
<tr>
<td rowspan="2" align="center" valign="top" width="25"><img alt="[Tip]" src="../../../doc/src/images/tip.png"></td>
<th align="left">Tip</th>
</tr>
<tr><td align="left" valign="top"><p>
<span class="bold"><strong>Grouping without capturing a back-reference</strong></span>
<br> <br> In xpressive, if you just want grouping without capturing
a back-reference, you can just use <code class="computeroutput"><span class="special">()</span></code>
without <code class="computeroutput"><span class="identifier">s1</span></code>. That is the
equivalent of Perl's <code class="literal">(?:)</code> non-capturing grouping construct.
</p></td></tr>
</table></div>
<a name="boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.case_insensitivity_and_internationalization"></a><h3>
<a name="id3095958"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.case_insensitivity_and_internationalization">Case-Insensitivity
and Internationalization</a>
</h3>
<p>
Perl lets you make part of your regular expression case-insensitive by
using the <code class="literal">(?i:)</code> pattern modifier. xpressive also has
a case-insensitivity pattern modifier, called <code class="computeroutput"><span class="identifier">icase</span></code>.
You can use it as follows:
</p>
<pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="string">"this"</span> <span class="special">&gt;&gt;</span> <span class="identifier">icase</span><span class="special">(</span> <span class="string">"that"</span> <span class="special">);</span>
</pre>
<p>
In this regular expression, <code class="computeroutput"><span class="string">"this"</span></code>
will be matched exactly, but <code class="computeroutput"><span class="string">"that"</span></code>
will be matched irrespective of case.
</p>
<p>
Case-insensitive regular expressions raise the issue of internationalization:
how should case-insensitive character comparisons be evaluated? Also, many
character classes are locale-specific. Which characters are matched by
<code class="computeroutput"><span class="identifier">digit</span></code> and which are matched
by <code class="computeroutput"><span class="identifier">alpha</span></code>? The answer depends
on the <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">locale</span></code> object the regular expression
object is using. By default, all regular expression objects use the global
locale. You can override the default by using the <code class="computeroutput"><span class="identifier">imbue</span><span class="special">()</span></code> pattern modifier, as follows:
</p>
<pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">locale</span> <span class="identifier">my_locale</span> <span class="special">=</span> <span class="comment">/* initialize a std::locale object */</span><span class="special">;</span>
<span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="identifier">imbue</span><span class="special">(</span> <span class="identifier">my_locale</span> <span class="special">)(</span> <span class="special">+</span><span class="identifier">alpha</span> <span class="special">&gt;&gt;</span> <span class="special">+</span><span class="identifier">digit</span> <span class="special">);</span>
</pre>
<p>
This regular expression will evaluate <code class="computeroutput"><span class="identifier">alpha</span></code>
and <code class="computeroutput"><span class="identifier">digit</span></code> according to
<code class="computeroutput"><span class="identifier">my_locale</span></code>. See the section
on <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.localization_and_regex_traits" title="Localization and Regex Traits">Localization
and Regex Traits</a> for more information about how to customize the
behavior of your regexes.
</p>
<a name="boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.static_xpressive_syntax_cheat_sheet"></a><h3>
<a name="id3096293"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.static_xpressive_syntax_cheat_sheet">Static
xpressive Syntax Cheat Sheet</a>
</h3>
<p>
The table below lists the familiar regex constructs and their equivalents
in static xpressive.
</p>
<div class="table">
<a name="id3096315"></a><p class="title"><b>Table&#160;29.4.&#160;Perl syntax vs. Static xpressive syntax</b></p>
<div class="table-contents"><table class="table" summary="Perl syntax vs. Static xpressive syntax">
<colgroup>
<col>
<col>
<col>
</colgroup>
<thead><tr>
<th>
<p>
Perl
</p>
</th>
<th>
<p>
Static xpressive
</p>
</th>
<th>
<p>
Meaning
</p>
</th>
</tr></thead>
<tbody>
<tr>
<td>
<p>
<code class="literal">.</code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><a class="link" href="../boost/xpressive/_.html" title="Global _">_</a></code>
</p>
</td>
<td>
<p>
any character (assuming Perl's /s modifier).
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">ab</code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="identifier">a</span> <span class="special">&gt;&gt;</span>
<span class="identifier">b</span></code>
</p>
</td>
<td>
<p>
sequencing of <code class="literal">a</code> and <code class="literal">b</code> sub-expressions.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">a|b</code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="identifier">a</span> <span class="special">|</span>
<span class="identifier">b</span></code>
</p>
</td>
<td>
<p>
alternation of <code class="literal">a</code> and <code class="literal">b</code>
sub-expressions.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">(a)</code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="special">(</span><a class="link" href="../boost/xpressive/s1.html" title="Global s1">s1</a><span class="special">=</span> <span class="identifier">a</span><span class="special">)</span></code>
</p>
</td>
<td>
<p>
group and capture a back-reference.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">(?:a)</code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="special">(</span><span class="identifier">a</span><span class="special">)</span></code>
</p>
</td>
<td>
<p>
group and do not capture a back-reference.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">\1</code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><a class="link" href="../boost/xpressive/s1.html" title="Global s1">s1</a></code>
</p>
</td>
<td>
<p>
a previously captured back-reference.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">a*</code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="special">*</span><span class="identifier">a</span></code>
</p>
</td>
<td>
<p>
zero or more times, greedy.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">a+</code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="special">+</span><span class="identifier">a</span></code>
</p>
</td>
<td>
<p>
one or more times, greedy.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">a?</code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="special">!</span><span class="identifier">a</span></code>
</p>
</td>
<td>
<p>
zero or one time, greedy.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">a{n,m}</code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><a class="link" href="../boost/xpressive/repeat.html" title="Function repeat">repeat</a><span class="special">&lt;</span><span class="identifier">n</span><span class="special">,</span><span class="identifier">m</span><span class="special">&gt;(</span><span class="identifier">a</span><span class="special">)</span></code>
</p>
</td>
<td>
<p>
between <code class="literal">n</code> and <code class="literal">m</code> times,
greedy.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">a*?</code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="special">-*</span><span class="identifier">a</span></code>
</p>
</td>
<td>
<p>
zero or more times, non-greedy.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">a+?</code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="special">-+</span><span class="identifier">a</span></code>
</p>
</td>
<td>
<p>
one or more times, non-greedy.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">a??</code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="special">-!</span><span class="identifier">a</span></code>
</p>
</td>
<td>
<p>
zero or one time, non-greedy.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">a{n,m}?</code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="special">-</span><a class="link" href="../boost/xpressive/repeat.html" title="Function repeat">repeat</a><span class="special">&lt;</span><span class="identifier">n</span><span class="special">,</span><span class="identifier">m</span><span class="special">&gt;(</span><span class="identifier">a</span><span class="special">)</span></code>
</p>
</td>
<td>
<p>
between <code class="literal">n</code> and <code class="literal">m</code> times,
non-greedy.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">^</code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><a class="link" href="../boost/xpressive/bos.html" title="Global bos">bos</a></code>
</p>
</td>
<td>
<p>
beginning of sequence assertion.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">$</code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><a class="link" href="../boost/xpressive/eos.html" title="Global eos">eos</a></code>
</p>
</td>
<td>
<p>
end of sequence assertion.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">\b</code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><a class="link" href="../boost/xpressive/_b.html" title="Global _b">_b</a></code>
</p>
</td>
<td>
<p>
word boundary assertion.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">\B</code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="special">~</span><a class="link" href="../boost/xpressive/_b.html" title="Global _b">_b</a></code>
</p>
</td>
<td>
<p>
not word boundary assertion.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">\n</code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><a class="link" href="../boost/xpressive/_n.html" title="Global _n">_n</a></code>
</p>
</td>
<td>
<p>
literal newline.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">.</code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="special">~</span><a class="link" href="../boost/xpressive/_n.html" title="Global _n">_n</a></code>
</p>
</td>
<td>
<p>
any character except a literal newline (without Perl's /s modifier).
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">\r?\n|\r</code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><a class="link" href="../boost/xpressive/_ln.html" title="Global _ln">_ln</a></code>
</p>
</td>
<td>
<p>
logical newline.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">[^\r\n]</code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="special">~</span><a class="link" href="../boost/xpressive/_ln.html" title="Global _ln">_ln</a></code>
</p>
</td>
<td>
<p>
any single character not a logical newline.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">\w</code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><a class="link" href="../boost/xpressive/_w.html" title="Global _w">_w</a></code>
</p>
</td>
<td>
<p>
a word character, equivalent to set[alnum | '_'].
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">\W</code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="special">~</span><a class="link" href="../boost/xpressive/_w.html" title="Global _w">_w</a></code>
</p>
</td>
<td>
<p>
not a word character, equivalent to ~set[alnum | '_'].
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">\d</code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><a class="link" href="../boost/xpressive/_d.html" title="Global _d">_d</a></code>
</p>
</td>
<td>
<p>
a digit character.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">\D</code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="special">~</span><a class="link" href="../boost/xpressive/_d.html" title="Global _d">_d</a></code>
</p>
</td>
<td>
<p>
not a digit character.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">\s</code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><a class="link" href="../boost/xpressive/_s.html" title="Global _s">_s</a></code>
</p>
</td>
<td>
<p>
a space character.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">\S</code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="special">~</span><a class="link" href="../boost/xpressive/_s.html" title="Global _s">_s</a></code>
</p>
</td>
<td>
<p>
not a space character.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">[:alnum:]</code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><a class="link" href="../boost/xpressive/alnum.html" title="Global alnum">alnum</a></code>
</p>
</td>
<td>
<p>
an alpha-numeric character.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">[:alpha:]</code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><a class="link" href="../boost/xpressive/alpha.html" title="Global alpha">alpha</a></code>
</p>
</td>
<td>
<p>
an alphabetic character.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">[:blank:]</code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><a class="link" href="../boost/xpressive/blank.html" title="Global blank">blank</a></code>
</p>
</td>
<td>
<p>
a horizontal white-space character.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">[:cntrl:]</code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><a class="link" href="../boost/xpressive/cntrl.html" title="Global cntrl">cntrl</a></code>
</p>
</td>
<td>
<p>
a control character.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">[:digit:]</code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><a class="link" href="../boost/xpressive/digit.html" title="Global digit">digit</a></code>
</p>
</td>
<td>
<p>
a digit character.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">[:graph:]</code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><a class="link" href="../boost/xpressive/graph.html" title="Global graph">graph</a></code>
</p>
</td>
<td>
<p>
a graphable character.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">[:lower:]</code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><a class="link" href="../boost/xpressive/lower.html" title="Global lower">lower</a></code>
</p>
</td>
<td>
<p>
a lower-case character.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">[:print:]</code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><a class="link" href="../boost/xpressive/print.html" title="Global print">print</a></code>
</p>
</td>
<td>
<p>
a printing character.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">[:punct:]</code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><a class="link" href="../boost/xpressive/punct.html" title="Global punct">punct</a></code>
</p>
</td>
<td>
<p>
a punctuation character.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">[:space:]</code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><a class="link" href="../boost/xpressive/space.html" title="Global space">space</a></code>
</p>
</td>
<td>
<p>
a white-space character.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">[:upper:]</code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><a class="link" href="../boost/xpressive/upper.html" title="Global upper">upper</a></code>
</p>
</td>
<td>
<p>
an upper-case character.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">[:xdigit:]</code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><a class="link" href="../boost/xpressive/xdigit.html" title="Global xdigit">xdigit</a></code>
</p>
</td>
<td>
<p>
a hexadecimal digit character.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">[0-9]</code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><a class="link" href="../boost/xpressive/range.html" title="Function template range">range</a><span class="special">(</span><span class="char">'0'</span><span class="special">,</span><span class="char">'9'</span><span class="special">)</span></code>
</p>
</td>
<td>
<p>
characters in range <code class="computeroutput"><span class="char">'0'</span></code>
through <code class="computeroutput"><span class="char">'9'</span></code>.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">[abc]</code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="identifier">as_xpr</span><span class="special">(</span><span class="char">'a'</span><span class="special">)</span> <span class="special">|</span> <span class="char">'b'</span> <span class="special">|</span><span class="char">'c'</span></code>
</p>
</td>
<td>
<p>
characters <code class="computeroutput"><span class="char">'a'</span></code>, <code class="computeroutput"><span class="char">'b'</span></code>, or <code class="computeroutput"><span class="char">'c'</span></code>.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">[abc]</code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="special">(</span><a class="link" href="../boost/xpressive/set.html" title="Global set">set</a><span class="special">=</span> <span class="char">'a'</span><span class="special">,</span><span class="char">'b'</span><span class="special">,</span><span class="char">'c'</span><span class="special">)</span></code>
</p>
</td>
<td>
<p>
<span class="emphasis"><em>same as above</em></span>
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">[0-9abc]</code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><a class="link" href="../boost/xpressive/set.html" title="Global set">set</a><span class="special">[</span> <a class="link" href="../boost/xpressive/range.html" title="Function template range">range</a><span class="special">(</span><span class="char">'0'</span><span class="special">,</span><span class="char">'9'</span><span class="special">)</span> <span class="special">|</span>
<span class="char">'a'</span> <span class="special">|</span>
<span class="char">'b'</span> <span class="special">|</span>
<span class="char">'c'</span> <span class="special">]</span></code>
</p>
</td>
<td>
<p>
characters <code class="computeroutput"><span class="char">'a'</span></code>, <code class="computeroutput"><span class="char">'b'</span></code>, <code class="computeroutput"><span class="char">'c'</span></code>
or in range <code class="computeroutput"><span class="char">'0'</span></code> through
<code class="computeroutput"><span class="char">'9'</span></code>.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">[0-9abc]</code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><a class="link" href="../boost/xpressive/set.html" title="Global set">set</a><span class="special">[</span> <a class="link" href="../boost/xpressive/range.html" title="Function template range">range</a><span class="special">(</span><span class="char">'0'</span><span class="special">,</span><span class="char">'9'</span><span class="special">)</span> <span class="special">|</span>
<span class="special">(</span><a class="link" href="../boost/xpressive/set.html" title="Global set">set</a><span class="special">=</span> <span class="char">'a'</span><span class="special">,</span><span class="char">'b'</span><span class="special">,</span><span class="char">'c'</span><span class="special">)</span> <span class="special">]</span></code>
</p>
</td>
<td>
<p>
<span class="emphasis"><em>same as above</em></span>
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">[^abc]</code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="special">~(</span><a class="link" href="../boost/xpressive/set.html" title="Global set">set</a><span class="special">=</span> <span class="char">'a'</span><span class="special">,</span><span class="char">'b'</span><span class="special">,</span><span class="char">'c'</span><span class="special">)</span></code>
</p>
</td>
<td>
<p>
not characters <code class="computeroutput"><span class="char">'a'</span></code>,
<code class="computeroutput"><span class="char">'b'</span></code>, or <code class="computeroutput"><span class="char">'c'</span></code>.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">(?i:<span class="emphasis"><em>stuff</em></span>)</code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><a class="link" href="../boost/xpressive/icase.html" title="Function template icase">icase</a><span class="special">(</span></code><code class="literal"><span class="emphasis"><em>stuff</em></span></code><code class="computeroutput"><span class="special">)</span></code>
</p>
</td>
<td>
<p>
match <span class="emphasis"><em>stuff</em></span> disregarding case.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">(?&gt;<span class="emphasis"><em>stuff</em></span>)</code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><a class="link" href="../boost/xpressive/keep.html" title="Function template keep">keep</a><span class="special">(</span></code><code class="literal"><span class="emphasis"><em>stuff</em></span></code><code class="computeroutput"><span class="special">)</span></code>
</p>
</td>
<td>
<p>
independent sub-expression, match <span class="emphasis"><em>stuff</em></span>
and turn off backtracking.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">(?=<span class="emphasis"><em>stuff</em></span>)</code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><a class="link" href="../boost/xpressive/before.html" title="Function template before">before</a><span class="special">(</span></code><code class="literal"><span class="emphasis"><em>stuff</em></span></code><code class="computeroutput"><span class="special">)</span></code>
</p>
</td>
<td>
<p>
positive look-ahead assertion, match if before <span class="emphasis"><em>stuff</em></span>
but don't include <span class="emphasis"><em>stuff</em></span> in the match.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">(?!<span class="emphasis"><em>stuff</em></span>)</code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="special">~</span><a class="link" href="../boost/xpressive/before.html" title="Function template before">before</a><span class="special">(</span></code><code class="literal"><span class="emphasis"><em>stuff</em></span></code><code class="computeroutput"><span class="special">)</span></code>
</p>
</td>
<td>
<p>
negative look-ahead assertion, match if not before <span class="emphasis"><em>stuff</em></span>.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">(?&lt;=<span class="emphasis"><em>stuff</em></span>)</code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><a class="link" href="../boost/xpressive/after.html" title="Function template after">after</a><span class="special">(</span></code><code class="literal"><span class="emphasis"><em>stuff</em></span></code><code class="computeroutput"><span class="special">)</span></code>
</p>
</td>
<td>
<p>
positive look-behind assertion, match if after <span class="emphasis"><em>stuff</em></span>
but don't include <span class="emphasis"><em>stuff</em></span> in the match. (<span class="emphasis"><em>stuff</em></span>
must be constant-width.)
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">(?&lt;!<span class="emphasis"><em>stuff</em></span>)</code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="special">~</span><a class="link" href="../boost/xpressive/after.html" title="Function template after">after</a><span class="special">(</span></code><code class="literal"><span class="emphasis"><em>stuff</em></span></code><code class="computeroutput"><span class="special">)</span></code>
</p>
</td>
<td>
<p>
negative look-behind assertion, match if not after <span class="emphasis"><em>stuff</em></span>.
(<span class="emphasis"><em>stuff</em></span> must be constant-width.)
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">(?P&lt;<span class="emphasis"><em>name</em></span>&gt;<span class="emphasis"><em>stuff</em></span>)</code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><code class="literal"><a class="link" href="../boost/xpressive/mark_tag.html" title="Struct mark_tag">mark_tag</a></code>
</code><code class="literal"><span class="emphasis"><em>name</em></span></code><code class="computeroutput"><span class="special">(</span></code><span class="emphasis"><em>n</em></span><code class="computeroutput"><span class="special">);</span></code><br> ...<br> <code class="computeroutput"><span class="special">(</span></code><code class="literal"><span class="emphasis"><em>name</em></span></code><code class="computeroutput"><span class="special">=</span> </code><code class="literal"><span class="emphasis"><em>stuff</em></span></code><code class="computeroutput"><span class="special">)</span></code>
</p>
</td>
<td>
<p>
Create a named capture.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">(?P=<span class="emphasis"><em>name</em></span>)</code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><code class="literal"><a class="link" href="../boost/xpressive/mark_tag.html" title="Struct mark_tag">mark_tag</a></code>
</code><code class="literal"><span class="emphasis"><em>name</em></span></code><code class="computeroutput"><span class="special">(</span></code><span class="emphasis"><em>n</em></span><code class="computeroutput"><span class="special">);</span></code><br> ...<br> <code class="literal"><span class="emphasis"><em>name</em></span></code>
</p>
</td>
<td>
<p>
Refer back to a previously created named capture.
</p>
</td>
</tr>
</tbody>
</table></div>
</div>
<br class="table-break"><p>
<br>
</p>
</div>
<div class="section">
<div class="titlepage"><div><div><h4 class="title">
<a name="boost_xpressive.user_s_guide.creating_a_regex_object.dynamic_regexes"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.dynamic_regexes" title="Dynamic Regexes">Dynamic
Regexes</a>
</h4></div></div></div>
<a name="boost_xpressive.user_s_guide.creating_a_regex_object.dynamic_regexes.overview"></a><h3>
<a name="id3099458"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.dynamic_regexes.overview">Overview</a>
</h3>
<p>
Static regexes are dandy, but sometimes you need something a bit more ...
dynamic. Imagine you are developing a text editor with a regex search/replace
feature. You need to accept a regular expression from the end user as input
at run-time. There should be a way to parse a string into a regular expression.
That's what xpressive's dynamic regexes are for. They are built from the
same core components as their static counterparts, but they are late-bound
so you can specify them at run-time.
</p>
<a name="boost_xpressive.user_s_guide.creating_a_regex_object.dynamic_regexes.construction_and_assignment"></a><h3>
<a name="id3099486"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.dynamic_regexes.construction_and_assignment">Construction
and Assignment</a>
</h3>
<p>
There are two ways to create a dynamic regex: with the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html#id1532404-bb">basic_regex&lt;&gt;::compile()</a></code></code>
function or with the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler&lt;&gt;</a></code></code>
class template. Use <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html#id1532404-bb">basic_regex&lt;&gt;::compile()</a></code></code>
if you want the default locale. Use <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler&lt;&gt;</a></code></code>
if you need to specify a different locale. In the section on <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.grammars_and_nested_matches" title="Grammars and Nested Matches">regex
grammars</a>, we'll see another use for <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler&lt;&gt;</a></code></code>.
</p>
<p>
Here is an example of using <code class="computeroutput"><span class="identifier">basic_regex</span><span class="special">&lt;&gt;::</span><span class="identifier">compile</span><span class="special">()</span></code>:
</p>
<pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="identifier">sregex</span><span class="special">::</span><span class="identifier">compile</span><span class="special">(</span> <span class="string">"this|that"</span><span class="special">,</span> <span class="identifier">regex_constants</span><span class="special">::</span><span class="identifier">icase</span> <span class="special">);</span>
</pre>
<p>
Here is the same example using <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler&lt;&gt;</a></code></code>:
</p>
<pre class="programlisting"><span class="identifier">sregex_compiler</span> <span class="identifier">compiler</span><span class="special">;</span>
<span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="identifier">compiler</span><span class="special">.</span><span class="identifier">compile</span><span class="special">(</span> <span class="string">"this|that"</span><span class="special">,</span> <span class="identifier">regex_constants</span><span class="special">::</span><span class="identifier">icase</span> <span class="special">);</span>
</pre>
<p>
<code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html#id1532404-bb">basic_regex&lt;&gt;::compile()</a></code></code>
is implemented in terms of <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler&lt;&gt;</a></code></code>.
</p>
<a name="boost_xpressive.user_s_guide.creating_a_regex_object.dynamic_regexes.dynamic_xpressive_syntax"></a><h3>
<a name="id3099827"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.dynamic_regexes.dynamic_xpressive_syntax">Dynamic
xpressive Syntax</a>
</h3>
<p>
Since the dynamic syntax is not constrained by the rules for valid C++
expressions, we are free to use familiar syntax for dynamic regexes. For
this reason, the syntax used by xpressive for dynamic regexes follows the
lead set by John Maddock's <a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2003/n1429.htm" target="_top">proposal</a>
to add regular expressions to the Standard Library. It is essentially the
syntax standardized by <a href="http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-262.pdf" target="_top">ECMAScript</a>,
with minor changes in support of internationalization.
</p>
<p>
Since the syntax is documented exhaustively elsewhere, I will simply refer
you to the existing standards, rather than duplicate the specification
here.
</p>
<a name="boost_xpressive.user_s_guide.creating_a_regex_object.dynamic_regexes.internationalization"></a><h3>
<a name="id3099882"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.dynamic_regexes.internationalization">Internationalization</a>
</h3>
<p>
As with static regexes, dynamic regexes support internationalization by
allowing you to specify a different <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">locale</span></code>.
To do this, you must use <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler&lt;&gt;</a></code></code>.
The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler&lt;&gt;</a></code></code>
class has an <code class="computeroutput"><span class="identifier">imbue</span><span class="special">()</span></code>
function. After you have imbued a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler&lt;&gt;</a></code></code>
object with a custom <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">locale</span></code>,
all regex objects compiled by that <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler&lt;&gt;</a></code></code>
will use that locale. For example:
</p>
<pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">locale</span> <span class="identifier">my_locale</span> <span class="special">=</span> <span class="comment">/* initialize your locale object here */</span><span class="special">;</span>
<span class="identifier">sregex_compiler</span> <span class="identifier">compiler</span><span class="special">;</span>
<span class="identifier">compiler</span><span class="special">.</span><span class="identifier">imbue</span><span class="special">(</span> <span class="identifier">my_locale</span> <span class="special">);</span>
<span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="identifier">compiler</span><span class="special">.</span><span class="identifier">compile</span><span class="special">(</span> <span class="string">"\\w+|\\d+"</span> <span class="special">);</span>
</pre>
<p>
This regex will use <code class="computeroutput"><span class="identifier">my_locale</span></code>
when evaluating the intrinsic character sets <code class="computeroutput"><span class="string">"\\w"</span></code>
and <code class="computeroutput"><span class="string">"\\d"</span></code>.
</p>
</div>
</div>
<div class="section">
<div class="titlepage"><div><div><h3 class="title">
<a name="boost_xpressive.user_s_guide.matching_and_searching"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.matching_and_searching" title="Matching and Searching">Matching
and Searching</a>
</h3></div></div></div>
<a name="boost_xpressive.user_s_guide.matching_and_searching.overview"></a><h3>
<a name="id3100200"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.matching_and_searching.overview">Overview</a>
</h3>
<p>
Once you have created a regex object, you can use the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
and <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>
algorithms to find patterns in strings. This page covers the basics of regex
matching and searching. In all cases, if you are familiar with how <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
and <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>
in the <a href="../../../libs/regex" target="_top">Boost.Regex</a> library work, xpressive's
versions work the same way.
</p>
<a name="boost_xpressive.user_s_guide.matching_and_searching.seeing_if_a_string_matches_a_regex"></a><h3>
<a name="id3100294"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.matching_and_searching.seeing_if_a_string_matches_a_regex">Seeing
if a String Matches a Regex</a>
</h3>
<p>
The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
algorithm checks to see if a regex matches a given input.
</p>
<div class="warning"><table border="0" summary="Warning">
<tr>
<td rowspan="2" align="center" valign="top" width="25"><img alt="[Warning]" src="../../../doc/src/images/warning.png"></td>
<th align="left">Warning</th>
</tr>
<tr><td align="left" valign="top"><p>
The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
algorithm will only report success if the regex matches the <span class="emphasis"><em>whole
input</em></span>, from beginning to end. If the regex matches only a part
of the input, <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
will return false. If you want to search through the string looking for
sub-strings that the regex matches, use the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>
algorithm.
</p></td></tr>
</table></div>
<p>
The input can be a bidirectional range such as <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span></code>,
a C-style null-terminated string or a pair of iterators. In all cases, the
type of the iterator used to traverse the input sequence must match the iterator
type used to declare the regex object. (You can use the table in the <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.quick_start.know_your_iterator_type">Quick
Start</a> to find the correct regex type for your iterator.)
</p>
<pre class="programlisting"><span class="identifier">cregex</span> <span class="identifier">cre</span> <span class="special">=</span> <span class="special">+</span><span class="identifier">_w</span><span class="special">;</span> <span class="comment">// this regex can match C-style strings
</span><span class="identifier">sregex</span> <span class="identifier">sre</span> <span class="special">=</span> <span class="special">+</span><span class="identifier">_w</span><span class="special">;</span> <span class="comment">// this regex can match std::strings
</span>
<span class="keyword">if</span><span class="special">(</span> <span class="identifier">regex_match</span><span class="special">(</span> <span class="string">"hello"</span><span class="special">,</span> <span class="identifier">cre</span> <span class="special">)</span> <span class="special">)</span> <span class="comment">// OK
</span> <span class="special">{</span> <span class="comment">/*...*/</span> <span class="special">}</span>
<span class="keyword">if</span><span class="special">(</span> <span class="identifier">regex_match</span><span class="special">(</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">(</span><span class="string">"hello"</span><span class="special">),</span> <span class="identifier">sre</span> <span class="special">)</span> <span class="special">)</span> <span class="comment">// OK
</span> <span class="special">{</span> <span class="comment">/*...*/</span> <span class="special">}</span>
<span class="keyword">if</span><span class="special">(</span> <span class="identifier">regex_match</span><span class="special">(</span> <span class="string">"hello"</span><span class="special">,</span> <span class="identifier">sre</span> <span class="special">)</span> <span class="special">)</span> <span class="comment">// ERROR! iterator mis-match!
</span> <span class="special">{</span> <span class="comment">/*...*/</span> <span class="special">}</span>
</pre>
<p>
The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
algorithm optionally accepts a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results&lt;&gt;</a></code></code>
struct as an out parameter. If given, the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
algorithm fills in the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results&lt;&gt;</a></code></code>
struct with information about which parts of the regex matched which parts
of the input.
</p>
<pre class="programlisting"><span class="identifier">cmatch</span> <span class="identifier">what</span><span class="special">;</span>
<span class="identifier">cregex</span> <span class="identifier">cre</span> <span class="special">=</span> <span class="special">+(</span><span class="identifier">s1</span><span class="special">=</span> <span class="identifier">_w</span><span class="special">);</span>
<span class="comment">// store the results of the regex_match in "what"
</span><span class="keyword">if</span><span class="special">(</span> <span class="identifier">regex_match</span><span class="special">(</span> <span class="string">"hello"</span><span class="special">,</span> <span class="identifier">what</span><span class="special">,</span> <span class="identifier">cre</span> <span class="special">)</span> <span class="special">)</span>
<span class="special">{</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="identifier">what</span><span class="special">[</span><span class="number">1</span><span class="special">]</span> <span class="special">&lt;&lt;</span> <span class="char">'\n'</span><span class="special">;</span> <span class="comment">// prints "o"
</span><span class="special">}</span>
</pre>
<p>
The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
algorithm also optionally accepts a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_constants/match_flag_type.html" title="Type match_flag_type">match_flag_type</a></code></code>
bitmask. With <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_constants/match_flag_type.html" title="Type match_flag_type">match_flag_type</a></code></code>,
you can control certain aspects of how the match is evaluated. See the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_constants/match_flag_type.html" title="Type match_flag_type">match_flag_type</a></code></code>
reference for a complete list of the flags and their meanings.
</p>
<pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span><span class="string">"hello"</span><span class="special">);</span>
<span class="identifier">sregex</span> <span class="identifier">sre</span> <span class="special">=</span> <span class="identifier">bol</span> <span class="special">&gt;&gt;</span> <span class="special">+</span><span class="identifier">_w</span><span class="special">;</span>
<span class="comment">// match_not_bol means that "bol" should not match at [begin,begin)
</span><span class="keyword">if</span><span class="special">(</span> <span class="identifier">regex_match</span><span class="special">(</span> <span class="identifier">str</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">str</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">sre</span><span class="special">,</span> <span class="identifier">regex_constants</span><span class="special">::</span><span class="identifier">match_not_bol</span> <span class="special">)</span> <span class="special">)</span>
<span class="special">{</span>
<span class="comment">// should never get here!!!
</span><span class="special">}</span>
</pre>
<p>
Click <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.see_if_a_whole_string_matches_a_regex">here</a>
to see a complete example program that shows how to use <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>.
And check the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
reference to see a complete list of the available overloads.
</p>
<a name="boost_xpressive.user_s_guide.matching_and_searching.searching_for_matching_sub_strings"></a><h3>
<a name="id3101277"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.matching_and_searching.searching_for_matching_sub_strings">Searching
for Matching Sub-Strings</a>
</h3>
<p>
Use <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>
when you want to know if an input sequence contains a sub-sequence that a
regex matches. <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>
will try to match the regex at the beginning of the input sequence and scan
forward in the sequence until it either finds a match or exhausts the sequence.
</p>
<p>
In all other regards, <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>
behaves like <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
<span class="emphasis"><em>(see above)</em></span>. In particular, it can operate on a bidirectional
range such as <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span></code>, C-style null-terminated strings
or iterator ranges. The same care must be taken to ensure that the iterator
type of your regex matches the iterator type of your input sequence. As with
<code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>,
you can optionally provide a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results&lt;&gt;</a></code></code>
struct to receive the results of the search, and a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_constants/match_flag_type.html" title="Type match_flag_type">match_flag_type</a></code></code>
bitmask to control how the match is evaluated.
</p>
<p>
Click <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.see_if_a_string_contains_a_sub_string_that_matches_a_regex">here</a>
to see a complete example program that shows how to use <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>.
And check the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>
reference to see a complete list of the available overloads.
</p>
</div>
<div class="section">
<div class="titlepage"><div><div><h3 class="title">
<a name="boost_xpressive.user_s_guide.accessing_results"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.accessing_results" title="Accessing Results">Accessing
Results</a>
</h3></div></div></div>
<a name="boost_xpressive.user_s_guide.accessing_results.overview"></a><h3>
<a name="id3101498"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.accessing_results.overview">Overview</a>
</h3>
<p>
Sometimes, it is not enough to know simply whether a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
or <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>
was successful or not. If you pass an object of type <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results&lt;&gt;</a></code></code>
to <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
or <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>,
then after the algorithm has completed successfully the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results&lt;&gt;</a></code></code>
will contain extra information about which parts of the regex matched which
parts of the sequence. In Perl, these sub-sequences are called <span class="emphasis"><em>back-references</em></span>,
and they are stored in the variables <code class="literal">$1</code>, <code class="literal">$2</code>,
etc. In xpressive, they are objects of type <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match&lt;&gt;</a></code></code>,
and they are stored in the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results&lt;&gt;</a></code></code>
structure, which acts as a vector of <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match&lt;&gt;</a></code></code>
objects.
</p>
<a name="boost_xpressive.user_s_guide.accessing_results.match_results"></a><h3>
<a name="id3101672"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.accessing_results.match_results">match_results</a>
</h3>
<p>
So, you've passed a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results&lt;&gt;</a></code></code>
object to a regex algorithm, and the algorithm has succeeded. Now you want
to examine the results. Most of what you'll be doing with the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results&lt;&gt;</a></code></code>
object is indexing into it to access its internally stored <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match&lt;&gt;</a></code></code>
objects, but there are a few other things you can do with a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results&lt;&gt;</a></code></code>
object besides.
</p>
<p>
The table below shows how to access the information stored in a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results&lt;&gt;</a></code></code>
object named <code class="computeroutput"><span class="identifier">what</span></code>.
</p>
<div class="table">
<a name="id3101780"></a><p class="title"><b>Table&#160;29.5.&#160;match_results&lt;&gt; Accessors</b></p>
<div class="table-contents"><table class="table" summary="match_results&lt;&gt; Accessors">
<colgroup>
<col>
<col>
</colgroup>
<thead><tr>
<th>
<p>
Accessor
</p>
</th>
<th>
<p>
Effects
</p>
</th>
</tr></thead>
<tbody>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">what</span><span class="special">.</span><span class="identifier">size</span><span class="special">()</span></code>
</p>
</td>
<td>
<p>
Returns the number of sub-matches, which is always greater than
zero after a successful match because the full match is stored
in the zero-th sub-match.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">what</span><span class="special">[</span><span class="identifier">n</span><span class="special">]</span></code>
</p>
</td>
<td>
<p>
Returns the <span class="emphasis"><em>n</em></span>-th sub-match.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">what</span><span class="special">.</span><span class="identifier">length</span><span class="special">(</span><span class="identifier">n</span><span class="special">)</span></code>
</p>
</td>
<td>
<p>
Returns the length of the <span class="emphasis"><em>n</em></span>-th sub-match.
Same as <code class="computeroutput"><span class="identifier">what</span><span class="special">[</span><span class="identifier">n</span><span class="special">].</span><span class="identifier">length</span><span class="special">()</span></code>.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">what</span><span class="special">.</span><span class="identifier">position</span><span class="special">(</span><span class="identifier">n</span><span class="special">)</span></code>
</p>
</td>
<td>
<p>
Returns the offset into the input sequence at which the <span class="emphasis"><em>n</em></span>-th
sub-match begins.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">what</span><span class="special">.</span><span class="identifier">str</span><span class="special">(</span><span class="identifier">n</span><span class="special">)</span></code>
</p>
</td>
<td>
<p>
Returns a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">basic_string</span><span class="special">&lt;&gt;</span></code>
constructed from the <span class="emphasis"><em>n</em></span>-th sub-match. Same
as <code class="computeroutput"><span class="identifier">what</span><span class="special">[</span><span class="identifier">n</span><span class="special">].</span><span class="identifier">str</span><span class="special">()</span></code>.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">what</span><span class="special">.</span><span class="identifier">prefix</span><span class="special">()</span></code>
</p>
</td>
<td>
<p>
Returns a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match&lt;&gt;</a></code></code>
object which represents the sub-sequence from the beginning of
the input sequence to the start of the full match.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">what</span><span class="special">.</span><span class="identifier">suffix</span><span class="special">()</span></code>
</p>
</td>
<td>
<p>
Returns a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match&lt;&gt;</a></code></code>
object which represents the sub-sequence from the end of the full
match to the end of the input sequence.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">what</span><span class="special">.</span><span class="identifier">regex_id</span><span class="special">()</span></code>
</p>
</td>
<td>
<p>
Returns the <code class="computeroutput"><span class="identifier">regex_id</span></code>
of the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex&lt;&gt;</a></code></code>
object that was last used with this <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results&lt;&gt;</a></code></code>
object.
</p>
</td>
</tr>
</tbody>
</table></div>
</div>
<br class="table-break"><p>
There is more you can do with the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results&lt;&gt;</a></code></code>
object, but that will be covered when we talk about <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.grammars_and_nested_matches" title="Grammars and Nested Matches">Grammars
and Nested Matches</a>.
</p>
<a name="boost_xpressive.user_s_guide.accessing_results.sub_match"></a><h3>
<a name="id3102363"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.accessing_results.sub_match">sub_match</a>
</h3>
<p>
When you index into a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results&lt;&gt;</a></code></code>
object, you get back a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match&lt;&gt;</a></code></code>
object. A <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match&lt;&gt;</a></code></code>
is basically a pair of iterators. It is defined like this:
</p>
<pre class="programlisting"><span class="keyword">template</span><span class="special">&lt;</span> <span class="keyword">class</span> <span class="identifier">BidirectionalIterator</span> <span class="special">&gt;</span>
<span class="keyword">struct</span> <span class="identifier">sub_match</span>
<span class="special">:</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">pair</span><span class="special">&lt;</span> <span class="identifier">BidirectionalIterator</span><span class="special">,</span> <span class="identifier">BidirectionalIterator</span> <span class="special">&gt;</span>
<span class="special">{</span>
<span class="keyword">bool</span> <span class="identifier">matched</span><span class="special">;</span>
<span class="comment">// ...
</span><span class="special">};</span>
</pre>
<p>
Since it inherits publicaly from <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">pair</span><span class="special">&lt;&gt;</span></code>, <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match&lt;&gt;</a></code></code>
has <code class="computeroutput"><span class="identifier">first</span></code> and <code class="computeroutput"><span class="identifier">second</span></code> data members of type <code class="computeroutput"><span class="identifier">BidirectionalIterator</span></code>. These are the beginning
and end of the sub-sequence this <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match&lt;&gt;</a></code></code>
represents. <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match&lt;&gt;</a></code></code>
also has a Boolean <code class="computeroutput"><span class="identifier">matched</span></code>
data member, which is true if this <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match&lt;&gt;</a></code></code>
participated in the full match.
</p>
<p>
The following table shows how you might access the information stored in
a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match&lt;&gt;</a></code></code>
object called <code class="computeroutput"><span class="identifier">sub</span></code>.
</p>
<div class="table">
<a name="id3102694"></a><p class="title"><b>Table&#160;29.6.&#160;sub_match&lt;&gt; Accessors</b></p>
<div class="table-contents"><table class="table" summary="sub_match&lt;&gt; Accessors">
<colgroup>
<col>
<col>
</colgroup>
<thead><tr>
<th>
<p>
Accessor
</p>
</th>
<th>
<p>
Effects
</p>
</th>
</tr></thead>
<tbody>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">sub</span><span class="special">.</span><span class="identifier">length</span><span class="special">()</span></code>
</p>
</td>
<td>
<p>
Returns the length of the sub-match. Same as <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">distance</span><span class="special">(</span><span class="identifier">sub</span><span class="special">.</span><span class="identifier">first</span><span class="special">,</span><span class="identifier">sub</span><span class="special">.</span><span class="identifier">second</span><span class="special">)</span></code>.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">sub</span><span class="special">.</span><span class="identifier">str</span><span class="special">()</span></code>
</p>
</td>
<td>
<p>
Returns a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">basic_string</span><span class="special">&lt;&gt;</span></code>
constructed from the sub-match. Same as <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">basic_string</span><span class="special">&lt;</span><span class="identifier">char_type</span><span class="special">&gt;(</span><span class="identifier">sub</span><span class="special">.</span><span class="identifier">first</span><span class="special">,</span><span class="identifier">sub</span><span class="special">.</span><span class="identifier">second</span><span class="special">)</span></code>.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">sub</span><span class="special">.</span><span class="identifier">compare</span><span class="special">(</span><span class="identifier">str</span><span class="special">)</span></code>
</p>
</td>
<td>
<p>
Performs a string comparison between the sub-match and <code class="computeroutput"><span class="identifier">str</span></code>, where <code class="computeroutput"><span class="identifier">str</span></code>
can be a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">basic_string</span><span class="special">&lt;&gt;</span></code>,
C-style null-terminated string, or another sub-match. Same as
<code class="computeroutput"><span class="identifier">sub</span><span class="special">.</span><span class="identifier">str</span><span class="special">().</span><span class="identifier">compare</span><span class="special">(</span><span class="identifier">str</span><span class="special">)</span></code>.
</p>
</td>
</tr>
</tbody>
</table></div>
</div>
<br class="table-break"><a name="boost_xpressive.user_s_guide.accessing_results._inlinemediaobject__imageobject__imagedata_fileref__images_caution_png____imagedata___imageobject__textobject__phrase_caution__phrase___textobject___inlinemediaobject__results_invalidation__inlinemediaobject__imageobject__imagedata_fileref__images_caution_png____imagedata___imageobject__textobject__phrase_caution__phrase___textobject___inlinemediaobject_"></a><h3>
<a name="id3103093"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.accessing_results._inlinemediaobject__imageobject__imagedata_fileref__images_caution_png____imagedata___imageobject__textobject__phrase_caution__phrase___textobject___inlinemediaobject__results_invalidation__inlinemediaobject__imageobject__imagedata_fileref__images_caution_png____imagedata___imageobject__textobject__phrase_caution__phrase___textobject___inlinemediaobject_"><span class="inlinemediaobject"><img src="../images/caution.png" alt="caution"></span> Results Invalidation <span class="inlinemediaobject"><img src="../images/caution.png" alt="caution"></span></a>
</h3>
<p>
Results are stored as iterators into the input sequence. Anything which invalidates
the input sequence will invalidate the match results. For instance, if you
match a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span></code> object, the results are only valid
until your next call to a non-const member function of that <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span></code>
object. After that, the results held by the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results&lt;&gt;</a></code></code>
object are invalid. Don't use them!
</p>
</div>
<div class="section">
<div class="titlepage"><div><div><h3 class="title">
<a name="boost_xpressive.user_s_guide.string_substitutions"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_substitutions" title="String Substitutions">String
Substitutions</a>
</h3></div></div></div>
<p>
Regular expressions are not only good for searching text; they're good at
<span class="emphasis"><em>manipulating</em></span> it. And one of the most common text manipulation
tasks is search-and-replace. xpressive provides the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code>
algorithm for searching and replacing.
</p>
<a name="boost_xpressive.user_s_guide.string_substitutions.regex_replace__"></a><h3>
<a name="id3103267"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_substitutions.regex_replace__">regex_replace()</a>
</h3>
<p>
Performing search-and-replace using <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code>
is simple. All you need is an input sequence, a regex object, and a format
string or a formatter object. There are several versions of the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code>
algorithm. Some accept the input sequence as a bidirectional container such
as <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span></code> and returns the result in a new
container of the same type. Others accept the input as a null terminated
string and return a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span></code>. Still others accept the input sequence
as a pair of iterators and writes the result into an output iterator. The
substitution may be specified as a string with format sequences or as a formatter
object. Below are some simple examples of using string-based substitutions.
</p>
<pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">input</span><span class="special">(</span><span class="string">"This is his face"</span><span class="special">);</span>
<span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="identifier">as_xpr</span><span class="special">(</span><span class="string">"his"</span><span class="special">);</span> <span class="comment">// find all occurrences of "his" ...
</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">format</span><span class="special">(</span><span class="string">"her"</span><span class="special">);</span> <span class="comment">// ... and replace them with "her"
</span>
<span class="comment">// use the version of regex_replace() that operates on strings
</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">output</span> <span class="special">=</span> <span class="identifier">regex_replace</span><span class="special">(</span> <span class="identifier">input</span><span class="special">,</span> <span class="identifier">re</span><span class="special">,</span> <span class="identifier">format</span> <span class="special">);</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="identifier">output</span> <span class="special">&lt;&lt;</span> <span class="char">'\n'</span><span class="special">;</span>
<span class="comment">// use the version of regex_replace() that operates on iterators
</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">ostream_iterator</span><span class="special">&lt;</span> <span class="keyword">char</span> <span class="special">&gt;</span> <span class="identifier">out_iter</span><span class="special">(</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">);</span>
<span class="identifier">regex_replace</span><span class="special">(</span> <span class="identifier">out_iter</span><span class="special">,</span> <span class="identifier">input</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">input</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">re</span><span class="special">,</span> <span class="identifier">format</span> <span class="special">);</span>
</pre>
<p>
The above program prints out the following:
</p>
<pre class="programlisting">Ther is her face
Ther is her face
</pre>
<p>
Notice that <span class="emphasis"><em>all</em></span> the occurrences of <code class="computeroutput"><span class="string">"his"</span></code>
have been replaced with <code class="computeroutput"><span class="string">"her"</span></code>.
</p>
<p>
Click <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.replace_all_sub_strings_that_match_a_regex">here</a>
to see a complete example program that shows how to use <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code>.
And check the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code>
reference to see a complete list of the available overloads.
</p>
<a name="boost_xpressive.user_s_guide.string_substitutions.replace_options"></a><h3>
<a name="id3103818"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_substitutions.replace_options">Replace
Options</a>
</h3>
<p>
The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code>
algorithm takes an optional bitmask parameter to control the formatting.
The possible values of the bitmask are:
</p>
<div class="table">
<a name="id3103853"></a><p class="title"><b>Table&#160;29.7.&#160;Format Flags</b></p>
<div class="table-contents"><table class="table" summary="Format Flags">
<colgroup>
<col>
<col>
</colgroup>
<thead><tr>
<th>
<p>
Flag
</p>
</th>
<th>
<p>
Meaning
</p>
</th>
</tr></thead>
<tbody>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">format_default</span></code>
</p>
</td>
<td>
<p>
Recognize the ECMA-262 format sequences (see below).
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">format_first_only</span></code>
</p>
</td>
<td>
<p>
Only replace the first match, not all of them.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">format_no_copy</span></code>
</p>
</td>
<td>
<p>
Don't copy the parts of the input sequence that didn't match the
regex to the output sequence.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">format_literal</span></code>
</p>
</td>
<td>
<p>
Treat the format string as a literal; that is, don't recognize
any escape sequences.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">format_perl</span></code>
</p>
</td>
<td>
<p>
Recognize the Perl format sequences (see below).
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">format_sed</span></code>
</p>
</td>
<td>
<p>
Recognize the sed format sequences (see below).
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">format_all</span></code>
</p>
</td>
<td>
<p>
In addition to the Perl format sequences, recognize some Boost-specific
format sequences.
</p>
</td>
</tr>
</tbody>
</table></div>
</div>
<br class="table-break"><p>
These flags live in the <code class="computeroutput"><span class="identifier">xpressive</span><span class="special">::</span><span class="identifier">regex_constants</span></code>
namespace. If the substitution parameter is a function object instead of
a string, the flags <code class="computeroutput"><span class="identifier">format_literal</span></code>,
<code class="computeroutput"><span class="identifier">format_perl</span></code>, <code class="computeroutput"><span class="identifier">format_sed</span></code>, and <code class="computeroutput"><span class="identifier">format_all</span></code>
are ignored.
</p>
<a name="boost_xpressive.user_s_guide.string_substitutions.the_ecma_262_format_sequences"></a><h3>
<a name="id3104147"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_substitutions.the_ecma_262_format_sequences">The
ECMA-262 Format Sequences</a>
</h3>
<p>
When you haven't specified a substitution string dialect with one of the
format flags above, you get the dialect defined by ECMA-262, the standard
for ECMAScript. The table below shows the escape sequences recognized in
ECMA-262 mode.
</p>
<div class="table">
<a name="id3104170"></a><p class="title"><b>Table&#160;29.8.&#160;Format Escape Sequences</b></p>
<div class="table-contents"><table class="table" summary="Format Escape Sequences">
<colgroup>
<col>
<col>
</colgroup>
<thead><tr>
<th>
<p>
Escape Sequence
</p>
</th>
<th>
<p>
Meaning
</p>
</th>
</tr></thead>
<tbody>
<tr>
<td>
<p>
<code class="literal">$1</code>, <code class="literal">$2</code>, etc.
</p>
</td>
<td>
<p>
the corresponding sub-match
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">$&amp;</code>
</p>
</td>
<td>
<p>
the full match
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">$`</code>
</p>
</td>
<td>
<p>
the match prefix
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">$'</code>
</p>
</td>
<td>
<p>
the match suffix
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">$$</code>
</p>
</td>
<td>
<p>
a literal <code class="computeroutput"><span class="char">'$'</span></code> character
</p>
</td>
</tr>
</tbody>
</table></div>
</div>
<br class="table-break"><p>
Any other sequence beginning with <code class="computeroutput"><span class="char">'$'</span></code>
simply represents itself. For example, if the format string were <code class="computeroutput"><span class="string">"$a"</span></code> then <code class="computeroutput"><span class="string">"$a"</span></code>
would be inserted into the output sequence.
</p>
<a name="boost_xpressive.user_s_guide.string_substitutions.the_sed_format_sequences"></a><h3>
<a name="id3104377"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_substitutions.the_sed_format_sequences">The
Sed Format Sequences</a>
</h3>
<p>
When specifying the <code class="computeroutput"><span class="identifier">format_sed</span></code>
flag to <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code>,
the following escape sequences are recognized:
</p>
<div class="table">
<a name="id3104420"></a><p class="title"><b>Table&#160;29.9.&#160;Sed Format Escape Sequences</b></p>
<div class="table-contents"><table class="table" summary="Sed Format Escape Sequences">
<colgroup>
<col>
<col>
</colgroup>
<thead><tr>
<th>
<p>
Escape Sequence
</p>
</th>
<th>
<p>
Meaning
</p>
</th>
</tr></thead>
<tbody>
<tr>
<td>
<p>
<code class="literal">\1</code>, <code class="literal">\2</code>, etc.
</p>
</td>
<td>
<p>
The corresponding sub-match
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">&amp;</code>
</p>
</td>
<td>
<p>
the full match
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">\a</code>
</p>
</td>
<td>
<p>
A literal <code class="computeroutput"><span class="char">'\a'</span></code>
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">\e</code>
</p>
</td>
<td>
<p>
A literal <code class="computeroutput"><span class="identifier">char_type</span><span class="special">(</span><span class="number">27</span><span class="special">)</span></code>
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">\f</code>
</p>
</td>
<td>
<p>
A literal <code class="computeroutput"><span class="char">'\f'</span></code>
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">\n</code>
</p>
</td>
<td>
<p>
A literal <code class="computeroutput"><span class="char">'\n'</span></code>
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">\r</code>
</p>
</td>
<td>
<p>
A literal <code class="computeroutput"><span class="char">'\r'</span></code>
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">\t</code>
</p>
</td>
<td>
<p>
A literal <code class="computeroutput"><span class="char">'\t'</span></code>
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">\v</code>
</p>
</td>
<td>
<p>
A literal <code class="computeroutput"><span class="char">'\v'</span></code>
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">\xFF</code>
</p>
</td>
<td>
<p>
A literal <code class="computeroutput"><span class="identifier">char_type</span><span class="special">(</span><span class="number">0xFF</span><span class="special">)</span></code>, where <code class="literal"><span class="emphasis"><em>F</em></span></code>
is any hex digit
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">\x{FFFF}</code>
</p>
</td>
<td>
<p>
A literal <code class="computeroutput"><span class="identifier">char_type</span><span class="special">(</span><span class="number">0xFFFF</span><span class="special">)</span></code>, where <code class="literal"><span class="emphasis"><em>F</em></span></code>
is any hex digit
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">\cX</code>
</p>
</td>
<td>
<p>
The control character <code class="literal"><span class="emphasis"><em>X</em></span></code>
</p>
</td>
</tr>
</tbody>
</table></div>
</div>
<br class="table-break"><a name="boost_xpressive.user_s_guide.string_substitutions.the_perl_format_sequences"></a><h3>
<a name="id3104880"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_substitutions.the_perl_format_sequences">The
Perl Format Sequences</a>
</h3>
<p>
When specifying the <code class="computeroutput"><span class="identifier">format_perl</span></code>
flag to <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code>,
the following escape sequences are recognized:
</p>
<div class="table">
<a name="id3104922"></a><p class="title"><b>Table&#160;29.10.&#160;Perl Format Escape Sequences</b></p>
<div class="table-contents"><table class="table" summary="Perl Format Escape Sequences">
<colgroup>
<col>
<col>
</colgroup>
<thead><tr>
<th>
<p>
Escape Sequence
</p>
</th>
<th>
<p>
Meaning
</p>
</th>
</tr></thead>
<tbody>
<tr>
<td>
<p>
<code class="literal">$1</code>, <code class="literal">$2</code>, etc.
</p>
</td>
<td>
<p>
the corresponding sub-match
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">$&amp;</code>
</p>
</td>
<td>
<p>
the full match
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">$`</code>
</p>
</td>
<td>
<p>
the match prefix
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">$'</code>
</p>
</td>
<td>
<p>
the match suffix
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">$$</code>
</p>
</td>
<td>
<p>
a literal <code class="computeroutput"><span class="char">'$'</span></code> character
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">\a</code>
</p>
</td>
<td>
<p>
A literal <code class="computeroutput"><span class="char">'\a'</span></code>
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">\e</code>
</p>
</td>
<td>
<p>
A literal <code class="computeroutput"><span class="identifier">char_type</span><span class="special">(</span><span class="number">27</span><span class="special">)</span></code>
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">\f</code>
</p>
</td>
<td>
<p>
A literal <code class="computeroutput"><span class="char">'\f'</span></code>
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">\n</code>
</p>
</td>
<td>
<p>
A literal <code class="computeroutput"><span class="char">'\n'</span></code>
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">\r</code>
</p>
</td>
<td>
<p>
A literal <code class="computeroutput"><span class="char">'\r'</span></code>
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">\t</code>
</p>
</td>
<td>
<p>
A literal <code class="computeroutput"><span class="char">'\t'</span></code>
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">\v</code>
</p>
</td>
<td>
<p>
A literal <code class="computeroutput"><span class="char">'\v'</span></code>
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">\xFF</code>
</p>
</td>
<td>
<p>
A literal <code class="computeroutput"><span class="identifier">char_type</span><span class="special">(</span><span class="number">0xFF</span><span class="special">)</span></code>, where <code class="literal"><span class="emphasis"><em>F</em></span></code>
is any hex digit
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">\x{FFFF}</code>
</p>
</td>
<td>
<p>
A literal <code class="computeroutput"><span class="identifier">char_type</span><span class="special">(</span><span class="number">0xFFFF</span><span class="special">)</span></code>, where <code class="literal"><span class="emphasis"><em>F</em></span></code>
is any hex digit
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">\cX</code>
</p>
</td>
<td>
<p>
The control character <code class="literal"><span class="emphasis"><em>X</em></span></code>
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">\l</code>
</p>
</td>
<td>
<p>
Make the next character lowercase
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">\L</code>
</p>
</td>
<td>
<p>
Make the rest of the substitution lowercase until the next <code class="literal">\E</code>
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">\u</code>
</p>
</td>
<td>
<p>
Make the next character uppercase
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">\U</code>
</p>
</td>
<td>
<p>
Make the rest of the substitution uppercase until the next <code class="literal">\E</code>
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">\E</code>
</p>
</td>
<td>
<p>
Terminate <code class="literal">\L</code> or <code class="literal">\U</code>
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">\1</code>, <code class="literal">\2</code>, etc.
</p>
</td>
<td>
<p>
The corresponding sub-match
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="literal">\g&lt;name&gt;</code>
</p>
</td>
<td>
<p>
The named backref <span class="emphasis"><em>name</em></span>
</p>
</td>
</tr>
</tbody>
</table></div>
</div>
<br class="table-break"><a name="boost_xpressive.user_s_guide.string_substitutions.the_boost_specific_format_sequences"></a><h3>
<a name="id3105648"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_substitutions.the_boost_specific_format_sequences">The
Boost-Specific Format Sequences</a>
</h3>
<p>
When specifying the <code class="computeroutput"><span class="identifier">format_all</span></code>
flag to <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code>,
the escape sequences recognized are the same as those above for <code class="computeroutput"><span class="identifier">format_perl</span></code>. In addition, conditional expressions
of the following form are recognized:
</p>
<pre class="programlisting">?Ntrue-expression:false-expression
</pre>
<p>
where <span class="emphasis"><em>N</em></span> is a decimal digit representing a sub-match.
If the corresponding sub-match participated in the full match, then the substitution
is <span class="emphasis"><em>true-expression</em></span>. Otherwise, it is <span class="emphasis"><em>false-expression</em></span>.
In this mode, you can use parens <code class="literal">()</code> for grouping. If you
want a literal paren, you must escape it as <code class="literal">\(</code>.
</p>
<a name="boost_xpressive.user_s_guide.string_substitutions.formatter_objects"></a><h3>
<a name="id3105747"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_substitutions.formatter_objects">Formatter
Objects</a>
</h3>
<p>
Format strings are not always expressive enough for all your text substitution
needs. Consider the simple example of wanting to map input strings to output
strings, as you may want to do with environment variables. Rather than a
format <span class="emphasis"><em>string</em></span>, for this you would use a formatter <span class="emphasis"><em>object</em></span>.
Consider the following code, which finds embedded environment variables of
the form <code class="computeroutput"><span class="string">"$(XYZ)"</span></code> and
computes the substitution string by looking up the environment variable in
a map.
</p>
<pre class="programlisting"><span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">map</span><span class="special">&gt;</span>
<span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">string</span><span class="special">&gt;</span>
<span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">iostream</span><span class="special">&gt;</span>
<span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">&gt;</span>
<span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">;</span>
<span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">xpressive</span><span class="special">;</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special">&lt;</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">,</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">&gt;</span> <span class="identifier">env</span><span class="special">;</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="keyword">const</span> <span class="special">&amp;</span><span class="identifier">format_fun</span><span class="special">(</span><span class="identifier">smatch</span> <span class="keyword">const</span> <span class="special">&amp;</span><span class="identifier">what</span><span class="special">)</span>
<span class="special">{</span>
<span class="keyword">return</span> <span class="identifier">env</span><span class="special">[</span><span class="identifier">what</span><span class="special">[</span><span class="number">1</span><span class="special">].</span><span class="identifier">str</span><span class="special">()];</span>
<span class="special">}</span>
<span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span>
<span class="special">{</span>
<span class="identifier">env</span><span class="special">[</span><span class="string">"X"</span><span class="special">]</span> <span class="special">=</span> <span class="string">"this"</span><span class="special">;</span>
<span class="identifier">env</span><span class="special">[</span><span class="string">"Y"</span><span class="special">]</span> <span class="special">=</span> <span class="string">"that"</span><span class="special">;</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">input</span><span class="special">(</span><span class="string">"\"$(X)\" has the value \"$(Y)\""</span><span class="special">);</span>
<span class="comment">// replace strings like "$(XYZ)" with the result of env["XYZ"]
</span> <span class="identifier">sregex</span> <span class="identifier">envar</span> <span class="special">=</span> <span class="string">"$("</span> <span class="special">&gt;&gt;</span> <span class="special">(</span><span class="identifier">s1</span> <span class="special">=</span> <span class="special">+</span><span class="identifier">_w</span><span class="special">)</span> <span class="special">&gt;&gt;</span> <span class="char">')'</span><span class="special">;</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">output</span> <span class="special">=</span> <span class="identifier">regex_replace</span><span class="special">(</span><span class="identifier">input</span><span class="special">,</span> <span class="identifier">envar</span><span class="special">,</span> <span class="identifier">format_fun</span><span class="special">);</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="identifier">output</span> <span class="special">&lt;&lt;</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">endl</span><span class="special">;</span>
<span class="special">}</span>
</pre>
<p>
In this case, we use a function, <code class="computeroutput"><span class="identifier">format_fun</span><span class="special">()</span></code> to compute the substitution string on the
fly. It accepts a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results&lt;&gt;</a></code></code>
object which contains the results of the current match. <code class="computeroutput"><span class="identifier">format_fun</span><span class="special">()</span></code> uses the first submatch as a key into the
global <code class="computeroutput"><span class="identifier">env</span></code> map. The above
code displays:
</p>
<pre class="programlisting">"this" has the value "that"
</pre>
<p>
The formatter need not be an ordinary function. It may be an object of class
type. And rather than return a string, it may accept an output iterator into
which it writes the substitution. Consider the following, which is functionally
equivalent to the above.
</p>
<pre class="programlisting"><span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">map</span><span class="special">&gt;</span>
<span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">string</span><span class="special">&gt;</span>
<span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">iostream</span><span class="special">&gt;</span>
<span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">&gt;</span>
<span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">;</span>
<span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">xpressive</span><span class="special">;</span>
<span class="keyword">struct</span> <span class="identifier">formatter</span>
<span class="special">{</span>
<span class="keyword">typedef</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special">&lt;</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">,</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">&gt;</span> <span class="identifier">env_map</span><span class="special">;</span>
<span class="identifier">env_map</span> <span class="identifier">env</span><span class="special">;</span>
<span class="keyword">template</span><span class="special">&lt;</span><span class="keyword">typename</span> <span class="identifier">Out</span><span class="special">&gt;</span>
<span class="identifier">Out</span> <span class="keyword">operator</span><span class="special">()(</span><span class="identifier">smatch</span> <span class="keyword">const</span> <span class="special">&amp;</span><span class="identifier">what</span><span class="special">,</span> <span class="identifier">Out</span> <span class="identifier">out</span><span class="special">)</span> <span class="keyword">const</span>
<span class="special">{</span>
<span class="identifier">env_map</span><span class="special">::</span><span class="identifier">const_iterator</span> <span class="identifier">where</span> <span class="special">=</span> <span class="identifier">env</span><span class="special">.</span><span class="identifier">find</span><span class="special">(</span><span class="identifier">what</span><span class="special">[</span><span class="number">1</span><span class="special">]);</span>
<span class="keyword">if</span><span class="special">(</span><span class="identifier">where</span> <span class="special">!=</span> <span class="identifier">env</span><span class="special">.</span><span class="identifier">end</span><span class="special">())</span>
<span class="special">{</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="keyword">const</span> <span class="special">&amp;</span><span class="identifier">sub</span> <span class="special">=</span> <span class="identifier">where</span><span class="special">-&gt;</span><span class="identifier">second</span><span class="special">;</span>
<span class="identifier">out</span> <span class="special">=</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">copy</span><span class="special">(</span><span class="identifier">sub</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">sub</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">out</span><span class="special">);</span>
<span class="special">}</span>
<span class="keyword">return</span> <span class="identifier">out</span><span class="special">;</span>
<span class="special">}</span>
<span class="special">};</span>
<span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span>
<span class="special">{</span>
<span class="identifier">formatter</span> <span class="identifier">fmt</span><span class="special">;</span>
<span class="identifier">fmt</span><span class="special">.</span><span class="identifier">env</span><span class="special">[</span><span class="string">"X"</span><span class="special">]</span> <span class="special">=</span> <span class="string">"this"</span><span class="special">;</span>
<span class="identifier">fmt</span><span class="special">.</span><span class="identifier">env</span><span class="special">[</span><span class="string">"Y"</span><span class="special">]</span> <span class="special">=</span> <span class="string">"that"</span><span class="special">;</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">input</span><span class="special">(</span><span class="string">"\"$(X)\" has the value \"$(Y)\""</span><span class="special">);</span>
<span class="identifier">sregex</span> <span class="identifier">envar</span> <span class="special">=</span> <span class="string">"$("</span> <span class="special">&gt;&gt;</span> <span class="special">(</span><span class="identifier">s1</span> <span class="special">=</span> <span class="special">+</span><span class="identifier">_w</span><span class="special">)</span> <span class="special">&gt;&gt;</span> <span class="char">')'</span><span class="special">;</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">output</span> <span class="special">=</span> <span class="identifier">regex_replace</span><span class="special">(</span><span class="identifier">input</span><span class="special">,</span> <span class="identifier">envar</span><span class="special">,</span> <span class="identifier">fmt</span><span class="special">);</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="identifier">output</span> <span class="special">&lt;&lt;</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">endl</span><span class="special">;</span>
<span class="special">}</span>
</pre>
<p>
The formatter must be a callable object -- a function or a function object
-- that has one of three possible signatures, detailed in the table below.
For the table, <code class="computeroutput"><span class="identifier">fmt</span></code> is a function
pointer or function object, <code class="computeroutput"><span class="identifier">what</span></code>
is a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results&lt;&gt;</a></code></code>
object, <code class="computeroutput"><span class="identifier">out</span></code> is an OutputIterator,
and <code class="computeroutput"><span class="identifier">flags</span></code> is a value of
<code class="computeroutput"><span class="identifier">regex_constants</span><span class="special">::</span><span class="identifier">match_flag_type</span></code>:
</p>
<div class="table">
<a name="id3107534"></a><p class="title"><b>Table&#160;29.11.&#160;Formatter Signatures</b></p>
<div class="table-contents"><table class="table" summary="Formatter Signatures">
<colgroup>
<col>
<col>
<col>
</colgroup>
<thead><tr>
<th>
<p>
Formatter Invocation
</p>
</th>
<th>
<p>
Return Type
</p>
</th>
<th>
<p>
Semantics
</p>
</th>
</tr></thead>
<tbody>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">fmt</span><span class="special">(</span><span class="identifier">what</span><span class="special">)</span></code>
</p>
</td>
<td>
<p>
Range of characters (e.g. <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span></code>)
or null-terminated string
</p>
</td>
<td>
<p>
The string matched by the regex is replaced with the string returned
by the formatter.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">fmt</span><span class="special">(</span><span class="identifier">what</span><span class="special">,</span>
<span class="identifier">out</span><span class="special">)</span></code>
</p>
</td>
<td>
<p>
OutputIterator
</p>
</td>
<td>
<p>
The formatter writes the replacement string into <code class="computeroutput"><span class="identifier">out</span></code> and returns <code class="computeroutput"><span class="identifier">out</span></code>.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">fmt</span><span class="special">(</span><span class="identifier">what</span><span class="special">,</span>
<span class="identifier">out</span><span class="special">,</span>
<span class="identifier">flags</span><span class="special">)</span></code>
</p>
</td>
<td>
<p>
OutputIterator
</p>
</td>
<td>
<p>
The formatter writes the replacement string into <code class="computeroutput"><span class="identifier">out</span></code> and returns <code class="computeroutput"><span class="identifier">out</span></code>. The <code class="computeroutput"><span class="identifier">flags</span></code>
parameter is the value of the match flags passed to the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code>
algorithm.
</p>
</td>
</tr>
</tbody>
</table></div>
</div>
<br class="table-break"><a name="boost_xpressive.user_s_guide.string_substitutions.formatter_expressions"></a><h3>
<a name="id3107836"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_substitutions.formatter_expressions">Formatter
Expressions</a>
</h3>
<p>
In addition to format <span class="emphasis"><em>strings</em></span> and formatter <span class="emphasis"><em>objects</em></span>,
<code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code>
also accepts formatter <span class="emphasis"><em>expressions</em></span>. A formatter expression
is a lambda expression that generates a string. It uses the same syntax as
that for <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions" title="Semantic Actions and User-Defined Assertions">Semantic
Actions</a>, which are covered later. The above example, which uses <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code>
to substitute strings for environment variables, is repeated here using a
formatter expression.
</p>
<pre class="programlisting"><span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">map</span><span class="special">&gt;</span>
<span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">string</span><span class="special">&gt;</span>
<span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">iostream</span><span class="special">&gt;</span>
<span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">&gt;</span>
<span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">regex_actions</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">&gt;</span>
<span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span><span class="special">;</span>
<span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span>
<span class="special">{</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special">&lt;</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">,</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">&gt;</span> <span class="identifier">env</span><span class="special">;</span>
<span class="identifier">env</span><span class="special">[</span><span class="string">"X"</span><span class="special">]</span> <span class="special">=</span> <span class="string">"this"</span><span class="special">;</span>
<span class="identifier">env</span><span class="special">[</span><span class="string">"Y"</span><span class="special">]</span> <span class="special">=</span> <span class="string">"that"</span><span class="special">;</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">input</span><span class="special">(</span><span class="string">"\"$(X)\" has the value \"$(Y)\""</span><span class="special">);</span>
<span class="identifier">sregex</span> <span class="identifier">envar</span> <span class="special">=</span> <span class="string">"$("</span> <span class="special">&gt;&gt;</span> <span class="special">(</span><span class="identifier">s1</span> <span class="special">=</span> <span class="special">+</span><span class="identifier">_w</span><span class="special">)</span> <span class="special">&gt;&gt;</span> <span class="char">')'</span><span class="special">;</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">output</span> <span class="special">=</span> <span class="identifier">regex_replace</span><span class="special">(</span><span class="identifier">input</span><span class="special">,</span> <span class="identifier">envar</span><span class="special">,</span> <span class="identifier">ref</span><span class="special">(</span><span class="identifier">env</span><span class="special">)[</span><span class="identifier">s1</span><span class="special">]);</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="identifier">output</span> <span class="special">&lt;&lt;</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">endl</span><span class="special">;</span>
<span class="special">}</span>
</pre>
<p>
In the above, the formatter expression is <code class="computeroutput"><span class="identifier">ref</span><span class="special">(</span><span class="identifier">env</span><span class="special">)[</span><span class="identifier">s1</span><span class="special">]</span></code>. This
means to use the value of the first submatch, <code class="computeroutput"><span class="identifier">s1</span></code>,
as a key into the <code class="computeroutput"><span class="identifier">env</span></code> map.
The purpose of <code class="computeroutput"><span class="identifier">xpressive</span><span class="special">::</span><span class="identifier">ref</span><span class="special">()</span></code>
here is to make the reference to the <code class="computeroutput"><span class="identifier">env</span></code>
local variable <span class="emphasis"><em>lazy</em></span> so that the index operation is deferred
until we know what to replace <code class="computeroutput"><span class="identifier">s1</span></code>
with.
</p>
</div>
<div class="section">
<div class="titlepage"><div><div><h3 class="title">
<a name="boost_xpressive.user_s_guide.string_splitting_and_tokenization"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_splitting_and_tokenization" title="String Splitting and Tokenization">String
Splitting and Tokenization</a>
</h3></div></div></div>
<p>
<code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator&lt;&gt;</a></code></code>
is the Ginsu knife of the text manipulation world. It slices! It dices! This
section describes how to use the highly-configurable <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator&lt;&gt;</a></code></code>
to chop up input sequences.
</p>
<a name="boost_xpressive.user_s_guide.string_splitting_and_tokenization.overview"></a><h3>
<a name="id3108650"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_splitting_and_tokenization.overview">Overview</a>
</h3>
<p>
You initialize a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator&lt;&gt;</a></code></code>
with an input sequence, a regex, and some optional configuration parameters.
The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator&lt;&gt;</a></code></code>
will use <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>
to find the first place in the sequence that the regex matches. When dereferenced,
the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator&lt;&gt;</a></code></code>
returns a <span class="emphasis"><em>token</em></span> in the form of a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">basic_string</span><span class="special">&lt;&gt;</span></code>. Which string it returns depends
on the configuration parameters. By default it returns a string corresponding
to the full match, but it could also return a string corresponding to a particular
marked sub-expression, or even the part of the sequence that <span class="emphasis"><em>didn't</em></span>
match. When you increment the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator&lt;&gt;</a></code></code>,
it will move to the next token. Which token is next depends on the configuration
parameters. It could simply be a different marked sub-expression in the current
match, or it could be part or all of the next match. Or it could be the part
that <span class="emphasis"><em>didn't</em></span> match.
</p>
<p>
As you can see, <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator&lt;&gt;</a></code></code>
can do a lot. That makes it hard to describe, but some examples should make
it clear.
</p>
<a name="boost_xpressive.user_s_guide.string_splitting_and_tokenization.example_1__simple_tokenization"></a><h3>
<a name="id3108818"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_splitting_and_tokenization.example_1__simple_tokenization">Example
1: Simple Tokenization</a>
</h3>
<p>
This example uses <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator&lt;&gt;</a></code></code>
to chop a sequence into a series of tokens consisting of words.
</p>
<pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">input</span><span class="special">(</span><span class="string">"This is his face"</span><span class="special">);</span>
<span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="special">+</span><span class="identifier">_w</span><span class="special">;</span> <span class="comment">// find a word
</span>
<span class="comment">// iterate over all the words in the input
</span><span class="identifier">sregex_token_iterator</span> <span class="identifier">begin</span><span class="special">(</span> <span class="identifier">input</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">input</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">re</span> <span class="special">),</span> <span class="identifier">end</span><span class="special">;</span>
<span class="comment">// write all the words to std::cout
</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">ostream_iterator</span><span class="special">&lt;</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="special">&gt;</span> <span class="identifier">out_iter</span><span class="special">(</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span><span class="special">,</span> <span class="string">"\n"</span> <span class="special">);</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">copy</span><span class="special">(</span> <span class="identifier">begin</span><span class="special">,</span> <span class="identifier">end</span><span class="special">,</span> <span class="identifier">out_iter</span> <span class="special">);</span>
</pre>
<p>
This program displays the following:
</p>
<pre class="programlisting">This
is
his
face
</pre>
<a name="boost_xpressive.user_s_guide.string_splitting_and_tokenization.example_2__simple_tokenization__reloaded"></a><h3>
<a name="id3109156"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_splitting_and_tokenization.example_2__simple_tokenization__reloaded">Example
2: Simple Tokenization, Reloaded</a>
</h3>
<p>
This example also uses <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator&lt;&gt;</a></code></code>
to chop a sequence into a series of tokens consisting of words, but it uses
the regex as a delimiter. When we pass a <code class="computeroutput"><span class="special">-</span><span class="number">1</span></code> as the last parameter to the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator&lt;&gt;</a></code></code>
constructor, it instructs the token iterator to consider as tokens those
parts of the input that <span class="emphasis"><em>didn't</em></span> match the regex.
</p>
<pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">input</span><span class="special">(</span><span class="string">"This is his face"</span><span class="special">);</span>
<span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="special">+</span><span class="identifier">_s</span><span class="special">;</span> <span class="comment">// find white space
</span>
<span class="comment">// iterate over all non-white space in the input. Note the -1 below:
</span><span class="identifier">sregex_token_iterator</span> <span class="identifier">begin</span><span class="special">(</span> <span class="identifier">input</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">input</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">re</span><span class="special">,</span> <span class="special">-</span><span class="number">1</span> <span class="special">),</span> <span class="identifier">end</span><span class="special">;</span>
<span class="comment">// write all the words to std::cout
</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">ostream_iterator</span><span class="special">&lt;</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="special">&gt;</span> <span class="identifier">out_iter</span><span class="special">(</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span><span class="special">,</span> <span class="string">"\n"</span> <span class="special">);</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">copy</span><span class="special">(</span> <span class="identifier">begin</span><span class="special">,</span> <span class="identifier">end</span><span class="special">,</span> <span class="identifier">out_iter</span> <span class="special">);</span>
</pre>
<p>
This program displays the following:
</p>
<pre class="programlisting">This
is
his
face
</pre>
<a name="boost_xpressive.user_s_guide.string_splitting_and_tokenization.example_3__simple_tokenization__revolutions"></a><h3>
<a name="id3109545"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_splitting_and_tokenization.example_3__simple_tokenization__revolutions">Example
3: Simple Tokenization, Revolutions</a>
</h3>
<p>
This example also uses <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator&lt;&gt;</a></code></code>
to chop a sequence containing a bunch of dates into a series of tokens consisting
of just the years. When we pass a positive integer <code class="literal"><span class="emphasis"><em>N</em></span></code>
as the last parameter to the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator&lt;&gt;</a></code></code>
constructor, it instructs the token iterator to consider as tokens only the
<code class="literal"><span class="emphasis"><em>N</em></span></code>-th marked sub-expression of each
match.
</p>
<pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">input</span><span class="special">(</span><span class="string">"01/02/2003 blahblah 04/23/1999 blahblah 11/13/1981"</span><span class="special">);</span>
<span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="identifier">sregex</span><span class="special">::</span><span class="identifier">compile</span><span class="special">(</span><span class="string">"(\\d{2})/(\\d{2})/(\\d{4})"</span><span class="special">);</span> <span class="comment">// find a date
</span>
<span class="comment">// iterate over all the years in the input. Note the 3 below, corresponding to the 3rd sub-expression:
</span><span class="identifier">sregex_token_iterator</span> <span class="identifier">begin</span><span class="special">(</span> <span class="identifier">input</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">input</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">re</span><span class="special">,</span> <span class="number">3</span> <span class="special">),</span> <span class="identifier">end</span><span class="special">;</span>
<span class="comment">// write all the words to std::cout
</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">ostream_iterator</span><span class="special">&lt;</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="special">&gt;</span> <span class="identifier">out_iter</span><span class="special">(</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span><span class="special">,</span> <span class="string">"\n"</span> <span class="special">);</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">copy</span><span class="special">(</span> <span class="identifier">begin</span><span class="special">,</span> <span class="identifier">end</span><span class="special">,</span> <span class="identifier">out_iter</span> <span class="special">);</span>
</pre>
<p>
This program displays the following:
</p>
<pre class="programlisting">2003
1999
1981
</pre>
<a name="boost_xpressive.user_s_guide.string_splitting_and_tokenization.example_4__not_so_simple_tokenization"></a><h3>
<a name="id3109940"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_splitting_and_tokenization.example_4__not_so_simple_tokenization">Example
4: Not-So-Simple Tokenization</a>
</h3>
<p>
This example is like the previous one, except that instead of tokenizing
just the years, this program turns the days, months and years into tokens.
When we pass an array of integers <code class="literal"><span class="emphasis"><em>{I,J,...}</em></span></code>
as the last parameter to the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator&lt;&gt;</a></code></code>
constructor, it instructs the token iterator to consider as tokens the <code class="literal"><span class="emphasis"><em>I</em></span></code>-th,
<code class="literal"><span class="emphasis"><em>J</em></span></code>-th, etc. marked sub-expression
of each match.
</p>
<pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">input</span><span class="special">(</span><span class="string">"01/02/2003 blahblah 04/23/1999 blahblah 11/13/1981"</span><span class="special">);</span>
<span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="identifier">sregex</span><span class="special">::</span><span class="identifier">compile</span><span class="special">(</span><span class="string">"(\\d{2})/(\\d{2})/(\\d{4})"</span><span class="special">);</span> <span class="comment">// find a date
</span>
<span class="comment">// iterate over the days, months and years in the input
</span><span class="keyword">int</span> <span class="keyword">const</span> <span class="identifier">sub_matches</span><span class="special">[]</span> <span class="special">=</span> <span class="special">{</span> <span class="number">2</span><span class="special">,</span> <span class="number">1</span><span class="special">,</span> <span class="number">3</span> <span class="special">};</span> <span class="comment">// day, month, year
</span><span class="identifier">sregex_token_iterator</span> <span class="identifier">begin</span><span class="special">(</span> <span class="identifier">input</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">input</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">re</span><span class="special">,</span> <span class="identifier">sub_matches</span> <span class="special">),</span> <span class="identifier">end</span><span class="special">;</span>
<span class="comment">// write all the words to std::cout
</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">ostream_iterator</span><span class="special">&lt;</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="special">&gt;</span> <span class="identifier">out_iter</span><span class="special">(</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span><span class="special">,</span> <span class="string">"\n"</span> <span class="special">);</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">copy</span><span class="special">(</span> <span class="identifier">begin</span><span class="special">,</span> <span class="identifier">end</span><span class="special">,</span> <span class="identifier">out_iter</span> <span class="special">);</span>
</pre>
<p>
This program displays the following:
</p>
<pre class="programlisting">02
01
2003
23
04
1999
13
11
1981
</pre>
<p>
The <code class="computeroutput"><span class="identifier">sub_matches</span></code> array instructs
the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator&lt;&gt;</a></code></code>
to first take the value of the 2nd sub-match, then the 1st sub-match, and
finally the 3rd. Incrementing the iterator again instructs it to use <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>
again to find the next match. At that point, the process repeats -- the token
iterator takes the value of the 2nd sub-match, then the 1st, et cetera.
</p>
</div>
<div class="section">
<div class="titlepage"><div><div><h3 class="title">
<a name="boost_xpressive.user_s_guide.named_captures"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.named_captures" title="Named Captures">Named Captures</a>
</h3></div></div></div>
<a name="boost_xpressive.user_s_guide.named_captures.overview"></a><h3>
<a name="id3110456"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.named_captures.overview">Overview</a>
</h3>
<p>
For complicated regular expressions, dealing with numbered captures can be
a pain. Counting left parentheses to figure out which capture to reference
is no fun. Less fun is the fact that merely editing a regular expression
could cause a capture to be assigned a new number, invaliding code that refers
back to it by the old number.
</p>
<p>
Other regular expression engines solve this problem with a feature called
<span class="emphasis"><em>named captures</em></span>. This feature allows you to assign a
name to a capture, and to refer back to the capture by name rather by number.
Xpressive also supports named captures, both in dynamic and in static regexes.
</p>
<a name="boost_xpressive.user_s_guide.named_captures.dynamic_named_captures"></a><h3>
<a name="id3110502"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.named_captures.dynamic_named_captures">Dynamic
Named Captures</a>
</h3>
<p>
For dynamic regular expressions, xpressive follows the lead of other popular
regex engines with the syntax of named captures. You can create a named capture
with <code class="computeroutput"><span class="string">"(?P&lt;xxx&gt;...)"</span></code>
and refer back to that capture with <code class="computeroutput"><span class="string">"(?P=xxx)"</span></code>.
Here, for instance, is a regular expression that creates a named capture
and refers back to it:
</p>
<pre class="programlisting"><span class="comment">// Create a named capture called "char" that matches a single
</span><span class="comment">// character and refer back to that capture by name.
</span><span class="identifier">sregex</span> <span class="identifier">rx</span> <span class="special">=</span> <span class="identifier">sregex</span><span class="special">::</span><span class="identifier">compile</span><span class="special">(</span><span class="string">"(?P&lt;char&gt;.)(?P=char)"</span><span class="special">);</span>
</pre>
<p>
The effect of the above regular expression is to find the first doubled character.
</p>
<p>
Once you have executed a match or search operation using a regex with named
captures, you can access the named capture through the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results&lt;&gt;</a></code></code>
object using the capture's name.
</p>
<pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span><span class="string">"tweet"</span><span class="special">);</span>
<span class="identifier">sregex</span> <span class="identifier">rx</span> <span class="special">=</span> <span class="identifier">sregex</span><span class="special">::</span><span class="identifier">compile</span><span class="special">(</span><span class="string">"(?P&lt;char&gt;.)(?P=char)"</span><span class="special">);</span>
<span class="identifier">smatch</span> <span class="identifier">what</span><span class="special">;</span>
<span class="keyword">if</span><span class="special">(</span><span class="identifier">regex_search</span><span class="special">(</span><span class="identifier">str</span><span class="special">,</span> <span class="identifier">what</span><span class="special">,</span> <span class="identifier">rx</span><span class="special">))</span>
<span class="special">{</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="string">"char = "</span> <span class="special">&lt;&lt;</span> <span class="identifier">what</span><span class="special">[</span><span class="string">"char"</span><span class="special">]</span> <span class="special">&lt;&lt;</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">endl</span><span class="special">;</span>
<span class="special">}</span>
</pre>
<p>
The above code displays:
</p>
<pre class="programlisting">char = e
</pre>
<p>
You can also refer back to a named capture from within a substitution string.
The syntax for that is <code class="computeroutput"><span class="string">"\\g&lt;xxx&gt;"</span></code>.
Below is some code that demonstrates how to use named captures when doing
string substitution.
</p>
<pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span><span class="string">"tweet"</span><span class="special">);</span>
<span class="identifier">sregex</span> <span class="identifier">rx</span> <span class="special">=</span> <span class="identifier">sregex</span><span class="special">::</span><span class="identifier">compile</span><span class="special">(</span><span class="string">"(?P&lt;char&gt;.)(?P=char)"</span><span class="special">);</span>
<span class="identifier">str</span> <span class="special">=</span> <span class="identifier">regex_replace</span><span class="special">(</span><span class="identifier">str</span><span class="special">,</span> <span class="identifier">rx</span><span class="special">,</span> <span class="string">"**\\g&lt;char&gt;**"</span><span class="special">,</span> <span class="identifier">regex_constants</span><span class="special">::</span><span class="identifier">format_perl</span><span class="special">);</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="identifier">str</span> <span class="special">&lt;&lt;</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">endl</span><span class="special">;</span>
</pre>
<p>
Notice that you have to specify <code class="computeroutput"><span class="identifier">format_perl</span></code>
when using named captures. Only the perl syntax recognizes the <code class="computeroutput"><span class="string">"\\g&lt;xxx&gt;"</span></code> syntax. The above
code displays:
</p>
<pre class="programlisting">tw**e**t
</pre>
<a name="boost_xpressive.user_s_guide.named_captures.static_named_captures"></a><h3>
<a name="id3111388"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.named_captures.static_named_captures">Static
Named Captures</a>
</h3>
<p>
If you're using static regular expressions, creating and using named captures
is even easier. You can use the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/mark_tag.html" title="Struct mark_tag">mark_tag</a></code></code>
type to create a variable that you can use like <code class="computeroutput"><a class="link" href="../boost/xpressive/s1.html" title="Global s1">s1</a></code>, <code class="computeroutput"><a class="link" href="../boost/xpressive/s1.html" title="Global s1">s2</a></code> and friends, but with a
name that is more meaningful. Below is how the above example would look using
static regexes:
</p>
<pre class="programlisting"><span class="identifier">mark_tag</span> <span class="identifier">char_</span><span class="special">(</span><span class="number">1</span><span class="special">);</span> <span class="comment">// char_ is now a synonym for s1
</span><span class="identifier">sregex</span> <span class="identifier">rx</span> <span class="special">=</span> <span class="special">(</span><span class="identifier">char_</span><span class="special">=</span> <span class="identifier">_</span><span class="special">)</span> <span class="special">&gt;&gt;</span> <span class="identifier">char_</span><span class="special">;</span>
</pre>
<p>
After a match operation, you can use the <code class="computeroutput"><span class="identifier">mark_tag</span></code>
to index into the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results&lt;&gt;</a></code></code>
to access the named capture:
</p>
<pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span><span class="string">"tweet"</span><span class="special">);</span>
<span class="identifier">mark_tag</span> <span class="identifier">char_</span><span class="special">(</span><span class="number">1</span><span class="special">);</span>
<span class="identifier">sregex</span> <span class="identifier">rx</span> <span class="special">=</span> <span class="special">(</span><span class="identifier">char_</span><span class="special">=</span> <span class="identifier">_</span><span class="special">)</span> <span class="special">&gt;&gt;</span> <span class="identifier">char_</span><span class="special">;</span>
<span class="identifier">smatch</span> <span class="identifier">what</span><span class="special">;</span>
<span class="keyword">if</span><span class="special">(</span><span class="identifier">regex_search</span><span class="special">(</span><span class="identifier">str</span><span class="special">,</span> <span class="identifier">what</span><span class="special">,</span> <span class="identifier">rx</span><span class="special">))</span>
<span class="special">{</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="identifier">what</span><span class="special">[</span><span class="identifier">char_</span><span class="special">]</span> <span class="special">&lt;&lt;</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">endl</span><span class="special">;</span>
<span class="special">}</span>
</pre>
<p>
The above code displays:
</p>
<pre class="programlisting">char = e
</pre>
<p>
When doing string substitutions with <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code>,
you can use named captures to create <span class="emphasis"><em>format expressions</em></span>
as below:
</p>
<pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span><span class="string">"tweet"</span><span class="special">);</span>
<span class="identifier">mark_tag</span> <span class="identifier">char_</span><span class="special">(</span><span class="number">1</span><span class="special">);</span>
<span class="identifier">sregex</span> <span class="identifier">rx</span> <span class="special">=</span> <span class="special">(</span><span class="identifier">char_</span><span class="special">=</span> <span class="identifier">_</span><span class="special">)</span> <span class="special">&gt;&gt;</span> <span class="identifier">char_</span><span class="special">;</span>
<span class="identifier">str</span> <span class="special">=</span> <span class="identifier">regex_replace</span><span class="special">(</span><span class="identifier">str</span><span class="special">,</span> <span class="identifier">rx</span><span class="special">,</span> <span class="string">"**"</span> <span class="special">+</span> <span class="identifier">char_</span> <span class="special">+</span> <span class="string">"**"</span><span class="special">);</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="identifier">str</span> <span class="special">&lt;&lt;</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">endl</span><span class="special">;</span>
</pre>
<p>
The above code displays:
</p>
<pre class="programlisting">tw**e**t
</pre>
<div class="note"><table border="0" summary="Note">
<tr>
<td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../../doc/src/images/note.png"></td>
<th align="left">Note</th>
</tr>
<tr><td align="left" valign="top"><p>
You need to include <code class="literal">&lt;boost/xpressive/regex_actions.hpp&gt;</code>
to use format expressions.
</p></td></tr>
</table></div>
</div>
<div class="section">
<div class="titlepage"><div><div><h3 class="title">
<a name="boost_xpressive.user_s_guide.grammars_and_nested_matches"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.grammars_and_nested_matches" title="Grammars and Nested Matches">Grammars
and Nested Matches</a>
</h3></div></div></div>
<a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.overview"></a><h3>
<a name="id3112133"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.grammars_and_nested_matches.overview">Overview</a>
</h3>
<p>
One of the key benefits of representing regexes as C++ expressions is the
ability to easily refer to other C++ code and data from within the regex.
This enables programming idioms that are not possible with other regular
expression libraries. Of particular note is the ability for one regex to
refer to another regex, allowing you to build grammars out of regular expressions.
This section describes how to embed one regex in another by value and by
reference, how regex objects behave when they refer to other regexes, and
how to access the tree of results after a successful parse.
</p>
<a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.embedding_a_regex_by_value"></a><h3>
<a name="id3112160"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.grammars_and_nested_matches.embedding_a_regex_by_value">Embedding
a Regex by Value</a>
</h3>
<p>
The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex&lt;&gt;</a></code></code>
object has value semantics. When a regex object appears on the right-hand
side in the definition of another regex, it is as if the regex were embedded
by value; that is, a copy of the nested regex is stored by the enclosing
regex. The inner regex is invoked by the outer regex during pattern matching.
The inner regex participates fully in the match, back-tracking as needed
to make the match succeed.
</p>
<p>
Consider a text editor that has a regex-find feature with a whole-word option.
You can implement this with xpressive as follows:
</p>
<pre class="programlisting"><span class="identifier">find_dialog</span> <span class="identifier">dlg</span><span class="special">;</span>
<span class="keyword">if</span><span class="special">(</span> <span class="identifier">dialog_ok</span> <span class="special">==</span> <span class="identifier">dlg</span><span class="special">.</span><span class="identifier">do_modal</span><span class="special">()</span> <span class="special">)</span>
<span class="special">{</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">pattern</span> <span class="special">=</span> <span class="identifier">dlg</span><span class="special">.</span><span class="identifier">get_text</span><span class="special">();</span> <span class="comment">// the pattern the user entered
</span> <span class="keyword">bool</span> <span class="identifier">whole_word</span> <span class="special">=</span> <span class="identifier">dlg</span><span class="special">.</span><span class="identifier">whole_word</span><span class="special">.</span><span class="identifier">is_checked</span><span class="special">();</span> <span class="comment">// did the user select the whole-word option?
</span>
<span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="identifier">sregex</span><span class="special">::</span><span class="identifier">compile</span><span class="special">(</span> <span class="identifier">pattern</span> <span class="special">);</span> <span class="comment">// try to compile the pattern
</span>
<span class="keyword">if</span><span class="special">(</span> <span class="identifier">whole_word</span> <span class="special">)</span>
<span class="special">{</span>
<span class="comment">// wrap the regex in begin-word / end-word assertions
</span> <span class="identifier">re</span> <span class="special">=</span> <span class="identifier">bow</span> <span class="special">&gt;&gt;</span> <span class="identifier">re</span> <span class="special">&gt;&gt;</span> <span class="identifier">eow</span><span class="special">;</span>
<span class="special">}</span>
<span class="comment">// ... use re ...
</span><span class="special">}</span>
</pre>
<p>
Look closely at this line:
</p>
<pre class="programlisting"><span class="comment">// wrap the regex in begin-word / end-word assertions
</span><span class="identifier">re</span> <span class="special">=</span> <span class="identifier">bow</span> <span class="special">&gt;&gt;</span> <span class="identifier">re</span> <span class="special">&gt;&gt;</span> <span class="identifier">eow</span><span class="special">;</span>
</pre>
<p>
This line creates a new regex that embeds the old regex by value. Then, the
new regex is assigned back to the original regex. Since a copy of the old
regex was made on the right-hand side, this works as you might expect: the
new regex has the behavior of the old regex wrapped in begin- and end-word
assertions.
</p>
<div class="note"><table border="0" summary="Note">
<tr>
<td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../../doc/src/images/note.png"></td>
<th align="left">Note</th>
</tr>
<tr><td align="left" valign="top"><p>
Note that <code class="computeroutput"><span class="identifier">re</span> <span class="special">=</span>
<span class="identifier">bow</span> <span class="special">&gt;&gt;</span>
<span class="identifier">re</span> <span class="special">&gt;&gt;</span>
<span class="identifier">eow</span></code> does <span class="emphasis"><em>not</em></span>
define a recursive regular expression, since regex objects embed by value
by default. The next section shows how to define a recursive regular expression
by embedding a regex by reference.
</p></td></tr>
</table></div>
<a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.embedding_a_regex_by_reference"></a><h3>
<a name="id3112648"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.grammars_and_nested_matches.embedding_a_regex_by_reference">Embedding
a Regex by Reference</a>
</h3>
<p>
If you want to be able to build recursive regular expressions and context-free
grammars, embedding a regex by value is not enough. You need to be able to
make your regular expressions self-referential. Most regular expression engines
don't give you that power, but xpressive does.
</p>
<div class="tip"><table border="0" summary="Tip">
<tr>
<td rowspan="2" align="center" valign="top" width="25"><img alt="[Tip]" src="../../../doc/src/images/tip.png"></td>
<th align="left">Tip</th>
</tr>
<tr><td align="left" valign="top"><p>
The theoretical computer scientists out there will correctly point out
that a self-referential regular expression is not "regular",
so in the strict sense, xpressive isn't really a <span class="emphasis"><em>regular</em></span>
expression engine at all. But as Larry Wall once said, "the term [regular expression] has
grown with the capabilities of our pattern matching engines, so I'm not
going to try to fight linguistic necessity here."
</p></td></tr>
</table></div>
<p>
Consider the following code, which uses the <code class="computeroutput"><span class="identifier">by_ref</span><span class="special">()</span></code> helper to define a recursive regular expression
that matches balanced, nested parentheses:
</p>
<pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">parentheses</span><span class="special">;</span>
<span class="identifier">parentheses</span> <span class="comment">// A balanced set of parentheses ...
</span> <span class="special">=</span> <span class="char">'('</span> <span class="comment">// is an opening parenthesis ...
</span> <span class="special">&gt;&gt;</span> <span class="comment">// followed by ...
</span> <span class="special">*(</span> <span class="comment">// zero or more ...
</span> <span class="identifier">keep</span><span class="special">(</span> <span class="special">+~(</span><span class="identifier">set</span><span class="special">=</span><span class="char">'('</span><span class="special">,</span><span class="char">')'</span><span class="special">)</span> <span class="special">)</span> <span class="comment">// of a bunch of things that are not parentheses ...
</span> <span class="special">|</span> <span class="comment">// or ...
</span> <span class="identifier">by_ref</span><span class="special">(</span><span class="identifier">parentheses</span><span class="special">)</span> <span class="comment">// a balanced set of parentheses
</span> <span class="special">)</span> <span class="comment">// (ooh, recursion!) ...
</span> <span class="special">&gt;&gt;</span> <span class="comment">// followed by ...
</span> <span class="char">')'</span> <span class="comment">// a closing parenthesis
</span> <span class="special">;</span>
</pre>
<p>
Matching balanced, nested tags is an important text processing task, and
it is one that "classic" regular expressions cannot do. The <code class="computeroutput"><span class="identifier">by_ref</span><span class="special">()</span></code>
helper makes it possible. It allows one regex object to be embedded in another
<span class="emphasis"><em>by reference</em></span>. Since the right-hand side holds <code class="computeroutput"><span class="identifier">parentheses</span></code> by reference, assigning the
right-hand side back to <code class="computeroutput"><span class="identifier">parentheses</span></code>
creates a cycle, which will execute recursively.
</p>
<a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.building_a_grammar"></a><h3>
<a name="id3112962"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.grammars_and_nested_matches.building_a_grammar">Building
a Grammar</a>
</h3>
<p>
Once we allow self-reference in our regular expressions, the genie is out
of the bottle and all manner of fun things are possible. In particular, we
can now build grammars out of regular expressions. Let's have a look at the
text-book grammar example: the humble calculator.
</p>
<pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">group</span><span class="special">,</span> <span class="identifier">factor</span><span class="special">,</span> <span class="identifier">term</span><span class="special">,</span> <span class="identifier">expression</span><span class="special">;</span>
<span class="identifier">group</span> <span class="special">=</span> <span class="char">'('</span> <span class="special">&gt;&gt;</span> <span class="identifier">by_ref</span><span class="special">(</span><span class="identifier">expression</span><span class="special">)</span> <span class="special">&gt;&gt;</span> <span class="char">')'</span><span class="special">;</span>
<span class="identifier">factor</span> <span class="special">=</span> <span class="special">+</span><span class="identifier">_d</span> <span class="special">|</span> <span class="identifier">group</span><span class="special">;</span>
<span class="identifier">term</span> <span class="special">=</span> <span class="identifier">factor</span> <span class="special">&gt;&gt;</span> <span class="special">*((</span><span class="char">'*'</span> <span class="special">&gt;&gt;</span> <span class="identifier">factor</span><span class="special">)</span> <span class="special">|</span> <span class="special">(</span><span class="char">'/'</span> <span class="special">&gt;&gt;</span> <span class="identifier">factor</span><span class="special">));</span>
<span class="identifier">expression</span> <span class="special">=</span> <span class="identifier">term</span> <span class="special">&gt;&gt;</span> <span class="special">*((</span><span class="char">'+'</span> <span class="special">&gt;&gt;</span> <span class="identifier">term</span><span class="special">)</span> <span class="special">|</span> <span class="special">(</span><span class="char">'-'</span> <span class="special">&gt;&gt;</span> <span class="identifier">term</span><span class="special">));</span>
</pre>
<p>
The regex <code class="computeroutput"><span class="identifier">expression</span></code> defined
above does something rather remarkable for a regular expression: it matches
mathematical expressions. For example, if the input string were <code class="computeroutput"><span class="string">"foo 9*(10+3) bar"</span></code>, this pattern
would match <code class="computeroutput"><span class="string">"9*(10+3)"</span></code>.
It only matches well-formed mathematical expressions, where the parentheses
are balanced and the infix operators have two arguments each. Don't try this
with just any regular expression engine!
</p>
<p>
Let's take a closer look at this regular expression grammar. Notice that
it is cyclic: <code class="computeroutput"><span class="identifier">expression</span></code>
is implemented in terms of <code class="computeroutput"><span class="identifier">term</span></code>,
which is implemented in terms of <code class="computeroutput"><span class="identifier">factor</span></code>,
which is implemented in terms of <code class="computeroutput"><span class="identifier">group</span></code>,
which is implemented in terms of <code class="computeroutput"><span class="identifier">expression</span></code>,
closing the loop. In general, the way to define a cyclic grammar is to forward-declare
the regex objects and embed by reference those regular expressions that have
not yet been initialized. In the above grammar, there is only one place where
we need to reference a regex object that has not yet been initialized: the
definition of <code class="computeroutput"><span class="identifier">group</span></code>. In that
place, we use <code class="computeroutput"><span class="identifier">by_ref</span><span class="special">()</span></code>
to embed <code class="computeroutput"><span class="identifier">expression</span></code> by reference.
In all other places, it is sufficient to embed the other regex objects by
value, since they have already been initialized and their values will not
change.
</p>
<div class="tip"><table border="0" summary="Tip">
<tr>
<td rowspan="2" align="center" valign="top" width="25"><img alt="[Tip]" src="../../../doc/src/images/tip.png"></td>
<th align="left">Tip</th>
</tr>
<tr><td align="left" valign="top"><p>
<span class="bold"><strong>Embed by value if possible</strong></span> <br> <br>
In general, prefer embedding regular expressions by value rather than by
reference. It involves one less indirection, making your patterns match
a little faster. Besides, value semantics are simpler and will make your
grammars easier to reason about. Don't worry about the expense of "copying"
a regex. Each regex object shares its implementation with all of its copies.
</p></td></tr>
</table></div>
<a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.dynamic_regex_grammars"></a><h3>
<a name="id3113448"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.grammars_and_nested_matches.dynamic_regex_grammars">Dynamic
Regex Grammars</a>
</h3>
<p>
Using <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler&lt;&gt;</a></code></code>,
you can also build grammars out of dynamic regular expressions. You do that
by creating named regexes, and referring to other regexes by name. Each
<code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler&lt;&gt;</a></code></code>
instance keeps a mapping from names to regexes that have been created with
it.
</p>
<p>
You can create a named dynamic regex by prefacing your regex with <code class="computeroutput"><span class="string">"(?$name=)"</span></code>, where <span class="emphasis"><em>name</em></span>
is the name of the regex. You can refer to a named regex from another regex
with <code class="computeroutput"><span class="string">"(?$name)"</span></code>. The
named regex does not need to exist yet at the time it is referenced in another
regex, but it must exist by the time you use the regex.
</p>
<p>
Below is a code fragment that uses dynamic regex grammars to implement the
calculator example from above.
</p>
<pre class="programlisting"><span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span><span class="special">;</span>
<span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">regex_constants</span><span class="special">;</span>
<span class="identifier">sregex</span> <span class="identifier">expr</span><span class="special">;</span>
<span class="special">{</span>
<span class="identifier">sregex_compiler</span> <span class="identifier">compiler</span><span class="special">;</span>
<span class="identifier">syntax_option_type</span> <span class="identifier">x</span> <span class="special">=</span> <span class="identifier">ignore_white_space</span><span class="special">;</span>
<span class="identifier">compiler</span><span class="special">.</span><span class="identifier">compile</span><span class="special">(</span><span class="string">"(? $group = ) \\( (? $expr ) \\) "</span><span class="special">,</span> <span class="identifier">x</span><span class="special">);</span>
<span class="identifier">compiler</span><span class="special">.</span><span class="identifier">compile</span><span class="special">(</span><span class="string">"(? $factor = ) \\d+ | (? $group ) "</span><span class="special">,</span> <span class="identifier">x</span><span class="special">);</span>
<span class="identifier">compiler</span><span class="special">.</span><span class="identifier">compile</span><span class="special">(</span><span class="string">"(? $term = ) (? $factor )"</span>
<span class="string">" ( \\* (? $factor ) | / (? $factor ) )* "</span><span class="special">,</span> <span class="identifier">x</span><span class="special">);</span>
<span class="identifier">expr</span> <span class="special">=</span> <span class="identifier">compiler</span><span class="special">.</span><span class="identifier">compile</span><span class="special">(</span><span class="string">"(? $expr = ) (? $term )"</span>
<span class="string">" ( \\+ (? $term ) | - (? $term ) )* "</span><span class="special">,</span> <span class="identifier">x</span><span class="special">);</span>
<span class="special">}</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span><span class="string">"foo 9*(10+3) bar"</span><span class="special">);</span>
<span class="identifier">smatch</span> <span class="identifier">what</span><span class="special">;</span>
<span class="keyword">if</span><span class="special">(</span><span class="identifier">regex_search</span><span class="special">(</span><span class="identifier">str</span><span class="special">,</span> <span class="identifier">what</span><span class="special">,</span> <span class="identifier">expr</span><span class="special">))</span>
<span class="special">{</span>
<span class="comment">// This prints "9*(10+3)":
</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="identifier">what</span><span class="special">[</span><span class="number">0</span><span class="special">]</span> <span class="special">&lt;&lt;</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">endl</span><span class="special">;</span>
<span class="special">}</span>
</pre>
<p>
As with static regex grammars, nested regex invocations create nested match
results (see <span class="emphasis"><em>Nested Results</em></span> below). The result is a
complete parse tree for string that matched. Unlike static regexes, dynamic
regexes are always embedded by reference, not by value.
</p>
<a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.cyclic_patterns__copying_and_memory_management__oh_my_"></a><h3>
<a name="id3114027"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.grammars_and_nested_matches.cyclic_patterns__copying_and_memory_management__oh_my_">Cyclic
Patterns, Copying and Memory Management, Oh My!</a>
</h3>
<p>
The calculator examples above raises a number of very complicated memory-management
issues. Each of the four regex objects refer to each other, some directly
and some indirectly, some by value and some by reference. What if we were
to return one of them from a function and let the others go out of scope?
What becomes of the references? The answer is that the regex objects are
internally reference counted, such that they keep their referenced regex
objects alive as long as they need them. So passing a regex object by value
is never a problem, even if it refers to other regex objects that have gone
out of scope.
</p>
<p>
Those of you who have dealt with reference counting are probably familiar
with its Achilles Heel: cyclic references. If regex objects are reference
counted, what happens to cycles like the one created in the calculator examples?
Are they leaked? The answer is no, they are not leaked. The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex&lt;&gt;</a></code></code>
object has some tricky reference tracking code that ensures that even cyclic
regex grammars are cleaned up when the last external reference goes away.
So don't worry about it. Create cyclic grammars, pass your regex objects
around and copy them all you want. It is fast and efficient and guaranteed
not to leak or result in dangling references.
</p>
<a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.nested_regexes_and_sub_match_scoping"></a><h3>
<a name="id3114084"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.grammars_and_nested_matches.nested_regexes_and_sub_match_scoping">Nested
Regexes and Sub-Match Scoping</a>
</h3>
<p>
Nested regular expressions raise the issue of sub-match scoping. If both
the inner and outer regex write to and read from the same sub-match vector,
chaos would ensue. The inner regex would stomp on the sub-matches written
by the outer regex. For example, what does this do?
</p>
<pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">inner</span> <span class="special">=</span> <span class="identifier">sregex</span><span class="special">::</span><span class="identifier">compile</span><span class="special">(</span> <span class="string">"(.)\\1"</span> <span class="special">);</span>
<span class="identifier">sregex</span> <span class="identifier">outer</span> <span class="special">=</span> <span class="special">(</span><span class="identifier">s1</span><span class="special">=</span> <span class="identifier">_</span><span class="special">)</span> <span class="special">&gt;&gt;</span> <span class="identifier">inner</span> <span class="special">&gt;&gt;</span> <span class="identifier">s1</span><span class="special">;</span>
</pre>
<p>
The author probably didn't intend for the inner regex to overwrite the sub-match
written by the outer regex. The problem is particularly acute when the inner
regex is accepted from the user as input. The author has no way of knowing
whether the inner regex will stomp the sub-match vector or not. This is clearly
not acceptable.
</p>
<p>
Instead, what actually happens is that each invocation of a nested regex
gets its own scope. Sub-matches belong to that scope. That is, each nested
regex invocation gets its own copy of the sub-match vector to play with,
so there is no way for an inner regex to stomp on the sub-matches of an outer
regex. So, for example, the regex <code class="computeroutput"><span class="identifier">outer</span></code>
defined above would match <code class="computeroutput"><span class="string">"ABBA"</span></code>,
as it should.
</p>
<a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.nested_results"></a><h3>
<a name="id3114265"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.grammars_and_nested_matches.nested_results">Nested
Results</a>
</h3>
<p>
If nested regexes have their own sub-matches, there should be a way to access
them after a successful match. In fact, there is. After a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
or <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>,
the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results&lt;&gt;</a></code></code>
struct behaves like the head of a tree of nested results. The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results&lt;&gt;</a></code></code>
class provides a <code class="computeroutput"><span class="identifier">nested_results</span><span class="special">()</span></code> member function that returns an ordered
sequence of <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results&lt;&gt;</a></code></code>
structures, representing the results of the nested regexes. The order of
the nested results is the same as the order in which the nested regex objects
matched.
</p>
<p>
Take as an example the regex for balanced, nested parentheses we saw earlier:
</p>
<pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">parentheses</span><span class="special">;</span>
<span class="identifier">parentheses</span> <span class="special">=</span> <span class="char">'('</span> <span class="special">&gt;&gt;</span> <span class="special">*(</span> <span class="identifier">keep</span><span class="special">(</span> <span class="special">+~(</span><span class="identifier">set</span><span class="special">=</span><span class="char">'('</span><span class="special">,</span><span class="char">')'</span><span class="special">)</span> <span class="special">)</span> <span class="special">|</span> <span class="identifier">by_ref</span><span class="special">(</span><span class="identifier">parentheses</span><span class="special">)</span> <span class="special">)</span> <span class="special">&gt;&gt;</span> <span class="char">')'</span><span class="special">;</span>
<span class="identifier">smatch</span> <span class="identifier">what</span><span class="special">;</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span> <span class="string">"blah blah( a(b)c (c(e)f (g)h )i (j)6 )blah"</span> <span class="special">);</span>
<span class="keyword">if</span><span class="special">(</span> <span class="identifier">regex_search</span><span class="special">(</span> <span class="identifier">str</span><span class="special">,</span> <span class="identifier">what</span><span class="special">,</span> <span class="identifier">parentheses</span> <span class="special">)</span> <span class="special">)</span>
<span class="special">{</span>
<span class="comment">// display the whole match
</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="identifier">what</span><span class="special">[</span><span class="number">0</span><span class="special">]</span> <span class="special">&lt;&lt;</span> <span class="char">'\n'</span><span class="special">;</span>
<span class="comment">// display the nested results
</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">for_each</span><span class="special">(</span>
<span class="identifier">what</span><span class="special">.</span><span class="identifier">nested_results</span><span class="special">().</span><span class="identifier">begin</span><span class="special">(),</span>
<span class="identifier">what</span><span class="special">.</span><span class="identifier">nested_results</span><span class="special">().</span><span class="identifier">end</span><span class="special">(),</span>
<span class="identifier">output_nested_results</span><span class="special">()</span> <span class="special">);</span>
<span class="special">}</span>
</pre>
<p>
This program displays the following:
</p>
<pre class="programlisting">( a(b)c (c(e)f (g)h )i (j)6 )
(b)
(c(e)f (g)h )
(e)
(g)
(j)
</pre>
<p>
Here you can see how the results are nested and that they are stored in the
order in which they are found.
</p>
<div class="tip"><table border="0" summary="Tip">
<tr>
<td rowspan="2" align="center" valign="top" width="25"><img alt="[Tip]" src="../../../doc/src/images/tip.png"></td>
<th align="left">Tip</th>
</tr>
<tr><td align="left" valign="top"><p>
See the definition of <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.display_a_tree_of_nested_results">output_nested_results</a>
in the <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples" title="Examples">Examples</a>
section.
</p></td></tr>
</table></div>
<a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.filtering_nested_results"></a><h3>
<a name="id3114840"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.grammars_and_nested_matches.filtering_nested_results">Filtering
Nested Results</a>
</h3>
<p>
Sometimes a regex will have several nested regex objects, and you want to
know which result corresponds to which regex object. That's where <code class="computeroutput"><span class="identifier">basic_regex</span><span class="special">&lt;&gt;::</span><span class="identifier">regex_id</span><span class="special">()</span></code>
and <code class="computeroutput"><span class="identifier">match_results</span><span class="special">&lt;&gt;::</span><span class="identifier">regex_id</span><span class="special">()</span></code>
come in handy. When iterating over the nested results, you can compare the
regex id from the results to the id of the regex object you're interested
in.
</p>
<p>
To make this a bit easier, xpressive provides a predicate to make it simple
to iterate over just the results that correspond to a certain nested regex.
It is called <code class="computeroutput"><span class="identifier">regex_id_filter_predicate</span></code>,
and it is intended to be used with <a href="../../../libs/iterator/doc/index.html" target="_top">Boost.Iterator</a>.
You can use it as follows:
</p>
<pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">name</span> <span class="special">=</span> <span class="special">+</span><span class="identifier">alpha</span><span class="special">;</span>
<span class="identifier">sregex</span> <span class="identifier">integer</span> <span class="special">=</span> <span class="special">+</span><span class="identifier">_d</span><span class="special">;</span>
<span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="special">*(</span> <span class="special">*</span><span class="identifier">_s</span> <span class="special">&gt;&gt;</span> <span class="special">(</span> <span class="identifier">name</span> <span class="special">|</span> <span class="identifier">integer</span> <span class="special">)</span> <span class="special">);</span>
<span class="identifier">smatch</span> <span class="identifier">what</span><span class="special">;</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span> <span class="string">"marsha 123 jan 456 cindy 789"</span> <span class="special">);</span>
<span class="keyword">if</span><span class="special">(</span> <span class="identifier">regex_match</span><span class="special">(</span> <span class="identifier">str</span><span class="special">,</span> <span class="identifier">what</span><span class="special">,</span> <span class="identifier">re</span> <span class="special">)</span> <span class="special">)</span>
<span class="special">{</span>
<span class="identifier">smatch</span><span class="special">::</span><span class="identifier">nested_results_type</span><span class="special">::</span><span class="identifier">const_iterator</span> <span class="identifier">begin</span> <span class="special">=</span> <span class="identifier">what</span><span class="special">.</span><span class="identifier">nested_results</span><span class="special">().</span><span class="identifier">begin</span><span class="special">();</span>
<span class="identifier">smatch</span><span class="special">::</span><span class="identifier">nested_results_type</span><span class="special">::</span><span class="identifier">const_iterator</span> <span class="identifier">end</span> <span class="special">=</span> <span class="identifier">what</span><span class="special">.</span><span class="identifier">nested_results</span><span class="special">().</span><span class="identifier">end</span><span class="special">();</span>
<span class="comment">// declare filter predicates to select just the names or the integers
</span> <span class="identifier">sregex_id_filter_predicate</span> <span class="identifier">name_id</span><span class="special">(</span> <span class="identifier">name</span><span class="special">.</span><span class="identifier">regex_id</span><span class="special">()</span> <span class="special">);</span>
<span class="identifier">sregex_id_filter_predicate</span> <span class="identifier">integer_id</span><span class="special">(</span> <span class="identifier">integer</span><span class="special">.</span><span class="identifier">regex_id</span><span class="special">()</span> <span class="special">);</span>
<span class="comment">// iterate over only the results from the name regex
</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">for_each</span><span class="special">(</span>
<span class="identifier">boost</span><span class="special">::</span><span class="identifier">make_filter_iterator</span><span class="special">(</span> <span class="identifier">name_id</span><span class="special">,</span> <span class="identifier">begin</span><span class="special">,</span> <span class="identifier">end</span> <span class="special">),</span>
<span class="identifier">boost</span><span class="special">::</span><span class="identifier">make_filter_iterator</span><span class="special">(</span> <span class="identifier">name_id</span><span class="special">,</span> <span class="identifier">end</span><span class="special">,</span> <span class="identifier">end</span> <span class="special">),</span>
<span class="identifier">output_result</span>
<span class="special">);</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="char">'\n'</span><span class="special">;</span>
<span class="comment">// iterate over only the results from the integer regex
</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">for_each</span><span class="special">(</span>
<span class="identifier">boost</span><span class="special">::</span><span class="identifier">make_filter_iterator</span><span class="special">(</span> <span class="identifier">integer_id</span><span class="special">,</span> <span class="identifier">begin</span><span class="special">,</span> <span class="identifier">end</span> <span class="special">),</span>
<span class="identifier">boost</span><span class="special">::</span><span class="identifier">make_filter_iterator</span><span class="special">(</span> <span class="identifier">integer_id</span><span class="special">,</span> <span class="identifier">end</span><span class="special">,</span> <span class="identifier">end</span> <span class="special">),</span>
<span class="identifier">output_result</span>
<span class="special">);</span>
<span class="special">}</span>
</pre>
<p>
where <code class="computeroutput"><span class="identifier">output_results</span></code> is a
simple function that takes a <code class="computeroutput"><span class="identifier">smatch</span></code>
and displays the full match. Notice how we use the <code class="computeroutput"><span class="identifier">regex_id_filter_predicate</span></code>
together with <code class="computeroutput"><span class="identifier">basic_regex</span><span class="special">&lt;&gt;::</span><span class="identifier">regex_id</span><span class="special">()</span></code> and <code class="computeroutput"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">make_filter_iterator</span><span class="special">()</span></code> from the <a href="../../../libs/iterator/doc/index.html" target="_top">Boost.Iterator</a>
to select only those results corresponding to a particular nested regex.
This program displays the following:
</p>
<pre class="programlisting">marsha
jan
cindy
123
456
789
</pre>
</div>
<div class="section">
<div class="titlepage"><div><div><h3 class="title">
<a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions" title="Semantic Actions and User-Defined Assertions">Semantic
Actions and User-Defined Assertions</a>
</h3></div></div></div>
<a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.overview"></a><h3>
<a name="id3115810"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.overview">Overview</a>
</h3>
<p>
Imagine you want to parse an input string and build a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special">&lt;&gt;</span></code>
from it. For something like that, matching a regular expression isn't enough.
You want to <span class="emphasis"><em>do something</em></span> when parts of your regular
expression match. Xpressive lets you attach semantic actions to parts of
your static regular expressions. This section shows you how.
</p>
<a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.semantic_actions"></a><h3>
<a name="id3115869"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.semantic_actions">Semantic
Actions</a>
</h3>
<p>
Consider the following code, which uses xpressive's semantic actions to parse
a string of word/integer pairs and stuffs them into a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special">&lt;&gt;</span></code>.
It is described below.
</p>
<pre class="programlisting"><span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">string</span><span class="special">&gt;</span>
<span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">iostream</span><span class="special">&gt;</span>
<span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">&gt;</span>
<span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">regex_actions</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">&gt;</span>
<span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span><span class="special">;</span>
<span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span>
<span class="special">{</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special">&lt;</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">,</span> <span class="keyword">int</span><span class="special">&gt;</span> <span class="identifier">result</span><span class="special">;</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span><span class="string">"aaa=&gt;1 bbb=&gt;23 ccc=&gt;456"</span><span class="special">);</span>
<span class="comment">// Match a word and an integer, separated by =&gt;,
</span> <span class="comment">// and then stuff the result into a std::map&lt;&gt;
</span> <span class="identifier">sregex</span> <span class="identifier">pair</span> <span class="special">=</span> <span class="special">(</span> <span class="special">(</span><span class="identifier">s1</span><span class="special">=</span> <span class="special">+</span><span class="identifier">_w</span><span class="special">)</span> <span class="special">&gt;&gt;</span> <span class="string">"=&gt;"</span> <span class="special">&gt;&gt;</span> <span class="special">(</span><span class="identifier">s2</span><span class="special">=</span> <span class="special">+</span><span class="identifier">_d</span><span class="special">)</span> <span class="special">)</span>
<span class="special">[</span> <span class="identifier">ref</span><span class="special">(</span><span class="identifier">result</span><span class="special">)[</span><span class="identifier">s1</span><span class="special">]</span> <span class="special">=</span> <span class="identifier">as</span><span class="special">&lt;</span><span class="keyword">int</span><span class="special">&gt;(</span><span class="identifier">s2</span><span class="special">)</span> <span class="special">];</span>
<span class="comment">// Match one or more word/integer pairs, separated
</span> <span class="comment">// by whitespace.
</span> <span class="identifier">sregex</span> <span class="identifier">rx</span> <span class="special">=</span> <span class="identifier">pair</span> <span class="special">&gt;&gt;</span> <span class="special">*(+</span><span class="identifier">_s</span> <span class="special">&gt;&gt;</span> <span class="identifier">pair</span><span class="special">);</span>
<span class="keyword">if</span><span class="special">(</span><span class="identifier">regex_match</span><span class="special">(</span><span class="identifier">str</span><span class="special">,</span> <span class="identifier">rx</span><span class="special">))</span>
<span class="special">{</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="identifier">result</span><span class="special">[</span><span class="string">"aaa"</span><span class="special">]</span> <span class="special">&lt;&lt;</span> <span class="char">'\n'</span><span class="special">;</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="identifier">result</span><span class="special">[</span><span class="string">"bbb"</span><span class="special">]</span> <span class="special">&lt;&lt;</span> <span class="char">'\n'</span><span class="special">;</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="identifier">result</span><span class="special">[</span><span class="string">"ccc"</span><span class="special">]</span> <span class="special">&lt;&lt;</span> <span class="char">'\n'</span><span class="special">;</span>
<span class="special">}</span>
<span class="keyword">return</span> <span class="number">0</span><span class="special">;</span>
<span class="special">}</span>
</pre>
<p>
This program prints the following:
</p>
<pre class="programlisting">1
23
456
</pre>
<p>
The regular expression <code class="computeroutput"><span class="identifier">pair</span></code>
has two parts: the pattern and the action. The pattern says to match a word,
capturing it in sub-match 1, and an integer, capturing it in sub-match 2,
separated by <code class="computeroutput"><span class="string">"=&gt;"</span></code>.
The action is the part in square brackets: <code class="computeroutput"><span class="special">[</span>
<span class="identifier">ref</span><span class="special">(</span><span class="identifier">result</span><span class="special">)[</span><span class="identifier">s1</span><span class="special">]</span> <span class="special">=</span>
<span class="identifier">as</span><span class="special">&lt;</span><span class="keyword">int</span><span class="special">&gt;(</span><span class="identifier">s2</span><span class="special">)</span> <span class="special">]</span></code>. It says
to take sub-match one and use it to index into the <code class="computeroutput"><span class="identifier">results</span></code>
map, and assign to it the result of converting sub-match 2 to an integer.
</p>
<div class="note"><table border="0" summary="Note">
<tr>
<td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../../doc/src/images/note.png"></td>
<th align="left">Note</th>
</tr>
<tr><td align="left" valign="top"><p>
To use semantic actions with your static regexes, you must <code class="computeroutput"><span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">regex_actions</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">&gt;</span></code>
</p></td></tr>
</table></div>
<p>
How does this work? Just as the rest of the static regular expression, the
part between brackets is an expression template. It encodes the action and
executes it later. The expression <code class="computeroutput"><span class="identifier">ref</span><span class="special">(</span><span class="identifier">result</span><span class="special">)</span></code> creates a lazy reference to the <code class="computeroutput"><span class="identifier">result</span></code> object. The larger expression <code class="computeroutput"><span class="identifier">ref</span><span class="special">(</span><span class="identifier">result</span><span class="special">)[</span><span class="identifier">s1</span><span class="special">]</span></code>
is a lazy map index operation. Later, when this action is getting executed,
<code class="computeroutput"><span class="identifier">s1</span></code> gets replaced with the
first <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match&lt;&gt;</a></code></code>.
Likewise, when <code class="computeroutput"><span class="identifier">as</span><span class="special">&lt;</span><span class="keyword">int</span><span class="special">&gt;(</span><span class="identifier">s2</span><span class="special">)</span></code> gets executed, <code class="computeroutput"><span class="identifier">s2</span></code>
is replaced with the second <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match&lt;&gt;</a></code></code>.
The <code class="computeroutput"><span class="identifier">as</span><span class="special">&lt;&gt;</span></code>
action converts its argument to the requested type using Boost.Lexical_cast.
The effect of the whole action is to insert a new word/integer pair into
the map.
</p>
<div class="note"><table border="0" summary="Note">
<tr>
<td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../../doc/src/images/note.png"></td>
<th align="left">Note</th>
</tr>
<tr><td align="left" valign="top"><p>
There is an important difference between the function <code class="computeroutput"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">ref</span><span class="special">()</span></code> in <code class="computeroutput"><span class="special">&lt;</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">ref</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">&gt;</span></code>
and <code class="computeroutput"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span><span class="special">::</span><span class="identifier">ref</span><span class="special">()</span></code>
in <code class="computeroutput"><span class="special">&lt;</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">regex_actions</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">&gt;</span></code>. The first returns a plain <code class="computeroutput"><span class="identifier">reference_wrapper</span><span class="special">&lt;&gt;</span></code>
which behaves in many respects like an ordinary reference. By contrast,
<code class="computeroutput"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span><span class="special">::</span><span class="identifier">ref</span><span class="special">()</span></code>
returns a <span class="emphasis"><em>lazy</em></span> reference that you can use in expressions
that are executed lazily. That is why we can say <code class="computeroutput"><span class="identifier">ref</span><span class="special">(</span><span class="identifier">result</span><span class="special">)[</span><span class="identifier">s1</span><span class="special">]</span></code>, even though <code class="computeroutput"><span class="identifier">result</span></code>
doesn't have an <code class="computeroutput"><span class="keyword">operator</span><span class="special">[]</span></code>
that would accept <code class="computeroutput"><span class="identifier">s1</span></code>.
</p></td></tr>
</table></div>
<p>
In addition to the sub-match placeholders <code class="computeroutput"><span class="identifier">s1</span></code>,
<code class="computeroutput"><span class="identifier">s2</span></code>, etc., you can also use
the placeholder <code class="computeroutput"><span class="identifier">_</span></code> within
an action to refer back to the string matched by the sub-expression to which
the action is attached. For instance, you can use the following regex to
match a bunch of digits, interpret them as an integer and assign the result
to a local variable:
</p>
<pre class="programlisting"><span class="keyword">int</span> <span class="identifier">i</span> <span class="special">=</span> <span class="number">0</span><span class="special">;</span>
<span class="comment">// Here, _ refers back to all the
</span><span class="comment">// characters matched by (+_d)
</span><span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">(+</span><span class="identifier">_d</span><span class="special">)[</span> <span class="identifier">ref</span><span class="special">(</span><span class="identifier">i</span><span class="special">)</span> <span class="special">=</span> <span class="identifier">as</span><span class="special">&lt;</span><span class="keyword">int</span><span class="special">&gt;(</span><span class="identifier">_</span><span class="special">)</span> <span class="special">];</span>
</pre>
<a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.lazy_action_execution"></a><h4>
<a name="id3117436"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.lazy_action_execution">Lazy
Action Execution</a>
</h4>
<p>
What does it mean, exactly, to attach an action to part of a regular expression
and perform a match? When does the action execute? If the action is part
of a repeated sub-expression, does the action execute once or many times?
And if the sub-expression initially matches, but ultimately fails because
the rest of the regular expression fails to match, is the action executed
at all?
</p>
<p>
The answer is that by default, actions are executed <span class="emphasis"><em>lazily</em></span>.
When a sub-expression matches a string, its action is placed on a queue,
along with the current values of any sub-matches to which the action refers.
If the match algorithm must backtrack, actions are popped off the queue as
necessary. Only after the entire regex has matched successfully are the actions
actually exeucted. They are executed all at once, in the order in which they
were added to the queue, as the last step before <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
returns.
</p>
<p>
For example, consider the following regex that increments a counter whenever
it finds a digit.
</p>
<pre class="programlisting"><span class="keyword">int</span> <span class="identifier">i</span> <span class="special">=</span> <span class="number">0</span><span class="special">;</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span><span class="string">"1!2!3?"</span><span class="special">);</span>
<span class="comment">// count the exciting digits, but not the
</span><span class="comment">// questionable ones.
</span><span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">+(</span> <span class="identifier">_d</span> <span class="special">[</span> <span class="special">++</span><span class="identifier">ref</span><span class="special">(</span><span class="identifier">i</span><span class="special">)</span> <span class="special">]</span> <span class="special">&gt;&gt;</span> <span class="char">'!'</span> <span class="special">);</span>
<span class="identifier">regex_search</span><span class="special">(</span><span class="identifier">str</span><span class="special">,</span> <span class="identifier">rex</span><span class="special">);</span>
<span class="identifier">assert</span><span class="special">(</span> <span class="identifier">i</span> <span class="special">==</span> <span class="number">2</span> <span class="special">);</span>
</pre>
<p>
The action <code class="computeroutput"><span class="special">++</span><span class="identifier">ref</span><span class="special">(</span><span class="identifier">i</span><span class="special">)</span></code>
is queued three times: once for each found digit. But it is only <span class="emphasis"><em>executed</em></span>
twice: once for each digit that precedes a <code class="computeroutput"><span class="char">'!'</span></code>
character. When the <code class="computeroutput"><span class="char">'?'</span></code> character
is encountered, the match algorithm backtracks, removing the final action
from the queue.
</p>
<a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.immediate_action_execution"></a><h4>
<a name="id3117770"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.immediate_action_execution">Immediate
Action Execution</a>
</h4>
<p>
When you want semantic actions to execute immediately, you can wrap the sub-expression
containing the action in a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/keep.html" title="Function template keep">keep()</a></code></code>.
<code class="computeroutput"><span class="identifier">keep</span><span class="special">()</span></code>
turns off back-tracking for its sub-expression, but it also causes any actions
queued by the sub-expression to execute at the end of the <code class="computeroutput"><span class="identifier">keep</span><span class="special">()</span></code>. It is as if the sub-expression in the
<code class="computeroutput"><span class="identifier">keep</span><span class="special">()</span></code>
were compiled into an independent regex object, and matching the <code class="computeroutput"><span class="identifier">keep</span><span class="special">()</span></code>
is like a separate invocation of <code class="computeroutput"><span class="identifier">regex_search</span><span class="special">()</span></code>. It matches characters and executes actions
but never backtracks or unwinds. For example, imagine the above example had
been written as follows:
</p>
<pre class="programlisting"><span class="keyword">int</span> <span class="identifier">i</span> <span class="special">=</span> <span class="number">0</span><span class="special">;</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span><span class="string">"1!2!3?"</span><span class="special">);</span>
<span class="comment">// count all the digits.
</span><span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">+(</span> <span class="identifier">keep</span><span class="special">(</span> <span class="identifier">_d</span> <span class="special">[</span> <span class="special">++</span><span class="identifier">ref</span><span class="special">(</span><span class="identifier">i</span><span class="special">)</span> <span class="special">]</span> <span class="special">)</span> <span class="special">&gt;&gt;</span> <span class="char">'!'</span> <span class="special">);</span>
<span class="identifier">regex_search</span><span class="special">(</span><span class="identifier">str</span><span class="special">,</span> <span class="identifier">rex</span><span class="special">);</span>
<span class="identifier">assert</span><span class="special">(</span> <span class="identifier">i</span> <span class="special">==</span> <span class="number">3</span> <span class="special">);</span>
</pre>
<p>
We have wrapped the sub-expression <code class="computeroutput"><span class="identifier">_d</span>
<span class="special">[</span> <span class="special">++</span><span class="identifier">ref</span><span class="special">(</span><span class="identifier">i</span><span class="special">)</span> <span class="special">]</span></code> in <code class="computeroutput"><span class="identifier">keep</span><span class="special">()</span></code>.
Now, whenever this regex matches a digit, the action will be queued and then
immediately executed before we try to match a <code class="computeroutput"><span class="char">'!'</span></code>
character. In this case, the action executes three times.
</p>
<div class="note"><table border="0" summary="Note">
<tr>
<td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../../doc/src/images/note.png"></td>
<th align="left">Note</th>
</tr>
<tr><td align="left" valign="top"><p>
Like <code class="computeroutput"><span class="identifier">keep</span><span class="special">()</span></code>,
actions within <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/before.html" title="Function template before">before()</a></code></code>
and <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/after.html" title="Function template after">after()</a></code></code>
are also executed early when their sub-expressions have matched.
</p></td></tr>
</table></div>
<a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.lazy_functions"></a><h4>
<a name="id3118228"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.lazy_functions">Lazy
Functions</a>
</h4>
<p>
So far, we've seen how to write semantic actions consisting of variables
and operators. But what if you want to be able to call a function from a
semantic action? Xpressive provides a mechanism to do this.
</p>
<p>
The first step is to define a function object type. Here, for instance, is
a function object type that calls <code class="computeroutput"><span class="identifier">push</span><span class="special">()</span></code> on its argument:
</p>
<pre class="programlisting"><span class="keyword">struct</span> <span class="identifier">push_impl</span>
<span class="special">{</span>
<span class="comment">// Result type, needed for tr1::result_of
</span> <span class="keyword">typedef</span> <span class="keyword">void</span> <span class="identifier">result_type</span><span class="special">;</span>
<span class="keyword">template</span><span class="special">&lt;</span><span class="keyword">typename</span> <span class="identifier">Sequence</span><span class="special">,</span> <span class="keyword">typename</span> <span class="identifier">Value</span><span class="special">&gt;</span>
<span class="keyword">void</span> <span class="keyword">operator</span><span class="special">()(</span><span class="identifier">Sequence</span> <span class="special">&amp;</span><span class="identifier">seq</span><span class="special">,</span> <span class="identifier">Value</span> <span class="keyword">const</span> <span class="special">&amp;</span><span class="identifier">val</span><span class="special">)</span> <span class="keyword">const</span>
<span class="special">{</span>
<span class="identifier">seq</span><span class="special">.</span><span class="identifier">push</span><span class="special">(</span><span class="identifier">val</span><span class="special">);</span>
<span class="special">}</span>
<span class="special">};</span>
</pre>
<p>
The next step is to use xpressive's <code class="computeroutput"><span class="identifier">function</span><span class="special">&lt;&gt;</span></code> template to define a function object
named <code class="computeroutput"><span class="identifier">push</span></code>:
</p>
<pre class="programlisting"><span class="comment">// Global "push" function object.
</span><span class="identifier">function</span><span class="special">&lt;</span><span class="identifier">push_impl</span><span class="special">&gt;::</span><span class="identifier">type</span> <span class="keyword">const</span> <span class="identifier">push</span> <span class="special">=</span> <span class="special">{{}};</span>
</pre>
<p>
The initialization looks a bit odd, but this is because <code class="computeroutput"><span class="identifier">push</span></code>
is being statically initialized. That means it doesn't need to be constructed
at runtime. We can use <code class="computeroutput"><span class="identifier">push</span></code>
in semantic actions as follows:
</p>
<pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">stack</span><span class="special">&lt;</span><span class="keyword">int</span><span class="special">&gt;</span> <span class="identifier">ints</span><span class="special">;</span>
<span class="comment">// Match digits, cast them to an int
</span><span class="comment">// and push it on the stack.
</span><span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">(+</span><span class="identifier">_d</span><span class="special">)[</span><span class="identifier">push</span><span class="special">(</span><span class="identifier">ref</span><span class="special">(</span><span class="identifier">ints</span><span class="special">),</span> <span class="identifier">as</span><span class="special">&lt;</span><span class="keyword">int</span><span class="special">&gt;(</span><span class="identifier">_</span><span class="special">))];</span>
</pre>
<p>
You'll notice that doing it this way causes member function invocations to
look like ordinary function invocations. You can choose to write your semantic
action in a different way that makes it look a bit more like a member function
call:
</p>
<pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">(+</span><span class="identifier">_d</span><span class="special">)[</span><span class="identifier">ref</span><span class="special">(</span><span class="identifier">ints</span><span class="special">)-&gt;*</span><span class="identifier">push</span><span class="special">(</span><span class="identifier">as</span><span class="special">&lt;</span><span class="keyword">int</span><span class="special">&gt;(</span><span class="identifier">_</span><span class="special">))];</span>
</pre>
<p>
Xpressive recognizes the use of the <code class="computeroutput"><span class="special">-&gt;*</span></code>
and treats this expression exactly the same as the one above.
</p>
<p>
When your function object must return a type that depends on its arguments,
you can use a <code class="computeroutput"><span class="identifier">result</span><span class="special">&lt;&gt;</span></code>
member template instead of the <code class="computeroutput"><span class="identifier">result_type</span></code>
typedef. Here, for example, is a <code class="computeroutput"><span class="identifier">first</span></code>
function object that returns the <code class="computeroutput"><span class="identifier">first</span></code>
member of a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">pair</span><span class="special">&lt;&gt;</span></code>
or <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match&lt;&gt;</a></code></code>:
</p>
<pre class="programlisting"><span class="comment">// Function object that returns the
</span><span class="comment">// first element of a pair.
</span><span class="keyword">struct</span> <span class="identifier">first_impl</span>
<span class="special">{</span>
<span class="keyword">template</span><span class="special">&lt;</span><span class="keyword">typename</span> <span class="identifier">Sig</span><span class="special">&gt;</span> <span class="keyword">struct</span> <span class="identifier">result</span> <span class="special">{};</span>
<span class="keyword">template</span><span class="special">&lt;</span><span class="keyword">typename</span> <span class="identifier">This</span><span class="special">,</span> <span class="keyword">typename</span> <span class="identifier">Pair</span><span class="special">&gt;</span>
<span class="keyword">struct</span> <span class="identifier">result</span><span class="special">&lt;</span><span class="identifier">This</span><span class="special">(</span><span class="identifier">Pair</span><span class="special">)&gt;</span>
<span class="special">{</span>
<span class="keyword">typedef</span> <span class="keyword">typename</span> <span class="identifier">remove_reference</span><span class="special">&lt;</span><span class="identifier">Pair</span><span class="special">&gt;</span>
<span class="special">::</span><span class="identifier">type</span><span class="special">::</span><span class="identifier">first_type</span> <span class="identifier">type</span><span class="special">;</span>
<span class="special">};</span>
<span class="keyword">template</span><span class="special">&lt;</span><span class="keyword">typename</span> <span class="identifier">Pair</span><span class="special">&gt;</span>
<span class="keyword">typename</span> <span class="identifier">Pair</span><span class="special">::</span><span class="identifier">first_type</span>
<span class="keyword">operator</span><span class="special">()(</span><span class="identifier">Pair</span> <span class="keyword">const</span> <span class="special">&amp;</span><span class="identifier">p</span><span class="special">)</span> <span class="keyword">const</span>
<span class="special">{</span>
<span class="keyword">return</span> <span class="identifier">p</span><span class="special">.</span><span class="identifier">first</span><span class="special">;</span>
<span class="special">}</span>
<span class="special">};</span>
<span class="comment">// OK, use as first(s1) to get the begin iterator
</span><span class="comment">// of the sub-match referred to by s1.
</span><span class="identifier">function</span><span class="special">&lt;</span><span class="identifier">first_impl</span><span class="special">&gt;::</span><span class="identifier">type</span> <span class="keyword">const</span> <span class="identifier">first</span> <span class="special">=</span> <span class="special">{{}};</span>
</pre>
<a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.referring_to_local_variables"></a><h4>
<a name="id3119309"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.referring_to_local_variables">Referring
to Local Variables</a>
</h4>
<p>
As we've seen in the examples above, we can refer to local variables within
an actions using <code class="computeroutput"><span class="identifier">xpressive</span><span class="special">::</span><span class="identifier">ref</span><span class="special">()</span></code>.
Any such variables are held by reference by the regular expression, and care
should be taken to avoid letting those references dangle. For instance, in
the following code, the reference to <code class="computeroutput"><span class="identifier">i</span></code>
is left to dangle when <code class="computeroutput"><span class="identifier">bad_voodoo</span><span class="special">()</span></code> returns:
</p>
<pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">bad_voodoo</span><span class="special">()</span>
<span class="special">{</span>
<span class="keyword">int</span> <span class="identifier">i</span> <span class="special">=</span> <span class="number">0</span><span class="special">;</span>
<span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">+(</span> <span class="identifier">_d</span> <span class="special">[</span> <span class="special">++</span><span class="identifier">ref</span><span class="special">(</span><span class="identifier">i</span><span class="special">)</span> <span class="special">]</span> <span class="special">&gt;&gt;</span> <span class="char">'!'</span> <span class="special">);</span>
<span class="comment">// ERROR! rex refers by reference to a local
</span> <span class="comment">// variable, which will dangle after bad_voodoo()
</span> <span class="comment">// returns.
</span> <span class="keyword">return</span> <span class="identifier">rex</span><span class="special">;</span>
<span class="special">}</span>
</pre>
<p>
When writing semantic actions, it is your responsibility to make sure that
all the references do not dangle. One way to do that would be to make the
variables shared pointers that are held by the regex by value.
</p>
<pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">good_voodoo</span><span class="special">(</span><span class="identifier">boost</span><span class="special">::</span><span class="identifier">shared_ptr</span><span class="special">&lt;</span><span class="keyword">int</span><span class="special">&gt;</span> <span class="identifier">pi</span><span class="special">)</span>
<span class="special">{</span>
<span class="comment">// Use val() to hold the shared_ptr by value:
</span> <span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">+(</span> <span class="identifier">_d</span> <span class="special">[</span> <span class="special">++*</span><span class="identifier">val</span><span class="special">(</span><span class="identifier">pi</span><span class="special">)</span> <span class="special">]</span> <span class="special">&gt;&gt;</span> <span class="char">'!'</span> <span class="special">);</span>
<span class="comment">// OK, rex holds a reference count to the integer.
</span> <span class="keyword">return</span> <span class="identifier">rex</span><span class="special">;</span>
<span class="special">}</span>
</pre>
<p>
In the above code, we use <code class="computeroutput"><span class="identifier">xpressive</span><span class="special">::</span><span class="identifier">val</span><span class="special">()</span></code>
to hold the shared pointer by value. That's not normally necessary because
local variables appearing in actions are held by value by default, but in
this case, it is necessary. Had we written the action as <code class="computeroutput"><span class="special">++*</span><span class="identifier">pi</span></code>, it would have executed immediately.
That's because <code class="computeroutput"><span class="special">++*</span><span class="identifier">pi</span></code>
is not an expression template, but <code class="computeroutput"><span class="special">++*</span><span class="identifier">val</span><span class="special">(</span><span class="identifier">pi</span><span class="special">)</span></code> is.
</p>
<p>
It can be tedious to wrap all your variables in <code class="computeroutput"><span class="identifier">ref</span><span class="special">()</span></code> and <code class="computeroutput"><span class="identifier">val</span><span class="special">()</span></code> in your semantic actions. Xpressive provides
the <code class="computeroutput"><span class="identifier">reference</span><span class="special">&lt;&gt;</span></code>
and <code class="computeroutput"><span class="identifier">value</span><span class="special">&lt;&gt;</span></code>
templates to make things easier. The following table shows the equivalencies:
</p>
<div class="table">
<a name="id3119864"></a><p class="title"><b>Table&#160;29.12.&#160;reference&lt;&gt; and value&lt;&gt;</b></p>
<div class="table-contents"><table class="table" summary="reference&lt;&gt; and value&lt;&gt;">
<colgroup>
<col>
<col>
</colgroup>
<thead><tr>
<th>
<p>
This ...
</p>
</th>
<th>
<p>
... is equivalent to this ...
</p>
</th>
</tr></thead>
<tbody>
<tr>
<td>
<p>
</p>
<pre class="programlisting"><span class="keyword">int</span> <span class="identifier">i</span> <span class="special">=</span> <span class="number">0</span><span class="special">;</span>
<span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">+(</span> <span class="identifier">_d</span> <span class="special">[</span> <span class="special">++</span><span class="identifier">ref</span><span class="special">(</span><span class="identifier">i</span><span class="special">)</span> <span class="special">]</span> <span class="special">&gt;&gt;</span> <span class="char">'!'</span> <span class="special">);</span></pre>
<p>
</p>
</td>
<td>
<p>
</p>
<pre class="programlisting"><span class="keyword">int</span> <span class="identifier">i</span> <span class="special">=</span> <span class="number">0</span><span class="special">;</span>
<span class="identifier">reference</span><span class="special">&lt;</span><span class="keyword">int</span><span class="special">&gt;</span> <span class="identifier">ri</span><span class="special">(</span><span class="identifier">i</span><span class="special">);</span>
<span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">+(</span> <span class="identifier">_d</span> <span class="special">[</span> <span class="special">++</span><span class="identifier">ri</span> <span class="special">]</span> <span class="special">&gt;&gt;</span> <span class="char">'!'</span> <span class="special">);</span></pre>
<p>
</p>
</td>
</tr>
<tr>
<td>
<p>
</p>
<pre class="programlisting"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">shared_ptr</span><span class="special">&lt;</span><span class="keyword">int</span><span class="special">&gt;</span> <span class="identifier">pi</span><span class="special">(</span><span class="keyword">new</span> <span class="keyword">int</span><span class="special">(</span><span class="number">0</span><span class="special">));</span>
<span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">+(</span> <span class="identifier">_d</span> <span class="special">[</span> <span class="special">++*</span><span class="identifier">val</span><span class="special">(</span><span class="identifier">pi</span><span class="special">)</span> <span class="special">]</span> <span class="special">&gt;&gt;</span> <span class="char">'!'</span> <span class="special">);</span></pre>
<p>
</p>
</td>
<td>
<p>
</p>
<pre class="programlisting"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">shared_ptr</span><span class="special">&lt;</span><span class="keyword">int</span><span class="special">&gt;</span> <span class="identifier">pi</span><span class="special">(</span><span class="keyword">new</span> <span class="keyword">int</span><span class="special">(</span><span class="number">0</span><span class="special">));</span>
<span class="identifier">value</span><span class="special">&lt;</span><span class="identifier">boost</span><span class="special">::</span><span class="identifier">shared_ptr</span><span class="special">&lt;</span><span class="keyword">int</span><span class="special">&gt;</span> <span class="special">&gt;</span> <span class="identifier">vpi</span><span class="special">(</span><span class="identifier">pi</span><span class="special">);</span>
<span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">+(</span> <span class="identifier">_d</span> <span class="special">[</span> <span class="special">++*</span><span class="identifier">vpi</span> <span class="special">]</span> <span class="special">&gt;&gt;</span> <span class="char">'!'</span> <span class="special">);</span></pre>
<p>
</p>
</td>
</tr>
</tbody>
</table></div>
</div>
<br class="table-break"><p>
As you can see, when using <code class="computeroutput"><span class="identifier">reference</span><span class="special">&lt;&gt;</span></code>, you need to first declare a local
variable and then declare a <code class="computeroutput"><span class="identifier">reference</span><span class="special">&lt;&gt;</span></code> to it. These two steps can be combined
into one using <code class="computeroutput"><span class="identifier">local</span><span class="special">&lt;&gt;</span></code>.
</p>
<div class="table">
<a name="id3120542"></a><p class="title"><b>Table&#160;29.13.&#160;local&lt;&gt; vs. reference&lt;&gt;</b></p>
<div class="table-contents"><table class="table" summary="local&lt;&gt; vs. reference&lt;&gt;">
<colgroup>
<col>
<col>
</colgroup>
<thead><tr>
<th>
<p>
This ...
</p>
</th>
<th>
<p>
... is equivalent to this ...
</p>
</th>
</tr></thead>
<tbody><tr>
<td>
<p>
</p>
<pre class="programlisting"><span class="identifier">local</span><span class="special">&lt;</span><span class="keyword">int</span><span class="special">&gt;</span> <span class="identifier">i</span><span class="special">(</span><span class="number">0</span><span class="special">);</span>
<span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">+(</span> <span class="identifier">_d</span> <span class="special">[</span> <span class="special">++</span><span class="identifier">i</span> <span class="special">]</span> <span class="special">&gt;&gt;</span> <span class="char">'!'</span> <span class="special">);</span></pre>
<p>
</p>
</td>
<td>
<p>
</p>
<pre class="programlisting"><span class="keyword">int</span> <span class="identifier">i</span> <span class="special">=</span> <span class="number">0</span><span class="special">;</span>
<span class="identifier">reference</span><span class="special">&lt;</span><span class="keyword">int</span><span class="special">&gt;</span> <span class="identifier">ri</span><span class="special">(</span><span class="identifier">i</span><span class="special">);</span>
<span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">+(</span> <span class="identifier">_d</span> <span class="special">[</span> <span class="special">++</span><span class="identifier">ri</span> <span class="special">]</span> <span class="special">&gt;&gt;</span> <span class="char">'!'</span> <span class="special">);</span></pre>
<p>
</p>
</td>
</tr></tbody>
</table></div>
</div>
<br class="table-break"><p>
We can use <code class="computeroutput"><span class="identifier">local</span><span class="special">&lt;&gt;</span></code>
to rewrite the above example as follows:
</p>
<pre class="programlisting"><span class="identifier">local</span><span class="special">&lt;</span><span class="keyword">int</span><span class="special">&gt;</span> <span class="identifier">i</span><span class="special">(</span><span class="number">0</span><span class="special">);</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span><span class="string">"1!2!3?"</span><span class="special">);</span>
<span class="comment">// count the exciting digits, but not the
</span><span class="comment">// questionable ones.
</span><span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">+(</span> <span class="identifier">_d</span> <span class="special">[</span> <span class="special">++</span><span class="identifier">i</span> <span class="special">]</span> <span class="special">&gt;&gt;</span> <span class="char">'!'</span> <span class="special">);</span>
<span class="identifier">regex_search</span><span class="special">(</span><span class="identifier">str</span><span class="special">,</span> <span class="identifier">rex</span><span class="special">);</span>
<span class="identifier">assert</span><span class="special">(</span> <span class="identifier">i</span><span class="special">.</span><span class="identifier">get</span><span class="special">()</span> <span class="special">==</span> <span class="number">2</span> <span class="special">);</span>
</pre>
<p>
Notice that we use <code class="computeroutput"><span class="identifier">local</span><span class="special">&lt;&gt;::</span><span class="identifier">get</span><span class="special">()</span></code> to access the value of the local variable.
Also, beware that <code class="computeroutput"><span class="identifier">local</span><span class="special">&lt;&gt;</span></code>
can be used to create a dangling reference, just as <code class="computeroutput"><span class="identifier">reference</span><span class="special">&lt;&gt;</span></code> can.
</p>
<a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.referring_to_non_local_variables"></a><h4>
<a name="id3121134"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.referring_to_non_local_variables">Referring
to Non-Local Variables</a>
</h4>
<p>
In the beginning of this section, we used a regex with a semantic action
to parse a string of word/integer pairs and stuff them into a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special">&lt;&gt;</span></code>. That required that the map and the
regex be defined together and used before either could go out of scope. What
if we wanted to define the regex once and use it to fill lots of different
maps? We would rather pass the map into the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
algorithm rather than embed a reference to it directly in the regex object.
What we can do instead is define a placeholder and use that in the semantic
action instead of the map itself. Later, when we call one of the regex algorithms,
we can bind the reference to an actual map object. The following code shows
how.
</p>
<pre class="programlisting"><span class="comment">// Define a placeholder for a map object:
</span><span class="identifier">placeholder</span><span class="special">&lt;</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special">&lt;</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">,</span> <span class="keyword">int</span><span class="special">&gt;</span> <span class="special">&gt;</span> <span class="identifier">_map</span><span class="special">;</span>
<span class="comment">// Match a word and an integer, separated by =&gt;,
</span><span class="comment">// and then stuff the result into a std::map&lt;&gt;
</span><span class="identifier">sregex</span> <span class="identifier">pair</span> <span class="special">=</span> <span class="special">(</span> <span class="special">(</span><span class="identifier">s1</span><span class="special">=</span> <span class="special">+</span><span class="identifier">_w</span><span class="special">)</span> <span class="special">&gt;&gt;</span> <span class="string">"=&gt;"</span> <span class="special">&gt;&gt;</span> <span class="special">(</span><span class="identifier">s2</span><span class="special">=</span> <span class="special">+</span><span class="identifier">_d</span><span class="special">)</span> <span class="special">)</span>
<span class="special">[</span> <span class="identifier">_map</span><span class="special">[</span><span class="identifier">s1</span><span class="special">]</span> <span class="special">=</span> <span class="identifier">as</span><span class="special">&lt;</span><span class="keyword">int</span><span class="special">&gt;(</span><span class="identifier">s2</span><span class="special">)</span> <span class="special">];</span>
<span class="comment">// Match one or more word/integer pairs, separated
</span><span class="comment">// by whitespace.
</span><span class="identifier">sregex</span> <span class="identifier">rx</span> <span class="special">=</span> <span class="identifier">pair</span> <span class="special">&gt;&gt;</span> <span class="special">*(+</span><span class="identifier">_s</span> <span class="special">&gt;&gt;</span> <span class="identifier">pair</span><span class="special">);</span>
<span class="comment">// The string to parse
</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span><span class="string">"aaa=&gt;1 bbb=&gt;23 ccc=&gt;456"</span><span class="special">);</span>
<span class="comment">// Here is the actual map to fill in:
</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special">&lt;</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">,</span> <span class="keyword">int</span><span class="special">&gt;</span> <span class="identifier">result</span><span class="special">;</span>
<span class="comment">// Bind the _map placeholder to the actual map
</span><span class="identifier">smatch</span> <span class="identifier">what</span><span class="special">;</span>
<span class="identifier">what</span><span class="special">.</span><span class="identifier">let</span><span class="special">(</span> <span class="identifier">_map</span> <span class="special">=</span> <span class="identifier">result</span> <span class="special">);</span>
<span class="comment">// Execute the match and fill in result map
</span><span class="keyword">if</span><span class="special">(</span><span class="identifier">regex_match</span><span class="special">(</span><span class="identifier">str</span><span class="special">,</span> <span class="identifier">what</span><span class="special">,</span> <span class="identifier">rx</span><span class="special">))</span>
<span class="special">{</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="identifier">result</span><span class="special">[</span><span class="string">"aaa"</span><span class="special">]</span> <span class="special">&lt;&lt;</span> <span class="char">'\n'</span><span class="special">;</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="identifier">result</span><span class="special">[</span><span class="string">"bbb"</span><span class="special">]</span> <span class="special">&lt;&lt;</span> <span class="char">'\n'</span><span class="special">;</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="identifier">result</span><span class="special">[</span><span class="string">"ccc"</span><span class="special">]</span> <span class="special">&lt;&lt;</span> <span class="char">'\n'</span><span class="special">;</span>
<span class="special">}</span>
</pre>
<p>
This program displays:
</p>
<pre class="programlisting">1
23
456
</pre>
<p>
We use <code class="computeroutput"><span class="identifier">placeholder</span><span class="special">&lt;&gt;</span></code>
here to define <code class="computeroutput"><span class="identifier">_map</span></code>, which
stands in for a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special">&lt;&gt;</span></code>
variable. We can use the placeholder in the semantic action as if it were
a map. Then, we define a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results&lt;&gt;</a></code></code>
struct and bind an actual map to the placeholder with "<code class="computeroutput"><span class="identifier">what</span><span class="special">.</span><span class="identifier">let</span><span class="special">(</span> <span class="identifier">_map</span> <span class="special">=</span> <span class="identifier">result</span> <span class="special">);</span></code>". The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
call behaves as if the placeholder in the semantic action had been replaced
with a reference to <code class="computeroutput"><span class="identifier">result</span></code>.
</p>
<div class="note"><table border="0" summary="Note">
<tr>
<td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../../doc/src/images/note.png"></td>
<th align="left">Note</th>
</tr>
<tr><td align="left" valign="top"><p>
Placeholders in semantic actions are not <span class="emphasis"><em>actually</em></span>
replaced at runtime with references to variables. The regex object is never
mutated in any way during any of the regex algorithms, so they are safe
to use in multiple threads.
</p></td></tr>
</table></div>
<p>
The syntax for late-bound action arguments is a little different if you are
using <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_iterator.html" title="Struct template regex_iterator">regex_iterator&lt;&gt;</a></code></code>
or <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator&lt;&gt;</a></code></code>.
The regex iterators accept an extra constructor parameter for specifying
the argument bindings. There is a <code class="computeroutput"><span class="identifier">let</span><span class="special">()</span></code> function that you can use to bind variables
to their placeholders. The following code demonstrates how.
</p>
<pre class="programlisting"><span class="comment">// Define a placeholder for a map object:
</span><span class="identifier">placeholder</span><span class="special">&lt;</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special">&lt;</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">,</span> <span class="keyword">int</span><span class="special">&gt;</span> <span class="special">&gt;</span> <span class="identifier">_map</span><span class="special">;</span>
<span class="comment">// Match a word and an integer, separated by =&gt;,
</span><span class="comment">// and then stuff the result into a std::map&lt;&gt;
</span><span class="identifier">sregex</span> <span class="identifier">pair</span> <span class="special">=</span> <span class="special">(</span> <span class="special">(</span><span class="identifier">s1</span><span class="special">=</span> <span class="special">+</span><span class="identifier">_w</span><span class="special">)</span> <span class="special">&gt;&gt;</span> <span class="string">"=&gt;"</span> <span class="special">&gt;&gt;</span> <span class="special">(</span><span class="identifier">s2</span><span class="special">=</span> <span class="special">+</span><span class="identifier">_d</span><span class="special">)</span> <span class="special">)</span>
<span class="special">[</span> <span class="identifier">_map</span><span class="special">[</span><span class="identifier">s1</span><span class="special">]</span> <span class="special">=</span> <span class="identifier">as</span><span class="special">&lt;</span><span class="keyword">int</span><span class="special">&gt;(</span><span class="identifier">s2</span><span class="special">)</span> <span class="special">];</span>
<span class="comment">// The string to parse
</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span><span class="string">"aaa=&gt;1 bbb=&gt;23 ccc=&gt;456"</span><span class="special">);</span>
<span class="comment">// Here is the actual map to fill in:
</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special">&lt;</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">,</span> <span class="keyword">int</span><span class="special">&gt;</span> <span class="identifier">result</span><span class="special">;</span>
<span class="comment">// Create a regex_iterator to find all the matches
</span><span class="identifier">sregex_iterator</span> <span class="identifier">it</span><span class="special">(</span><span class="identifier">str</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">str</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">pair</span><span class="special">,</span> <span class="identifier">let</span><span class="special">(</span><span class="identifier">_map</span><span class="special">=</span><span class="identifier">result</span><span class="special">));</span>
<span class="identifier">sregex_iterator</span> <span class="identifier">end</span><span class="special">;</span>
<span class="comment">// step through all the matches, and fill in
</span><span class="comment">// the result map
</span><span class="keyword">while</span><span class="special">(</span><span class="identifier">it</span> <span class="special">!=</span> <span class="identifier">end</span><span class="special">)</span>
<span class="special">++</span><span class="identifier">it</span><span class="special">;</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="identifier">result</span><span class="special">[</span><span class="string">"aaa"</span><span class="special">]</span> <span class="special">&lt;&lt;</span> <span class="char">'\n'</span><span class="special">;</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="identifier">result</span><span class="special">[</span><span class="string">"bbb"</span><span class="special">]</span> <span class="special">&lt;&lt;</span> <span class="char">'\n'</span><span class="special">;</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="identifier">result</span><span class="special">[</span><span class="string">"ccc"</span><span class="special">]</span> <span class="special">&lt;&lt;</span> <span class="char">'\n'</span><span class="special">;</span>
</pre>
<p>
This program displays:
</p>
<pre class="programlisting">1
23
456
</pre>
<a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.user_defined_assertions"></a><h3>
<a name="id3122803"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.user_defined_assertions">User-Defined
Assertions</a>
</h3>
<p>
You are probably already familiar with regular expression <span class="emphasis"><em>assertions</em></span>.
In Perl, some examples are the <code class="literal">^</code> and <code class="literal">$</code>
assertions, which you can use to match the beginning and end of a string,
respectively. Xpressive lets you define your own assertions. A custom assertion
is a contition which must be true at a point in the match in order for the
match to succeed. You can check a custom assertion with xpressive's <code class="literal"><code class="computeroutput">check()</code></code> function.
</p>
<p>
There are a couple of ways to define a custom assertion. The simplest is
to use a function object. Let's say that you want to ensure that a sub-expression
matches a sub-string that is either 3 or 6 characters long. The following
struct defines such a predicate:
</p>
<pre class="programlisting"><span class="comment">// A predicate that is true IFF a sub-match is
</span><span class="comment">// either 3 or 6 characters long.
</span><span class="keyword">struct</span> <span class="identifier">three_or_six</span>
<span class="special">{</span>
<span class="keyword">bool</span> <span class="keyword">operator</span><span class="special">()(</span><span class="identifier">ssub_match</span> <span class="keyword">const</span> <span class="special">&amp;</span><span class="identifier">sub</span><span class="special">)</span> <span class="keyword">const</span>
<span class="special">{</span>
<span class="keyword">return</span> <span class="identifier">sub</span><span class="special">.</span><span class="identifier">length</span><span class="special">()</span> <span class="special">==</span> <span class="number">3</span> <span class="special">||</span> <span class="identifier">sub</span><span class="special">.</span><span class="identifier">length</span><span class="special">()</span> <span class="special">==</span> <span class="number">6</span><span class="special">;</span>
<span class="special">}</span>
<span class="special">};</span>
</pre>
<p>
You can use this predicate within a regular expression as follows:
</p>
<pre class="programlisting"><span class="comment">// match words of 3 characters or 6 characters.
</span><span class="identifier">sregex</span> <span class="identifier">rx</span> <span class="special">=</span> <span class="special">(</span><span class="identifier">bow</span> <span class="special">&gt;&gt;</span> <span class="special">+</span><span class="identifier">_w</span> <span class="special">&gt;&gt;</span> <span class="identifier">eow</span><span class="special">)[</span> <span class="identifier">check</span><span class="special">(</span><span class="identifier">three_or_six</span><span class="special">())</span> <span class="special">]</span> <span class="special">;</span>
</pre>
<p>
The above regular expression will find whole words that are either 3 or 6
characters long. The <code class="computeroutput"><span class="identifier">three_or_six</span></code>
predicate accepts a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match&lt;&gt;</a></code></code>
that refers back to the part of the string matched by the sub-expression
to which the custom assertion is attached.
</p>
<div class="note"><table border="0" summary="Note">
<tr>
<td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../../doc/src/images/note.png"></td>
<th align="left">Note</th>
</tr>
<tr><td align="left" valign="top"><p>
The custom assertion participates in determining whether the match succeeds
or fails. Unlike actions, which execute lazily, custom assertions execute
immediately while the regex engine is searching for a match.
</p></td></tr>
</table></div>
<p>
Custom assertions can also be defined inline using the same syntax as for
semantic actions. Below is the same custom assertion written inline:
</p>
<pre class="programlisting"><span class="comment">// match words of 3 characters or 6 characters.
</span><span class="identifier">sregex</span> <span class="identifier">rx</span> <span class="special">=</span> <span class="special">(</span><span class="identifier">bow</span> <span class="special">&gt;&gt;</span> <span class="special">+</span><span class="identifier">_w</span> <span class="special">&gt;&gt;</span> <span class="identifier">eow</span><span class="special">)[</span> <span class="identifier">check</span><span class="special">(</span><span class="identifier">length</span><span class="special">(</span><span class="identifier">_</span><span class="special">)==</span><span class="number">3</span> <span class="special">||</span> <span class="identifier">length</span><span class="special">(</span><span class="identifier">_</span><span class="special">)==</span><span class="number">6</span><span class="special">)</span> <span class="special">]</span> <span class="special">;</span>
</pre>
<p>
In the above, <code class="computeroutput"><span class="identifier">length</span><span class="special">()</span></code>
is a lazy function that calls the <code class="computeroutput"><span class="identifier">length</span><span class="special">()</span></code> member function of its argument, and <code class="computeroutput"><span class="identifier">_</span></code> is a placeholder that receives the <code class="computeroutput"><span class="identifier">sub_match</span></code>.
</p>
<p>
Once you get the hang of writing custom assertions inline, they can be very
powerful. For example, you can write a regular expression that only matches
valid dates (for some suitably liberal definition of the term <span class="quote">&#8220;<span class="quote">valid</span>&#8221;</span>).
</p>
<pre class="programlisting"><span class="keyword">int</span> <span class="keyword">const</span> <span class="identifier">days_per_month</span><span class="special">[]</span> <span class="special">=</span>
<span class="special">{</span><span class="number">31</span><span class="special">,</span> <span class="number">29</span><span class="special">,</span> <span class="number">31</span><span class="special">,</span> <span class="number">30</span><span class="special">,</span> <span class="number">31</span><span class="special">,</span> <span class="number">30</span><span class="special">,</span> <span class="number">31</span><span class="special">,</span> <span class="number">31</span><span class="special">,</span> <span class="number">30</span><span class="special">,</span> <span class="number">31</span><span class="special">,</span> <span class="number">31</span><span class="special">,</span> <span class="number">31</span><span class="special">};</span>
<span class="identifier">mark_tag</span> <span class="identifier">month</span><span class="special">(</span><span class="number">1</span><span class="special">),</span> <span class="identifier">day</span><span class="special">(</span><span class="number">2</span><span class="special">);</span>
<span class="comment">// find a valid date of the form month/day/year.
</span><span class="identifier">sregex</span> <span class="identifier">date</span> <span class="special">=</span>
<span class="special">(</span>
<span class="comment">// Month must be between 1 and 12 inclusive
</span> <span class="special">(</span><span class="identifier">month</span><span class="special">=</span> <span class="identifier">_d</span> <span class="special">&gt;&gt;</span> <span class="special">!</span><span class="identifier">_d</span><span class="special">)</span> <span class="special">[</span> <span class="identifier">check</span><span class="special">(</span><span class="identifier">as</span><span class="special">&lt;</span><span class="keyword">int</span><span class="special">&gt;(</span><span class="identifier">_</span><span class="special">)</span> <span class="special">&gt;=</span> <span class="number">1</span>
<span class="special">&amp;&amp;</span> <span class="identifier">as</span><span class="special">&lt;</span><span class="keyword">int</span><span class="special">&gt;(</span><span class="identifier">_</span><span class="special">)</span> <span class="special">&lt;=</span> <span class="number">12</span><span class="special">)</span> <span class="special">]</span>
<span class="special">&gt;&gt;</span> <span class="char">'/'</span>
<span class="comment">// Day must be between 1 and 31 inclusive
</span> <span class="special">&gt;&gt;</span> <span class="special">(</span><span class="identifier">day</span><span class="special">=</span> <span class="identifier">_d</span> <span class="special">&gt;&gt;</span> <span class="special">!</span><span class="identifier">_d</span><span class="special">)</span> <span class="special">[</span> <span class="identifier">check</span><span class="special">(</span><span class="identifier">as</span><span class="special">&lt;</span><span class="keyword">int</span><span class="special">&gt;(</span><span class="identifier">_</span><span class="special">)</span> <span class="special">&gt;=</span> <span class="number">1</span>
<span class="special">&amp;&amp;</span> <span class="identifier">as</span><span class="special">&lt;</span><span class="keyword">int</span><span class="special">&gt;(</span><span class="identifier">_</span><span class="special">)</span> <span class="special">&lt;=</span> <span class="number">31</span><span class="special">)</span> <span class="special">]</span>
<span class="special">&gt;&gt;</span> <span class="char">'/'</span>
<span class="comment">// Only consider years between 1970 and 2038
</span> <span class="special">&gt;&gt;</span> <span class="special">(</span><span class="identifier">_d</span> <span class="special">&gt;&gt;</span> <span class="identifier">_d</span> <span class="special">&gt;&gt;</span> <span class="identifier">_d</span> <span class="special">&gt;&gt;</span> <span class="identifier">_d</span><span class="special">)</span> <span class="special">[</span> <span class="identifier">check</span><span class="special">(</span><span class="identifier">as</span><span class="special">&lt;</span><span class="keyword">int</span><span class="special">&gt;(</span><span class="identifier">_</span><span class="special">)</span> <span class="special">&gt;=</span> <span class="number">1970</span>
<span class="special">&amp;&amp;</span> <span class="identifier">as</span><span class="special">&lt;</span><span class="keyword">int</span><span class="special">&gt;(</span><span class="identifier">_</span><span class="special">)</span> <span class="special">&lt;=</span> <span class="number">2038</span><span class="special">)</span> <span class="special">]</span>
<span class="special">)</span>
<span class="comment">// Ensure the month actually has that many days!
</span> <span class="special">[</span> <span class="identifier">check</span><span class="special">(</span> <span class="identifier">ref</span><span class="special">(</span><span class="identifier">days_per_month</span><span class="special">)[</span><span class="identifier">as</span><span class="special">&lt;</span><span class="keyword">int</span><span class="special">&gt;(</span><span class="identifier">month</span><span class="special">)-</span><span class="number">1</span><span class="special">]</span> <span class="special">&gt;=</span> <span class="identifier">as</span><span class="special">&lt;</span><span class="keyword">int</span><span class="special">&gt;(</span><span class="identifier">day</span><span class="special">)</span> <span class="special">)</span> <span class="special">]</span>
<span class="special">;</span>
<span class="identifier">smatch</span> <span class="identifier">what</span><span class="special">;</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span><span class="string">"99/99/9999 2/30/2006 2/28/2006"</span><span class="special">);</span>
<span class="keyword">if</span><span class="special">(</span><span class="identifier">regex_search</span><span class="special">(</span><span class="identifier">str</span><span class="special">,</span> <span class="identifier">what</span><span class="special">,</span> <span class="identifier">date</span><span class="special">))</span>
<span class="special">{</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="identifier">what</span><span class="special">[</span><span class="number">0</span><span class="special">]</span> <span class="special">&lt;&lt;</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">endl</span><span class="special">;</span>
<span class="special">}</span>
</pre>
<p>
The above program prints out the following:
</p>
<pre class="programlisting">2/28/2006
</pre>
<p>
Notice how the inline custom assertions are used to range-check the values
for the month, day and year. The regular expression doesn't match <code class="computeroutput"><span class="string">"99/99/9999"</span></code> or <code class="computeroutput"><span class="string">"2/30/2006"</span></code>
because they are not valid dates. (There is no 99th month, and February doesn't
have 30 days.)
</p>
</div>
<div class="section">
<div class="titlepage"><div><div><h3 class="title">
<a name="boost_xpressive.user_s_guide.symbol_tables_and_attributes"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.symbol_tables_and_attributes" title="Symbol Tables and Attributes">Symbol
Tables and Attributes</a>
</h3></div></div></div>
<a name="boost_xpressive.user_s_guide.symbol_tables_and_attributes.overview"></a><h3>
<a name="id3124450"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.symbol_tables_and_attributes.overview">Overview</a>
</h3>
<p>
Symbol tables can be built into xpressive regular expressions with just a
<code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special">&lt;&gt;</span></code>.
The map keys are the strings to be matched and the map values are the data
to be returned to your semantic action. Xpressive attributes, named <code class="computeroutput"><span class="identifier">a1</span></code>, <code class="computeroutput"><span class="identifier">a2</span></code>,
through <code class="computeroutput"><span class="identifier">a9</span></code>, hold the value
corresponding to a matching key so that it can be used in a semantic action.
A default value can be specified for an attribute if a symbol is not found.
</p>
<a name="boost_xpressive.user_s_guide.symbol_tables_and_attributes.symbol_tables"></a><h3>
<a name="id3124533"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.symbol_tables_and_attributes.symbol_tables">Symbol
Tables</a>
</h3>
<p>
An xpressive symbol table is just a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special">&lt;&gt;</span></code>,
where the key is a string type and the value can be anything. For example,
the following regular expression matches a key from map1 and assigns the
corresponding value to the attribute <code class="computeroutput"><span class="identifier">a1</span></code>.
Then, in the semantic action, it assigns the value stored in attribute <code class="computeroutput"><span class="identifier">a1</span></code> to an integer result.
</p>
<pre class="programlisting"><span class="keyword">int</span> <span class="identifier">result</span><span class="special">;</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special">&lt;</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">,</span> <span class="keyword">int</span><span class="special">&gt;</span> <span class="identifier">map1</span><span class="special">;</span>
<span class="comment">// ... (fill the map)
</span><span class="identifier">sregex</span> <span class="identifier">rx</span> <span class="special">=</span> <span class="special">(</span> <span class="identifier">a1</span> <span class="special">=</span> <span class="identifier">map1</span> <span class="special">)</span> <span class="special">[</span> <span class="identifier">ref</span><span class="special">(</span><span class="identifier">result</span><span class="special">)</span> <span class="special">=</span> <span class="identifier">a1</span> <span class="special">];</span>
</pre>
<p>
Consider the following example code, which translates number names into integers.
It is described below.
</p>
<pre class="programlisting"><span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">string</span><span class="special">&gt;</span>
<span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">iostream</span><span class="special">&gt;</span>
<span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">&gt;</span>
<span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">regex_actions</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">&gt;</span>
<span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span><span class="special">;</span>
<span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span>
<span class="special">{</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special">&lt;</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">,</span> <span class="keyword">int</span><span class="special">&gt;</span> <span class="identifier">number_map</span><span class="special">;</span>
<span class="identifier">number_map</span><span class="special">[</span><span class="string">"one"</span><span class="special">]</span> <span class="special">=</span> <span class="number">1</span><span class="special">;</span>
<span class="identifier">number_map</span><span class="special">[</span><span class="string">"two"</span><span class="special">]</span> <span class="special">=</span> <span class="number">2</span><span class="special">;</span>
<span class="identifier">number_map</span><span class="special">[</span><span class="string">"three"</span><span class="special">]</span> <span class="special">=</span> <span class="number">3</span><span class="special">;</span>
<span class="comment">// Match a string from number_map
</span> <span class="comment">// and store the integer value in 'result'
</span> <span class="comment">// if not found, store -1 in 'result'
</span> <span class="keyword">int</span> <span class="identifier">result</span> <span class="special">=</span> <span class="number">0</span><span class="special">;</span>
<span class="identifier">cregex</span> <span class="identifier">rx</span> <span class="special">=</span> <span class="special">((</span><span class="identifier">a1</span> <span class="special">=</span> <span class="identifier">number_map</span> <span class="special">)</span> <span class="special">|</span> <span class="special">*</span><span class="identifier">_</span><span class="special">)</span>
<span class="special">[</span> <span class="identifier">ref</span><span class="special">(</span><span class="identifier">result</span><span class="special">)</span> <span class="special">=</span> <span class="special">(</span><span class="identifier">a1</span> <span class="special">|</span> <span class="special">-</span><span class="number">1</span><span class="special">)];</span>
<span class="identifier">regex_match</span><span class="special">(</span><span class="string">"three"</span><span class="special">,</span> <span class="identifier">rx</span><span class="special">);</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="identifier">result</span> <span class="special">&lt;&lt;</span> <span class="char">'\n'</span><span class="special">;</span>
<span class="identifier">regex_match</span><span class="special">(</span><span class="string">"two"</span><span class="special">,</span> <span class="identifier">rx</span><span class="special">);</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="identifier">result</span> <span class="special">&lt;&lt;</span> <span class="char">'\n'</span><span class="special">;</span>
<span class="identifier">regex_match</span><span class="special">(</span><span class="string">"stuff"</span><span class="special">,</span> <span class="identifier">rx</span><span class="special">);</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="identifier">result</span> <span class="special">&lt;&lt;</span> <span class="char">'\n'</span><span class="special">;</span>
<span class="keyword">return</span> <span class="number">0</span><span class="special">;</span>
<span class="special">}</span>
</pre>
<p>
This program prints the following:
</p>
<pre class="programlisting">3
2
-1
</pre>
<p>
First the program builds a number map, with number names as string keys and
the corresponding integers as values. Then it constructs a static regular
expression using an attribute <code class="computeroutput"><span class="identifier">a1</span></code>
to represent the result of the symbol table lookup. In the semantic action,
the attribute is assigned to an integer variable <code class="computeroutput"><span class="identifier">result</span></code>.
If the symbol was not found, a default value of <code class="computeroutput"><span class="special">-</span><span class="number">1</span></code> is assigned to <code class="computeroutput"><span class="identifier">result</span></code>.
A wildcard, <code class="computeroutput"><span class="special">*</span><span class="identifier">_</span></code>,
makes sure the regex matches even if the symbol is not found.
</p>
<p>
A more complete version of this example can be found in <code class="literal">libs/xpressive/example/numbers.cpp</code><sup>[<a name="id3125582" href="#ftn.id3125582" class="footnote">5</a>]</sup>. It translates number names up to "nine hundred ninety nine
million nine hundred ninety nine thousand nine hundred ninety nine"
along with some special number names like "dozen".
</p>
<p>
Symbol table matches are case sensitive by default, but they can be made
case-insensitive by enclosing the expression in <code class="computeroutput"><span class="identifier">icase</span><span class="special">()</span></code>.
</p>
<a name="boost_xpressive.user_s_guide.symbol_tables_and_attributes.attributes"></a><h3>
<a name="id3125623"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.symbol_tables_and_attributes.attributes">Attributes</a>
</h3>
<p>
Up to nine attributes can be used in a regular expression. They are named
<code class="computeroutput"><span class="identifier">a1</span></code>, <code class="computeroutput"><span class="identifier">a2</span></code>,
..., <code class="computeroutput"><span class="identifier">a9</span></code> in the <code class="computeroutput"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span></code> namespace. The attribute type
is the same as the second component of the map that is assigned to it. A
default value for an attribute can be specified in a semantic action with
the syntax <code class="computeroutput"><span class="special">(</span><span class="identifier">a1</span>
<span class="special">|</span> <em class="replaceable"><code>default-value</code></em><span class="special">)</span></code>.
</p>
<p>
Attributes are properly scoped, so you can do crazy things like: <code class="computeroutput"><span class="special">(</span> <span class="special">(</span><span class="identifier">a1</span><span class="special">=</span><span class="identifier">sym1</span><span class="special">)</span>
<span class="special">&gt;&gt;</span> <span class="special">(</span><span class="identifier">a1</span><span class="special">=</span><span class="identifier">sym2</span><span class="special">)[</span><span class="identifier">ref</span><span class="special">(</span><span class="identifier">x</span><span class="special">)=</span><span class="identifier">a1</span><span class="special">]</span> <span class="special">)[</span><span class="identifier">ref</span><span class="special">(</span><span class="identifier">y</span><span class="special">)=</span><span class="identifier">a1</span><span class="special">]</span></code>. The
inner semantic action sees the inner <code class="computeroutput"><span class="identifier">a1</span></code>,
and the outer semantic action sees the outer one. They can even have different
types.
</p>
<div class="note"><table border="0" summary="Note">
<tr>
<td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../../doc/src/images/note.png"></td>
<th align="left">Note</th>
</tr>
<tr><td align="left" valign="top"><p>
Xpressive builds a hidden ternary search trie from the map so it can search
quickly. If BOOST_DISABLE_THREADS is defined, the hidden ternary search
trie "self adjusts", so after each search it restructures itself
to improve the efficiency of future searches based on the frequency of
previous searches.
</p></td></tr>
</table></div>
</div>
<div class="section">
<div class="titlepage"><div><div><h3 class="title">
<a name="boost_xpressive.user_s_guide.localization_and_regex_traits"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.localization_and_regex_traits" title="Localization and Regex Traits">Localization
and Regex Traits</a>
</h3></div></div></div>
<a name="boost_xpressive.user_s_guide.localization_and_regex_traits.overview"></a><h3>
<a name="id3125887"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.localization_and_regex_traits.overview">Overview</a>
</h3>
<p>
Matching a regular expression against a string often requires locale-dependent
information. For example, how are case-insensitive comparisons performed?
The locale-sensitive behavior is captured in a traits class. xpressive provides
three traits class templates: <code class="computeroutput"><span class="identifier">cpp_regex_traits</span><span class="special">&lt;&gt;</span></code>, <code class="computeroutput"><span class="identifier">c_regex_traits</span><span class="special">&lt;&gt;</span></code> and <code class="computeroutput"><span class="identifier">null_regex_traits</span><span class="special">&lt;&gt;</span></code>. The first wraps a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">locale</span></code>,
the second wraps the global C locale, and the third is a stub traits type
for use when searching non-character data. All traits templates conform to
the <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.concepts.traits_requirements">Regex
Traits Concept</a>.
</p>
<a name="boost_xpressive.user_s_guide.localization_and_regex_traits.setting_the_default_regex_trait"></a><h3>
<a name="id3125990"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.localization_and_regex_traits.setting_the_default_regex_trait">Setting
the Default Regex Trait</a>
</h3>
<p>
By default, xpressive uses <code class="computeroutput"><span class="identifier">cpp_regex_traits</span><span class="special">&lt;&gt;</span></code> for all patterns. This causes all
regex objects to use the global <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">locale</span></code>.
If you compile with <code class="computeroutput"><span class="identifier">BOOST_XPRESSIVE_USE_C_TRAITS</span></code>
defined, then xpressive will use <code class="computeroutput"><span class="identifier">c_regex_traits</span><span class="special">&lt;&gt;</span></code> by default.
</p>
<a name="boost_xpressive.user_s_guide.localization_and_regex_traits.using_custom_traits_with_dynamic_regexes"></a><h3>
<a name="id3126077"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.localization_and_regex_traits.using_custom_traits_with_dynamic_regexes">Using
Custom Traits with Dynamic Regexes</a>
</h3>
<p>
To create a dynamic regex that uses a custom traits object, you must use
<code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler&lt;&gt;</a></code></code>.
The basic steps are shown in the following example:
</p>
<pre class="programlisting"><span class="comment">// Declare a regex_compiler that uses the global C locale
</span><span class="identifier">regex_compiler</span><span class="special">&lt;</span><span class="keyword">char</span> <span class="keyword">const</span> <span class="special">*,</span> <span class="identifier">c_regex_traits</span><span class="special">&lt;</span><span class="keyword">char</span><span class="special">&gt;</span> <span class="special">&gt;</span> <span class="identifier">crxcomp</span><span class="special">;</span>
<span class="identifier">cregex</span> <span class="identifier">crx</span> <span class="special">=</span> <span class="identifier">crxcomp</span><span class="special">.</span><span class="identifier">compile</span><span class="special">(</span> <span class="string">"\\w+"</span> <span class="special">);</span>
<span class="comment">// Declare a regex_compiler that uses a custom std::locale
</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">locale</span> <span class="identifier">loc</span> <span class="special">=</span> <span class="comment">/* ... create a locale here ... */</span><span class="special">;</span>
<span class="identifier">regex_compiler</span><span class="special">&lt;</span><span class="keyword">char</span> <span class="keyword">const</span> <span class="special">*,</span> <span class="identifier">cpp_regex_traits</span><span class="special">&lt;</span><span class="keyword">char</span><span class="special">&gt;</span> <span class="special">&gt;</span> <span class="identifier">cpprxcomp</span><span class="special">(</span><span class="identifier">loc</span><span class="special">);</span>
<span class="identifier">cregex</span> <span class="identifier">cpprx</span> <span class="special">=</span> <span class="identifier">cpprxcomp</span><span class="special">.</span><span class="identifier">compile</span><span class="special">(</span> <span class="string">"\\w+"</span> <span class="special">);</span>
</pre>
<p>
The <code class="computeroutput"><span class="identifier">regex_compiler</span></code> objects
act as regex factories. Once they have been imbued with a locale, every regex
object they create will use that locale.
</p>
<a name="boost_xpressive.user_s_guide.localization_and_regex_traits.using_custom_traits_with_static_regexes"></a><h3>
<a name="id3126409"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.localization_and_regex_traits.using_custom_traits_with_static_regexes">Using
Custom Traits with Static Regexes</a>
</h3>
<p>
If you want a particular static regex to use a different set of traits, you
can use the special <code class="computeroutput"><span class="identifier">imbue</span><span class="special">()</span></code> pattern modifier. For instance:
</p>
<pre class="programlisting"><span class="comment">// Define a regex that uses the global C locale
</span><span class="identifier">c_regex_traits</span><span class="special">&lt;</span><span class="keyword">char</span><span class="special">&gt;</span> <span class="identifier">ctraits</span><span class="special">;</span>
<span class="identifier">sregex</span> <span class="identifier">crx</span> <span class="special">=</span> <span class="identifier">imbue</span><span class="special">(</span><span class="identifier">ctraits</span><span class="special">)(</span> <span class="special">+</span><span class="identifier">_w</span> <span class="special">);</span>
<span class="comment">// Define a regex that uses a customized std::locale
</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">locale</span> <span class="identifier">loc</span> <span class="special">=</span> <span class="comment">/* ... create a locale here ... */</span><span class="special">;</span>
<span class="identifier">cpp_regex_traits</span><span class="special">&lt;</span><span class="keyword">char</span><span class="special">&gt;</span> <span class="identifier">cpptraits</span><span class="special">(</span><span class="identifier">loc</span><span class="special">);</span>
<span class="identifier">sregex</span> <span class="identifier">cpprx1</span> <span class="special">=</span> <span class="identifier">imbue</span><span class="special">(</span><span class="identifier">cpptraits</span><span class="special">)(</span> <span class="special">+</span><span class="identifier">_w</span> <span class="special">);</span>
<span class="comment">// A shorthand for above
</span><span class="identifier">sregex</span> <span class="identifier">cpprx2</span> <span class="special">=</span> <span class="identifier">imbue</span><span class="special">(</span><span class="identifier">loc</span><span class="special">)(</span> <span class="special">+</span><span class="identifier">_w</span> <span class="special">);</span>
</pre>
<p>
The <code class="computeroutput"><span class="identifier">imbue</span><span class="special">()</span></code>
pattern modifier must wrap the entire pattern. It is an error to <code class="computeroutput"><span class="identifier">imbue</span></code> only part of a static regex. For
example:
</p>
<pre class="programlisting"><span class="comment">// ERROR! Cannot imbue() only part of a regex
</span><span class="identifier">sregex</span> <span class="identifier">error</span> <span class="special">=</span> <span class="identifier">_w</span> <span class="special">&gt;&gt;</span> <span class="identifier">imbue</span><span class="special">(</span><span class="identifier">loc</span><span class="special">)(</span> <span class="identifier">_w</span> <span class="special">);</span>
</pre>
<a name="boost_xpressive.user_s_guide.localization_and_regex_traits.searching_non_character_data_with__literal_null_regex_traits__literal_"></a><h3>
<a name="id3126821"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.localization_and_regex_traits.searching_non_character_data_with__literal_null_regex_traits__literal_">Searching
Non-Character Data With <code class="literal">null_regex_traits</code></a>
</h3>
<p>
With xpressive static regexes, you are not limitted to searching for patterns
in character sequences. You can search for patterns in raw bytes, integers,
or anything that conforms to the <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.concepts.chart_requirements">Char
Concept</a>. The <code class="computeroutput"><span class="identifier">null_regex_traits</span><span class="special">&lt;&gt;</span></code> makes it simple. It is a stub implementation
of the <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.concepts.traits_requirements">Regex
Traits Concept</a>. It recognizes no character classes and does no case-sensitive
mappings.
</p>
<p>
For example, with <code class="computeroutput"><span class="identifier">null_regex_traits</span><span class="special">&lt;&gt;</span></code>, you can write a static regex to
find a pattern in a sequence of integers as follows:
</p>
<pre class="programlisting"><span class="comment">// some integral data to search
</span><span class="keyword">int</span> <span class="keyword">const</span> <span class="identifier">data</span><span class="special">[]</span> <span class="special">=</span> <span class="special">{</span><span class="number">0</span><span class="special">,</span> <span class="number">1</span><span class="special">,</span> <span class="number">2</span><span class="special">,</span> <span class="number">3</span><span class="special">,</span> <span class="number">4</span><span class="special">,</span> <span class="number">5</span><span class="special">,</span> <span class="number">6</span><span class="special">};</span>
<span class="comment">// create a null_regex_traits&lt;&gt; object for searching integers ...
</span><span class="identifier">null_regex_traits</span><span class="special">&lt;</span><span class="keyword">int</span><span class="special">&gt;</span> <span class="identifier">nul</span><span class="special">;</span>
<span class="comment">// imbue a regex object with the null_regex_traits ...
</span><span class="identifier">basic_regex</span><span class="special">&lt;</span><span class="keyword">int</span> <span class="keyword">const</span> <span class="special">*&gt;</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="identifier">imbue</span><span class="special">(</span><span class="identifier">nul</span><span class="special">)(</span><span class="number">1</span> <span class="special">&gt;&gt;</span> <span class="special">+((</span><span class="identifier">set</span><span class="special">=</span> <span class="number">2</span><span class="special">,</span><span class="number">3</span><span class="special">)</span> <span class="special">|</span> <span class="number">4</span><span class="special">)</span> <span class="special">&gt;&gt;</span> <span class="number">5</span><span class="special">);</span>
<span class="identifier">match_results</span><span class="special">&lt;</span><span class="keyword">int</span> <span class="keyword">const</span> <span class="special">*&gt;</span> <span class="identifier">what</span><span class="special">;</span>
<span class="comment">// search for the pattern in the array of integers ...
</span><span class="identifier">regex_search</span><span class="special">(</span><span class="identifier">data</span><span class="special">,</span> <span class="identifier">data</span> <span class="special">+</span> <span class="number">7</span><span class="special">,</span> <span class="identifier">what</span><span class="special">,</span> <span class="identifier">rex</span><span class="special">);</span>
<span class="identifier">assert</span><span class="special">(</span><span class="identifier">what</span><span class="special">[</span><span class="number">0</span><span class="special">].</span><span class="identifier">matched</span><span class="special">);</span>
<span class="identifier">assert</span><span class="special">(*</span><span class="identifier">what</span><span class="special">[</span><span class="number">0</span><span class="special">].</span><span class="identifier">first</span> <span class="special">==</span> <span class="number">1</span><span class="special">);</span>
<span class="identifier">assert</span><span class="special">(*</span><span class="identifier">what</span><span class="special">[</span><span class="number">0</span><span class="special">].</span><span class="identifier">second</span> <span class="special">==</span> <span class="number">6</span><span class="special">);</span>
</pre>
</div>
<div class="section">
<div class="titlepage"><div><div><h3 class="title">
<a name="boost_xpressive.user_s_guide.tips_n_tricks"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.tips_n_tricks" title="Tips 'N Tricks">Tips 'N Tricks</a>
</h3></div></div></div>
<p>
Squeeze the most performance out of xpressive with these tips and tricks.
</p>
<a name="boost_xpressive.user_s_guide.tips_n_tricks.compile_patterns_once_and_reuse_them"></a><h3>
<a name="id3127440"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.tips_n_tricks.compile_patterns_once_and_reuse_them">Compile
Patterns Once And Reuse Them</a>
</h3>
<p>
Compiling a regex (dynamic or static) is <span class="emphasis"><em>far</em></span> more expensive
than executing a match or search. If you have the option, prefer to compile
a pattern into a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex&lt;&gt;</a></code></code>
object once and reuse it rather than recreating it over and over.
</p>
<p>
Since <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex&lt;&gt;</a></code></code>
objects are not mutated by any of the regex algorithms, they are completely
thread-safe once their initialization (and that of any grammars of which
they are members) completes. The easiest way to reuse your patterns is to
simply make your <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex&lt;&gt;</a></code></code>
objects "static const".
</p>
<a name="boost_xpressive.user_s_guide.tips_n_tricks.reuse__literal__classname_alt__boost__xpressive__match_results__match_results_lt__gt___classname___literal__objects"></a><h3>
<a name="id3127530"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.tips_n_tricks.reuse__literal__classname_alt__boost__xpressive__match_results__match_results_lt__gt___classname___literal__objects">Reuse
<code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results&lt;&gt;</a></code></code>
Objects</a>
</h3>
<p>
The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results&lt;&gt;</a></code></code>
object caches dynamically allocated memory. For this reason, it is far better
to reuse the same <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results&lt;&gt;</a></code></code>
object if you have to do many regex searches.
</p>
<p>
Caveat: <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results&lt;&gt;</a></code></code>
objects are not thread-safe, so don't go wild reusing them across threads.
</p>
<a name="boost_xpressive.user_s_guide.tips_n_tricks.prefer_algorithms_that_take_a__literal__classname_alt__boost__xpressive__match_results__match_results_lt__gt___classname___literal__object"></a><h3>
<a name="id3127628"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.tips_n_tricks.prefer_algorithms_that_take_a__literal__classname_alt__boost__xpressive__match_results__match_results_lt__gt___classname___literal__object">Prefer
Algorithms That Take A <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results&lt;&gt;</a></code></code>
Object</a>
</h3>
<p>
This is a corollary to the previous tip. If you are doing multiple searches,
you should prefer the regex algorithms that accept a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results&lt;&gt;</a></code></code>
object over the ones that don't, and you should reuse the same <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results&lt;&gt;</a></code></code>
object each time. If you don't provide a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results&lt;&gt;</a></code></code>
object, a temporary one will be created for you and discarded when the algorithm
returns. Any memory cached in the object will be deallocated and will have
to be reallocated the next time.
</p>
<a name="boost_xpressive.user_s_guide.tips_n_tricks.prefer_algorithms_that_accept_iterator_ranges_over_null_terminated_strings"></a><h3>
<a name="id3127724"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.tips_n_tricks.prefer_algorithms_that_accept_iterator_ranges_over_null_terminated_strings">Prefer
Algorithms That Accept Iterator Ranges Over Null-Terminated Strings</a>
</h3>
<p>
xpressive provides overloads of the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
and <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>
algorithms that operate on C-style null-terminated strings. You should prefer
the overloads that take iterator ranges. When you pass a null-terminated
string to a regex algorithm, the end iterator is calculated immediately by
calling <code class="computeroutput"><span class="identifier">strlen</span></code>. If you already
know the length of the string, you can avoid this overhead by calling the
regex algorithms with a <code class="computeroutput"><span class="special">[</span><span class="identifier">begin</span><span class="special">,</span> <span class="identifier">end</span><span class="special">)</span></code>
pair.
</p>
<a name="boost_xpressive.user_s_guide.tips_n_tricks.use_static_regexes"></a><h3>
<a name="id3127821"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.tips_n_tricks.use_static_regexes">Use
Static Regexes</a>
</h3>
<p>
On average, static regexes execute about 10 to 15% faster than their dynamic
counterparts. It's worth familiarizing yourself with the static regex dialect.
</p>
<a name="boost_xpressive.user_s_guide.tips_n_tricks.understand__literal_syntax_option_type__optimize__literal_"></a><h3>
<a name="id3127854"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.tips_n_tricks.understand__literal_syntax_option_type__optimize__literal_">Understand
<code class="literal">syntax_option_type::optimize</code></a>
</h3>
<p>
The <code class="computeroutput"><span class="identifier">optimize</span></code> flag tells the
regex compiler to spend some extra time analyzing the pattern. It can cause
some patterns to execute faster, but it increases the time to compile the
pattern, and often increases the amount of memory consumed by the pattern.
If you plan to reuse your pattern, <code class="computeroutput"><span class="identifier">optimize</span></code>
is usually a win. If you will only use the pattern once, don't use <code class="computeroutput"><span class="identifier">optimize</span></code>.
</p>
<a name="boost_xpressive.user_s_guide.tips_n_tricks.common_pitfalls"></a><h2>
<a name="id3127861"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.tips_n_tricks.common_pitfalls">Common
Pitfalls</a>
</h2>
<p>
Keep the following tips in mind to avoid stepping in potholes with xpressive.
</p>
<a name="boost_xpressive.user_s_guide.tips_n_tricks.create_grammars_on_a_single_thread"></a><h3>
<a name="id3127946"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.tips_n_tricks.create_grammars_on_a_single_thread">Create
Grammars On A Single Thread</a>
</h3>
<p>
With static regexes, you can create grammars by nesting regexes inside one
another. When compiling the outer regex, both the outer and inner regex objects,
and all the regex objects to which they refer either directly or indirectly,
are modified. For this reason, it's dangerous for global regex objects to
participate in grammars. It's best to build regex grammars from a single
thread. Once built, the resulting regex grammar can be executed from multiple
threads without problems.
</p>
<a name="boost_xpressive.user_s_guide.tips_n_tricks.beware_nested_quantifiers"></a><h3>
<a name="id3127972"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.tips_n_tricks.beware_nested_quantifiers">Beware
Nested Quantifiers</a>
</h3>
<p>
This is a pitfall common to many regular expression engines. Some patterns
can cause exponentially bad performance. Often these patterns involve one
quantified term nested withing another quantifier, such as <code class="computeroutput"><span class="string">"(a*)*"</span></code>, although in many cases,
the problem is harder to spot. Beware of patterns that have nested quantifiers.
</p>
</div>
<div class="section">
<div class="titlepage"><div><div><h3 class="title">
<a name="boost_xpressive.user_s_guide.concepts"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.concepts" title="Concepts">Concepts</a>
</h3></div></div></div>
<a name="boost_xpressive.user_s_guide.concepts.chart_requirements"></a><h3>
<a name="id3128300"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.concepts.chart_requirements">CharT
requirements</a>
</h3>
<p>
If type <code class="computeroutput"><span class="identifier">BidiIterT</span></code> is used
as a template argument to <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex&lt;&gt;</a></code></code>,
then <code class="computeroutput"><span class="identifier">CharT</span></code> is <code class="computeroutput"><span class="identifier">iterator_traits</span><span class="special">&lt;</span><span class="identifier">BidiIterT</span><span class="special">&gt;::</span><span class="identifier">value_type</span></code>. Type <code class="computeroutput"><span class="identifier">CharT</span></code>
must have a trivial default constructor, copy constructor, assignment operator,
and destructor. In addition the following requirements must be met for objects;
<code class="computeroutput"><span class="identifier">c</span></code> of type <code class="computeroutput"><span class="identifier">CharT</span></code>,
<code class="computeroutput"><span class="identifier">c1</span></code> and <code class="computeroutput"><span class="identifier">c2</span></code>
of type <code class="computeroutput"><span class="identifier">CharT</span> <span class="keyword">const</span></code>,
and <code class="computeroutput"><span class="identifier">i</span></code> of type <code class="computeroutput"><span class="keyword">int</span></code>:
</p>
<div class="table">
<a name="id3128457"></a><p class="title"><b>Table&#160;29.14.&#160;CharT Requirements</b></p>
<div class="table-contents"><table class="table" summary="CharT Requirements">
<colgroup>
<col>
<col>
<col>
</colgroup>
<thead><tr>
<th>
<p>
<span class="bold"><strong>Expression</strong></span>
</p>
</th>
<th>
<p>
<span class="bold"><strong>Return type</strong></span>
</p>
</th>
<th>
<p>
<span class="bold"><strong>Assertion / Note / Pre- / Post-condition</strong></span>
</p>
</th>
</tr></thead>
<tbody>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">CharT</span> <span class="identifier">c</span></code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="identifier">CharT</span></code>
</p>
</td>
<td>
<p>
Default constructor (must be trivial).
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">CharT</span> <span class="identifier">c</span><span class="special">(</span><span class="identifier">c1</span><span class="special">)</span></code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="identifier">CharT</span></code>
</p>
</td>
<td>
<p>
Copy constructor (must be trivial).
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">c1</span> <span class="special">=</span>
<span class="identifier">c2</span></code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="identifier">CharT</span></code>
</p>
</td>
<td>
<p>
Assignment operator (must be trivial).
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">c1</span> <span class="special">==</span>
<span class="identifier">c2</span></code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="keyword">bool</span></code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="keyword">true</span></code> if <code class="computeroutput"><span class="identifier">c1</span></code> has the same value as <code class="computeroutput"><span class="identifier">c2</span></code>.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">c1</span> <span class="special">!=</span>
<span class="identifier">c2</span></code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="keyword">bool</span></code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="keyword">true</span></code> if <code class="computeroutput"><span class="identifier">c1</span></code> and <code class="computeroutput"><span class="identifier">c2</span></code>
are not equal.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">c1</span> <span class="special">&lt;</span>
<span class="identifier">c2</span></code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="keyword">bool</span></code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="keyword">true</span></code> if the value
of <code class="computeroutput"><span class="identifier">c1</span></code> is less than
<code class="computeroutput"><span class="identifier">c2</span></code>.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">c1</span> <span class="special">&gt;</span>
<span class="identifier">c2</span></code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="keyword">bool</span></code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="keyword">true</span></code> if the value
of <code class="computeroutput"><span class="identifier">c1</span></code> is greater
than <code class="computeroutput"><span class="identifier">c2</span></code>.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">c1</span> <span class="special">&lt;=</span>
<span class="identifier">c2</span></code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="keyword">bool</span></code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="keyword">true</span></code> if <code class="computeroutput"><span class="identifier">c1</span></code> is less than or equal to
<code class="computeroutput"><span class="identifier">c2</span></code>.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">c1</span> <span class="special">&gt;=</span>
<span class="identifier">c2</span></code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="keyword">bool</span></code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="keyword">true</span></code> if <code class="computeroutput"><span class="identifier">c1</span></code> is greater than or equal to
<code class="computeroutput"><span class="identifier">c2</span></code>.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">intmax_t</span> <span class="identifier">i</span>
<span class="special">=</span> <span class="identifier">c1</span></code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="keyword">int</span></code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="identifier">CharT</span></code> must be convertible
to an integral type.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">CharT</span> <span class="identifier">c</span><span class="special">(</span><span class="identifier">i</span><span class="special">);</span></code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="identifier">CharT</span></code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="identifier">CharT</span></code> must be constructable
from an integral type.
</p>
</td>
</tr>
</tbody>
</table></div>
</div>
<br class="table-break"><a name="boost_xpressive.user_s_guide.concepts.traits_requirements"></a><h3>
<a name="id3129286"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.concepts.traits_requirements">Traits
Requirements</a>
</h3>
<p>
In the following table <code class="computeroutput"><span class="identifier">X</span></code>
denotes a traits class defining types and functions for the character container
type <code class="computeroutput"><span class="identifier">CharT</span></code>; <code class="computeroutput"><span class="identifier">u</span></code> is an object of type <code class="computeroutput"><span class="identifier">X</span></code>;
<code class="computeroutput"><span class="identifier">v</span></code> is an object of type <code class="computeroutput"><span class="keyword">const</span> <span class="identifier">X</span></code>;
<code class="computeroutput"><span class="identifier">p</span></code> is a value of type <code class="computeroutput"><span class="keyword">const</span> <span class="identifier">CharT</span><span class="special">*</span></code>; <code class="computeroutput"><span class="identifier">I1</span></code>
and <code class="computeroutput"><span class="identifier">I2</span></code> are <code class="computeroutput"><span class="identifier">Input</span> <span class="identifier">Iterators</span></code>;
<code class="computeroutput"><span class="identifier">c</span></code> is a value of type <code class="computeroutput"><span class="keyword">const</span> <span class="identifier">CharT</span></code>;
<code class="computeroutput"><span class="identifier">s</span></code> is an object of type <code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">string_type</span></code>;
<code class="computeroutput"><span class="identifier">cs</span></code> is an object of type
<code class="computeroutput"><span class="keyword">const</span> <span class="identifier">X</span><span class="special">::</span><span class="identifier">string_type</span></code>;
<code class="computeroutput"><span class="identifier">b</span></code> is a value of type <code class="computeroutput"><span class="keyword">bool</span></code>; <code class="computeroutput"><span class="identifier">i</span></code>
is a value of type <code class="computeroutput"><span class="keyword">int</span></code>; <code class="computeroutput"><span class="identifier">F1</span></code> and <code class="computeroutput"><span class="identifier">F2</span></code>
are values of type <code class="computeroutput"><span class="keyword">const</span> <span class="identifier">CharT</span><span class="special">*</span></code>; <code class="computeroutput"><span class="identifier">loc</span></code>
is an object of type <code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">locale_type</span></code>; and <code class="computeroutput"><span class="identifier">ch</span></code>
is an object of <code class="computeroutput"><span class="keyword">const</span> <span class="keyword">char</span></code>.
</p>
<div class="table">
<a name="id3129631"></a><p class="title"><b>Table&#160;29.15.&#160;Traits Requirements</b></p>
<div class="table-contents"><table class="table" summary="Traits Requirements">
<colgroup>
<col>
<col>
<col>
</colgroup>
<thead><tr>
<th>
<p>
<span class="bold"><strong>Expression</strong></span>
</p>
</th>
<th>
<p>
<span class="bold"><strong>Return type</strong></span>
</p>
</th>
<th>
<p>
<span class="bold"><strong>Assertion / Note<br> Pre / Post condition</strong></span>
</p>
</th>
</tr></thead>
<tbody>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">char_type</span></code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="identifier">CharT</span></code>
</p>
</td>
<td>
<p>
The character container type used in the implementation of class
template <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex&lt;&gt;</a></code></code>.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">string_type</span></code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">basic_string</span><span class="special">&lt;</span><span class="identifier">CharT</span><span class="special">&gt;</span></code>
or <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">vector</span><span class="special">&lt;</span><span class="identifier">CharT</span><span class="special">&gt;</span></code>
</p>
</td>
<td>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">locale_type</span></code>
</p>
</td>
<td>
<p>
<span class="emphasis"><em>Implementation defined</em></span>
</p>
</td>
<td>
<p>
A copy constructible type that represents the locale used by the
traits class.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">char_class_type</span></code>
</p>
</td>
<td>
<p>
<span class="emphasis"><em>Implementation defined</em></span>
</p>
</td>
<td>
<p>
A bitmask type representing a particular character classification.
Multiple values of this type can be bitwise-or'ed together to obtain
a new valid value.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">hash</span><span class="special">(</span><span class="identifier">c</span><span class="special">)</span></code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="keyword">unsigned</span> <span class="keyword">char</span></code>
</p>
</td>
<td>
<p>
Yields a value between <code class="computeroutput"><span class="number">0</span></code>
and <code class="computeroutput"><span class="identifier">UCHAR_MAX</span></code> inclusive.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">widen</span><span class="special">(</span><span class="identifier">ch</span><span class="special">)</span></code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="identifier">CharT</span></code>
</p>
</td>
<td>
<p>
Widens the specified <code class="computeroutput"><span class="keyword">char</span></code>
and returns the resulting <code class="computeroutput"><span class="identifier">CharT</span></code>.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">in_range</span><span class="special">(</span><span class="identifier">r1</span><span class="special">,</span>
<span class="identifier">r2</span><span class="special">,</span>
<span class="identifier">c</span><span class="special">)</span></code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="keyword">bool</span></code>
</p>
</td>
<td>
<p>
For any characters <code class="computeroutput"><span class="identifier">r1</span></code>
and <code class="computeroutput"><span class="identifier">r2</span></code>, returns
<code class="computeroutput"><span class="keyword">true</span></code> if <code class="computeroutput"><span class="identifier">r1</span> <span class="special">&lt;=</span>
<span class="identifier">c</span> <span class="special">&amp;&amp;</span>
<span class="identifier">c</span> <span class="special">&lt;=</span>
<span class="identifier">r2</span></code>. Requires that <code class="computeroutput"><span class="identifier">r1</span> <span class="special">&lt;=</span>
<span class="identifier">r2</span></code>.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">in_range_nocase</span><span class="special">(</span><span class="identifier">r1</span><span class="special">,</span>
<span class="identifier">r2</span><span class="special">,</span>
<span class="identifier">c</span><span class="special">)</span></code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="keyword">bool</span></code>
</p>
</td>
<td>
<p>
For characters <code class="computeroutput"><span class="identifier">r1</span></code>
and <code class="computeroutput"><span class="identifier">r2</span></code>, returns
<code class="computeroutput"><span class="keyword">true</span></code> if there is some
character <code class="computeroutput"><span class="identifier">d</span></code> for
which <code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">translate_nocase</span><span class="special">(</span><span class="identifier">d</span><span class="special">)</span>
<span class="special">==</span> <span class="identifier">v</span><span class="special">.</span><span class="identifier">translate_nocase</span><span class="special">(</span><span class="identifier">c</span><span class="special">)</span></code> and <code class="computeroutput"><span class="identifier">r1</span>
<span class="special">&lt;=</span> <span class="identifier">d</span>
<span class="special">&amp;&amp;</span> <span class="identifier">d</span>
<span class="special">&lt;=</span> <span class="identifier">r2</span></code>.
Requires that <code class="computeroutput"><span class="identifier">r1</span> <span class="special">&lt;=</span> <span class="identifier">r2</span></code>.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">translate</span><span class="special">(</span><span class="identifier">c</span><span class="special">)</span></code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">char_type</span></code>
</p>
</td>
<td>
<p>
Returns a character such that for any character <code class="computeroutput"><span class="identifier">d</span></code>
that is to be considered equivalent to <code class="computeroutput"><span class="identifier">c</span></code>
then <code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">translate</span><span class="special">(</span><span class="identifier">c</span><span class="special">)</span>
<span class="special">==</span> <span class="identifier">v</span><span class="special">.</span><span class="identifier">translate</span><span class="special">(</span><span class="identifier">d</span><span class="special">)</span></code>.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">translate_nocase</span><span class="special">(</span><span class="identifier">c</span><span class="special">)</span></code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">char_type</span></code>
</p>
</td>
<td>
<p>
For all characters <code class="computeroutput"><span class="identifier">C</span></code>
that are to be considered equivalent to <code class="computeroutput"><span class="identifier">c</span></code>
when comparisons are to be performed without regard to case, then
<code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">translate_nocase</span><span class="special">(</span><span class="identifier">c</span><span class="special">)</span>
<span class="special">==</span> <span class="identifier">v</span><span class="special">.</span><span class="identifier">translate_nocase</span><span class="special">(</span><span class="identifier">C</span><span class="special">)</span></code>.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">transform</span><span class="special">(</span><span class="identifier">F1</span><span class="special">,</span>
<span class="identifier">F2</span><span class="special">)</span></code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">string_type</span></code>
</p>
</td>
<td>
<p>
Returns a sort key for the character sequence designated by the
iterator range <code class="computeroutput"><span class="special">[</span><span class="identifier">F1</span><span class="special">,</span> <span class="identifier">F2</span><span class="special">)</span></code> such that if the character sequence
<code class="computeroutput"><span class="special">[</span><span class="identifier">G1</span><span class="special">,</span> <span class="identifier">G2</span><span class="special">)</span></code> sorts before the character sequence
<code class="computeroutput"><span class="special">[</span><span class="identifier">H1</span><span class="special">,</span> <span class="identifier">H2</span><span class="special">)</span></code> then <code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">transform</span><span class="special">(</span><span class="identifier">G1</span><span class="special">,</span> <span class="identifier">G2</span><span class="special">)</span> <span class="special">&lt;</span>
<span class="identifier">v</span><span class="special">.</span><span class="identifier">transform</span><span class="special">(</span><span class="identifier">H1</span><span class="special">,</span>
<span class="identifier">H2</span><span class="special">)</span></code>.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">transform_primary</span><span class="special">(</span><span class="identifier">F1</span><span class="special">,</span>
<span class="identifier">F2</span><span class="special">)</span></code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">string_type</span></code>
</p>
</td>
<td>
<p>
Returns a sort key for the character sequence designated by the
iterator range <code class="computeroutput"><span class="special">[</span><span class="identifier">F1</span><span class="special">,</span> <span class="identifier">F2</span><span class="special">)</span></code> such that if the character sequence
<code class="computeroutput"><span class="special">[</span><span class="identifier">G1</span><span class="special">,</span> <span class="identifier">G2</span><span class="special">)</span></code> sorts before the character sequence
<code class="computeroutput"><span class="special">[</span><span class="identifier">H1</span><span class="special">,</span> <span class="identifier">H2</span><span class="special">)</span></code> when character case is not considered
then <code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">transform_primary</span><span class="special">(</span><span class="identifier">G1</span><span class="special">,</span>
<span class="identifier">G2</span><span class="special">)</span>
<span class="special">&lt;</span> <span class="identifier">v</span><span class="special">.</span><span class="identifier">transform_primary</span><span class="special">(</span><span class="identifier">H1</span><span class="special">,</span> <span class="identifier">H2</span><span class="special">)</span></code>.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">lookup_classname</span><span class="special">(</span><span class="identifier">F1</span><span class="special">,</span>
<span class="identifier">F2</span><span class="special">)</span></code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">char_class_type</span></code>
</p>
</td>
<td>
<p>
Converts the character sequence designated by the iterator range
<code class="computeroutput"><span class="special">[</span><span class="identifier">F1</span><span class="special">,</span><span class="identifier">F2</span><span class="special">)</span></code> into a bitmask type that can subsequently
be passed to <code class="computeroutput"><span class="identifier">isctype</span></code>.
Values returned from <code class="computeroutput"><span class="identifier">lookup_classname</span></code>
can be safely bitwise or'ed together. Returns <code class="computeroutput"><span class="number">0</span></code>
if the character sequence is not the name of a character class
recognized by <code class="computeroutput"><span class="identifier">X</span></code>.
The value returned shall be independent of the case of the characters
in the sequence.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">lookup_collatename</span><span class="special">(</span><span class="identifier">F1</span><span class="special">,</span>
<span class="identifier">F2</span><span class="special">)</span></code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">string_type</span></code>
</p>
</td>
<td>
<p>
Returns a sequence of characters that represents the collating
element consisting of the character sequence designated by the
iterator range <code class="computeroutput"><span class="special">[</span><span class="identifier">F1</span><span class="special">,</span> <span class="identifier">F2</span><span class="special">)</span></code>. Returns an empty string if the
character sequence is not a valid collating element.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">isctype</span><span class="special">(</span><span class="identifier">c</span><span class="special">,</span>
<span class="identifier">v</span><span class="special">.</span><span class="identifier">lookup_classname</span><span class="special">(</span><span class="identifier">F1</span><span class="special">,</span>
<span class="identifier">F2</span><span class="special">))</span></code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="keyword">bool</span></code>
</p>
</td>
<td>
<p>
Returns <code class="computeroutput"><span class="keyword">true</span></code> if character
<code class="computeroutput"><span class="identifier">c</span></code> is a member of
the character class designated by the iterator range <code class="computeroutput"><span class="special">[</span><span class="identifier">F1</span><span class="special">,</span> <span class="identifier">F2</span><span class="special">)</span></code>, <code class="computeroutput"><span class="keyword">false</span></code>
otherwise.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">value</span><span class="special">(</span><span class="identifier">c</span><span class="special">,</span>
<span class="identifier">i</span><span class="special">)</span></code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="keyword">int</span></code>
</p>
</td>
<td>
<p>
Returns the value represented by the digit <code class="computeroutput"><span class="identifier">c</span></code>
in base <code class="computeroutput"><span class="identifier">i</span></code> if the
character <code class="computeroutput"><span class="identifier">c</span></code> is
a valid digit in base <code class="computeroutput"><span class="identifier">i</span></code>;
otherwise returns <code class="computeroutput"><span class="special">-</span><span class="number">1</span></code>.<br> [Note: the value of <code class="computeroutput"><span class="identifier">i</span></code> will only be <code class="computeroutput"><span class="number">8</span></code>, <code class="computeroutput"><span class="number">10</span></code>,
or <code class="computeroutput"><span class="number">16</span></code>. -end note]
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">u</span><span class="special">.</span><span class="identifier">imbue</span><span class="special">(</span><span class="identifier">loc</span><span class="special">)</span></code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">locale_type</span></code>
</p>
</td>
<td>
<p>
Imbues <code class="computeroutput"><span class="identifier">u</span></code> with the
locale <code class="computeroutput"><span class="identifier">loc</span></code>, returns
the previous locale used by <code class="computeroutput"><span class="identifier">u</span></code>.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">getloc</span><span class="special">()</span></code>
</p>
</td>
<td>
<p>
<code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">locale_type</span></code>
</p>
</td>
<td>
<p>
Returns the current locale used by <code class="computeroutput"><span class="identifier">v</span></code>.
</p>
</td>
</tr>
</tbody>
</table></div>
</div>
<br class="table-break"><a name="boost_xpressive.user_s_guide.concepts.acknowledgements"></a><h3>
<a name="id3132089"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.concepts.acknowledgements">Acknowledgements</a>
</h3>
<p>
This section is adapted from the equivalent page in the <a href="../../../libs/regex" target="_top">Boost.Regex</a>
documentation and from the <a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2003/n1429.htm" target="_top">proposal</a>
to add regular expressions to the Standard Library.
</p>
</div>
<div class="section">
<div class="titlepage"><div><div><h3 class="title">
<a name="boost_xpressive.user_s_guide.examples"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples" title="Examples">Examples</a>
</h3></div></div></div>
<p>
Below you can find six complete sample programs. <br>
</p>
<p></p>
<a name="boost_xpressive.user_s_guide.examples.see_if_a_whole_string_matches_a_regex"></a><h5>
<a name="id3132157"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.see_if_a_whole_string_matches_a_regex">See
if a whole string matches a regex</a>
</h5>
<p>
This is the example from the Introduction. It is reproduced here for your
convenience.
</p>
<pre class="programlisting"><span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">iostream</span><span class="special">&gt;</span>
<span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">&gt;</span>
<span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span><span class="special">;</span>
<span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span>
<span class="special">{</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">hello</span><span class="special">(</span> <span class="string">"hello world!"</span> <span class="special">);</span>
<span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="identifier">sregex</span><span class="special">::</span><span class="identifier">compile</span><span class="special">(</span> <span class="string">"(\\w+) (\\w+)!"</span> <span class="special">);</span>
<span class="identifier">smatch</span> <span class="identifier">what</span><span class="special">;</span>
<span class="keyword">if</span><span class="special">(</span> <span class="identifier">regex_match</span><span class="special">(</span> <span class="identifier">hello</span><span class="special">,</span> <span class="identifier">what</span><span class="special">,</span> <span class="identifier">rex</span> <span class="special">)</span> <span class="special">)</span>
<span class="special">{</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="identifier">what</span><span class="special">[</span><span class="number">0</span><span class="special">]</span> <span class="special">&lt;&lt;</span> <span class="char">'\n'</span><span class="special">;</span> <span class="comment">// whole match
</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="identifier">what</span><span class="special">[</span><span class="number">1</span><span class="special">]</span> <span class="special">&lt;&lt;</span> <span class="char">'\n'</span><span class="special">;</span> <span class="comment">// first capture
</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="identifier">what</span><span class="special">[</span><span class="number">2</span><span class="special">]</span> <span class="special">&lt;&lt;</span> <span class="char">'\n'</span><span class="special">;</span> <span class="comment">// second capture
</span> <span class="special">}</span>
<span class="keyword">return</span> <span class="number">0</span><span class="special">;</span>
<span class="special">}</span>
</pre>
<p>
This program outputs the following:
</p>
<pre class="programlisting">hello world!
hello
world
</pre>
<p>
<br> <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples" title="Examples">top</a>
</p>
<p></p>
<a name="boost_xpressive.user_s_guide.examples.see_if_a_string_contains_a_sub_string_that_matches_a_regex"></a><h5>
<a name="id3132694"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.see_if_a_string_contains_a_sub_string_that_matches_a_regex">See
if a string contains a sub-string that matches a regex</a>
</h5>
<p>
Notice in this example how we use custom <code class="computeroutput"><span class="identifier">mark_tag</span></code>s
to make the pattern more readable. We can use the <code class="computeroutput"><span class="identifier">mark_tag</span></code>s
later to index into the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results&lt;&gt;</a></code></code>.
</p>
<pre class="programlisting"><span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">iostream</span><span class="special">&gt;</span>
<span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">&gt;</span>
<span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span><span class="special">;</span>
<span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span>
<span class="special">{</span>
<span class="keyword">char</span> <span class="keyword">const</span> <span class="special">*</span><span class="identifier">str</span> <span class="special">=</span> <span class="string">"I was born on 5/30/1973 at 7am."</span><span class="special">;</span>
<span class="comment">// define some custom mark_tags with names more meaningful than s1, s2, etc.
</span> <span class="identifier">mark_tag</span> <span class="identifier">day</span><span class="special">(</span><span class="number">1</span><span class="special">),</span> <span class="identifier">month</span><span class="special">(</span><span class="number">2</span><span class="special">),</span> <span class="identifier">year</span><span class="special">(</span><span class="number">3</span><span class="special">),</span> <span class="identifier">delim</span><span class="special">(</span><span class="number">4</span><span class="special">);</span>
<span class="comment">// this regex finds a date
</span> <span class="identifier">cregex</span> <span class="identifier">date</span> <span class="special">=</span> <span class="special">(</span><span class="identifier">month</span><span class="special">=</span> <span class="identifier">repeat</span><span class="special">&lt;</span><span class="number">1</span><span class="special">,</span><span class="number">2</span><span class="special">&gt;(</span><span class="identifier">_d</span><span class="special">))</span> <span class="comment">// find the month ...
</span> <span class="special">&gt;&gt;</span> <span class="special">(</span><span class="identifier">delim</span><span class="special">=</span> <span class="special">(</span><span class="identifier">set</span><span class="special">=</span> <span class="char">'/'</span><span class="special">,</span><span class="char">'-'</span><span class="special">))</span> <span class="comment">// followed by a delimiter ...
</span> <span class="special">&gt;&gt;</span> <span class="special">(</span><span class="identifier">day</span><span class="special">=</span> <span class="identifier">repeat</span><span class="special">&lt;</span><span class="number">1</span><span class="special">,</span><span class="number">2</span><span class="special">&gt;(</span><span class="identifier">_d</span><span class="special">))</span> <span class="special">&gt;&gt;</span> <span class="identifier">delim</span> <span class="comment">// and a day followed by the same delimiter ...
</span> <span class="special">&gt;&gt;</span> <span class="special">(</span><span class="identifier">year</span><span class="special">=</span> <span class="identifier">repeat</span><span class="special">&lt;</span><span class="number">1</span><span class="special">,</span><span class="number">2</span><span class="special">&gt;(</span><span class="identifier">_d</span> <span class="special">&gt;&gt;</span> <span class="identifier">_d</span><span class="special">));</span> <span class="comment">// and the year.
</span>
<span class="identifier">cmatch</span> <span class="identifier">what</span><span class="special">;</span>
<span class="keyword">if</span><span class="special">(</span> <span class="identifier">regex_search</span><span class="special">(</span> <span class="identifier">str</span><span class="special">,</span> <span class="identifier">what</span><span class="special">,</span> <span class="identifier">date</span> <span class="special">)</span> <span class="special">)</span>
<span class="special">{</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="identifier">what</span><span class="special">[</span><span class="number">0</span><span class="special">]</span> <span class="special">&lt;&lt;</span> <span class="char">'\n'</span><span class="special">;</span> <span class="comment">// whole match
</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="identifier">what</span><span class="special">[</span><span class="identifier">day</span><span class="special">]</span> <span class="special">&lt;&lt;</span> <span class="char">'\n'</span><span class="special">;</span> <span class="comment">// the day
</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="identifier">what</span><span class="special">[</span><span class="identifier">month</span><span class="special">]</span> <span class="special">&lt;&lt;</span> <span class="char">'\n'</span><span class="special">;</span> <span class="comment">// the month
</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="identifier">what</span><span class="special">[</span><span class="identifier">year</span><span class="special">]</span> <span class="special">&lt;&lt;</span> <span class="char">'\n'</span><span class="special">;</span> <span class="comment">// the year
</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="identifier">what</span><span class="special">[</span><span class="identifier">delim</span><span class="special">]</span> <span class="special">&lt;&lt;</span> <span class="char">'\n'</span><span class="special">;</span> <span class="comment">// the delimiter
</span> <span class="special">}</span>
<span class="keyword">return</span> <span class="number">0</span><span class="special">;</span>
<span class="special">}</span>
</pre>
<p>
This program outputs the following:
</p>
<pre class="programlisting">5/30/1973
30
5
1973
/
</pre>
<p>
<br> <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples" title="Examples">top</a>
</p>
<p></p>
<a name="boost_xpressive.user_s_guide.examples.replace_all_sub_strings_that_match_a_regex"></a><h5>
<a name="id3133704"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.replace_all_sub_strings_that_match_a_regex">Replace
all sub-strings that match a regex</a>
</h5>
<p>
The following program finds dates in a string and marks them up with pseudo-HTML.
</p>
<pre class="programlisting"><span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">iostream</span><span class="special">&gt;</span>
<span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">&gt;</span>
<span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span><span class="special">;</span>
<span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span>
<span class="special">{</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span> <span class="string">"I was born on 5/30/1973 at 7am."</span> <span class="special">);</span>
<span class="comment">// essentially the same regex as in the previous example, but using a dynamic regex
</span> <span class="identifier">sregex</span> <span class="identifier">date</span> <span class="special">=</span> <span class="identifier">sregex</span><span class="special">::</span><span class="identifier">compile</span><span class="special">(</span> <span class="string">"(\\d{1,2})([/-])(\\d{1,2})\\2((?:\\d{2}){1,2})"</span> <span class="special">);</span>
<span class="comment">// As in Perl, $&amp; is a reference to the sub-string that matched the regex
</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">format</span><span class="special">(</span> <span class="string">"&lt;date&gt;$&amp;&lt;/date&gt;"</span> <span class="special">);</span>
<span class="identifier">str</span> <span class="special">=</span> <span class="identifier">regex_replace</span><span class="special">(</span> <span class="identifier">str</span><span class="special">,</span> <span class="identifier">date</span><span class="special">,</span> <span class="identifier">format</span> <span class="special">);</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="identifier">str</span> <span class="special">&lt;&lt;</span> <span class="char">'\n'</span><span class="special">;</span>
<span class="keyword">return</span> <span class="number">0</span><span class="special">;</span>
<span class="special">}</span>
</pre>
<p>
This program outputs the following:
</p>
<pre class="programlisting">I was born on &lt;date&gt;5/30/1973&lt;/date&gt; at 7am.
</pre>
<p>
<br> <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples" title="Examples">top</a>
</p>
<p></p>
<a name="boost_xpressive.user_s_guide.examples.find_all_the_sub_strings_that_match_a_regex_and_step_through_them_one_at_a_time"></a><h5>
<a name="id3134127"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.find_all_the_sub_strings_that_match_a_regex_and_step_through_them_one_at_a_time">Find
all the sub-strings that match a regex and step through them one at a time</a>
</h5>
<p>
The following program finds the words in a wide-character string. It uses
<code class="computeroutput"><span class="identifier">wsregex_iterator</span></code>. Notice
that dereferencing a <code class="computeroutput"><span class="identifier">wsregex_iterator</span></code>
yields a <code class="computeroutput"><span class="identifier">wsmatch</span></code> object.
</p>
<pre class="programlisting"><span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">iostream</span><span class="special">&gt;</span>
<span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">&gt;</span>
<span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span><span class="special">;</span>
<span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span>
<span class="special">{</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">wstring</span> <span class="identifier">str</span><span class="special">(</span> <span class="identifier">L</span><span class="string">"This is his face."</span> <span class="special">);</span>
<span class="comment">// find a whole word
</span> <span class="identifier">wsregex</span> <span class="identifier">token</span> <span class="special">=</span> <span class="special">+</span><span class="identifier">alnum</span><span class="special">;</span>
<span class="identifier">wsregex_iterator</span> <span class="identifier">cur</span><span class="special">(</span> <span class="identifier">str</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">str</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">token</span> <span class="special">);</span>
<span class="identifier">wsregex_iterator</span> <span class="identifier">end</span><span class="special">;</span>
<span class="keyword">for</span><span class="special">(</span> <span class="special">;</span> <span class="identifier">cur</span> <span class="special">!=</span> <span class="identifier">end</span><span class="special">;</span> <span class="special">++</span><span class="identifier">cur</span> <span class="special">)</span>
<span class="special">{</span>
<span class="identifier">wsmatch</span> <span class="keyword">const</span> <span class="special">&amp;</span><span class="identifier">what</span> <span class="special">=</span> <span class="special">*</span><span class="identifier">cur</span><span class="special">;</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">wcout</span> <span class="special">&lt;&lt;</span> <span class="identifier">what</span><span class="special">[</span><span class="number">0</span><span class="special">]</span> <span class="special">&lt;&lt;</span> <span class="identifier">L</span><span class="char">'\n'</span><span class="special">;</span>
<span class="special">}</span>
<span class="keyword">return</span> <span class="number">0</span><span class="special">;</span>
<span class="special">}</span>
</pre>
<p>
This program outputs the following:
</p>
<pre class="programlisting">This
is
his
face
</pre>
<p>
<br> <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples" title="Examples">top</a>
</p>
<p></p>
<a name="boost_xpressive.user_s_guide.examples.split_a_string_into_tokens_that_each_match_a_regex"></a><h5>
<a name="id3134666"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.split_a_string_into_tokens_that_each_match_a_regex">Split
a string into tokens that each match a regex</a>
</h5>
<p>
The following program finds race times in a string and displays first the
minutes and then the seconds. It uses <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator&lt;&gt;</a></code></code>.
</p>
<pre class="programlisting"><span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">iostream</span><span class="special">&gt;</span>
<span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">&gt;</span>
<span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span><span class="special">;</span>
<span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span>
<span class="special">{</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span> <span class="string">"Eric: 4:40, Karl: 3:35, Francesca: 2:32"</span> <span class="special">);</span>
<span class="comment">// find a race time
</span> <span class="identifier">sregex</span> <span class="identifier">time</span> <span class="special">=</span> <span class="identifier">sregex</span><span class="special">::</span><span class="identifier">compile</span><span class="special">(</span> <span class="string">"(\\d):(\\d\\d)"</span> <span class="special">);</span>
<span class="comment">// for each match, the token iterator should first take the value of
</span> <span class="comment">// the first marked sub-expression followed by the value of the second
</span> <span class="comment">// marked sub-expression
</span> <span class="keyword">int</span> <span class="keyword">const</span> <span class="identifier">subs</span><span class="special">[]</span> <span class="special">=</span> <span class="special">{</span> <span class="number">1</span><span class="special">,</span> <span class="number">2</span> <span class="special">};</span>
<span class="identifier">sregex_token_iterator</span> <span class="identifier">cur</span><span class="special">(</span> <span class="identifier">str</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">str</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">time</span><span class="special">,</span> <span class="identifier">subs</span> <span class="special">);</span>
<span class="identifier">sregex_token_iterator</span> <span class="identifier">end</span><span class="special">;</span>
<span class="keyword">for</span><span class="special">(</span> <span class="special">;</span> <span class="identifier">cur</span> <span class="special">!=</span> <span class="identifier">end</span><span class="special">;</span> <span class="special">++</span><span class="identifier">cur</span> <span class="special">)</span>
<span class="special">{</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="special">*</span><span class="identifier">cur</span> <span class="special">&lt;&lt;</span> <span class="char">'\n'</span><span class="special">;</span>
<span class="special">}</span>
<span class="keyword">return</span> <span class="number">0</span><span class="special">;</span>
<span class="special">}</span>
</pre>
<p>
This program outputs the following:
</p>
<pre class="programlisting">4
40
3
35
2
32
</pre>
<p>
<br> <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples" title="Examples">top</a>
</p>
<p></p>
<a name="boost_xpressive.user_s_guide.examples.split_a_string_using_a_regex_as_a_delimiter"></a><h5>
<a name="id3135229"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.split_a_string_using_a_regex_as_a_delimiter">Split
a string using a regex as a delimiter</a>
</h5>
<p>
The following program takes some text that has been marked up with html and
strips out the mark-up. It uses a regex that matches an HTML tag and a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator&lt;&gt;</a></code></code>
that returns the parts of the string that do <span class="emphasis"><em>not</em></span> match
the regex.
</p>
<pre class="programlisting"><span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">iostream</span><span class="special">&gt;</span>
<span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">&gt;</span>
<span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span><span class="special">;</span>
<span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span>
<span class="special">{</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span> <span class="string">"Now &lt;bold&gt;is the time &lt;i&gt;for all good men&lt;/i&gt; to come to the aid of their&lt;/bold&gt; country."</span> <span class="special">);</span>
<span class="comment">// find a HTML tag
</span> <span class="identifier">sregex</span> <span class="identifier">html</span> <span class="special">=</span> <span class="char">'&lt;'</span> <span class="special">&gt;&gt;</span> <span class="identifier">optional</span><span class="special">(</span><span class="char">'/'</span><span class="special">)</span> <span class="special">&gt;&gt;</span> <span class="special">+</span><span class="identifier">_w</span> <span class="special">&gt;&gt;</span> <span class="char">'&gt;'</span><span class="special">;</span>
<span class="comment">// the -1 below directs the token iterator to display the parts of
</span> <span class="comment">// the string that did NOT match the regular expression.
</span> <span class="identifier">sregex_token_iterator</span> <span class="identifier">cur</span><span class="special">(</span> <span class="identifier">str</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">str</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">html</span><span class="special">,</span> <span class="special">-</span><span class="number">1</span> <span class="special">);</span>
<span class="identifier">sregex_token_iterator</span> <span class="identifier">end</span><span class="special">;</span>
<span class="keyword">for</span><span class="special">(</span> <span class="special">;</span> <span class="identifier">cur</span> <span class="special">!=</span> <span class="identifier">end</span><span class="special">;</span> <span class="special">++</span><span class="identifier">cur</span> <span class="special">)</span>
<span class="special">{</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="char">'{'</span> <span class="special">&lt;&lt;</span> <span class="special">*</span><span class="identifier">cur</span> <span class="special">&lt;&lt;</span> <span class="char">'}'</span><span class="special">;</span>
<span class="special">}</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="char">'\n'</span><span class="special">;</span>
<span class="keyword">return</span> <span class="number">0</span><span class="special">;</span>
<span class="special">}</span>
</pre>
<p>
This program outputs the following:
</p>
<pre class="programlisting">{Now }{is the time }{for all good men}{ to come to the aid of their}{ country.}
</pre>
<p>
<br> <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples" title="Examples">top</a>
</p>
<p></p>
<a name="boost_xpressive.user_s_guide.examples.display_a_tree_of_nested_results"></a><h5>
<a name="id3135817"></a>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.display_a_tree_of_nested_results">Display
a tree of nested results</a>
</h5>
<p>
Here is a helper class to demonstrate how you might display a tree of nested
results:
</p>
<pre class="programlisting"><span class="comment">// Displays nested results to std::cout with indenting
</span><span class="keyword">struct</span> <span class="identifier">output_nested_results</span>
<span class="special">{</span>
<span class="keyword">int</span> <span class="identifier">tabs_</span><span class="special">;</span>
<span class="identifier">output_nested_results</span><span class="special">(</span> <span class="keyword">int</span> <span class="identifier">tabs</span> <span class="special">=</span> <span class="number">0</span> <span class="special">)</span>
<span class="special">:</span> <span class="identifier">tabs_</span><span class="special">(</span> <span class="identifier">tabs</span> <span class="special">)</span>
<span class="special">{</span>
<span class="special">}</span>
<span class="keyword">template</span><span class="special">&lt;</span> <span class="keyword">typename</span> <span class="identifier">BidiIterT</span> <span class="special">&gt;</span>
<span class="keyword">void</span> <span class="keyword">operator</span> <span class="special">()(</span> <span class="identifier">match_results</span><span class="special">&lt;</span> <span class="identifier">BidiIterT</span> <span class="special">&gt;</span> <span class="keyword">const</span> <span class="special">&amp;</span><span class="identifier">what</span> <span class="special">)</span> <span class="keyword">const</span>
<span class="special">{</span>
<span class="comment">// first, do some indenting
</span> <span class="keyword">typedef</span> <span class="keyword">typename</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">iterator_traits</span><span class="special">&lt;</span> <span class="identifier">BidiIterT</span> <span class="special">&gt;::</span><span class="identifier">value_type</span> <span class="identifier">char_type</span><span class="special">;</span>
<span class="identifier">char_type</span> <span class="identifier">space_ch</span> <span class="special">=</span> <span class="identifier">char_type</span><span class="special">(</span><span class="char">' '</span><span class="special">);</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">fill_n</span><span class="special">(</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">ostream_iterator</span><span class="special">&lt;</span><span class="identifier">char_type</span><span class="special">&gt;(</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">),</span> <span class="identifier">tabs_</span> <span class="special">*</span> <span class="number">4</span><span class="special">,</span> <span class="identifier">space_ch</span> <span class="special">);</span>
<span class="comment">// output the match
</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="identifier">what</span><span class="special">[</span><span class="number">0</span><span class="special">]</span> <span class="special">&lt;&lt;</span> <span class="char">'\n'</span><span class="special">;</span>
<span class="comment">// output any nested matches
</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">for_each</span><span class="special">(</span>
<span class="identifier">what</span><span class="special">.</span><span class="identifier">nested_results</span><span class="special">().</span><span class="identifier">begin</span><span class="special">(),</span>
<span class="identifier">what</span><span class="special">.</span><span class="identifier">nested_results</span><span class="special">().</span><span class="identifier">end</span><span class="special">(),</span>
<span class="identifier">output_nested_results</span><span class="special">(</span> <span class="identifier">tabs_</span> <span class="special">+</span> <span class="number">1</span> <span class="special">)</span> <span class="special">);</span>
<span class="special">}</span>
<span class="special">};</span>
</pre>
<p>
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples" title="Examples">top</a>
</p>
</div>
<div class="footnotes">
<br><hr width="100" align="left">
<div class="footnote"><p><sup>[<a name="ftn.id3091425" href="#id3091425" class="para">4</a>] </sup>
See <a href="http://www.osl.iu.edu/~tveldhui/papers/Expression-Templates/exprtmpl.html" target="_top">Expression
Templates</a>
</p></div>
<div class="footnote"><p><sup>[<a name="ftn.id3125582" href="#id3125582" class="para">5</a>] </sup>
Many thanks to David Jenkins, who contributed this example.
</p></div>
</div>
</div>
<table xmlns:rev="http://www.cs.rpi.edu/~gregod/boost/tools/doc/revision" width="100%"><tr>
<td align="left"></td>
<td align="right"><div class="copyright-footer">Copyright &#169; 2007 Eric Niebler<p>
Distributed under the Boost Software License, Version 1.0. (See accompanying
file LICENSE_1_0.txt or copy at <a href="http://www.boost.org/LICENSE_1_0.txt" target="_top">http://www.boost.org/LICENSE_1_0.txt</a>)
</p>
</div></td>
</tr></table>
<hr>
<div class="spirit-nav">
<a accesskey="p" href="../xpressive.html"><img src="../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../xpressive.html"><img src="../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../index.html"><img src="../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="reference.html"><img src="../../../doc/src/images/next.png" alt="Next"></a>
</div>
</body>
</html>