blob: f571d35cff0399a84d6cf3668beb8b51768422c4 [file] [log] [blame]
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=US-ASCII">
<title>Supported Regular Expressions</title>
<link rel="stylesheet" href="../../../../../../../doc/src/boostbook.css" type="text/css">
<meta name="generator" content="DocBook XSL Stylesheets V1.75.0">
<link rel="home" href="../../../index.html" title="Spirit 2.4.1">
<link rel="up" href="../quick_reference.html" title="Quick Reference">
<link rel="prev" href="phoenix.html" title="Phoenix">
<link rel="next" href="../reference.html" title="Reference">
</head>
<body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF">
<table cellpadding="2" width="100%"><tr>
<td valign="top"><img alt="Boost C++ Libraries" width="277" height="86" src="../../../../../../../boost.png"></td>
<td align="center"><a href="../../../../../../../index.html">Home</a></td>
<td align="center"><a href="../../../../../../../libs/libraries.htm">Libraries</a></td>
<td align="center"><a href="http://www.boost.org/users/people.html">People</a></td>
<td align="center"><a href="http://www.boost.org/users/faq.html">FAQ</a></td>
<td align="center"><a href="../../../../../../../more/index.htm">More</a></td>
</tr></table>
<hr>
<div class="spirit-nav">
<a accesskey="p" href="phoenix.html"><img src="../../../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../quick_reference.html"><img src="../../../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../../../index.html"><img src="../../../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="../reference.html"><img src="../../../../../../../doc/src/images/next.png" alt="Next"></a>
</div>
<div class="section">
<div class="titlepage"><div><div><h4 class="title">
<a name="spirit.lex.quick_reference.lexer"></a><a class="link" href="lexer.html" title="Supported Regular Expressions">Supported Regular
Expressions</a>
</h4></div></div></div>
<div class="table">
<a name="id1143740"></a><p class="title"><b>Table&#160;11.&#160;Regular expressions support</b></p>
<div class="table-contents"><table class="table" summary="Regular expressions support">
<colgroup>
<col>
<col>
</colgroup>
<thead><tr>
<th>
<p>
Expression
</p>
</th>
<th>
<p>
Meaning
</p>
</th>
</tr></thead>
<tbody>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">x</span></code>
</p>
</td>
<td>
<p>
Match any character <code class="computeroutput"><span class="identifier">x</span></code>
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="special">.</span></code>
</p>
</td>
<td>
<p>
Match any except newline (or optionally <span class="bold"><strong>any</strong></span>
character)
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="string">"..."</span></code>
</p>
</td>
<td>
<p>
All characters taken as literals between double quotes, except
escape sequences
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="special">[</span><span class="identifier">xyz</span><span class="special">]</span></code>
</p>
</td>
<td>
<p>
A character class; in this case matches <code class="computeroutput"><span class="identifier">x</span></code>,
<code class="computeroutput"><span class="identifier">y</span></code> or <code class="computeroutput"><span class="identifier">z</span></code>
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="special">[</span><span class="identifier">abj</span><span class="special">-</span><span class="identifier">oZ</span><span class="special">]</span></code>
</p>
</td>
<td>
<p>
A character class with a range in it; matches <code class="computeroutput"><span class="identifier">a</span></code>,
<code class="computeroutput"><span class="identifier">b</span></code> any letter
from <code class="computeroutput"><span class="identifier">j</span></code> through
<code class="computeroutput"><span class="identifier">o</span></code> or a <code class="computeroutput"><span class="identifier">Z</span></code>
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="special">[^</span><span class="identifier">A</span><span class="special">-</span><span class="identifier">Z</span><span class="special">]</span></code>
</p>
</td>
<td>
<p>
A negated character class i.e. any character but those in the
class. In this case, any character except an uppercase letter
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">r</span><span class="special">*</span></code>
</p>
</td>
<td>
<p>
Zero or more r's (greedy), where r is any regular expression
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">r</span><span class="special">*?</span></code>
</p>
</td>
<td>
<p>
Zero or more r's (abstemious), where r is any regular expression
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">r</span><span class="special">+</span></code>
</p>
</td>
<td>
<p>
One or more r's (greedy)
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">r</span><span class="special">+?</span></code>
</p>
</td>
<td>
<p>
One or more r's (abstemious)
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">r</span><span class="special">?</span></code>
</p>
</td>
<td>
<p>
Zero or one r's (greedy), i.e. optional
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">r</span><span class="special">??</span></code>
</p>
</td>
<td>
<p>
Zero or one r's (abstemious), i.e. optional
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">r</span><span class="special">{</span><span class="number">2</span><span class="special">,</span><span class="number">5</span><span class="special">}</span></code>
</p>
</td>
<td>
<p>
Anywhere between two and five r's (greedy)
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">r</span><span class="special">{</span><span class="number">2</span><span class="special">,</span><span class="number">5</span><span class="special">}?</span></code>
</p>
</td>
<td>
<p>
Anywhere between two and five r's (abstemious)
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">r</span><span class="special">{</span><span class="number">2</span><span class="special">,}</span></code>
</p>
</td>
<td>
<p>
Two or more r's (greedy)
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">r</span><span class="special">{</span><span class="number">2</span><span class="special">,}?</span></code>
</p>
</td>
<td>
<p>
Two or more r's (abstemious)
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">r</span><span class="special">{</span><span class="number">4</span><span class="special">}</span></code>
</p>
</td>
<td>
<p>
Exactly four r's
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="special">{</span><span class="identifier">NAME</span><span class="special">}</span></code>
</p>
</td>
<td>
<p>
The macro <code class="computeroutput"><span class="identifier">NAME</span></code>
(see below)
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="string">"[xyz]\"foo"</span></code>
</p>
</td>
<td>
<p>
The literal string <code class="computeroutput"><span class="special">[</span><span class="identifier">xyz</span><span class="special">]\</span><span class="error">"</span><span class="identifier">foo</span></code>
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="special">\</span><span class="identifier">X</span></code>
</p>
</td>
<td>
<p>
If X is <code class="computeroutput"><span class="identifier">a</span></code>, <code class="computeroutput"><span class="identifier">b</span></code>, <code class="computeroutput"><span class="identifier">e</span></code>,
<code class="computeroutput"><span class="identifier">n</span></code>, <code class="computeroutput"><span class="identifier">r</span></code>, <code class="computeroutput"><span class="identifier">f</span></code>,
<code class="computeroutput"><span class="identifier">t</span></code>, <code class="computeroutput"><span class="identifier">v</span></code> then the ANSI-C interpretation
of <code class="computeroutput"><span class="special">\</span><span class="identifier">x</span></code>.
Otherwise a literal <code class="computeroutput"><span class="identifier">X</span></code>
(used to escape operators such as <code class="computeroutput"><span class="special">*</span></code>)
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="special">\</span><span class="number">0</span></code>
</p>
</td>
<td>
<p>
A NUL character (ASCII code 0)
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="special">\</span><span class="number">123</span></code>
</p>
</td>
<td>
<p>
The character with octal value 123
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="special">\</span><span class="identifier">x2a</span></code>
</p>
</td>
<td>
<p>
The character with hexadecimal value 2a
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="special">\</span><span class="identifier">cX</span></code>
</p>
</td>
<td>
<p>
A named control character <code class="computeroutput"><span class="identifier">X</span></code>.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="special">\</span><span class="identifier">a</span></code>
</p>
</td>
<td>
<p>
A shortcut for Alert (bell).
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="special">\</span><span class="identifier">b</span></code>
</p>
</td>
<td>
<p>
A shortcut for Backspace
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="special">\</span><span class="identifier">e</span></code>
</p>
</td>
<td>
<p>
A shortcut for ESC (escape character <code class="computeroutput"><span class="number">0x1b</span></code>)
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="special">\</span><span class="identifier">n</span></code>
</p>
</td>
<td>
<p>
A shortcut for newline
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="special">\</span><span class="identifier">r</span></code>
</p>
</td>
<td>
<p>
A shortcut for carriage return
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="special">\</span><span class="identifier">f</span></code>
</p>
</td>
<td>
<p>
A shortcut for form feed <code class="computeroutput"><span class="number">0x0c</span></code>
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="special">\</span><span class="identifier">t</span></code>
</p>
</td>
<td>
<p>
A shortcut for horizontal tab <code class="computeroutput"><span class="number">0x09</span></code>
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="special">\</span><span class="identifier">v</span></code>
</p>
</td>
<td>
<p>
A shortcut for vertical tab <code class="computeroutput"><span class="number">0x0b</span></code>
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="special">\</span><span class="identifier">d</span></code>
</p>
</td>
<td>
<p>
A shortcut for <code class="computeroutput"><span class="special">[</span><span class="number">0</span><span class="special">-</span><span class="number">9</span><span class="special">]</span></code>
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="special">\</span><span class="identifier">D</span></code>
</p>
</td>
<td>
<p>
A shortcut for <code class="computeroutput"><span class="special">[^</span><span class="number">0</span><span class="special">-</span><span class="number">9</span><span class="special">]</span></code>
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="special">\</span><span class="identifier">s</span></code>
</p>
</td>
<td>
<p>
A shortcut for <code class="computeroutput"><span class="special">[\</span><span class="identifier">x20</span><span class="special">\</span><span class="identifier">t</span><span class="special">\</span><span class="identifier">n</span><span class="special">\</span><span class="identifier">r</span><span class="special">\</span><span class="identifier">f</span><span class="special">\</span><span class="identifier">v</span><span class="special">]</span></code>
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="special">\</span><span class="identifier">S</span></code>
</p>
</td>
<td>
<p>
A shortcut for <code class="computeroutput"><span class="special">[^\</span><span class="identifier">x20</span><span class="special">\</span><span class="identifier">t</span><span class="special">\</span><span class="identifier">n</span><span class="special">\</span><span class="identifier">r</span><span class="special">\</span><span class="identifier">f</span><span class="special">\</span><span class="identifier">v</span><span class="special">]</span></code>
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="special">\</span><span class="identifier">w</span></code>
</p>
</td>
<td>
<p>
A shortcut for <code class="computeroutput"><span class="special">[</span><span class="identifier">a</span><span class="special">-</span><span class="identifier">zA</span><span class="special">-</span><span class="identifier">Z0</span><span class="special">-</span><span class="number">9</span><span class="identifier">_</span><span class="special">]</span></code>
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="special">\</span><span class="identifier">W</span></code>
</p>
</td>
<td>
<p>
A shortcut for <code class="computeroutput"><span class="special">[^</span><span class="identifier">a</span><span class="special">-</span><span class="identifier">zA</span><span class="special">-</span><span class="identifier">Z0</span><span class="special">-</span><span class="number">9</span><span class="identifier">_</span><span class="special">]</span></code>
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="special">(</span><span class="identifier">r</span><span class="special">)</span></code>
</p>
</td>
<td>
<p>
Match an <code class="computeroutput"><span class="identifier">r</span></code>; parenthesis
are used to override precedence (see below)
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="special">(?</span><span class="identifier">r</span><span class="special">-</span><span class="identifier">s</span><span class="special">:</span><span class="identifier">pattern</span><span class="special">)</span></code>
</p>
</td>
<td>
<p>
apply option 'r' and omit option 's' while interpreting pattern.
Options may be zero or more of the characters 'i' or 's'. 'i'
means case-insensitive. '-i' means case-sensitive. 's' alters
the meaning of the '.' syntax to match any single character whatsoever.
'-s' alters the meaning of '.' to match any character except
'<code class="computeroutput"><span class="special">\</span><span class="identifier">n</span></code>'.
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">rs</span></code>
</p>
</td>
<td>
<p>
The regular expression <code class="computeroutput"><span class="identifier">r</span></code>
followed by the regular expression <code class="computeroutput"><span class="identifier">s</span></code>
(a sequence)
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">r</span><span class="special">|</span><span class="identifier">s</span></code>
</p>
</td>
<td>
<p>
Either an <code class="computeroutput"><span class="identifier">r</span></code> or
and <code class="computeroutput"><span class="identifier">s</span></code>
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="special">^</span><span class="identifier">r</span></code>
</p>
</td>
<td>
<p>
An <code class="computeroutput"><span class="identifier">r</span></code> but only
at the beginning of a line (i.e. when just starting to scan,
or right after a newline has been scanned)
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="identifier">r</span></code>$
</p>
</td>
<td>
<p>
An <code class="computeroutput"><span class="identifier">r</span></code> but only
at the end of a line (i.e. just before a newline)
</p>
</td>
</tr>
</tbody>
</table></div>
</div>
<br class="table-break"><div class="note"><table border="0" summary="Note">
<tr>
<td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../../images/note.png"></td>
<th align="left">Note</th>
</tr>
<tr><td align="left" valign="top"><p>
POSIX character classes are not currently supported, due to performance
issues when creating them in wide character mode.
</p></td></tr>
</table></div>
<div class="tip"><table border="0" summary="Tip">
<tr>
<td rowspan="2" align="center" valign="top" width="25"><img alt="[Tip]" src="../../../images/tip.png"></td>
<th align="left">Tip</th>
</tr>
<tr><td align="left" valign="top">
<p>
If you want to build tokens for syntaxes that recognize items like quotes
(<code class="computeroutput"><span class="string">"'"</span></code>, <code class="computeroutput"><span class="char">'"'</span></code>) and backslash (<code class="computeroutput"><span class="special">\</span></code>),
here is example syntax to get you started. The lesson here really is
to remember that both c++, as well as regular expressions require escaping
with <code class="computeroutput"><span class="special">\</span></code> for some constructs,
which can cascade.
</p>
<pre class="programlisting"><span class="identifier">quote1</span> <span class="special">=</span> <span class="string">"'"</span><span class="special">;</span> <span class="comment">// match single "'"
</span><span class="identifier">quote2</span> <span class="special">=</span> <span class="string">"\\\""</span><span class="special">;</span> <span class="comment">// match single '"'
</span><span class="identifier">literal_quote1</span> <span class="special">=</span> <span class="string">"\\'"</span><span class="special">;</span> <span class="comment">// match backslash followed by single "'"
</span><span class="identifier">literal_quote2</span> <span class="special">=</span> <span class="string">"\\\\\\\""</span><span class="special">;</span> <span class="comment">// match backslash followed by single '"'
</span><span class="identifier">literal_backslash</span> <span class="special">=</span> <span class="string">"\\\\\\\\"</span><span class="special">;</span> <span class="comment">// match two backslashes
</span></pre>
<p>
</p>
</td></tr>
</table></div>
<a name="spirit.lex.quick_reference.lexer.regular_expression_precedence"></a><h6>
<a name="id1145875"></a>
<a class="link" href="lexer.html#spirit.lex.quick_reference.lexer.regular_expression_precedence">Regular
Expression Precedence</a>
</h6>
<div class="itemizedlist"><ul class="itemizedlist" type="disc">
<li class="listitem">
<code class="computeroutput"><span class="identifier">rs</span></code> has highest precedence
</li>
<li class="listitem">
<code class="computeroutput"><span class="identifier">r</span><span class="special">*</span></code>
has next highest (<code class="computeroutput"><span class="special">+</span></code>,
<code class="computeroutput"><span class="special">?</span></code>, <code class="computeroutput"><span class="special">{</span><span class="identifier">n</span><span class="special">,</span><span class="identifier">m</span><span class="special">}</span></code>
have the same precedence as <code class="computeroutput"><span class="special">*</span></code>)
</li>
<li class="listitem">
<code class="computeroutput"><span class="identifier">r</span><span class="special">|</span><span class="identifier">s</span></code> has the lowest precedence
</li>
</ul></div>
<a name="spirit.lex.quick_reference.lexer.macros"></a><h6>
<a name="id1145991"></a>
<a class="link" href="lexer.html#spirit.lex.quick_reference.lexer.macros">Macros</a>
</h6>
<p>
Regular expressions can be given a name and referred to in rules using
the syntax <code class="computeroutput"><span class="special">{</span><span class="identifier">NAME</span><span class="special">}</span></code> where <code class="computeroutput"><span class="identifier">NAME</span></code>
is the name you have given to the macro. A macro name can be at most 30
characters long and must start with a <code class="computeroutput"><span class="identifier">_</span></code>
or a letter. Subsequent characters can be <code class="computeroutput"><span class="identifier">_</span></code>,
<code class="computeroutput"><span class="special">-</span></code>, a letter or a decimal digit.
</p>
</div>
<table xmlns:rev="http://www.cs.rpi.edu/~gregod/boost/tools/doc/revision" width="100%"><tr>
<td align="left"></td>
<td align="right"><div class="copyright-footer">Copyright &#169; 2001-2010 Joel de Guzman, Hartmut Kaiser<p>
Distributed under the Boost Software License, Version 1.0. (See accompanying
file LICENSE_1_0.txt or copy at <a href="http://www.boost.org/LICENSE_1_0.txt" target="_top">http://www.boost.org/LICENSE_1_0.txt</a>)
</p>
</div></td>
</tr></table>
<hr>
<div class="spirit-nav">
<a accesskey="p" href="phoenix.html"><img src="../../../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../quick_reference.html"><img src="../../../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../../../index.html"><img src="../../../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="../reference.html"><img src="../../../../../../../doc/src/images/next.png" alt="Next"></a>
</div>
</body>
</html>