blob: 6b466a35b7647eb6a2933f37affc3ebdf6cfc340 [file] [log] [blame]
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=US-ASCII">
<title>Parsers in Depth</title>
<link rel="stylesheet" href="../../../../../../../doc/src/boostbook.css" type="text/css">
<meta name="generator" content="DocBook XSL Stylesheets V1.75.0">
<link rel="home" href="../../../index.html" title="Spirit 2.4.1">
<link rel="up" href="../indepth.html" title="In Depth">
<link rel="prev" href="../indepth.html" title="In Depth">
<link rel="next" href="../customize.html" title="Customization of Spirit's Attribute Handling">
</head>
<body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF">
<table cellpadding="2" width="100%"><tr>
<td valign="top"><img alt="Boost C++ Libraries" width="277" height="86" src="../../../../../../../boost.png"></td>
<td align="center"><a href="../../../../../../../index.html">Home</a></td>
<td align="center"><a href="../../../../../../../libs/libraries.htm">Libraries</a></td>
<td align="center"><a href="http://www.boost.org/users/people.html">People</a></td>
<td align="center"><a href="http://www.boost.org/users/faq.html">FAQ</a></td>
<td align="center"><a href="../../../../../../../more/index.htm">More</a></td>
</tr></table>
<hr>
<div class="spirit-nav">
<a accesskey="p" href="../indepth.html"><img src="../../../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../indepth.html"><img src="../../../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../../../index.html"><img src="../../../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="../customize.html"><img src="../../../../../../../doc/src/images/next.png" alt="Next"></a>
</div>
<div class="section">
<div class="titlepage"><div><div><h4 class="title">
<a name="spirit.advanced.indepth.parsers_indepth"></a><a class="link" href="parsers_indepth.html" title="Parsers in Depth">Parsers in
Depth</a>
</h4></div></div></div>
<p>
This section is not for the faint of heart. In here, are distilled the
inner workings of <span class="emphasis"><em>Spirit.Qi</em></span> parsers, using real code
from the <a href="http://boost-spirit.com" target="_top">Spirit</a> library as
examples. On the other hand, here is no reason to fear reading on, though.
We tried to explain things step by step while highlighting the important
insights.
</p>
<p>
The <code class="computeroutput"><a class="link" href="../../qi/reference/parser_concepts/parser.html" title="Parser"><code class="computeroutput"><span class="identifier">Parser</span></code></a></code> class is the base
class for all parsers.
</p>
<p>
</p>
<pre class="programlisting"><span class="keyword">template</span> <span class="special">&lt;</span><span class="keyword">typename</span> <span class="identifier">Derived</span><span class="special">&gt;</span>
<span class="keyword">struct</span> <span class="identifier">parser</span>
<span class="special">{</span>
<span class="keyword">struct</span> <span class="identifier">parser_id</span><span class="special">;</span>
<span class="keyword">typedef</span> <span class="identifier">Derived</span> <span class="identifier">derived_type</span><span class="special">;</span>
<span class="keyword">typedef</span> <span class="identifier">qi</span><span class="special">::</span><span class="identifier">domain</span> <span class="identifier">domain</span><span class="special">;</span>
<span class="comment">// Requirement: p.parse(f, l, context, skip, attr) -&gt; bool
</span> <span class="comment">//
</span> <span class="comment">// p: a parser
</span> <span class="comment">// f, l: first/last iterator pair
</span> <span class="comment">// context: enclosing rule context (can be unused_type)
</span> <span class="comment">// skip: skipper (can be unused_type)
</span> <span class="comment">// attr: attribute (can be unused_type)
</span>
<span class="comment">// Requirement: p.what(context) -&gt; info
</span> <span class="comment">//
</span> <span class="comment">// p: a parser
</span> <span class="comment">// context: enclosing rule context (can be unused_type)
</span>
<span class="comment">// Requirement: P::template attribute&lt;Ctx, Iter&gt;::type
</span> <span class="comment">//
</span> <span class="comment">// P: a parser type
</span> <span class="comment">// Ctx: A context type (can be unused_type)
</span> <span class="comment">// Iter: An iterator type (can be unused_type)
</span>
<span class="identifier">Derived</span> <span class="keyword">const</span><span class="special">&amp;</span> <span class="identifier">derived</span><span class="special">()</span> <span class="keyword">const</span>
<span class="special">{</span>
<span class="keyword">return</span> <span class="special">*</span><span class="keyword">static_cast</span><span class="special">&lt;</span><span class="identifier">Derived</span> <span class="keyword">const</span><span class="special">*&gt;(</span><span class="keyword">this</span><span class="special">);</span>
<span class="special">}</span>
<span class="special">};</span>
</pre>
<p>
</p>
<p>
The <code class="computeroutput"><a class="link" href="../../qi/reference/parser_concepts/parser.html" title="Parser"><code class="computeroutput"><span class="identifier">Parser</span></code></a></code> class does not really
know how to parse anything but instead relies on the template parameter
<code class="computeroutput"><span class="identifier">Derived</span></code> to do the actual
parsing. This technique is known as the "Curiously Recurring Template
Pattern" in template meta-programming circles. This inheritance strategy
gives us the power of polymorphism without the virtual function overhead.
In essence this is a way to implement compile time polymorphism.
</p>
<p>
The Derived parsers, <code class="computeroutput"><a class="link" href="../../qi/reference/parser_concepts/primitiveparser.html" title="PrimitiveParser"><code class="computeroutput"><span class="identifier">PrimitiveParser</span></code></a></code>, <code class="computeroutput"><a class="link" href="../../qi/reference/parser_concepts/unaryparser.html" title="UnaryParser"><code class="computeroutput"><span class="identifier">UnaryParser</span></code></a></code>, <code class="computeroutput"><a class="link" href="../../qi/reference/parser_concepts/binaryparser.html" title="BinaryParser"><code class="computeroutput"><span class="identifier">BinaryParser</span></code></a></code> and <code class="computeroutput"><a class="link" href="../../qi/reference/parser_concepts/naryparser.html" title="NaryParser"><code class="computeroutput"><span class="identifier">NaryParser</span></code></a></code> provide the
necessary facilities for parser detection, introspection, transformation
and visitation.
</p>
<p>
Derived parsers must support the following:
</p>
<div class="variablelist">
<p class="title"><b>bool parse(f, l, context, skip, attr)</b></p>
<dl>
<dt><span class="term"><code class="computeroutput"><span class="identifier">f</span></code>, <code class="computeroutput"><span class="identifier">l</span></code></span></dt>
<dd><p>
first/last iterator pair
</p></dd>
<dt><span class="term"><code class="computeroutput"><span class="identifier">context</span></code></span></dt>
<dd><p>
enclosing rule context (can be unused_type)
</p></dd>
<dt><span class="term"><code class="computeroutput"><span class="identifier">skip</span></code></span></dt>
<dd><p>
skipper (can be unused_type)
</p></dd>
<dt><span class="term"><code class="computeroutput"><span class="identifier">attr</span></code></span></dt>
<dd><p>
attribute (can be unused_type)
</p></dd>
</dl>
</div>
<p>
The <span class="emphasis"><em>parse</em></span> is the main parser entry point. <span class="emphasis"><em>skipper</em></span>
can be an <code class="computeroutput"><span class="identifier">unused_type</span></code>.
It's a type used every where in <a href="http://boost-spirit.com" target="_top">Spirit</a>
to signify "don't-care". There is an overload for <span class="emphasis"><em>skip</em></span>
for <code class="computeroutput"><span class="identifier">unused_type</span></code> that is
simply a no-op. That way, we do not have to write multiple parse functions
for phrase and character level parsing.
</p>
<p>
Here are the basic rules for parsing:
</p>
<div class="itemizedlist"><ul class="itemizedlist" type="disc">
<li class="listitem">
The parser returns <code class="computeroutput"><span class="keyword">true</span></code>
if successful, <code class="computeroutput"><span class="keyword">false</span></code> otherwise.
</li>
<li class="listitem">
If successful, <code class="computeroutput"><span class="identifier">first</span></code>
is incremented N number of times, where N is the number of characters
parsed. N can be zero --an empty (epsilon) match.
</li>
<li class="listitem">
If successful, the parsed attribute is assigned to <span class="emphasis"><em>attr</em></span>
</li>
<li class="listitem">
If unsuccessful, <code class="computeroutput"><span class="identifier">first</span></code>
is reset to its position before entering the parser function. <span class="emphasis"><em>attr</em></span>
is untouched.
</li>
</ul></div>
<div class="variablelist">
<p class="title"><b>void what(context)</b></p>
<dl>
<dt><span class="term"><code class="computeroutput"><span class="identifier">context</span></code></span></dt>
<dd><p>
enclosing rule context (can be <code class="computeroutput"><span class="identifier">unused_type</span></code>)
</p></dd>
</dl>
</div>
<p>
The <span class="emphasis"><em>what</em></span> function should be obvious. It provides some
information about <span class="quote">&#8220;<span class="quote">what</span>&#8221;</span> the parser is. It is used as a debugging
aid, for example.
</p>
<div class="variablelist">
<p class="title"><b>P::template attribute&lt;context&gt;::type</b></p>
<dl>
<dt><span class="term"><code class="computeroutput"><span class="identifier">P</span></code></span></dt>
<dd><p>
a parser type
</p></dd>
<dt><span class="term"><code class="computeroutput"><span class="identifier">context</span></code></span></dt>
<dd><p>
A context type (can be unused_type)
</p></dd>
</dl>
</div>
<p>
The <span class="emphasis"><em>attribute</em></span> metafunction returns the expected attribute
type of the parser. In some cases, this is context dependent.
</p>
<p>
In this section, we will dissect two parser types:
</p>
<div class="variablelist">
<p class="title"><b>Parsers</b></p>
<dl>
<dt><span class="term"><code class="computeroutput"><a class="link" href="../../qi/reference/parser_concepts/primitiveparser.html" title="PrimitiveParser"><code class="computeroutput"><span class="identifier">PrimitiveParser</span></code></a></code></span></dt>
<dd><p>
A parser for primitive data (e.g. integer parsing).
</p></dd>
<dt><span class="term"><code class="computeroutput"><a class="link" href="../../qi/reference/parser_concepts/unaryparser.html" title="UnaryParser"><code class="computeroutput"><span class="identifier">UnaryParser</span></code></a></code></span></dt>
<dd><p>
A parser that has single subject (e.g. kleene star).
</p></dd>
</dl>
</div>
<a name="spirit.advanced.indepth.parsers_indepth.primitive_parsers"></a><h6>
<a name="id1154680"></a>
<a class="link" href="parsers_indepth.html#spirit.advanced.indepth.parsers_indepth.primitive_parsers">Primitive
Parsers</a>
</h6>
<p>
For our dissection study, we will use a <a href="http://boost-spirit.com" target="_top">Spirit</a>
primitive, the <code class="computeroutput"><span class="identifier">int_parser</span></code>
in the boost::spirit::qi namespace.
</p>
<p>
</p>
<pre class="programlisting"><span class="keyword">template</span> <span class="special">&lt;</span>
<span class="keyword">typename</span> <span class="identifier">T</span>
<span class="special">,</span> <span class="keyword">unsigned</span> <span class="identifier">Radix</span> <span class="special">=</span> <span class="number">10</span>
<span class="special">,</span> <span class="keyword">unsigned</span> <span class="identifier">MinDigits</span> <span class="special">=</span> <span class="number">1</span>
<span class="special">,</span> <span class="keyword">int</span> <span class="identifier">MaxDigits</span> <span class="special">=</span> <span class="special">-</span><span class="number">1</span><span class="special">&gt;</span>
<span class="keyword">struct</span> <span class="identifier">int_parser_impl</span>
<span class="special">:</span> <span class="identifier">primitive_parser</span><span class="special">&lt;</span><span class="identifier">int_parser_impl</span><span class="special">&lt;</span><span class="identifier">T</span><span class="special">,</span> <span class="identifier">Radix</span><span class="special">,</span> <span class="identifier">MinDigits</span><span class="special">,</span> <span class="identifier">MaxDigits</span><span class="special">&gt;</span> <span class="special">&gt;</span>
<span class="special">{</span>
<span class="comment">// check template parameter 'Radix' for validity
</span> <span class="identifier">BOOST_SPIRIT_ASSERT_MSG</span><span class="special">(</span>
<span class="identifier">Radix</span> <span class="special">==</span> <span class="number">2</span> <span class="special">||</span> <span class="identifier">Radix</span> <span class="special">==</span> <span class="number">8</span> <span class="special">||</span> <span class="identifier">Radix</span> <span class="special">==</span> <span class="number">10</span> <span class="special">||</span> <span class="identifier">Radix</span> <span class="special">==</span> <span class="number">16</span><span class="special">,</span>
<span class="identifier">not_supported_radix</span><span class="special">,</span> <span class="special">());</span>
<span class="keyword">template</span> <span class="special">&lt;</span><span class="keyword">typename</span> <span class="identifier">Context</span><span class="special">,</span> <span class="keyword">typename</span> <span class="identifier">Iterator</span><span class="special">&gt;</span>
<span class="keyword">struct</span> <span class="identifier">attribute</span>
<span class="special">{</span>
<span class="keyword">typedef</span> <span class="identifier">T</span> <span class="identifier">type</span><span class="special">;</span>
<span class="special">};</span>
<span class="keyword">template</span> <span class="special">&lt;</span><span class="keyword">typename</span> <span class="identifier">Iterator</span><span class="special">,</span> <span class="keyword">typename</span> <span class="identifier">Context</span>
<span class="special">,</span> <span class="keyword">typename</span> <span class="identifier">Skipper</span><span class="special">,</span> <span class="keyword">typename</span> <span class="identifier">Attribute</span><span class="special">&gt;</span>
<span class="keyword">bool</span> <span class="identifier">parse</span><span class="special">(</span><span class="identifier">Iterator</span><span class="special">&amp;</span> <span class="identifier">first</span><span class="special">,</span> <span class="identifier">Iterator</span> <span class="keyword">const</span><span class="special">&amp;</span> <span class="identifier">last</span>
<span class="special">,</span> <span class="identifier">Context</span><span class="special">&amp;</span> <span class="comment">/*context*/</span><span class="special">,</span> <span class="identifier">Skipper</span> <span class="keyword">const</span><span class="special">&amp;</span> <span class="identifier">skipper</span>
<span class="special">,</span> <span class="identifier">Attribute</span><span class="special">&amp;</span> <span class="identifier">attr</span><span class="special">)</span> <span class="keyword">const</span>
<span class="special">{</span>
<span class="identifier">qi</span><span class="special">::</span><span class="identifier">skip_over</span><span class="special">(</span><span class="identifier">first</span><span class="special">,</span> <span class="identifier">last</span><span class="special">,</span> <span class="identifier">skipper</span><span class="special">);</span>
<span class="keyword">return</span> <span class="identifier">extract_int</span><span class="special">&lt;</span><span class="identifier">T</span><span class="special">,</span> <span class="identifier">Radix</span><span class="special">,</span> <span class="identifier">MinDigits</span><span class="special">,</span> <span class="identifier">MaxDigits</span><span class="special">&gt;</span>
<span class="special">::</span><span class="identifier">call</span><span class="special">(</span><span class="identifier">first</span><span class="special">,</span> <span class="identifier">last</span><span class="special">,</span> <span class="identifier">attr</span><span class="special">);</span>
<span class="special">}</span>
<span class="keyword">template</span> <span class="special">&lt;</span><span class="keyword">typename</span> <span class="identifier">Context</span><span class="special">&gt;</span>
<span class="identifier">info</span> <span class="identifier">what</span><span class="special">(</span><span class="identifier">Context</span><span class="special">&amp;</span> <span class="comment">/*context*/</span><span class="special">)</span> <span class="keyword">const</span>
<span class="special">{</span>
<span class="keyword">return</span> <span class="identifier">info</span><span class="special">(</span><span class="string">"integer"</span><span class="special">);</span>
<span class="special">}</span>
<span class="special">};</span>
</pre>
<p>
</p>
<p>
The <code class="computeroutput"><span class="identifier">int_parser</span></code> is derived
from a <code class="computeroutput"><a class="link" href="../../qi/reference/parser_concepts/primitiveparser.html" title="PrimitiveParser"><code class="computeroutput"><span class="identifier">PrimitiveParser</span></code></a><span class="special">&lt;</span><span class="identifier">Derived</span><span class="special">&gt;</span></code>,
which in turn derives from <code class="computeroutput"><span class="identifier">parser</span><span class="special">&lt;</span><span class="identifier">Derived</span><span class="special">&gt;</span></code>. Therefore, it supports the following
requirements:
</p>
<div class="itemizedlist"><ul class="itemizedlist" type="disc">
<li class="listitem">
The <code class="computeroutput"><span class="identifier">parse</span></code> member function
</li>
<li class="listitem">
The <code class="computeroutput"><span class="identifier">what</span></code> member function
</li>
<li class="listitem">
The nested <code class="computeroutput"><span class="identifier">attribute</span></code>
metafunction
</li>
</ul></div>
<p>
<span class="emphasis"><em>parse</em></span> is the main entry point. For primitive parsers,
our first thing to do is call:
</p>
<p>
</p>
<pre class="programlisting"><span class="identifier">qi</span><span class="special">::</span><span class="identifier">skip</span><span class="special">(</span><span class="identifier">first</span><span class="special">,</span> <span class="identifier">last</span><span class="special">,</span> <span class="identifier">skipper</span><span class="special">);</span>
</pre>
<p>
</p>
<p>
to do a pre-skip. After pre-skipping, the parser proceeds to do its thing.
The actual parsing code is placed in <code class="computeroutput"><span class="identifier">extract_int</span><span class="special">&lt;</span><span class="identifier">T</span><span class="special">,</span> <span class="identifier">Radix</span><span class="special">,</span> <span class="identifier">MinDigits</span><span class="special">,</span> <span class="identifier">MaxDigits</span><span class="special">&gt;::</span><span class="identifier">call</span><span class="special">(</span><span class="identifier">first</span><span class="special">,</span> <span class="identifier">last</span><span class="special">,</span> <span class="identifier">attr</span><span class="special">);</span></code>
</p>
<p>
This simple no-frills protocol is one of the reasons why <a href="http://boost-spirit.com" target="_top">Spirit</a>
is fast. If you know the internals of <a href="../../../../../../../libs/spirit/classic/index.html" target="_top"><span class="emphasis"><em>Spirit.Classic</em></span></a>
and perhaps even wrote some parsers with it, this simple <a href="http://boost-spirit.com" target="_top">Spirit</a>
mechanism is a joy to work with. There are no scanners and all that crap.
</p>
<p>
The <span class="emphasis"><em>what</em></span> function just tells us that it is an integer
parser. Simple.
</p>
<p>
The <span class="emphasis"><em>attribute</em></span> metafunction returns the T template
parameter. We associate the <code class="computeroutput"><span class="identifier">int_parser</span></code>
to some placeholders for <code class="computeroutput"><span class="identifier">short_</span></code>,
<code class="computeroutput"><span class="identifier">int_</span></code>, <code class="computeroutput"><span class="identifier">long_</span></code>
and <code class="computeroutput"><span class="identifier">long_long</span></code> types. But,
first, we enable these placeholders in namespace boost::spirit:
</p>
<p>
</p>
<pre class="programlisting"><span class="keyword">template</span> <span class="special">&lt;&gt;</span>
<span class="keyword">struct</span> <span class="identifier">use_terminal</span><span class="special">&lt;</span><span class="identifier">qi</span><span class="special">::</span><span class="identifier">domain</span><span class="special">,</span> <span class="identifier">tag</span><span class="special">::</span><span class="identifier">short_</span><span class="special">&gt;</span> <span class="comment">// enables short_
</span> <span class="special">:</span> <span class="identifier">mpl</span><span class="special">::</span><span class="identifier">true_</span> <span class="special">{};</span>
</pre>
<p>
</p>
<p>
</p>
<pre class="programlisting"><span class="keyword">template</span> <span class="special">&lt;&gt;</span>
<span class="keyword">struct</span> <span class="identifier">use_terminal</span><span class="special">&lt;</span><span class="identifier">qi</span><span class="special">::</span><span class="identifier">domain</span><span class="special">,</span> <span class="identifier">tag</span><span class="special">::</span><span class="identifier">int_</span><span class="special">&gt;</span> <span class="comment">// enables int_
</span> <span class="special">:</span> <span class="identifier">mpl</span><span class="special">::</span><span class="identifier">true_</span> <span class="special">{};</span>
</pre>
<p>
</p>
<p>
</p>
<pre class="programlisting"><span class="keyword">template</span> <span class="special">&lt;&gt;</span>
<span class="keyword">struct</span> <span class="identifier">use_terminal</span><span class="special">&lt;</span><span class="identifier">qi</span><span class="special">::</span><span class="identifier">domain</span><span class="special">,</span> <span class="identifier">tag</span><span class="special">::</span><span class="identifier">long_</span><span class="special">&gt;</span> <span class="comment">// enables long_
</span> <span class="special">:</span> <span class="identifier">mpl</span><span class="special">::</span><span class="identifier">true_</span> <span class="special">{};</span>
</pre>
<p>
</p>
<p>
</p>
<pre class="programlisting"><span class="keyword">template</span> <span class="special">&lt;&gt;</span>
<span class="keyword">struct</span> <span class="identifier">use_terminal</span><span class="special">&lt;</span><span class="identifier">qi</span><span class="special">::</span><span class="identifier">domain</span><span class="special">,</span> <span class="identifier">tag</span><span class="special">::</span><span class="identifier">long_long</span><span class="special">&gt;</span> <span class="comment">// enables long_long
</span> <span class="special">:</span> <span class="identifier">mpl</span><span class="special">::</span><span class="identifier">true_</span> <span class="special">{};</span>
</pre>
<p>
</p>
<p>
Notice that <code class="computeroutput"><span class="identifier">int_parser</span></code>
is placed in the namespace boost::spirit::qi while these <span class="emphasis"><em>enablers</em></span>
are in namespace boost::spirit. The reason is that these placeholders are
shared by other <a href="http://boost-spirit.com" target="_top">Spirit</a> <span class="emphasis"><em>domains</em></span>.
<span class="emphasis"><em>Spirit.Qi</em></span>, the parser is one domain. <span class="emphasis"><em>Spirit.Karma</em></span>,
the generator is another domain. Other parser technologies may be developed
and placed in yet another domain. Yet, all these can potentially share
the same placeholders for interoperability. The interpretation of these
placeholders is domain-specific.
</p>
<p>
Now that we enabled the placeholders, we have to write generators for them.
The make_xxx stuff (in boost::spirit::qi namespace):
</p>
<p>
</p>
<pre class="programlisting"><span class="keyword">template</span> <span class="special">&lt;</span><span class="keyword">typename</span> <span class="identifier">T</span><span class="special">&gt;</span>
<span class="keyword">struct</span> <span class="identifier">make_int</span>
<span class="special">{</span>
<span class="keyword">typedef</span> <span class="identifier">int_parser_impl</span><span class="special">&lt;</span><span class="identifier">T</span><span class="special">&gt;</span> <span class="identifier">result_type</span><span class="special">;</span>
<span class="identifier">result_type</span> <span class="keyword">operator</span><span class="special">()(</span><span class="identifier">unused_type</span><span class="special">,</span> <span class="identifier">unused_type</span><span class="special">)</span> <span class="keyword">const</span>
<span class="special">{</span>
<span class="keyword">return</span> <span class="identifier">result_type</span><span class="special">();</span>
<span class="special">}</span>
<span class="special">};</span>
</pre>
<p>
</p>
<p>
This one above is our main generator. It's a simple function object with
2 (unused) arguments. These arguments are
</p>
<div class="orderedlist"><ol class="orderedlist" type="1">
<li class="listitem">
The actual terminal value obtained by proto. In this case, either a
short<span class="underline">, int</span>, long_ or long_long.
We don't care about this.
</li>
<li class="listitem">
Modifiers. We also don't care about this. This allows directives such
as <code class="computeroutput"><span class="identifier">no_case</span><span class="special">[</span><span class="identifier">p</span><span class="special">]</span></code>
to pass information to inner parser nodes. We'll see how that works
later.
</li>
</ol></div>
<p>
Now:
</p>
<p>
</p>
<pre class="programlisting"><span class="keyword">template</span> <span class="special">&lt;</span><span class="keyword">typename</span> <span class="identifier">Modifiers</span><span class="special">&gt;</span>
<span class="keyword">struct</span> <span class="identifier">make_primitive</span><span class="special">&lt;</span><span class="identifier">tag</span><span class="special">::</span><span class="identifier">short_</span><span class="special">,</span> <span class="identifier">Modifiers</span><span class="special">&gt;</span> <span class="special">:</span> <span class="identifier">make_int</span><span class="special">&lt;</span><span class="keyword">short</span><span class="special">&gt;</span> <span class="special">{};</span>
</pre>
<p>
</p>
<p>
</p>
<pre class="programlisting"><span class="keyword">template</span> <span class="special">&lt;</span><span class="keyword">typename</span> <span class="identifier">Modifiers</span><span class="special">&gt;</span>
<span class="keyword">struct</span> <span class="identifier">make_primitive</span><span class="special">&lt;</span><span class="identifier">tag</span><span class="special">::</span><span class="identifier">int_</span><span class="special">,</span> <span class="identifier">Modifiers</span><span class="special">&gt;</span> <span class="special">:</span> <span class="identifier">make_int</span><span class="special">&lt;</span><span class="keyword">int</span><span class="special">&gt;</span> <span class="special">{};</span>
</pre>
<p>
</p>
<p>
</p>
<pre class="programlisting"><span class="keyword">template</span> <span class="special">&lt;</span><span class="keyword">typename</span> <span class="identifier">Modifiers</span><span class="special">&gt;</span>
<span class="keyword">struct</span> <span class="identifier">make_primitive</span><span class="special">&lt;</span><span class="identifier">tag</span><span class="special">::</span><span class="identifier">long_</span><span class="special">,</span> <span class="identifier">Modifiers</span><span class="special">&gt;</span> <span class="special">:</span> <span class="identifier">make_int</span><span class="special">&lt;</span><span class="keyword">long</span><span class="special">&gt;</span> <span class="special">{};</span>
</pre>
<p>
</p>
<p>
</p>
<pre class="programlisting"><span class="keyword">template</span> <span class="special">&lt;</span><span class="keyword">typename</span> <span class="identifier">Modifiers</span><span class="special">&gt;</span>
<span class="keyword">struct</span> <span class="identifier">make_primitive</span><span class="special">&lt;</span><span class="identifier">tag</span><span class="special">::</span><span class="identifier">long_long</span><span class="special">,</span> <span class="identifier">Modifiers</span><span class="special">&gt;</span>
<span class="special">:</span> <span class="identifier">make_int</span><span class="special">&lt;</span><span class="identifier">boost</span><span class="special">::</span><span class="identifier">long_long_type</span><span class="special">&gt;</span> <span class="special">{};</span>
</pre>
<p>
</p>
<p>
These, specialize <code class="computeroutput"><span class="identifier">qi</span><span class="special">:</span><span class="identifier">make_primitive</span></code> for specific tags. They
all inherit from <code class="computeroutput"><span class="identifier">make_int</span></code>
which does the actual work.
</p>
<a name="spirit.advanced.indepth.parsers_indepth.composite_parsers"></a><h6>
<a name="id1156785"></a>
<a class="link" href="parsers_indepth.html#spirit.advanced.indepth.parsers_indepth.composite_parsers">Composite
Parsers</a>
</h6>
<p>
Let me present the kleene star (also in namespace spirit::qi):
</p>
<p>
</p>
<pre class="programlisting"><span class="keyword">template</span> <span class="special">&lt;</span><span class="keyword">typename</span> <span class="identifier">Subject</span><span class="special">&gt;</span>
<span class="keyword">struct</span> <span class="identifier">kleene</span> <span class="special">:</span> <span class="identifier">unary_parser</span><span class="special">&lt;</span><span class="identifier">kleene</span><span class="special">&lt;</span><span class="identifier">Subject</span><span class="special">&gt;</span> <span class="special">&gt;</span>
<span class="special">{</span>
<span class="keyword">typedef</span> <span class="identifier">Subject</span> <span class="identifier">subject_type</span><span class="special">;</span>
<span class="keyword">template</span> <span class="special">&lt;</span><span class="keyword">typename</span> <span class="identifier">Context</span><span class="special">,</span> <span class="keyword">typename</span> <span class="identifier">Iterator</span><span class="special">&gt;</span>
<span class="keyword">struct</span> <span class="identifier">attribute</span>
<span class="special">{</span>
<span class="comment">// Build a std::vector from the subject's attribute. Note
</span> <span class="comment">// that build_std_vector may return unused_type if the
</span> <span class="comment">// subject's attribute is an unused_type.
</span> <span class="keyword">typedef</span> <span class="keyword">typename</span>
<span class="identifier">traits</span><span class="special">::</span><span class="identifier">build_std_vector</span><span class="special">&lt;</span>
<span class="keyword">typename</span> <span class="identifier">traits</span><span class="special">::</span>
<span class="identifier">attribute_of</span><span class="special">&lt;</span><span class="identifier">Subject</span><span class="special">,</span> <span class="identifier">Context</span><span class="special">,</span> <span class="identifier">Iterator</span><span class="special">&gt;::</span><span class="identifier">type</span>
<span class="special">&gt;::</span><span class="identifier">type</span>
<span class="identifier">type</span><span class="special">;</span>
<span class="special">};</span>
<span class="identifier">kleene</span><span class="special">(</span><span class="identifier">Subject</span> <span class="keyword">const</span><span class="special">&amp;</span> <span class="identifier">subject</span><span class="special">)</span>
<span class="special">:</span> <span class="identifier">subject</span><span class="special">(</span><span class="identifier">subject</span><span class="special">)</span> <span class="special">{}</span>
<span class="keyword">template</span> <span class="special">&lt;</span><span class="keyword">typename</span> <span class="identifier">Iterator</span><span class="special">,</span> <span class="keyword">typename</span> <span class="identifier">Context</span>
<span class="special">,</span> <span class="keyword">typename</span> <span class="identifier">Skipper</span><span class="special">,</span> <span class="keyword">typename</span> <span class="identifier">Attribute</span><span class="special">&gt;</span>
<span class="keyword">bool</span> <span class="identifier">parse</span><span class="special">(</span><span class="identifier">Iterator</span><span class="special">&amp;</span> <span class="identifier">first</span><span class="special">,</span> <span class="identifier">Iterator</span> <span class="keyword">const</span><span class="special">&amp;</span> <span class="identifier">last</span>
<span class="special">,</span> <span class="identifier">Context</span><span class="special">&amp;</span> <span class="identifier">context</span><span class="special">,</span> <span class="identifier">Skipper</span> <span class="keyword">const</span><span class="special">&amp;</span> <span class="identifier">skipper</span>
<span class="special">,</span> <span class="identifier">Attribute</span><span class="special">&amp;</span> <span class="identifier">attr</span><span class="special">)</span> <span class="keyword">const</span>
<span class="special">{</span>
<span class="comment">// create a local value if Attribute is not unused_type
</span> <span class="keyword">typedef</span> <span class="keyword">typename</span> <span class="identifier">traits</span><span class="special">::</span><span class="identifier">container_value</span><span class="special">&lt;</span><span class="identifier">Attribute</span><span class="special">&gt;::</span><span class="identifier">type</span>
<span class="identifier">value_type</span><span class="special">;</span>
<span class="identifier">value_type</span> <span class="identifier">val</span> <span class="special">=</span> <span class="identifier">value_type</span><span class="special">();</span>
<span class="comment">// Repeat while subject parses ok
</span> <span class="identifier">Iterator</span> <span class="identifier">save</span> <span class="special">=</span> <span class="identifier">first</span><span class="special">;</span>
<span class="keyword">while</span> <span class="special">(</span><span class="identifier">subject</span><span class="special">.</span><span class="identifier">parse</span><span class="special">(</span><span class="identifier">save</span><span class="special">,</span> <span class="identifier">last</span><span class="special">,</span> <span class="identifier">context</span><span class="special">,</span> <span class="identifier">skipper</span><span class="special">,</span> <span class="identifier">val</span><span class="special">)</span> <span class="special">&amp;&amp;</span>
<span class="identifier">traits</span><span class="special">::</span><span class="identifier">push_back</span><span class="special">(</span><span class="identifier">attr</span><span class="special">,</span> <span class="identifier">val</span><span class="special">))</span> <span class="comment">// push the parsed value into our attribute
</span> <span class="special">{</span>
<span class="identifier">first</span> <span class="special">=</span> <span class="identifier">save</span><span class="special">;</span>
<span class="identifier">traits</span><span class="special">::</span><span class="identifier">clear</span><span class="special">(</span><span class="identifier">val</span><span class="special">);</span>
<span class="special">}</span>
<span class="keyword">return</span> <span class="keyword">true</span><span class="special">;</span>
<span class="special">}</span>
<span class="keyword">template</span> <span class="special">&lt;</span><span class="keyword">typename</span> <span class="identifier">Context</span><span class="special">&gt;</span>
<span class="identifier">info</span> <span class="identifier">what</span><span class="special">(</span><span class="identifier">Context</span><span class="special">&amp;</span> <span class="identifier">context</span><span class="special">)</span> <span class="keyword">const</span>
<span class="special">{</span>
<span class="keyword">return</span> <span class="identifier">info</span><span class="special">(</span><span class="string">"kleene"</span><span class="special">,</span> <span class="identifier">subject</span><span class="special">.</span><span class="identifier">what</span><span class="special">(</span><span class="identifier">context</span><span class="special">));</span>
<span class="special">}</span>
<span class="identifier">Subject</span> <span class="identifier">subject</span><span class="special">;</span>
<span class="special">};</span>
</pre>
<p>
</p>
<p>
Looks similar in form to its primitive cousin, the <code class="computeroutput"><span class="identifier">int_parser</span></code>.
And, again, it has the same basic ingredients required by <code class="computeroutput"><span class="identifier">Derived</span></code>.
</p>
<div class="itemizedlist"><ul class="itemizedlist" type="disc">
<li class="listitem">
The nested attribute metafunction
</li>
<li class="listitem">
The parse member function
</li>
<li class="listitem">
The what member function
</li>
</ul></div>
<p>
kleene is a composite parser. It is a parser that composes another parser,
its <span class="quote">&#8220;<span class="quote">subject</span>&#8221;</span>. It is a <code class="computeroutput"><a class="link" href="../../qi/reference/parser_concepts/unaryparser.html" title="UnaryParser"><code class="computeroutput"><span class="identifier">UnaryParser</span></code></a></code> and subclasses
from it. Like <code class="computeroutput"><a class="link" href="../../qi/reference/parser_concepts/primitiveparser.html" title="PrimitiveParser"><code class="computeroutput"><span class="identifier">PrimitiveParser</span></code></a></code>, <code class="computeroutput"><a class="link" href="../../qi/reference/parser_concepts/unaryparser.html" title="UnaryParser"><code class="computeroutput"><span class="identifier">UnaryParser</span></code></a><span class="special">&lt;</span><span class="identifier">Derived</span><span class="special">&gt;</span></code>
derives from <code class="computeroutput"><span class="identifier">parser</span><span class="special">&lt;</span><span class="identifier">Derived</span><span class="special">&gt;</span></code>.
</p>
<p>
unary_parser&lt;Derived&gt;, has these expression requirements on Derived:
</p>
<div class="itemizedlist"><ul class="itemizedlist" type="disc">
<li class="listitem">
p.subject -&gt; subject parser ( <span class="emphasis"><em>p</em></span> is a <a class="link" href="../../qi/reference/parser_concepts/unaryparser.html" title="UnaryParser"><code class="computeroutput"><span class="identifier">UnaryParser</span></code></a> parser.)
</li>
<li class="listitem">
P::subject_type -&gt; subject parser type ( <span class="emphasis"><em>P</em></span>
is a <a class="link" href="../../qi/reference/parser_concepts/unaryparser.html" title="UnaryParser"><code class="computeroutput"><span class="identifier">UnaryParser</span></code></a> type.)
</li>
</ul></div>
<p>
<span class="emphasis"><em>parse</em></span> is the main parser entry point. Since this is
not a primitive parser, we do not need to call <code class="computeroutput"><span class="identifier">qi</span><span class="special">::</span><span class="identifier">skip</span><span class="special">(</span><span class="identifier">first</span><span class="special">,</span> <span class="identifier">last</span><span class="special">,</span> <span class="identifier">skipper</span><span class="special">)</span></code>. The <span class="emphasis"><em>subject</em></span>, if
it is a primitive, will do the pre-skip. If if it is another composite
parser, it will eventually call a primitive parser somewhere down the line
which will do the pre-skip. This makes it a lot more efficient than <a href="../../../../../../../libs/spirit/classic/index.html" target="_top"><span class="emphasis"><em>Spirit.Classic</em></span></a>.
<a href="../../../../../../../libs/spirit/classic/index.html" target="_top"><span class="emphasis"><em>Spirit.Classic</em></span></a>
puts the skipping business into the so-called "scanner" which
blindly attempts a pre-skip every time we increment the iterator.
</p>
<p>
What is the <span class="emphasis"><em>attribute</em></span> of the kleene? In general, it
is a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">vector</span><span class="special">&lt;</span><span class="identifier">T</span><span class="special">&gt;</span></code>
where <code class="computeroutput"><span class="identifier">T</span></code> is the attribute
of the subject. There is a special case though. If <code class="computeroutput"><span class="identifier">T</span></code>
is an <code class="computeroutput"><span class="identifier">unused_type</span></code>, then
the attribute of kleene is also <code class="computeroutput"><span class="identifier">unused_type</span></code>.
<code class="computeroutput"><span class="identifier">traits</span><span class="special">::</span><span class="identifier">build_std_vector</span></code> takes care of that minor
detail.
</p>
<p>
So, let's parse. First, we need to provide a local attribute of for the
subject:
</p>
<p>
</p>
<pre class="programlisting"><span class="keyword">typename</span> <span class="identifier">traits</span><span class="special">::</span><span class="identifier">attribute_of</span><span class="special">&lt;</span><span class="identifier">Subject</span><span class="special">,</span> <span class="identifier">Context</span><span class="special">&gt;::</span><span class="identifier">type</span> <span class="identifier">val</span><span class="special">;</span>
</pre>
<p>
</p>
<p>
<code class="computeroutput"><span class="identifier">traits</span><span class="special">::</span><span class="identifier">attribute_of</span><span class="special">&lt;</span><span class="identifier">Subject</span><span class="special">,</span> <span class="identifier">Context</span><span class="special">&gt;</span></code>
simply calls the subject's <code class="computeroutput"><span class="keyword">struct</span>
<span class="identifier">attribute</span><span class="special">&lt;</span><span class="identifier">Context</span><span class="special">&gt;</span></code>
nested metafunction.
</p>
<p>
<span class="emphasis"><em>val</em></span> starts out default initialized. This val is the
one we'll pass to the subject's parse function.
</p>
<p>
The kleene repeats indefinitely while the subject parser is successful.
On each successful parse, we <code class="computeroutput"><span class="identifier">push_back</span></code>
the parsed attribute to the kleene's attribute, which is expected to be,
at the very least, compatible with a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">vector</span></code>.
In other words, although we say that we want our attribute to be a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">vector</span></code>, we try to be more lenient than
that. The caller of kleene's parse may pass a different attribute type.
For as long as it is also a conforming STL container with <code class="computeroutput"><span class="identifier">push_back</span></code>, we are ok. Here is the kleene
loop:
</p>
<p>
</p>
<pre class="programlisting"><span class="keyword">while</span> <span class="special">(</span><span class="identifier">subject</span><span class="special">.</span><span class="identifier">parse</span><span class="special">(</span><span class="identifier">first</span><span class="special">,</span> <span class="identifier">last</span><span class="special">,</span> <span class="identifier">context</span><span class="special">,</span> <span class="identifier">skipper</span><span class="special">,</span> <span class="identifier">val</span><span class="special">))</span>
<span class="special">{</span>
<span class="comment">// push the parsed value into our attribute
</span> <span class="identifier">traits</span><span class="special">::</span><span class="identifier">push_back</span><span class="special">(</span><span class="identifier">attr</span><span class="special">,</span> <span class="identifier">val</span><span class="special">);</span>
<span class="identifier">traits</span><span class="special">::</span><span class="identifier">clear</span><span class="special">(</span><span class="identifier">val</span><span class="special">);</span>
<span class="special">}</span>
<span class="keyword">return</span> <span class="keyword">true</span><span class="special">;</span>
</pre>
<p>
Take note that we didn't call attr.push_back(val). Instead, we called a
Spirit provided function:
</p>
<p>
</p>
<pre class="programlisting"><span class="identifier">traits</span><span class="special">::</span><span class="identifier">push_back</span><span class="special">(</span><span class="identifier">attr</span><span class="special">,</span> <span class="identifier">val</span><span class="special">);</span>
</pre>
<p>
</p>
<p>
This is a recurring pattern. The reason why we do it this way is because
attr <span class="bold"><strong>can</strong></span> be <code class="computeroutput"><span class="identifier">unused_type</span></code>.
<code class="computeroutput"><span class="identifier">traits</span><span class="special">::</span><span class="identifier">push_back</span></code> takes care of that detail.
The overload for unused_type is a no-op. Now, you can imagine why <a href="http://boost-spirit.com" target="_top">Spirit</a> is fast! The parsers are so
simple and the generated code is as efficient as a hand rolled loop. All
these parser compositions and recursive parse invocations are extensively
inlined by a modern C++ compiler. In the end, you get a tight loop when
you use the kleene. No more excess baggage. If the attribute is unused,
then there is no code generated for that. That's how <a href="http://boost-spirit.com" target="_top">Spirit</a>
is designed.
</p>
<p>
The <span class="emphasis"><em>what</em></span> function simply wraps the output of the subject
in a "kleene<span class="quote">&#8220;<span class="quote">... "</span>&#8221;</span>".
</p>
<p>
Ok, now, like the <code class="computeroutput"><span class="identifier">int_parser</span></code>,
we have to hook our parser to the <span class="underline">qi</span>
engine. Here's how we do it:
</p>
<p>
First, we enable the prefix star operator. In proto, it's called the "dereference":
</p>
<p>
</p>
<pre class="programlisting"><span class="keyword">template</span> <span class="special">&lt;&gt;</span>
<span class="keyword">struct</span> <span class="identifier">use_operator</span><span class="special">&lt;</span><span class="identifier">qi</span><span class="special">::</span><span class="identifier">domain</span><span class="special">,</span> <span class="identifier">proto</span><span class="special">::</span><span class="identifier">tag</span><span class="special">::</span><span class="identifier">dereference</span><span class="special">&gt;</span> <span class="comment">// enables *p
</span> <span class="special">:</span> <span class="identifier">mpl</span><span class="special">::</span><span class="identifier">true_</span> <span class="special">{};</span>
</pre>
<p>
</p>
<p>
This is done in namespace <code class="computeroutput"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">spirit</span></code>
like its friend, the <code class="computeroutput"><span class="identifier">use_terminal</span></code>
specialization for our <code class="computeroutput"><span class="identifier">int_parser</span></code>.
Obviously, we use <span class="emphasis"><em>use_operator</em></span> to enable the dereference
for the qi::domain.
</p>
<p>
Then, we need to write our generator (in namespace qi):
</p>
<p>
</p>
<pre class="programlisting"><span class="keyword">template</span> <span class="special">&lt;</span><span class="keyword">typename</span> <span class="identifier">Elements</span><span class="special">,</span> <span class="keyword">typename</span> <span class="identifier">Modifiers</span><span class="special">&gt;</span>
<span class="keyword">struct</span> <span class="identifier">make_composite</span><span class="special">&lt;</span><span class="identifier">proto</span><span class="special">::</span><span class="identifier">tag</span><span class="special">::</span><span class="identifier">dereference</span><span class="special">,</span> <span class="identifier">Elements</span><span class="special">,</span> <span class="identifier">Modifiers</span><span class="special">&gt;</span>
<span class="special">:</span> <span class="identifier">make_unary_composite</span><span class="special">&lt;</span><span class="identifier">Elements</span><span class="special">,</span> <span class="identifier">kleene</span><span class="special">&gt;</span>
<span class="special">{};</span>
</pre>
<p>
</p>
<p>
This essentially says; for all expressions of the form: <code class="computeroutput"><span class="special">*</span><span class="identifier">p</span></code>, to build a kleene parser. Elements
is a <a href="../../../../../../../libs/fusion/doc/html/index.html" target="_top">Boost.Fusion</a>
sequence. For the kleene, which is a unary operator, expect only one element
in the sequence. That element is the subject of the kleene.
</p>
<p>
We still don't care about the Modifiers. We'll see how the modifiers is
all about when we get to deep directives.
</p>
</div>
<table xmlns:rev="http://www.cs.rpi.edu/~gregod/boost/tools/doc/revision" width="100%"><tr>
<td align="left"></td>
<td align="right"><div class="copyright-footer">Copyright &#169; 2001-2010 Joel de Guzman, Hartmut Kaiser<p>
Distributed under the Boost Software License, Version 1.0. (See accompanying
file LICENSE_1_0.txt or copy at <a href="http://www.boost.org/LICENSE_1_0.txt" target="_top">http://www.boost.org/LICENSE_1_0.txt</a>)
</p>
</div></td>
</tr></table>
<hr>
<div class="spirit-nav">
<a accesskey="p" href="../indepth.html"><img src="../../../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../indepth.html"><img src="../../../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../../../index.html"><img src="../../../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="../customize.html"><img src="../../../../../../../doc/src/images/next.png" alt="Next"></a>
</div>
</body>
</html>