| <html> |
| <head> |
| <title>Subrules</title> |
| <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> |
| <link rel="stylesheet" href="theme/style.css" type="text/css"> |
| </head> |
| |
| <body> |
| <table width="100%" border="0" background="theme/bkd2.gif" cellspacing="2"> |
| <tr> |
| <td width="10"> |
| </td> |
| <td width="85%"> |
| <font size="6" face="Verdana, Arial, Helvetica, sans-serif"><b>Subrules</b></font> |
| </td> |
| <td width="112"><a href="http://spirit.sf.net"><img src="theme/spirit.gif" width="112" height="48" align="right" border="0"></a></td> |
| </tr> |
| </table> |
| <br> |
| <table border="0"> |
| <tr> |
| <td width="10"></td> |
| <td width="30"><a href="../index.html"><img src="theme/u_arr.gif" border="0"></a></td> |
| <td width="30"><a href="grammar.html"><img src="theme/l_arr.gif" border="0"></a></td> |
| <td width="30"><a href="semantic_actions.html"><img src="theme/r_arr.gif" border="0"></a></td> |
| </tr> |
| </table> |
| <p>Spirit is implemented using expression templates. This is a very powerful technique. |
| Along with its power comes some complications. We almost take for granted that |
| when we write <tt>i | j >> k</tt> where <tt>i</tt>, <tt>j</tt> and <tt>k</tt> |
| are all integers the result is still an integer. Yet, with expression templates, |
| the same expression <tt>i | j >> k</tt> where <tt>i</tt>, <tt>j</tt> and |
| <tt>k</tt> are of type <tt>T</tt>, the result is a complex composite type [see |
| <a href="basic_concepts.html">Basic Concepts</a>]. Spirit expressions, which |
| are combinations of primitives and composites yield an infinite set of new types. |
| One problem is that C++ offers no easy facility to deduce the type of an arbitrarily |
| complex expression that yields a complex type. Thus, while it is easy to write:</p> |
| <pre><code><font color="#000000"><span class=identifier> </span><span class=keyword>int </span><span class=identifier>r </span><span class=special>= </span><span class=identifier>i </span><span class=special>| </span><span class=identifier>j </span><span class=special>>> </span><span class=identifier>k</span><span class=special>; </span><span class=comment>// where i, j, and k are ints</span></font></code></pre> |
| <p>Expression templates yield an endless supply of types. Without the <a href="rule.html">rule</a>, |
| there is no easy way to do this in C++ if <tt>i</tt>, <tt>j</tt> and <tt>k</tt> |
| are Spirit parsers:</p> |
| <pre><code><font color="#000000"><span class=comment> </span><span class=special><</span><span class=identifier>what_type???</span><span class=special>> </span><span class=identifier>r </span><span class=special>= </span><span class=identifier>i </span><span class=special>| </span><span class=identifier>j </span><span class=special>>> </span><span class=identifier>k</span><span class=special>; </span><span class=comment>// where i, j, and k are Spirit parsers</span></font></code></pre> |
| <p>If <tt>i</tt>, <tt>j</tt> and <tt>k</tt> are all <tt>chlit<></tt> objects, |
| the type that we want is:</p> |
| <pre><code><font color="#000000"><span class=comment> </span><span class=keyword>typedef |
| </span><span class=identifier>alternative</span><span class=special>< |
| </span><span class=identifier>chlit</span><span class=special><></span><span class=comment> // i |
| </span><span class=special>,</span> <span class=identifier>sequence</span><span class=special>< |
| </span><span class=identifier>chlit</span><span class=special><> </span><span class=comment>// j |
| </span><span class=special> ,</span><span class=comment> </span><span class=identifier>chlit</span><span class=special><> </span><span class=comment>// k |
| </span><span class=special>> |
| > |
| </span><span class=identifier>rule_t</span><span class=special>; |
| |
| </span><span class=identifier>rule_t r </span><span class=special>= </span><span class=identifier>i </span><span class=special>| </span><span class=identifier>j </span><span class=special>>> </span><span class=identifier>k</span><span class=special>; </span><span class=comment>// where i, j, and k are chlit<> objects</span></font></code></pre> |
| <p>We deliberately formatted the type declaration nicely to make it understandable. |
| Try that with a more complex expression. While it can be done, explicitly spelling |
| out the type of a Spirit expression template is tedious and error prone. The |
| right hand side (rhs) has to mirror the type of the left hand side (lhs). (<img src="theme/lens.gif" width="15" height="16"> |
| Yet, if you still wish to do it, see this <a href="techniques.html#no_rules">link</a> |
| for a technique). </p> |
| <table width="80%" border="0" align="center"> |
| <tr> |
| <td class="note_box"><p><img src="theme/lens.gif" width="15" height="16"><b> |
| typeof and auto</b> <br> |
| <br> |
| Some compilers already support the <tt>typeof</tt> keyword. This can be |
| used to free us from having to explicitly type the type (pun intentional). |
| Using the <tt>typeof</tt>, we can rewrite the Spirit expression above |
| as:<br> |
| <br> |
| <span class="keyword"><code>typeof</code><code></code></span><code><span class=special>(</span><span class=identifier>i |
| </span><span class=special>| </span><span class=identifier>j </span><span class=special>>> |
| </span><span class=identifier>k</span><span class=special>) </span><span class=identifier>r |
| </span><span class=special>= </span><span class=identifier>i </span><span class=special>| |
| </span><span class=identifier>j </span><span class=special>>> </span><span class=identifier>k</span><span class=special>;</span></code><br> |
| <br> |
| While this is better than having to explicitly declare a complex type, |
| it is redundant, error prone and still an eye sore. The expression is |
| typed twice. The only way to simplify this is to introduce a macro (See |
| this <a href="techniques.html#typeof">link</a> for more information).<br> |
| <br> |
| <a href="http://www.boost-consulting.com">David Abrahams</a> proposed |
| in comp.std.c++ to reuse the <tt>auto</tt> keyword for type deduced variables. |
| This has been extensibly discussed in <a href="http://www.boost.org">boost.org</a>. Example: |
| <br> |
| <br> |
| <span class=keyword><code>auto </code></span><code><span class=identifier>r |
| </span><span class=special>= </span><span class=identifier>i </span><span class=special>| |
| </span><span class=identifier>j </span><span class=special>>> </span><span class=identifier>k</span><span class=special>;</span></code><br> |
| <br> |
| Once such a C++ extension is accepted into the standard, this would be |
| a neat solution and a nice fit for our purpose. It's not a complete solution |
| though since there are still situations where we do not know the rhs beforehand; |
| for instance when pre-declaring cyclic dependent rules.</p> |
| </td> |
| </tr> |
| </table> |
| <p>Fortunately, rules come to the rescue. Rules can capture the type of the expression |
| assigned to it. Thus:</p> |
| <pre><code><font color="#000000"> <span class=identifier>rule</span><span class=special><> </span><span class=identifier>r </span><span class=special>= </span><span class=identifier>i </span><span class=special>| </span><span class=identifier>j </span><span class=special>>> </span><span class=identifier>k</span><span class=special>; </span><span class=comment>// where i, j, and k are chlit<> objects</span></font></code></pre> |
| <p>It might not be apparent but behind the scenes, plain rules are actually implemented |
| using a pointer to a runtime polymorphic abstract class that holds the dynamic |
| type of the parser assigned to it. When a Spirit expression is assigned to a |
| rule, its type is encapsulated in a concrete subclass of the abstract class. |
| A virtual parse function delegates the parsing to the encapsulated object.</p> |
| <p>Rules have drawbacks though:</p> |
| <p><img src="theme/bullet.gif" width="12" height="12"> It is coupled to a specific |
| scanner type. The rule is tied to a specific scanner [see <a href="faq.html#scanner_business">The |
| Scanner Business</a>].<br> |
| <img src="theme/bullet.gif" width="12" height="12"> The rule's parse member |
| function has a virtual function call overhead that cannot be inlined.</p> |
| <h2>Static rules: subrules</h2> |
| <p>The subrule is a fully static version of the rule. The subrule does not have |
| the drawbacks listed above. </p> |
| <p><img src="theme/bullet.gif" width="12" height="12"> The subrule is not tied |
| to a specific scanner so just about any scanner type may be used<br> |
| <img src="theme/bullet.gif" width="12" height="12"> The subrule also allows |
| aggressive inlining since there are no virtual function calls</p> |
| <pre><code><font color="#000000"><span class=identifier> </span><span class=keyword>template</span><span class=special><</span><span class=keyword>int </span></font><span class="identifier">ID</span><font color="#000000"><span class=special>, </span><span class=keyword>typename </span><span class=identifier>ContextT </span><span class=special>= </span><span class=identifier>parser_context</span><span class=special><></span> <span class=special>> |
| </span><span class=keyword>class </span><span class=identifier>subrule</span><span class=special>;</span></font></code></pre> |
| <p>The first template parameter gives the subrule an identification tag. Like |
| the <a href="rule.html">rule</a>, there is a ContextT template parameter that |
| defaults to <code><tt>parser_context</tt></code>. You need not be concerned |
| at all with the <tt>ContextT</tt> template parameter unless you wish to tweak |
| the low level behavior of the subrule. Detailed information on the <tt>ContextT</tt> |
| template parameter is provided <a href="indepth_the_parser_context.html">elsewhere</a>. |
| </p> |
| <p>Presented above is the public API. There may actually be more template parameters |
| after <tt>ContextT</tt>. Everything after the <tt>ContextT</tt> parameter should |
| not be of concern to the client and are strictly for internal use only.</p> |
| <p>Apart from a few minor differences, the subrule follows the usage and syntax |
| of the rule closely. Here's the calculator grammar using subrules:</p> |
| <pre><code><font color="#000000"><span class=comment> </span><span class=keyword>struct </span><span class=identifier>calculator </span><span class=special>: </span><span class=keyword>public </span><span class=identifier>grammar</span><span class=special><</span><span class=identifier>calculator</span><span class=special>> |
| </span><span class=special>{ |
| </span><span class=keyword>template </span><span class=special><</span><span class=keyword>typename </span><span class=identifier>ScannerT</span><span class=special>> |
| </span><span class=keyword>struct </span><span class=identifier>definition |
| </span><span class=special>{ |
| </span><span class=identifier>definition</span><span class=special>(</span><span class=identifier>calculator </span><span class=keyword>const</span><span class=special>& </span><span class=identifier>self</span><span class=special>) |
| </span><span class=special>{ |
| </span><span class=identifier>first </span><span class=special>= |
| </span><span class=special>( |
| </span><span class=identifier>expression </span><span class=special>= </span><span class=identifier>term </span><span class=special>>> </span><span class=special>*((</span><span class=literal>'+' </span><span class=special>>> </span><span class=identifier>term</span><span class=special>) </span><span class=special>| </span><span class=special>(</span><span class=literal>'-' </span><span class=special>>> </span><span class=identifier>term</span><span class=special>)), |
| </span><span class=identifier>term </span><span class=special>= </span><span class=identifier>factor </span><span class=special>>> </span><span class=special>*((</span><span class=literal>'*' </span><span class=special>>> </span><span class=identifier>factor</span><span class=special>) </span><span class=special>| </span><span class=special>(</span><span class=literal>'/' </span><span class=special>>> </span><span class=identifier>factor</span><span class=special>)), |
| </span><span class=identifier>factor </span><span class=special>= </span><span class=identifier>integer </span><span class=special>| </span><span class=identifier>group</span><span class=special>, |
| </span><span class=identifier>group </span><span class=special>= </span><span class=literal>'(' </span><span class=special>>> </span><span class=identifier>expression </span><span class=special>>> </span><span class=literal>')' |
| </span><span class=special>); |
| </span><span class=special>} |
| |
| </span><span class=identifier>subrule</span><span class=special><</span><span class=number>0</span><span class=special>> </span><span class=identifier>expression</span><span class=special>; |
| </span><span class=identifier>subrule</span><span class=special><</span><span class=number>1</span><span class=special>> </span><span class=identifier>term</span><span class=special>; |
| </span><span class=identifier>subrule</span><span class=special><</span><span class=number>2</span><span class=special>> </span><span class=identifier>factor</span><span class=special>; |
| </span><span class=identifier>subrule</span><span class=special><</span><span class=number>3</span><span class=special>> </span><span class=identifier>group</span><span class=special>; |
| |
| </span><span class=identifier>rule</span><span class=special><</span><span class=identifier>ScannerT</span><span class=special>> </span><span class=identifier>first</span><span class=special>; |
| </span><span class=identifier>rule</span><span class=special><</span><span class=identifier>ScannerT</span><span class=special>> </span><span class=keyword>const</span><span class=special>& |
| </span><span class=identifier>start</span><span class=special>() </span><span class=keyword>const </span><span class=special>{ </span><span class=keyword>return </span><span class=identifier>first</span><span class=special>; </span><span class=special>} |
| </span><span class=special>}; |
| </span><span class=special>};</span></font></code></pre> |
| <p><img src="theme/lens.gif" width="15" height="16"> A fully working example with |
| <a href="semantic_actions.html">semantic actions</a> can be <a href="../example/fundamental/subrule_calc.cpp">viewed |
| here</a>. This is part of the Spirit distribution. </p> |
| <table border="0" align="left"> |
| <tr> |
| <td width="199"><img src="theme/subrule1.png" width="234" height="224"></td> |
| <td width="2"></td> |
| </tr> |
| </table> |
| <p>The subrule as an efficient version of the rule. Compiler optimizations such |
| as aggressive inlining help reduce the code size and increase performance significantly. |
| </p> |
| <p>The subrule is not a panacea however. Subrules push the C++ compiler hard to |
| its knees. For example, current compilers have a limit on recursion depth that |
| may not be exceeded. Don't even think about writing a full pascal grammar using |
| subrules alone. A grammar using subrules is a single C++ expression. Current |
| C++ compilers cannot handle very complex expressions very well. Finally, a plain |
| rule is still needed to act as place holder for subrules.</p> |
| <p>The code above is a good example of the recommended way to use subrules. Notice |
| the hierarchy. We have a grammar that encapsulates the whole calculator. The |
| start rule is a plain rule that holds the set of subrules. The subrules in turn |
| defines the actual details of the grammar.</p> |
| <table width="80%" border="0" align="center"> |
| <tr> |
| <td class="note_box"><img src="theme/lens.gif" width="15" height="16"><b> |
| Template instantiation depth</b> <br> <br> |
| Spirit pushes the C++ compiler hard. Current C++ compilers cannot handle |
| very complex heavily nested expressions very well. One restricting factor |
| is the typical compiler's limit on template recursion depth. Some, but not |
| all, compilers allow this limit to be configured.<br> |
| <br> |
| g++'s maximum can be set using a compiler flag: -ftemplate-depth. Set this |
| appropriately if you have a relatively complex grammar.<br> |
| <br> |
| Microsoft Visual C++ can take greater than 1000 for both template class |
| and function instantiation depths. However, the linker chokes with deep |
| template function instantiation unless inline recursion depth is set using |
| these pragmas:<br> |
| <br> |
| <span class="preprocessor">#pragma</span> inline_depth<span class="special">(</span>255<span class="special">)</span><br> |
| <span class="preprocessor">#pragma</span> inline_recursion<span class="special">(</span>on<span class="special">)<br> |
| <br> |
| </span>Perhaps this limitations no longer applies to more current versions |
| of these compilers. Be sure to check your compiler documentation.</td> |
| </tr> |
| </table> |
| <p>This setup gives a good balance. The subrules do all the work. Each grammar |
| will have only one rule: <tt>first</tt>. The rule is used just to hold the subrules |
| and make them visible to the grammar. </p> |
| <h3>The subrule definition</h3> |
| <p>Like the rule, the expression after assignment operator <tt>=</tt> defines |
| the subrule:</p> |
| <pre> <span class=identifier>identifier </span><span class=special>= </span><span class=identifier>expression</span></pre> |
| <p>Unlike rules, subrules may be defined only once. Redefining a subrule is illegal |
| and will result to a compile time assertion.</p> |
| <h3>Separators [ , ]</h3> |
| <p>While rules are terminated by the semicollon <tt>';'</tt>. Subrules are not |
| terminated but are separated by the comma: <tt>','</tt>. Like Pascal statements, |
| the last subrule in a group may not have a trailing comma.</p> |
| <pre><span class=identifier> </span><span class=identifier>a </span><span class=special>= </span><span class=identifier>ch_p</span><span class=special>(</span><span class=literal>'a'</span><span class=special>), |
| </span><span class=identifier>b </span><span class=special>= </span><span class=identifier>ch_p</span><span class=special>(</span><span class=literal>'b'</span><span class=special>), |
| </span><span class=identifier>c </span><span class=special>= </span><span class=identifier>ch_p</span><span class=special>(</span><span class=literal>'c'</span><span class=special>), </span><span class=comment>// BAD, trailing comma</span><code><font color="#000000"><font color="#800000"><i></i></font></font></code><code><font color="#000000"><font color="#800000"><i></i></font></font><i></i></code></pre> |
| <p> |
| <pre><code><span class=comment> </span><span class=identifier>a </span><span class=special>= </span><span class=identifier>ch_p</span><span class=special>(</span><span class=literal>'a'</span><span class=special>), |
| </span><span class=identifier>b </span><span class=special>= </span><span class=identifier>ch_p</span><span class=special>(</span><span class=literal>'b'</span><span class=special>), |
| </span><span class=identifier>c </span><span class=special>= </span><span class=identifier>ch_p</span><span class=special>(</span><span class=literal>'c'</span><span class=special>) </span><span class=comment>// OK</span></code></pre> |
| <h3> The start subrule</h3> |
| <p>Unlike rules, parsing proceeds from the start subrule. The first (topmost) |
| subrule in a group of subrules is called the <b>start subrule</b>. In our example |
| above, <tt>expression</tt> is the start subrule. When a group of subrules is |
| called forth, the start subrule <tt>expression</tt> is called first.</p> |
| <h3>IDs</h3> |
| <p>Each subrule has a corresponding ID; an integral constant that uniquely specifies |
| the subrule. Our example above has four subrules. They are declared as:</p> |
| <pre><code><span class=comment> </span><span class=identifier>subrule</span><span class=special><</span><span class=number>0</span><span class=special>> </span><span class=identifier>expression</span><span class=special>; |
| </span><span class=identifier>subrule</span><span class=special><</span><span class=number>1</span><span class=special>> </span><span class=identifier>term</span><span class=special>; |
| </span><span class=identifier>subrule</span><span class=special><</span><span class=number>2</span><span class=special>> </span><span class=identifier>factor</span><span class=special>; |
| </span><span class=identifier>subrule</span><span class=special><</span><span class=number>3</span><span class=special>> </span><span class=identifier>group</span><span class=special>;</span></code></pre> |
| <h3> Aliases</h3> |
| <p>It is possible to have subrules with similar IDs. A subrule with a similar |
| ID to will be an alias of the other. Both subrules may be used interchangeably.</p> |
| <pre><code><span class=special> </span><span class=identifier>subrule</span><span class=special><</span><span class=number>0</span><span class=special>> </span><span class=identifier>a</span><span class=special>; |
| </span><span class=identifier>subrule</span><span class=special><</span><span class=number>0</span><span class=special>> </span><span class=identifier>alias</span><span class=special>; </span><span class=comment>// alias of a</span></code></pre> |
| <h3>Groups: scope and nesting</h3> |
| <p>The scope of a subrule and its definition is the enclosing group, typically |
| (and by convention) enclosed inside the parentheses. IDs outside a scope are |
| not directly visible. Inner subrule groups can be nested by enclosing each sub-group |
| inside another set of parentheses. Each group is unique and acts independently. |
| Consequently, while it may not be advisable to do so, a subrule in a group may |
| share the same ID as a subrule in another group since both groups are independent |
| of each other.</p> |
| <pre><code><span class=comment> </span><span class=identifier>subrule</span><span class=special><</span><span class=number>0</span><span class=special>> </span><span class=identifier>a</span><span class=special>; |
| </span><span class=identifier>subrule</span><span class=special><</span><span class=number>1</span><span class=special>> </span><span class=identifier>b</span><span class=special>; |
| </span><span class=identifier>subrule</span><span class=special><</span><span class=number>0</span><span class=special>> </span><span class=identifier>c</span><span class=special>; |
| </span><span class=identifier>subrule</span><span class=special><</span><span class=number>1</span><span class=special>> </span><span class=identifier>d</span><span class=special>; |
| |
| </span><span class=special>( </span><span class=comment>// outer subrule group, scope of a and b |
| </span><span class=identifier>a </span><span class=special>= </span><span class=identifier>ch_p</span><span class=special>(</span><span class=literal>'a'</span><span class=special>), |
| </span><span class=identifier>b </span><span class=special>= |
| </span><span class=special>( </span><span class=comment>// inner subrule group, scope of b and c |
| </span><span class=identifier>c </span><span class=special>= </span><span class=identifier>ch_p</span><span class=special>(</span><span class=literal>'c'</span><span class=special>), |
| </span><span class=identifier>d </span><span class=special>= </span><span class=identifier>ch_p</span><span class=special>(</span><span class=literal>'d'</span><span class=special>) |
| </span><span class=special>) |
| </span><span class=special>)</span></code></pre> |
| <p>Subrule IDs need to be unique only within a group. A grammar is an implicit |
| group. Furthermore, even subrules in a grammar may have the same IDs without |
| clashing if they are inside a group. Subrules may be explicitly grouped using |
| the parentheses. Parenthesized groups have unique scopes. In the code above, |
| the outer subrule group defines the subrules <tt>a</tt> and <tt>b</tt> while |
| the inner subrule group defines the subrules <tt>c</tt> and <tt>d</tt>. Notice |
| that the definition of <tt>b</tt> is the inner subrule.</p> |
| <table border="0"> |
| <tr> |
| <td width="10"></td> |
| <td width="30"><a href="../index.html"><img src="theme/u_arr.gif" border="0"></a></td> |
| <td width="30"><a href="grammar.html"><img src="theme/l_arr.gif" border="0"></a></td> |
| <td width="30"><a href="semantic_actions.html"><img src="theme/r_arr.gif" border="0"></a></td> |
| </tr> |
| </table> |
| <br> |
| <hr size="1"> |
| <p class="copyright">Copyright © 1998-2003 Joel de Guzman<br> |
| <br> |
| <font size="2">Use, modification and distribution is subject to the Boost Software |
| License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at |
| http://www.boost.org/LICENSE_1_0.txt)</font></p> |
| <p> </p> |
| <p><code><font color="#000000"><font color="#0000ff"></font></font></code></p> |
| </body> |
| </html> |