| <html> |
| <head> |
| <title>The Grammar</title> |
| <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> |
| <link rel="stylesheet" href="theme/style.css" type="text/css"> |
| </head> |
| |
| <body> |
| <table width="100%" border="0" background="theme/bkd2.gif" cellspacing="2"> |
| <tr> |
| <td width="10"> |
| </td> |
| <td width="85%"> |
| <font size="6" face="Verdana, Arial, Helvetica, sans-serif"><b>The Grammar</b></font> |
| </td> |
| <td width="112"><a href="http://spirit.sf.net"><img src="theme/spirit.gif" width="112" height="48" align="right" border="0"></a></td> |
| </tr> |
| </table> |
| <br> |
| <table border="0"> |
| <tr> |
| <td width="10"></td> |
| <td width="30"><a href="../index.html"><img src="theme/u_arr.gif" border="0"></a></td> |
| <td width="30"><a href="scanner.html"><img src="theme/l_arr.gif" border="0"></a></td> |
| <td width="30"><a href="subrules.html"><img src="theme/r_arr.gif" border="0"></a></td> |
| </tr> |
| </table> |
| <p>The <b>grammar</b> encapsulates a set of rules. The <tt>grammar</tt> class |
| is a protocol base class. It is essentially an interface contract. The <tt>grammar</tt> |
| is a template class that is parameterized by its derived class, <tt>DerivedT</tt>, |
| and its context, <tt>ContextT</tt>. The template parameter ContextT defaults |
| to <tt>parser_context</tt>, a predefined context. </p> |
| <p>You need not be concerned at all with the ContextT template parameter unless |
| you wish to tweak the low level behavior of the grammar. Detailed information |
| on the ContextT template parameter is provided <a href="indepth_the_parser_context.html">elsewhere</a>. |
| The <tt>grammar</tt> relies on the template parameter DerivedT, a grammar subclass |
| to define the actual rules.</p> |
| <p>Presented below is the public API. There may actually be more template parameters |
| after <tt>ContextT</tt>. Everything after the <tt>ContextT</tt> parameter should |
| not be of concern to the client and are strictly for internal use only.</p> |
| <pre><code><font color="#000000"><span class=identifier> </span><span class=keyword>template</span><span class=special>< |
| </span><span class=keyword>typename </span><span class=identifier>DerivedT</span><span class=special>, |
| </span><span class=keyword>typename </span><span class=identifier>ContextT </span><span class=special>= </span><span class=identifier>parser_context</span><span class=special><</span><span class=special>> > |
| </span><span class=keyword>struct </span><span class=identifier>grammar</span><span class=special>;</span></font></code></pre> |
| <h2>Grammar definition</h2> |
| <p>A concrete sub-class inheriting from <tt>grammar</tt> is expected to have a |
| nested template class (or struct) named <tt>definition</tt>:</p> |
| <blockquote> |
| <p><img src="theme/bullet.gif" width="13" height="13"> It is a nested template |
| class with a typename <tt>ScannerT</tt> parameter.<br> |
| <img src="theme/bullet.gif" width="13" height="13"> Its constructor defines |
| the grammar rules.<br> |
| <img src="theme/bullet.gif" width="13" height="13"> Its constructor is passed |
| in a reference to the actual grammar <tt>self</tt>.<br> |
| <img src="theme/bullet.gif" width="13" height="13"> It has a member function |
| named <tt>start</tt> that returns a reference to the start <tt>rule</tt>.</p> |
| </blockquote> |
| <h2>Grammar skeleton</h2> |
| <pre><code><font color="#000000"><span class=special> </span><span class=keyword>struct </span><span class=identifier>my_grammar </span><span class=special>: </span><span class=keyword>public </span><span class=identifier>grammar</span><span class=special><</span><span class=identifier>my_grammar</span><span class=special>> |
| </span><span class=special>{ |
| </span><span class=keyword>template </span><span class=special><</span><span class=keyword>typename </span><span class=identifier>ScannerT</span><span class=special>> |
| </span><span class=keyword>struct </span><span class=identifier>definition |
| </span><span class=special>{ |
| </span><span class=identifier>rule</span><span class=special><</span><span class=identifier>ScannerT</span><span class=special>> </span><span class=identifier>r</span><span class=special>; |
| </span><span class=identifier>definition</span><span class=special>(</span><span class=identifier>my_grammar </span><span class=keyword>const</span><span class=special>& </span><span class=identifier>self</span><span class=special>) </span><span class=special>{ </span><span class=identifier>r </span><span class=special>= </span><span class=comment>/*..define here..*/</span><span class=special>; </span><span class=special>} |
| </span><span class=identifier>rule</span><span class=special><</span><span class=identifier>ScannerT</span><span class=special>> </span><span class=keyword>const</span><span class=special>& </span><span class=identifier>start</span><span class=special>() </span><span class=keyword>const </span><span class=special>{ </span><span class=keyword>return </span><span class=identifier>r</span><span class=special>; </span><span class=special>} |
| </span><span class=special>}; |
| </span><span class=special>};</span></font></code></pre> |
| <p>Decoupling the scanner type from the rules that form a grammar allows the grammar |
| to be used in different contexts possibly using different scanners. We do not |
| care what scanner we are dealing with. The user-defined <tt>my_grammar</tt> |
| can be used with <b>any</b> type of scanner. Unlike the rule, the grammar is |
| not tied to a specific scanner type. See <a href="faq.html#scanner_business">"Scanner |
| Business"</a> to see why this is important and to gain further understanding |
| on this scanner-rule coupling problem.</p> |
| <h2>Instantiating and using my_grammar</h2> |
| <p>Our grammar above may be instantiated and put into action:</p> |
| <pre><code><font color="#000000"><span class=special> </span><span class=identifier>my_grammar </span><span class=identifier>g</span><span class=special>; |
| |
| </span><span class=keyword>if </span><span class=special>(</span><span class=identifier>parse</span><span class=special>(</span><span class=identifier>first</span><span class=special>, </span><span class=identifier>last</span><span class=special>, </span><span class=identifier>g</span><span class=special>, </span><span class=identifier>space_p</span><span class=special>).</span><span class=identifier>full</span><span class=special>) |
| </span><span class=identifier>cout </span><span class=special><< </span><span class=string>"parsing succeeded\n"</span><span class=special>; |
| </span><span class=keyword>else |
| </span><span class=identifier>cout </span><span class=special><< </span><span class=string>"parsing failed\n"</span><span class=special>;</span></font></code></pre> |
| <p><tt>my_grammar</tt> <b>IS-A </b>parser and can be used anywhere a parser is |
| expected, even referenced by another rule:</p> |
| <pre><code><font color="#000000"><span class=special> </span><span class=identifier>rule</span><span class=special><> </span><span class=identifier>r </span><span class=special>= </span><span class=identifier>g </span><span class=special>>> </span><span class=identifier>str_p</span><span class=special>(</span><span class=string>"cool huh?"</span><span class=special>);</span></font></code></pre> |
| <table width="80%" border="0" align="center"> |
| <tr> |
| <td class="note_box"><img src="theme/alert.gif" width="16" height="16"> <b>Referencing |
| grammars<br> |
| </b><br> |
| Like the rule, the grammar is also held by reference when it is placed in |
| the right hand side of an EBNF expression. It is the responsibility of the |
| client to ensure that the referenced grammar stays in scope and does not |
| get destructed while it is being referenced. </td> |
| </tr> |
| </table> |
| <h2><a name="full_grammar"></a>Full Grammar Example</h2> |
| <p>Recalling our original calculator example, here it is now rewritten using a |
| grammar:</p> |
| <pre><code><font color="#000000"><span class=special> </span><span class=keyword>struct </span><span class=identifier>calculator </span><span class=special>: </span><span class=keyword>public </span><span class=identifier>grammar</span><span class=special><</span><span class=identifier>calculator</span><span class=special>> |
| </span><span class=special>{ |
| </span><span class=keyword>template </span><span class=special><</span><span class=keyword>typename </span><span class=identifier>ScannerT</span><span class=special>> |
| </span><span class=keyword>struct </span><span class=identifier>definition |
| </span><span class=special>{ |
| </span><span class=identifier>definition</span><span class=special>(</span><span class=identifier>calculator </span><span class=keyword>const</span><span class=special>& </span><span class=identifier>self</span><span class=special>) |
| </span><span class=special>{ |
| </span><span class=identifier>group </span><span class=special>= </span><span class=literal>'(' </span><span class=special>>> </span><span class=identifier>expression </span><span class=special>>> </span><span class=literal>')'</span><span class=special>; |
| </span><span class=identifier>factor </span><span class=special>= </span><span class=identifier>integer </span><span class=special>| </span><span class=identifier>group</span><span class=special>; |
| </span><span class=identifier>term </span><span class=special>= </span><span class=identifier>factor </span><span class=special>>> </span><span class=special>*((</span><span class=literal>'*' </span><span class=special>>> </span><span class=identifier>factor</span><span class=special>) </span><span class=special>| </span><span class=special>(</span><span class=literal>'/' </span><span class=special>>> </span><span class=identifier>factor</span><span class=special>)); |
| </span><span class=identifier>expression </span><span class=special>= </span><span class=identifier>term </span><span class=special>>> </span><span class=special>*((</span><span class=literal>'+' </span><span class=special>>> </span><span class=identifier>term</span><span class=special>) </span><span class=special>| </span><span class=special>(</span><span class=literal>'-' </span><span class=special>>> </span><span class=identifier>term</span><span class=special>)); |
| </span><span class=special>} |
| |
| </span><span class=identifier>rule</span><span class=special><</span><span class=identifier>ScannerT</span><span class=special>> </span><span class=identifier>expression</span><span class=special>, </span><span class=identifier>term</span><span class=special>, </span><span class=identifier>factor</span><span class=special>, </span><span class=identifier>group</span><span class=special>; |
| |
| </span><span class=identifier>rule</span><span class=special><</span><span class=identifier>ScannerT</span><span class=special>> </span><span class=keyword>const</span><span class=special>& |
| </span><span class=identifier>start</span><span class=special>() </span><span class=keyword>const </span><span class=special>{ </span><span class=keyword>return </span><span class=identifier>expression</span><span class=special>; </span><span class=special>} |
| </span><span class=special>}; |
| </span><span class=special>};</span></font></code></pre> |
| <p><img src="theme/lens.gif" width="15" height="16"> A fully working example with |
| <a href="semantic_actions.html">semantic actions</a> can be <a href="../example/fundamental/calc_plain.cpp">viewed |
| here</a>. This is part of the Spirit distribution. </p> |
| <table width="80%" border="0" align="center"> |
| <tr> |
| <td class="note_box"><img src="theme/lens.gif" width="15" height="16"> <b>self</b><br> |
| <br> |
| You might notice that the definition of the grammar has a constructor that |
| accepts a const reference to the outer grammar. In the example above, notice |
| that <tt>calculator::definition</tt> takes in a <tt>calculator const& |
| self</tt>. While this is unused in the example above, in many cases, this |
| is very useful. The self argument is the definition's window to the outside |
| world. For example, the calculator class might have a reference to some |
| state information that the definition can update while parsing proceeds |
| through <a href="semantic_actions.html">semantic actions</a>. </td> |
| </tr> |
| </table> |
| <h2>Grammar Capsules</h2> |
| <p>As a grammar becomes complicated, it is a good idea to group parts into logical |
| modules. For instance, when writing a language, it might be wise to put expressions |
| and statements into separate grammar capsules. The grammar takes advantage of |
| the encapsulation properties of C++ classes. The declarative nature of classes |
| makes it a perfect fit for the definition of grammars. Since the grammar is |
| nothing more than a class declaration, we can conveniently publish it in header |
| files. The idea is that once written and fully tested, a grammar can be reused |
| in many contexts. We now have the notion of grammar libraries.</p> |
| <h2><a name="multithreading"></a>Reentrancy and multithreading</h2> |
| <p>An instance of a grammar may be used in different places multiple times without |
| any problem. The implementation is tuned to allow this at the expense of some |
| overhead. However, we can save considerable cycles and bytes if we are certain |
| that a grammar will only have a single instance. If this is desired, simply |
| define <tt>BOOST_SPIRIT_SINGLE_GRAMMAR_INSTANCE</tt> before including any spirit |
| header files.</p> |
| <pre><font face="Courier New, Courier, mono"><code><span class="preprocessor"> #define</span></code></font><span class="preprocessor"><code><font face="Courier New, Courier, mono"> </font><tt>BOOST_SPIRIT_SINGLE_GRAMMAR_INSTANCE</tt></code></span></pre> |
| <p> On the other hand, if a grammar is intended to be used in multithreaded code, |
| we should then define <tt>BOOST_SPIRIT_THREADSAFE</tt> before including any |
| spirit header files. In this case it will also be required to link against <a href="http://www.boost.org/libs/thread/doc/index.html">Boost.Threads</a></p> |
| <pre><font face="Courier New, Courier, mono"><span class="preprocessor"> #define</span></font> <span class="preprocessor"><tt>BOOST_SPIRIT_THREADSAFE</tt></span></pre> |
| <h2>Using more than one grammar start rule </h2> |
| <p>Sometimes it is desirable to have more than one visible entry point to a grammar |
| (apart from the start rule). To allow additional start points, Spirit provides |
| a helper template <tt>grammar_def</tt>, which may be used as a base class for |
| the <tt>definition</tt> subclass of your <tt>grammar</tt>. Here's an example:</p> |
| <pre><code> <span class="comment">// this header has to be explicitly included</span> |
| <span class="preprocessor">#include</span> <span class="string"><boost/spirit/utility/grammar_def.hpp></span> |
| |
| </span><span class=keyword>struct </span><span class=identifier>calculator2 </span><span class=special>: </span><span class=keyword>public </span><span class=identifier>grammar</span><span class=special><</span><span class=identifier>calculator2</span><span class=special>> |
| { |
| </span> <span class="keyword">enum</span> |
| { |
| expression = 0, |
| term = 1, |
| factor = 2, |
| }; |
| |
| <span class=special> </span><span class=keyword>template </span><span class=special><</span><span class=keyword>typename </span><span class=identifier>ScannerT</span><span class=special>> |
| </span><span class=keyword>struct </span><span class=identifier>definition |
| </span><span class="special">:</span> <span class="keyword">public</span><span class=identifier> grammar_def</span><span class="special"><</span><span class=identifier>rule</span><span class=special><</span><span class=identifier>ScannerT</span><span class=special>>,</span> same<span class="special">,</span> same<span class="special">></span> |
| <span class=special>{</span> |
| <span class=identifier>definition</span><span class=special>(</span><span class=identifier>calculator2 </span><span class=keyword>const</span><span class=special>& </span><span class=identifier>self</span><span class=special>) |
| { |
| </span><span class=identifier>group </span><span class=special>= </span><span class=literal>'(' </span><span class=special>>> </span><span class=identifier>expression </span><span class=special>>> </span><span class=literal>')'</span><span class=special>; |
| </span><span class=identifier>factor </span><span class=special>= </span><span class=identifier>integer </span><span class=special>| </span><span class=identifier>group</span><span class=special>; |
| </span><span class=identifier>term </span><span class=special>= </span><span class=identifier>factor </span><span class=special>>> *((</span><span class=literal>'*' </span><span class=special>>> </span><span class=identifier>factor</span><span class=special>) | (</span><span class=literal>'/' </span><span class=special>>> </span><span class=identifier>factor</span><span class=special>)); |
| </span><span class=identifier>expression </span><span class=special>= </span><span class=identifier>term </span><span class=special>>> *((</span><span class=literal>'+' </span><span class=special>>> </span><span class=identifier>term</span><span class=special>) | (</span><span class=literal>'-' </span><span class=special>>> </span><span class=identifier>term</span><span class=special>));</span> |
| |
| <span class="keyword">this</span><span class="special">-></span>start_parsers<span class="special">(</span>expression<span class="special">,</span> term<span class="special">,</span> factor<span class="special">);</span> |
| <span class="special">}</span> |
| |
| <span class=identifier>rule</span><span class=special><</span><span class=identifier>ScannerT</span><span class=special>> </span><span class=identifier>expression</span><span class=special>, </span><span class=identifier>term</span><span class=special>, </span><span class=identifier>factor, group</span><span class=special>; |
| </span><span class=special> }; |
| };</span></font></code></pre> |
| <p>The <tt>grammar_def</tt> template has to be instantiated with the types of |
| all the rules you wish to make visible from outside the <tt>grammar</tt>:</p> |
| <pre><code><span class=identifier> </span><span class=identifier>grammar_def</span><span class="special"><</span><span class=identifier>rule</span><span class=special><</span><span class=identifier>ScannerT</span><span class=special>>,</span> same<span class="special">,</span> same<span class="special">></span></code> </pre> |
| <p>The shorthand notation <tt>same</tt> is used to indicate that the same type |
| be used as specified by the previous template parameter (e.g. <code><tt>rule<ScannerT></tt></code>). |
| Obviously, <tt>same</tt> may not be used as the first template parameter. </p> |
| <table width="80%" border="0" align="center"> |
| <tr> |
| <td class="note_box"> <img src="theme/bulb.gif" width="13" height="18"> <strong>grammar_def |
| start types</strong><br> |
| <br> |
| It may not be obvious, but it is interesting to note that aside from rule<>s, |
| any parser type may be specified (e.g. chlit<>, strlit<>, int_parser<>, |
| etc.).</td> |
| </tr> |
| </table> |
| <p>Using the grammar_def class, there is no need to provide a <tt>start()</tt>member |
| function anymore. Instead, you'll have to insert a call to the <tt>this->start_parsers()</tt> |
| (which is a member function of the <tt>grammar_def</tt> template) to define |
| the start symbols for your <tt>grammar</tt>. <img src="theme/note.gif" width="16" height="16"> |
| Note that the number and the sequence of the rules used as the parameters to |
| the <tt>start_parsers()</tt> function should match the types specified in the |
| <tt>grammar_def</tt> template:</p> |
| <pre><code> <span class="keyword">this</span><span class="special">-></span>start_parsers<span class="special">(</span>expression<span class="special">,</span> term<span class="special">,</span> factor<span class="special">);</span></code></pre> |
| <p> The grammar entry point may be specified using the following syntax:</p> |
| <pre><code><font color="#000000"><span class=identifier> g</span><span class="special">.</span><span class=identifier>use_parser</span><span class="special"><</span><span class=identifier>N</span><span class=special>>() </span><span class="comment">// Where g is your grammar and N is the Nth entry.</span></font></code></pre> |
| <p>This sample shows how to use the <tt>term</tt> rule from the <tt>calculator2</tt> |
| grammar above:</p> |
| <pre><code><font color="#000000"><span class=identifier> calculator2 g</span><span class=special>; |
| |
| </span><span class=keyword>if </span><span class=special>(</span><span class=identifier>parse</span><span class=special>(</span><span class=identifier> |
| first</span><span class=special>, </span><span class=identifier>last</span><span class=special>, |
| </span><span class=identifier>g</span><span class="special">.</span><span class=identifier>use_parser</span><span class="special"><</span><span class=identifier>calculator2::term</span><span class=special>>(),</span><span class=identifier> |
| space_p</span><span class=special> |
| ).</span><span class=identifier>full</span><span class=special>) |
| { |
| </span><span class=identifier>cout </span><span class=special><< </span><span class=string>"parsing succeeded\n"</span><span class=special>; |
| } |
| </span><span class=keyword>else</span> <span class="special">{</span> |
| <span class=identifier>cout </span><span class=special><< </span><span class=string>"parsing failed\n"</span><span class=special>; |
| }</span></font></code></pre> |
| <p>The template parameter for the <tt>use_parser<></tt> template type should |
| be the zero based index into the list of rules specified in the <tt>start_parsers()</tt> |
| function call. </p> |
| <table width="80%" border="0" align="center"> |
| <tr> |
| <td class="note_box"><img src="theme/note.gif" width="16" height="16"> <tt><strong>use_parser<0></strong></tt><br> |
| <br> |
| Note, that using <span class="literal">0</span> (zero) as the template parameter |
| to <tt>use_parser</tt> is equivalent to using the start rule, exported by |
| conventional means through the <tt>start()</tt> function, as shown in the |
| first <tt><a href="grammar.html#full_grammar">calculator</a></tt> sample |
| above. So this notation may be used even for grammars exporting one rule |
| through its <tt>start()</tt> function only. On the other hand, calling a |
| <tt>grammar</tt> without the <tt>use_parser</tt> notation will execute the |
| rule specified as the first parameter to the <tt>start_parsers()</tt> function. |
| </td> |
| </tr> |
| </table> |
| <p>The maximum number of usable start rules is limited by the preprocessor constant:</p> |
| <pre> <span class="identifier">BOOST_SPIRIT_GRAMMAR_STARTRULE_TYPE_LIMIT</span> <span class="comment">// defaults to 3</span></pre> |
| <table border="0"> |
| <tr> |
| <td width="10"></td> |
| <td width="30"><a href="../index.html"><img src="theme/u_arr.gif" border="0"></a></td> |
| <td width="30"><a href="scanner.html"><img src="theme/l_arr.gif" border="0"></a></td> |
| <td width="30"><a href="subrules.html"><img src="theme/r_arr.gif" border="0"></a></td> |
| </tr> |
| </table> |
| <br> |
| <hr size="1"> |
| <p class="copyright">Copyright © 1998-2003 Joel de Guzman<br> |
| Copyright © 2003-2004 Hartmut Kaiser <br> |
| <br> |
| <font size="2">Use, modification and distribution is subject to the Boost Software |
| License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at |
| http://www.boost.org/LICENSE_1_0.txt) </font> </p> |
| <p> </p> |
| </body> |
| </html> |