| <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> |
| <html> |
| <head> |
| <title>List Parsers</title> |
| <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> |
| <link href="theme/style.css" rel="stylesheet" type="text/css"> |
| </head> |
| |
| <body> |
| <table width="100%" border="0" background="theme/bkd2.gif" cellspacing="2"> |
| <tr> |
| <td width="10"> <font size="6" face="Verdana, Arial, Helvetica, sans-serif"><b> </b></font></td> |
| <td width="85%"> <font size="6" face="Verdana, Arial, Helvetica, sans-serif"><b>List Parsers</b></font></td> |
| <td width="112"><a href="http://spirit.sf.net"><img src="theme/spirit.gif" width="112" height="48" align="right" border="0"></a></td> |
| </tr> |
| </table> |
| <br> |
| <table border="0"> |
| <tr> |
| <td width="10"></td> |
| <td width="30"><a href="../index.html"><img src="theme/u_arr.gif" border="0"></a></td> |
| <td width="30"><a href="confix.html"><img src="theme/l_arr.gif" border="0"></a></td> |
| <td width="30"><a href="functor_parser.html"><img src="theme/r_arr.gif" border="0"></a></td> |
| </tr> |
| </table> |
| <p>List Parsers are generated by the special predefined parser generator object |
| <tt>list_p</tt>, which generates parsers recognizing list structures |
| of the type </p> |
| <pre><span class=identifier> item </span><span class=special>>> </span><span class=special>*(</span><span class=identifier>delimiter </span><span class=special>>> </span><span class=identifier>item</span><span class=special>) </span><span class=special>>> </span><span class=special>!</span><span class=identifier>end</span></pre> |
| <p>where <tt>item</tt> is an expression, delimiter is a delimiter and end is an |
| optional closing expression. As you can see, the <tt>list_p</tt> generated parser |
| does not recognize empty lists, i.e. the parser must find at least one item |
| in the input stream to return a successful match. If you wish to also match |
| an empty list, you can make your list_p optional with operator! An example where |
| this utility parser is helpful is parsing comma separated C/C++ strings, which |
| can be easily formulated as:</p> |
| <pre><span class=special> </span><span class=identifier>rule</span><span class=special><> </span><span class=identifier>list_of_c_strings_rule |
| </span><span class=special>= </span><span class=identifier>list_p</span><span class=special>(</span><span class=identifier>confix_p</span><span class=special>(</span><span class=literal>'\"'</span><span class=special>, </span><span class=special>*</span><span class=identifier>c_escape_char_p</span><span class=special>, </span><span class=literal>'\"'</span><span class=special>), </span><span class=literal>','</span><span class=special>) |
| </span><span class=special>;</span></pre> |
| <p>The <tt>confix_p</tt> and <tt>c_escape_char_p</tt> parser generators |
| are described <a href="confix.html">here</a> and <a href="escape_char_parser.html">here</a>.</p> |
| <p>The <tt>list_p</tt> parser generator object can be used to generate the following |
| different types of List Parsers:</p> |
| <table width="90%" border="0" align="center"> |
| <tr> |
| <td colspan="2" class="table_title"><b>List Parsers</b></td> |
| </tr> |
| <tr> |
| <td width="29%" height="27" class="table_cells"><b>list_p</b></td> |
| <td width="71%" class="table_cells"><p><tt>list_p</tt> used by itself parses |
| comma separated lists without special item formatting, i.e. everything |
| in between two commas is matched as an <tt>item</tt>, no <tt>end</tt> |
| of list token is matched</p></td> |
| </tr> |
| <tr> |
| <td height="27" class="table_cells"><strong>list_p(delimiter)</strong></td> |
| <td class="table_cells"><p>generates a list parser, which recognizes lists |
| with the given <tt>delimiter</tt> and matches everything in between them |
| as an <tt>item</tt>, no <tt>end</tt> of list token is matched</p></td> |
| </tr> |
| <tr> |
| <td height="27" class="table_cells"><strong>list_p(item, delimiter)</strong></td> |
| <td class="table_cells"><p>generates a list parser, which recognizes lists |
| with the given <tt>delimiter</tt> and matches items based on the given |
| item parser, no <tt>end</tt> of list token is matched</p></td> |
| </tr> |
| <tr> |
| <td height="27" class="table_cells"><strong>list_p(item, delimiter, end)</strong></td> |
| <td class="table_cells"><p>generates a list parser, which recognizes lists |
| with the given <tt>delimiter</tt> and matches items based on the given |
| <tt>item</tt> parser and additionally recognizes an optional <tt>end</tt> |
| expression</p></td> |
| </tr> |
| </table> |
| <p>All of the parameters to list_p can be single characters, strings |
| or, if more complex parsing logic is required, auxiliary parsers, each of which |
| is automatically converted to the corresponding parser type needed for successful |
| parsing.</p> |
| <p>If the <tt>item</tt> parser is an <tt>action_parser_category</tt> type (parser |
| with an attached semantic action) we have to do something special. This happens, |
| if the user wrote something like:</p> |
| <pre><span class=special> </span><span class=identifier>list_p</span><span class=special>(</span><span class=identifier>item</span><span class=special>[</span><span class=identifier>func</span><span class=special>], </span><span class=identifier>delim</span><span class=special>)</span></pre> |
| <p> where <tt>item</tt> is the parser matching one item of the list sequence and |
| <tt>func</tt> is a functor to be called after matching one item. If we would |
| do nothing, the resulting code would parse the sequence as follows:</p> |
| <pre><span class=special> </span><span class=special>(</span><span class=identifier>item</span><span class=special>[</span><span class=identifier>func</span><span class=special>] </span><span class=special>- </span><span class=identifier>delim</span><span class=special>) </span><span class=special>>> </span><span class=special>*(</span><span class=identifier>delim </span><span class=special>>> </span><span class=special>(</span><span class=identifier>item</span><span class=special>[</span><span class=identifier>func</span><span class=special>] </span><span class=special>- </span><span class=identifier>delim</span><span class=special>))</span></pre> |
| <p> what in most cases is not what the user expects. (If this <u>is</u> what you've |
| expected, then please use one of the <tt>list_p</tt> generator |
| functions <tt>direct()</tt>, which will inhibit refactoring of the <tt>item</tt> |
| parser). To make the list parser behave as expected:</p> |
| <pre><span class=special> </span><span class=special>(</span><span class=identifier>item </span><span class=special>- </span><span class=identifier>delim</span><span class=special>)[</span><span class=identifier>func</span><span class=special>] </span><span class=special>>> </span><span class=special>*(</span><span class=identifier>delim </span><span class=special>>> </span><span class=special>(</span><span class=identifier>item </span><span class=special>- </span><span class=identifier>delim</span><span class=special>)[</span><span class=identifier>func</span><span class=special>])</span></pre> |
| <p> the actor attached to the item parser has to be re-attached to the <tt>(item |
| - delim)</tt> parser construct, which will make the resulting list parser 'do |
| the right thing'. This refactoring is done by the help of the <a href="refactoring.html">Refactoring |
| Parsers</a>. Additionally special care must be taken, if the item parser is |
| a <tt>unary_parser_category</tt> type parser as for instance:</p> |
| <pre><span class=special> </span><span class=identifier>list_p</span><span class=special>(*</span><span class=identifier>anychar_p</span><span class=special>, </span><span class=literal>','</span><span class=special>)</span></pre> |
| <p> which without any refactoring would result in </p> |
| <pre><span class=special> </span><span class=special>(*</span><span class=identifier>anychar_p </span><span class=special>- </span><span class=identifier>ch_p</span><span class=special>(</span><span class=literal>','</span><span class=special>)) |
| </span><span class=special>>> </span><span class=special>*( </span><span class=identifier>ch_p</span><span class=special>(</span><span class=literal>','</span><span class=special>) </span><span class=special>>> </span><span class=special>(*</span><span class=identifier>anychar_p </span><span class=special>- </span><span class=identifier>ch_p</span><span class=special>(</span><span class=literal>','</span><span class=special>)) </span><span class=special>)</span></pre> |
| <p> and will not give the expected result (the first <tt>*anychar_p</tt> will |
| eat up all the input up to the end of the input stream). So we have to refactor |
| this into:</p> |
| <pre><span class=special> </span><span class=special>*(</span><span class=identifier>anychar_p </span><span class=special>- </span><span class=identifier>ch_p</span><span class=special>(</span><span class=literal>','</span><span class=special>)) |
| </span><span class=special>>> </span><span class=special>*( </span><span class=identifier>ch_p</span><span class=special>(</span><span class=literal>','</span><span class=special>) </span><span class=special>>> </span><span class=special>*(</span><span class=identifier>anychar_p </span><span class=special>- </span><span class=identifier>ch_p</span><span class=special>(</span><span class=literal>','</span><span class=special>)) </span><span class=special>)</span></pre> |
| <p> what will give the correct result.</p> |
| <p> The case, where the item parser is a combination of the two mentioned problems |
| (i.e. the item parser is a unary parser with an attached action), is handled |
| accordingly too:</p> |
| <pre><span class=special> </span><span class=identifier>list_p</span><span class=special>((*</span><span class=identifier>anychar_p</span><span class=special>)[</span><span class=identifier>func</span><span class=special>], </span><span class=literal>','</span><span class=special>)</span></pre> |
| <p> will be parsed as expected:</p> |
| <pre><span class=special> </span><span class=special>(*(</span><span class=identifier>anychar_p </span><span class=special>- </span><span class=identifier>ch_p</span><span class=special>(</span><span class=literal>','</span><span class=special>)))[</span><span class=identifier>func</span><span class=special>] |
| </span><span class=special>>> </span><span class=special>*( </span><span class=identifier>ch_p</span><span class=special>(</span><span class=literal>','</span><span class=special>) </span><span class=special>>> </span><span class=special>(*(</span><span class=identifier>anychar_p </span><span class=special>- </span><span class=identifier>ch_p</span><span class=special>(</span><span class=literal>','</span><span class=special>)))[</span><span class=identifier>func</span><span class=special>] </span><span class=special>)</span></pre> |
| <p>The required refactoring is implemented with the help of the <a href="refactoring.html">Refactoring |
| Parsers</a>.</p> |
| <table width="90%" border="0" align="center"> |
| <tr> |
| <td colspan="2" class="table_title"><b>Summary of List Parser refactorings</b></td> |
| </tr> |
| <tr class="table_title"> |
| <td width="34%"><b>You write it as:</b></td> |
| <td width="66%"><code><font face="Verdana, Arial, Helvetica, sans-serif">It |
| is refactored to:</font></code></td> |
| </tr> |
| <tr> |
| <td width="34%" class="table_cells"><code><span class=identifier>list_p</span><span class=special>(</span><span class=identifier>item</span><span class=special>, |
| </span><span class=identifier>delimiter</span><span class=special>)</span></code></td> |
| <td width="66%" class="table_cells"> <code><span class=special> (</span><span class=identifier>item |
| </span><span class=special>- </span><span class=identifier>delimiter</span><span class=special>) |
| <br> |
| >> *(</span><span class=identifier>delimiter </span><span class=special> |
| >> (</span><span class=identifier>item </span><span class=special>- |
| </span><span class=identifier>delimiter</span><span class=special>))</span></code></td> |
| </tr> |
| <tr> |
| <td width="34%" class="table_cells"><code><span class=identifier>list_p</span><span class=special>(</span><span class=identifier>item</span><span class=special>[</span><span class=identifier>func</span><span class=special>], |
| </span><span class=identifier>delimiter</span><span class=special>)</span></code></td> |
| <td width="66%" class="table_cells"> <code><span class=special> (</span><span class=identifier>item |
| </span><span class=special> - </span><span class=identifier>delimiter</span><span class=special>)[</span><span class=identifier>func</span><span class=special>] |
| <br> |
| >> *(</span><span class=identifier>delimiter </span><span class=special>>> |
| (</span><span class=identifier>item </span><span class=special>- </span><span class=identifier>delimiter</span><span class=special>)[</span><span class=identifier>func</span><span class=special>])</span></code></td> |
| </tr> |
| <tr> |
| <td width="34%" class="table_cells"><code><span class=identifier>list_p</span><span class=special>(*</span><span class=identifier>item</span><span class=special>, |
| </span><span class=identifier>delimiter</span><span class=special>)</span></code></td> |
| <td width="66%" class="table_cells"> <code><span class=special>*(</span><span class=identifier>item |
| </span><span class=special>- </span><span class=identifier>delimiter</span><span class=special>) |
| <br> |
| >> *(</span><span class=identifier>delimiter </span><span class=special>>> |
| *(</span><span class=identifier>item </span><span class=special>- </span><span class=identifier>delimiter</span><span class=special>))</span></code></td> |
| </tr> |
| <tr> |
| <td width="34%" class="table_cells"><code><span class=identifier>list_p</span><span class=special>((*</span><span class=identifier>item</span><span class=special>)[</span><span class=identifier>func</span><span class=special>], |
| </span><span class=identifier>delimiter</span><span class=special>)</span></code></td> |
| <td width="66%" class="table_cells"> <code><span class=special>(*(</span><span class=identifier>item |
| </span><span class=special>- </span><span class=identifier>delimiter</span><span class=special>))[</span><span class=identifier>func</span><span class=special>] |
| <br> |
| >> *(</span><span class=identifier>delimiter </span><span class=special>>> |
| (*(</span><span class=identifier>item </span><span class=special>- </span><span class=identifier>delimiter</span><span class=special>))[</span><span class=identifier>func</span><span class=special>])</span></code></td> |
| </tr> |
| </table> |
| <p> <img height="16" width="15" src="theme/lens.gif"> <a href="../example/fundamental/list_parser.cpp">list_parser.cpp </a> sample shows the usage of the list_p utility parser:</p> |
| <ol> |
| <li>parsing a simple ',' delimited list w/o item formatting</li> |
| <li> parsing a CSV list (comma separated values - strings, integers or reals)</li> |
| <li>parsing a token list (token separated values - strings, integers or reals) <br> |
| with an action parser directly attached to the item part of the list_p generated parser</li> |
| </ol> |
| <p>This is part of the Spirit distribution.</p> |
| <table border="0"> |
| <tr> |
| <td width="10"></td> |
| <td width="30"><a href="../index.html"><img src="theme/u_arr.gif" border="0"></a></td> |
| <td width="30"><a href="confix.html"><img src="theme/l_arr.gif" border="0"></a></td> |
| <td width="30"><a href="functor_parser.html"><img src="theme/r_arr.gif" border="0"></a></td> |
| </tr> |
| </table> |
| <br> |
| <hr size="1"> |
| <p class="copyright">Copyright © 2001-2003 Hartmut Kaiser<br> |
| <br> |
| <font size="2">Use, modification and distribution is subject to the Boost Software |
| License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at |
| http://www.boost.org/LICENSE_1_0.txt) </font> </p> |
| </body> |
| </html> |