boost_1_45_0/libs/spirit/doc/html/spirit/lex/lexer_introduction.html - nest-learning-thermostat/5.0.1/boost - Git at Google

 <html>
 <head>
 <meta http-equiv="Content-Type" content="text/html; charset=US-ASCII">
 <title>Introduction to Spirit.Lex</title>
 <link rel="stylesheet" href="../../../../../../doc/src/boostbook.css" type="text/css">
 <meta name="generator" content="DocBook XSL Stylesheets V1.75.0">
 <link rel="home" href="../../index.html" title="Spirit 2.4.1">
 <link rel="up" href="../lex.html" title="Lex - Writing Lexical Analyzers">
 <link rel="prev" href="../lex.html" title="Lex - Writing Lexical Analyzers">
 <link rel="next" href="tutorials.html" title="Spirit.Lex Tutorials">
 </head>
 <body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF">
 <table cellpadding="2" width="100%"><tr>
 <td valign="top"><img alt="Boost C++ Libraries" width="277" height="86" src="../../../../../../boost.png"></td>
 <td align="center"><a href="../../../../../../index.html">Home</a></td>
 <td align="center"><a href="../../../../../../libs/libraries.htm">Libraries</a></td>
 <td align="center"><a href="http://www.boost.org/users/people.html">People</a></td>
 <td align="center"><a href="http://www.boost.org/users/faq.html">FAQ</a></td>
 <td align="center"><a href="../../../../../../more/index.htm">More</a></td>
 </tr></table>
 <hr>
 <div class="spirit-nav">
 <a accesskey="p" href="../lex.html"><img src="../../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../lex.html"><img src="../../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../../index.html"><img src="../../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="tutorials.html"><img src="../../../../../../doc/src/images/next.png" alt="Next"></a>
 </div>
 <div class="section">
 <div class="titlepage"><div><div><h3 class="title">
 <a name="spirit.lex.lexer_introduction"></a><a class="link" href="lexer_introduction.html" title="Introduction to Spirit.Lex">Introduction to <span class="emphasis"><em>Spirit.Lex</em></span></a>
 </h3></div></div></div>
 <p>
         Lexical scanning is the process of analyzing the stream of input characters
         and separating it into strings called tokens, separated by whitespace. Most
         compiler texts start here, and devote several chapters to discussing various
         ways to build scanners. <span class="emphasis"><em>Spirit.Lex</em></span> is a library built
         to take care of the complexities of creating a lexer for your grammar (in
         this documentation we will use the terms 'lexical analyzer', 'lexer' and
         'scanner' interchangeably). All that is needed to create a lexer is to know
         the set of patterns describing the different tokens you want to recognize
         in the input. To make this a bit more formal, here are some definitions:
       </p>
 <div class="itemizedlist"><ul class="itemizedlist" type="disc">
 <li class="listitem">
             A token is a sequence of consecutive characters having a collective meaning.
             Tokens may have attributes specific to the token type, carrying additional
             information about the matched character sequence.
           </li>
 <li class="listitem">
             A pattern is a rule expressed as a regular expression and describing
             how a particular token can be formed. For example, <code class="literal">[A-Za-z][A-Za-z_0-9]*</code>
             is a pattern for a rule matching C++ identifiers.
           </li>
 <li class="listitem">
             Characters between tokens are called whitespace; these include spaces,
             tabs, newlines, and formfeeds. Many people also count comments as whitespace,
             though since some tools such as lint look at comments, this method is
             not perfect.
           </li>
 </ul></div>
 <a name="spirit.lex.lexer_introduction.why_use_a_separate_lexer_"></a><h5>
 <a name="id1115090"></a>
         <a class="link" href="lexer_introduction.html#spirit.lex.lexer_introduction.why_use_a_separate_lexer_">Why
         Use a Separate Lexer?</a>
       </h5>
 <p>
         Typically, lexical scanning is done in a separate module from the parser,
         feeding the parser with a stream of input tokens only. Theoretically it is
         not necessary implement this separation as in the end there is only one set
         of syntactical rules defining the language, so in theory we could write the
         whole parser in one module. In fact, <span class="emphasis"><em>Spirit.Qi</em></span> allows
         you to write parsers without using a lexer, parsing the input character stream
         directly, and for the most part this is the way <a href="http://boost-spirit.com" target="_top">Spirit</a>
         has been used since its invention.
       </p>
 <p>
         However, this separation has both practical and theoretical basis, and proves
         to be very useful in practical applications. In 1956, Noam Chomsky defined
         the "Chomsky Hierarchy" of grammars:
       </p>
 <div class="itemizedlist"><ul class="itemizedlist" type="disc">
 <li class="listitem">
             Type 0: Unrestricted grammars (e.g., natural languages)
           </li>
 <li class="listitem">
             Type 1: Context-Sensitive grammars
           </li>
 <li class="listitem">
             Type 2: Context-Free grammars
           </li>
 <li class="listitem">
             Type 3: Regular grammars
           </li>
 </ul></div>
 <p>
         The complexity of these grammars increases from regular grammars being the
         simplest to unrestricted grammars being the most complex. Similarly, the
         complexity of pattern recognition for these grammars increases. Although,
         a few features of some programming languages (such as C++) are Type 1, fortunately
         for the most part programming languages can be described using only the Types
         2 and 3. The neat part about these two types is that they are well known
         and the ways to parse them are well understood. It has been shown that any
         regular grammar can be parsed using a state machine (finite automaton). Similarly,
         context-free grammars can always be parsed using a push-down automaton (essentially
         a state machine augmented by a stack).
       </p>
 <p>
         In real programming languages and practical grammars, the parts that can
         be handled as regular expressions tend to be the lower-level pieces, such
         as the definition of an identifier or of an integer value:
       </p>
 <pre class="programlisting"><span class="identifier">letter</span>     <span class="special">:=</span> <span class="special">[</span><span class="identifier">a</span><span class="special">-</span><span class="identifier">zA</span><span class="special">-</span><span class="identifier">Z</span><span class="special">]</span>
 <span class="identifier">digit</span>      <span class="special">:=</span> <span class="special">[</span><span class="number">0</span><span class="special">-</span><span class="number">9</span><span class="special">]</span>

 <span class="identifier">identifier</span> <span class="special">:=</span> <span class="identifier">letter</span> <span class="special">[</span> <span class="identifier">letter</span> <span class="special">|</span> <span class="identifier">digit</span> <span class="special">]*</span>
 <span class="identifier">integer</span>    <span class="special">:=</span> <span class="identifier">digit</span><span class="special">+</span>
 </pre>
 <p>
         Higher level parts of practical grammars tend to be more complex and can't
         be implemented using plain regular expressions. We need to store information
         on the built-in hardware stack while recursing the grammar hierarchy, and
         that is the preferred approach used for top-down parsing. Since it takes
         a different kind of abstract machine to parse the two types of grammars,
         it proved to be efficient to separate the lexical scanner into a separate
         module which is built around the idea of a state machine. The goal here is
         to use the simplest parsing technique needed for the job.
       </p>
 <p>
         Another, more practical, reason for separating the scanner from the parser
         is the need for backtracking during parsing. The input data is a stream of
         characters, which is often thought to be processed left to right without
         any backtracking. Unfortunately, in practice most of the time that isn't
         possible. Almost every language has certain keywords such as IF, FOR, and
         WHILE. The decision if a certain character sequence actually comprises a
         keyword or just an identifier often can be made only after seeing the first
         delimiter <span class="emphasis"><em>after</em></span> it. In fact, this makes the process
         backtracking, since we need to store the string long enough to be able to
         make the decision. The same is true for more coarse grained language features
         such as nested IF/ELSE statements, where the decision about to which IF belongs
         the last ELSE statement can be made only after seeing the whole construct.
       </p>
 <p>
         So the structure of a conventional compiler often involves splitting up the
         functions of the lower-level and higher-level parsing. The lexical scanner
         deals with things at the character level, collecting characters into strings,
         converting character sequence into different representations as integers,
         etc., and passing them along to the parser proper as indivisible tokens.
         It's also considered normal to let the scanner do additional jobs, such as
         identifying keywords, storing identifiers in tables, etc.
       </p>
 <p>
         Now, <a href="http://boost-spirit.com" target="_top">Spirit</a> follows this structure,
         where <span class="emphasis"><em>Spirit.Lex</em></span> can be used to implement state machine
         based recognizers, while <span class="emphasis"><em>Spirit.Qi</em></span> can be used to build
         recognizers for context free grammars. Since both modules are seamlessly
         integrated with each other and with the C++ target language it is even possible
         to use the provided functionality to build more complex grammar recognizers.
       </p>
 <a name="spirit.lex.lexer_introduction.advantages_of_using__emphasis_spirit_lex__emphasis_"></a><h5>
 <a name="id1115314"></a>
         <a class="link" href="lexer_introduction.html#spirit.lex.lexer_introduction.advantages_of_using__emphasis_spirit_lex__emphasis_">Advantages
         of using <span class="emphasis"><em>Spirit.Lex</em></span></a>
       </h5>
 <p>
         The advantage of using <span class="emphasis"><em>Spirit.Lex</em></span> to create the lexical
         analyzer over using more traditional tools such as <a href="http://flex.sourceforge.net/" target="_top">Flex</a>
         is its carefully crafted integration with the <a href="http://boost-spirit.com" target="_top">Spirit</a>
         library and the C++ host language. You don't need any external tools to generate
         the code, your lexer will be perfectly integrated with the rest of your program,
         making it possible to freely access any context information and data structure.
         Since the C++ compiler sees all the code, it will generate optimal code no
         matter what configuration options have been chosen by the user. <span class="emphasis"><em>Spirit.Lex</em></span>
         gives you the vast majority of features you could get from a similar <a href="http://flex.sourceforge.net/" target="_top">Flex</a> program without the need
         to leave C++ as a host language:
       </p>
 <div class="itemizedlist"><ul class="itemizedlist" type="disc">
 <li class="listitem">
             The definition of tokens is done using regular expressions (patterns)
           </li>
 <li class="listitem">
             The token definitions can refer to special substitution strings (pattern
             macros) simplifying pattern definitions
           </li>
 <li class="listitem">
             The generated lexical scanner may have multiple start states
           </li>
 <li class="listitem">
             It is possible to attach code to any of the token definitions; this code
             gets executed whenever the corresponding token pattern has been matched
           </li>
 </ul></div>
 <p>
         Even if it is possible to use <span class="emphasis"><em>Spirit.Lex</em></span> to generate
         C++ code representing the lexical analyzer (we will refer to that as the
         <span class="emphasis"><em>static</em></span> model, described in more detail in the section
         <a class="link" href="abstracts/lexer_static_model.html" title="The Static Lexer Model">The <span class="emphasis"><em>Static</em></span>
         Model</a>) - a model very similar to the way <a href="http://flex.sourceforge.net/" target="_top">Flex</a>
         operates - we will mainly focus on the opposite, the <span class="emphasis"><em>dynamic</em></span>
         model. You can directly integrate the token definitions into your C++ program,
         building the lexical analyzer dynamically at runtime. The dynamic model is
         something not supported by <a href="http://flex.sourceforge.net/" target="_top">Flex</a>
         or other lexical scanner generators (such as <a href="http://re2c.sourceforge.net/" target="_top">re2c</a>,
         <a href="http://www.cs.queensu.ca/~thurston/ragel/" target="_top">Ragel</a>, etc.).
         This dynamic flexibility allows you to speed up the development of your application.
       </p>
 <a name="spirit.lex.lexer_introduction.the_library_structure_of__emphasis_spirit_lex__emphasis_"></a><h5>
 <a name="id1115432"></a>
         <a class="link" href="lexer_introduction.html#spirit.lex.lexer_introduction.the_library_structure_of__emphasis_spirit_lex__emphasis_">The
         Library Structure of <span class="emphasis"><em>Spirit.Lex</em></span></a>
       </h5>
 <p>
         The <a class="link" href="lexer_introduction.html#spirit.lexerflow" title="Figure&#160;6.&#160;The Library structure and Common Flow of Information while using Spirit.Lex in an application">figure</a> below shows a high level
         overview of how the <span class="emphasis"><em>Spirit.Lex</em></span> library might be used
         in an application. <span class="emphasis"><em>Spirit.Lex</em></span> allows to create lexical
         analyzers based on patterns. These patterns are regular expression based
         rules used to define the different tokens to be recognized in the character
         input sequence. The input sequence is expected to be provided to the lexical
         analyzer as an arbitrary standard forward iterator. The lexical analyzer
         itself exposes a standard forward iterator as well. The difference here is
         that the exposed iterator provides access to the token sequence instead of
         to the character sequence. The tokens in this sequence are constructed on
         the fly by analyzing the underlying character sequence and matching this
         to the patterns as defined by the application.
       </p>
 <p>
         </p>
 <div class="figure">
 <a name="spirit.lexerflow"></a><p class="title"><b>Figure&#160;6.&#160;The Library structure and Common Flow of Information while
         using <span class="emphasis"><em>Spirit.Lex</em></span> in an application</b></p>
 <div class="figure-contents"><span class="inlinemediaobject"><img src="../.././images/lexerflow.png" alt="The Library structure and Common Flow of Information while using Spirit.Lex in an application"></span></div>
 </div>
 <p><br class="figure-break">
       </p>
 </div>
 <table xmlns:rev="http://www.cs.rpi.edu/~gregod/boost/tools/doc/revision" width="100%"><tr>
 <td align="left"></td>
 <td align="right"><div class="copyright-footer">Copyright &#169; 2001-2010 Joel de Guzman, Hartmut Kaiser<p>
         Distributed under the Boost Software License, Version 1.0. (See accompanying
         file LICENSE_1_0.txt or copy at <a href="http://www.boost.org/LICENSE_1_0.txt" target="_top">http://www.boost.org/LICENSE_1_0.txt</a>)
       </p>
 </div></td>
 </tr></table>
 <hr>
 <div class="spirit-nav">
 <a accesskey="p" href="../lex.html"><img src="../../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../lex.html"><img src="../../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../../index.html"><img src="../../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="tutorials.html"><img src="../../../../../../doc/src/images/next.png" alt="Next"></a>
 </div>
 </body>
 </html>
	<html>
	<head>
	<meta http-equiv="Content-Type" content="text/html; charset=US-ASCII">
	<title>Introduction to Spirit.Lex</title>
	<link rel="stylesheet" href="../../../../../../doc/src/boostbook.css" type="text/css">
	<meta name="generator" content="DocBook XSL Stylesheets V1.75.0">
	<link rel="home" href="../../index.html" title="Spirit 2.4.1">
	<link rel="up" href="../lex.html" title="Lex - Writing Lexical Analyzers">
	<link rel="prev" href="../lex.html" title="Lex - Writing Lexical Analyzers">
	<link rel="next" href="tutorials.html" title="Spirit.Lex Tutorials">
	</head>
	<body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF">
	<table cellpadding="2" width="100%"><tr>
	<td valign="top"><img alt="Boost C++ Libraries" width="277" height="86" src="../../../../../../boost.png"></td>
	<td align="center"><a href="../../../../../../index.html">Home</a></td>
	<td align="center"><a href="../../../../../../libs/libraries.htm">Libraries</a></td>
	<td align="center"><a href="http://www.boost.org/users/people.html">People</a></td>
	<td align="center"><a href="http://www.boost.org/users/faq.html">FAQ</a></td>
	<td align="center"><a href="../../../../../../more/index.htm">More</a></td>
	</tr></table>
	<hr>
	<div class="spirit-nav">
	<a accesskey="p" href="../lex.html"><img src="../../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../lex.html"><img src="../../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../../index.html"><img src="../../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="tutorials.html"><img src="../../../../../../doc/src/images/next.png" alt="Next"></a>
	</div>
	<div class="section">
	<div class="titlepage"><div><div><h3 class="title">
	<a name="spirit.lex.lexer_introduction"></a><a class="link" href="lexer_introduction.html" title="Introduction to Spirit.Lex">Introduction to <span class="emphasis"><em>Spirit.Lex</em></span></a>
	</h3></div></div></div>
	<p>
	Lexical scanning is the process of analyzing the stream of input characters
	and separating it into strings called tokens, separated by whitespace. Most
	compiler texts start here, and devote several chapters to discussing various
	ways to build scanners. <span class="emphasis"><em>Spirit.Lex</em></span> is a library built
	to take care of the complexities of creating a lexer for your grammar (in
	this documentation we will use the terms 'lexical analyzer', 'lexer' and
	'scanner' interchangeably). All that is needed to create a lexer is to know
	the set of patterns describing the different tokens you want to recognize
	in the input. To make this a bit more formal, here are some definitions:
	</p>
	<div class="itemizedlist"><ul class="itemizedlist" type="disc">
	<li class="listitem">
	A token is a sequence of consecutive characters having a collective meaning.
	Tokens may have attributes specific to the token type, carrying additional
	information about the matched character sequence.
	</li>
	<li class="listitem">
	A pattern is a rule expressed as a regular expression and describing
	how a particular token can be formed. For example, <code class="literal">[A-Za-z][A-Za-z_0-9]*</code>
	is a pattern for a rule matching C++ identifiers.
	</li>
	<li class="listitem">
	Characters between tokens are called whitespace; these include spaces,
	tabs, newlines, and formfeeds. Many people also count comments as whitespace,
	though since some tools such as lint look at comments, this method is
	not perfect.
	</li>
	</ul></div>
	<a name="spirit.lex.lexer_introduction.why_use_a_separate_lexer_"></a><h5>
	<a name="id1115090"></a>
	<a class="link" href="lexer_introduction.html#spirit.lex.lexer_introduction.why_use_a_separate_lexer_">Why
	Use a Separate Lexer?</a>
	</h5>
	<p>
	Typically, lexical scanning is done in a separate module from the parser,
	feeding the parser with a stream of input tokens only. Theoretically it is
	not necessary implement this separation as in the end there is only one set
	of syntactical rules defining the language, so in theory we could write the
	whole parser in one module. In fact, <span class="emphasis"><em>Spirit.Qi</em></span> allows
	you to write parsers without using a lexer, parsing the input character stream
	directly, and for the most part this is the way <a href="http://boost-spirit.com" target="_top">Spirit</a>
	has been used since its invention.
	</p>
	<p>
	However, this separation has both practical and theoretical basis, and proves
	to be very useful in practical applications. In 1956, Noam Chomsky defined
	the "Chomsky Hierarchy" of grammars:
	</p>
	<div class="itemizedlist"><ul class="itemizedlist" type="disc">
	<li class="listitem">
	Type 0: Unrestricted grammars (e.g., natural languages)
	</li>
	<li class="listitem">
	Type 1: Context-Sensitive grammars
	</li>
	<li class="listitem">
	Type 2: Context-Free grammars
	</li>
	<li class="listitem">
	Type 3: Regular grammars
	</li>
	</ul></div>
	<p>
	The complexity of these grammars increases from regular grammars being the
	simplest to unrestricted grammars being the most complex. Similarly, the
	complexity of pattern recognition for these grammars increases. Although,
	a few features of some programming languages (such as C++) are Type 1, fortunately
	for the most part programming languages can be described using only the Types
	2 and 3. The neat part about these two types is that they are well known
	and the ways to parse them are well understood. It has been shown that any
	regular grammar can be parsed using a state machine (finite automaton). Similarly,
	context-free grammars can always be parsed using a push-down automaton (essentially
	a state machine augmented by a stack).
	</p>
	<p>
	In real programming languages and practical grammars, the parts that can
	be handled as regular expressions tend to be the lower-level pieces, such
	as the definition of an identifier or of an integer value:
	</p>
	<pre class="programlisting"><span class="identifier">letter</span> <span class="special">:=</span> <span class="special">[</span><span class="identifier">a</span><span class="special">-</span><span class="identifier">zA</span><span class="special">-</span><span class="identifier">Z</span><span class="special">]</span>
	<span class="identifier">digit</span> <span class="special">:=</span> <span class="special">[</span><span class="number">0</span><span class="special">-</span><span class="number">9</span><span class="special">]</span>

	<span class="identifier">identifier</span> <span class="special">:=</span> <span class="identifier">letter</span> <span class="special">[</span> <span class="identifier">letter</span> <span class="special">\|</span> <span class="identifier">digit</span> <span class="special">]*</span>
	<span class="identifier">integer</span> <span class="special">:=</span> <span class="identifier">digit</span><span class="special">+</span>
	</pre>
	<p>
	Higher level parts of practical grammars tend to be more complex and can't
	be implemented using plain regular expressions. We need to store information
	on the built-in hardware stack while recursing the grammar hierarchy, and
	that is the preferred approach used for top-down parsing. Since it takes
	a different kind of abstract machine to parse the two types of grammars,
	it proved to be efficient to separate the lexical scanner into a separate
	module which is built around the idea of a state machine. The goal here is
	to use the simplest parsing technique needed for the job.
	</p>
	<p>
	Another, more practical, reason for separating the scanner from the parser
	is the need for backtracking during parsing. The input data is a stream of
	characters, which is often thought to be processed left to right without
	any backtracking. Unfortunately, in practice most of the time that isn't
	possible. Almost every language has certain keywords such as IF, FOR, and
	WHILE. The decision if a certain character sequence actually comprises a
	keyword or just an identifier often can be made only after seeing the first
	delimiter <span class="emphasis"><em>after</em></span> it. In fact, this makes the process
	backtracking, since we need to store the string long enough to be able to
	make the decision. The same is true for more coarse grained language features
	such as nested IF/ELSE statements, where the decision about to which IF belongs
	the last ELSE statement can be made only after seeing the whole construct.
	</p>
	<p>
	So the structure of a conventional compiler often involves splitting up the
	functions of the lower-level and higher-level parsing. The lexical scanner
	deals with things at the character level, collecting characters into strings,
	converting character sequence into different representations as integers,
	etc., and passing them along to the parser proper as indivisible tokens.
	It's also considered normal to let the scanner do additional jobs, such as
	identifying keywords, storing identifiers in tables, etc.
	</p>
	<p>
	Now, <a href="http://boost-spirit.com" target="_top">Spirit</a> follows this structure,
	where <span class="emphasis"><em>Spirit.Lex</em></span> can be used to implement state machine
	based recognizers, while <span class="emphasis"><em>Spirit.Qi</em></span> can be used to build
	recognizers for context free grammars. Since both modules are seamlessly
	integrated with each other and with the C++ target language it is even possible
	to use the provided functionality to build more complex grammar recognizers.
	</p>
	<a name="spirit.lex.lexer_introduction.advantages_of_using__emphasis_spirit_lex__emphasis_"></a><h5>
	<a name="id1115314"></a>
	<a class="link" href="lexer_introduction.html#spirit.lex.lexer_introduction.advantages_of_using__emphasis_spirit_lex__emphasis_">Advantages
	of using <span class="emphasis"><em>Spirit.Lex</em></span></a>
	</h5>
	<p>
	The advantage of using <span class="emphasis"><em>Spirit.Lex</em></span> to create the lexical
	analyzer over using more traditional tools such as <a href="http://flex.sourceforge.net/" target="_top">Flex</a>
	is its carefully crafted integration with the <a href="http://boost-spirit.com" target="_top">Spirit</a>
	library and the C++ host language. You don't need any external tools to generate
	the code, your lexer will be perfectly integrated with the rest of your program,
	making it possible to freely access any context information and data structure.
	Since the C++ compiler sees all the code, it will generate optimal code no
	matter what configuration options have been chosen by the user. <span class="emphasis"><em>Spirit.Lex</em></span>
	gives you the vast majority of features you could get from a similar <a href="http://flex.sourceforge.net/" target="_top">Flex</a> program without the need
	to leave C++ as a host language:
	</p>
	<div class="itemizedlist"><ul class="itemizedlist" type="disc">
	<li class="listitem">
	The definition of tokens is done using regular expressions (patterns)
	</li>
	<li class="listitem">
	The token definitions can refer to special substitution strings (pattern
	macros) simplifying pattern definitions
	</li>
	<li class="listitem">
	The generated lexical scanner may have multiple start states
	</li>
	<li class="listitem">
	It is possible to attach code to any of the token definitions; this code
	gets executed whenever the corresponding token pattern has been matched
	</li>
	</ul></div>
	<p>
	Even if it is possible to use <span class="emphasis"><em>Spirit.Lex</em></span> to generate
	C++ code representing the lexical analyzer (we will refer to that as the
	<span class="emphasis"><em>static</em></span> model, described in more detail in the section
	<a class="link" href="abstracts/lexer_static_model.html" title="The Static Lexer Model">The <span class="emphasis"><em>Static</em></span>
	Model</a>) - a model very similar to the way <a href="http://flex.sourceforge.net/" target="_top">Flex</a>
	operates - we will mainly focus on the opposite, the <span class="emphasis"><em>dynamic</em></span>
	model. You can directly integrate the token definitions into your C++ program,
	building the lexical analyzer dynamically at runtime. The dynamic model is
	something not supported by <a href="http://flex.sourceforge.net/" target="_top">Flex</a>
	or other lexical scanner generators (such as <a href="http://re2c.sourceforge.net/" target="_top">re2c</a>,
	<a href="http://www.cs.queensu.ca/~thurston/ragel/" target="_top">Ragel</a>, etc.).
	This dynamic flexibility allows you to speed up the development of your application.
	</p>
	<a name="spirit.lex.lexer_introduction.the_library_structure_of__emphasis_spirit_lex__emphasis_"></a><h5>
	<a name="id1115432"></a>
	<a class="link" href="lexer_introduction.html#spirit.lex.lexer_introduction.the_library_structure_of__emphasis_spirit_lex__emphasis_">The
	Library Structure of <span class="emphasis"><em>Spirit.Lex</em></span></a>
	</h5>
	<p>
	The <a class="link" href="lexer_introduction.html#spirit.lexerflow" title="Figure 6. The Library structure and Common Flow of Information while using Spirit.Lex in an application">figure</a> below shows a high level
	overview of how the <span class="emphasis"><em>Spirit.Lex</em></span> library might be used
	in an application. <span class="emphasis"><em>Spirit.Lex</em></span> allows to create lexical
	analyzers based on patterns. These patterns are regular expression based
	rules used to define the different tokens to be recognized in the character
	input sequence. The input sequence is expected to be provided to the lexical
	analyzer as an arbitrary standard forward iterator. The lexical analyzer
	itself exposes a standard forward iterator as well. The difference here is
	that the exposed iterator provides access to the token sequence instead of
	to the character sequence. The tokens in this sequence are constructed on
	the fly by analyzing the underlying character sequence and matching this
	to the patterns as defined by the application.
	</p>
	<p>
	</p>
	<div class="figure">
	<a name="spirit.lexerflow"></a><p class="title"><b>Figure 6. The Library structure and Common Flow of Information while
	using <span class="emphasis"><em>Spirit.Lex</em></span> in an application</b></p>
	<div class="figure-contents"><span class="inlinemediaobject"><img src="../.././images/lexerflow.png" alt="The Library structure and Common Flow of Information while using Spirit.Lex in an application"></span></div>
	</div>
	<p><br class="figure-break">
	</p>
	</div>
	<table xmlns:rev="http://www.cs.rpi.edu/~gregod/boost/tools/doc/revision" width="100%"><tr>
	<td align="left"></td>
	<td align="right"><div class="copyright-footer">Copyright © 2001-2010 Joel de Guzman, Hartmut Kaiser<p>
	Distributed under the Boost Software License, Version 1.0. (See accompanying
	file LICENSE_1_0.txt or copy at <a href="http://www.boost.org/LICENSE_1_0.txt" target="_top">http://www.boost.org/LICENSE_1_0.txt</a>)
	</p>
	</div></td>
	</tr></table>
	<hr>
	<div class="spirit-nav">
	<a accesskey="p" href="../lex.html"><img src="../../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../lex.html"><img src="../../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../../index.html"><img src="../../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="tutorials.html"><img src="../../../../../../doc/src/images/next.png" alt="Next"></a>
	</div>
	</body>
	</html>