boost/libs/locale/doc/std_locales.txt - nest-cam/4320010/boost - Git at Google

 //
 //  Copyright (c) 2009-2011 Artyom Beilis (Tonkikh)
 //
 //  Distributed under the Boost Software License, Version 1.0. (See
 //  accompanying file LICENSE_1_0.txt or copy at
 //  http://www.boost.org/LICENSE_1_0.txt)
 //

 // vim: tabstop=4 expandtab shiftwidth=4 softtabstop=4 filetype=cpp.doxygen
 /*!
 \page std_locales Introduction to C++ Standard Library localization support

 \section std_locales_basics Getting familiar with standard C++ Locales

 The C++ standard library offers a simple and powerful way to provide locale-specific information. It is done via the \c
 std::locale class, the container that holds all the required information about a specific culture, such as number formatting
 patterns, date and time formatting, currency, case conversion etc.

 All this information is provided by facets, special classes derived from the \c std::locale::facet base class. Such facets are
 packed into the \c std::locale class and allow you to provide arbitrary information about the locale. The \c std::locale class
 keeps reference counters on installed facets and can be efficiently copied.

 Each facet that was installed into the \c std::locale object can be fetched using the \c std::use_facet function. For example,
 the \c std::ctype<Char> facet provides rules for case conversion, so you can convert a character to upper-case like this:

 \code
 std::ctype<char> const &ctype_facet = std::use_facet<std::ctype<char> >(some_locale);
 char upper_a = ctype_facet.toupper('a');
 \endcode

 A locale object can be imbued into an \c iostream so it would format information according to the locale:

 \code
 cout.imbue(std::locale("en_US.UTF-8"));
 cout << 1345.45 << endl;
 cout.imbue(std::locale("ru_RU.UTF-8"));
 cout << 1345.45 << endl;
 \endcode

 Would display:

 \verbatim
     1,345.45 1.345,45
 \endverbatim

 You can also create your own facets and install them into existing locale objects. For example:

 \code
     class measure : public std::locale::facet {
     public:
         typedef enum { inches, ... } measure_type;
         measure(measure_type m,size_t refs=0)
         double from_metric(double value) const;
         std::string name() const;
         ...
     };
 \endcode
 And now you can simply provide this information to a locale:

 \code
     std::locale::global(std::locale(std::locale("en_US.UTF-8"),new measure(measure::inches)));
     /// Create default locale built from en_US locale and add paper size facet.
 \endcode


 Now you can print a distance according to the correct locale:

 \code
     void print_distance(std::ostream &out,double value)
     {
         measure const &m = std::use_facet<measure>(out.getloc());
         // Fetch locale information from stream
         out << m.from_metric(value) << " " << m.name();
     }
 \endcode

 This technique was adopted by the Boost.Locale library in order to provide powerful and correct localization. Instead of using
 the very limited C++ standard library facets, it uses ICU under the hood to create its own much more powerful ones.

 \section std_locales_common Common Critical Problems with the Standard Library

 There are numerous issues in the standard library that prevent the use of its full power, and there are several
 additional issues:

 -   Setting the global locale has bad side effects.
     \n
     Consider following code:
     \n
     \code
         int main()
         {
             std::locale::global(std::locale(""));
             // Set system's default locale as global
             std::ofstream csv("test.csv");
             csv << 1.1 << ","  << 1.3 << std::endl;
         }
     \endcode
     \n
     What would be the content of \c test.csv ? It may be "1.1,1.3" or it may be "1,1,1,3"
     rather than what you had expected.
     \n
     More than that it affects even \c printf and libraries like \c boost::lexical_cast giving
     incorrect or unexpected formatting. In fact many third-party libraries are broken in such a
     situation.
     \n
     Unlike the standard localization library, Boost.Locale never changes the basic number formatting,
     even when it uses \c std based localization backends, so by default, numbers are always
     formatted using C-style locale. Localized number formatting requires specific flags.
     \n
 -   Number formatting is broken on some locales.
     \n
     Some locales use the non-breakable space u00A0 character for thousands separator, thus
     in \c ru_RU.UTF-8 locale number 1024 should be displayed as "1 024" where the space
     is a Unicode character with codepoint u00A0. Unfortunately many libraries don't handle
     this correctly, for example GCC and SunStudio display a "\xC2" character instead of
     the first character in the UTF-8 sequence "\xC2\xA0" that represents this code point, and
     actually generate invalid UTF-8.
     \n
 -   Locale names are not standardized. For example, under MSVC you need to provide the name
     \c en-US or \c English_USA.1252 , when on POSIX platforms it would be \c en_US.UTF-8
     or \c en_US.ISO-8859-1
     \n
     More than that, MSVC does not support UTF-8 locales at all.
     \n
 -   Many standard libraries provide only the C and POSIX locales, thus GCC supports localization
     only under Linux. On all other platforms, attempting to create locales other than "C" or
     "POSIX" would fail.

 */
	//
	// Copyright (c) 2009-2011 Artyom Beilis (Tonkikh)
	//
	// Distributed under the Boost Software License, Version 1.0. (See
	// accompanying file LICENSE_1_0.txt or copy at
	// http://www.boost.org/LICENSE_1_0.txt)
	//

	// vim: tabstop=4 expandtab shiftwidth=4 softtabstop=4 filetype=cpp.doxygen
	/*!
	\page std_locales Introduction to C++ Standard Library localization support

	\section std_locales_basics Getting familiar with standard C++ Locales

	The C++ standard library offers a simple and powerful way to provide locale-specific information. It is done via the \c
	std::locale class, the container that holds all the required information about a specific culture, such as number formatting
	patterns, date and time formatting, currency, case conversion etc.

	All this information is provided by facets, special classes derived from the \c std::locale::facet base class. Such facets are
	packed into the \c std::locale class and allow you to provide arbitrary information about the locale. The \c std::locale class
	keeps reference counters on installed facets and can be efficiently copied.

	Each facet that was installed into the \c std::locale object can be fetched using the \c std::use_facet function. For example,
	the \c std::ctype<Char> facet provides rules for case conversion, so you can convert a character to upper-case like this:

	\code
	std::ctype<char> const &ctype_facet = std::use_facet<std::ctype<char> >(some_locale);
	char upper_a = ctype_facet.toupper('a');
	\endcode

	A locale object can be imbued into an \c iostream so it would format information according to the locale:

	\code
	cout.imbue(std::locale("en_US.UTF-8"));
	cout << 1345.45 << endl;
	cout.imbue(std::locale("ru_RU.UTF-8"));
	cout << 1345.45 << endl;
	\endcode

	Would display:

	\verbatim
	1,345.45 1.345,45
	\endverbatim

	You can also create your own facets and install them into existing locale objects. For example:

	\code
	class measure : public std::locale::facet {
	public:
	typedef enum { inches, ... } measure_type;
	measure(measure_type m,size_t refs=0)
	double from_metric(double value) const;
	std::string name() const;
	...
	};
	\endcode
	And now you can simply provide this information to a locale:

	\code
	std::locale::global(std::locale(std::locale("en_US.UTF-8"),new measure(measure::inches)));
	/// Create default locale built from en_US locale and add paper size facet.
	\endcode


	Now you can print a distance according to the correct locale:

	\code
	void print_distance(std::ostream &out,double value)
	{
	measure const &m = std::use_facet<measure>(out.getloc());
	// Fetch locale information from stream
	out << m.from_metric(value) << " " << m.name();
	}
	\endcode

	This technique was adopted by the Boost.Locale library in order to provide powerful and correct localization. Instead of using
	the very limited C++ standard library facets, it uses ICU under the hood to create its own much more powerful ones.

	\section std_locales_common Common Critical Problems with the Standard Library

	There are numerous issues in the standard library that prevent the use of its full power, and there are several
	additional issues:

	- Setting the global locale has bad side effects.
	\n
	Consider following code:
	\n
	\code
	int main()
	{
	std::locale::global(std::locale(""));
	// Set system's default locale as global
	std::ofstream csv("test.csv");
	csv << 1.1 << "," << 1.3 << std::endl;
	}
	\endcode
	\n
	What would be the content of \c test.csv ? It may be "1.1,1.3" or it may be "1,1,1,3"
	rather than what you had expected.
	\n
	More than that it affects even \c printf and libraries like \c boost::lexical_cast giving
	incorrect or unexpected formatting. In fact many third-party libraries are broken in such a
	situation.
	\n
	Unlike the standard localization library, Boost.Locale never changes the basic number formatting,
	even when it uses \c std based localization backends, so by default, numbers are always
	formatted using C-style locale. Localized number formatting requires specific flags.
	\n
	- Number formatting is broken on some locales.
	\n
	Some locales use the non-breakable space u00A0 character for thousands separator, thus
	in \c ru_RU.UTF-8 locale number 1024 should be displayed as "1 024" where the space
	is a Unicode character with codepoint u00A0. Unfortunately many libraries don't handle
	this correctly, for example GCC and SunStudio display a "\xC2" character instead of
	the first character in the UTF-8 sequence "\xC2\xA0" that represents this code point, and
	actually generate invalid UTF-8.
	\n
	- Locale names are not standardized. For example, under MSVC you need to provide the name
	\c en-US or \c English_USA.1252 , when on POSIX platforms it would be \c en_US.UTF-8
	or \c en_US.ISO-8859-1
	\n
	More than that, MSVC does not support UTF-8 locales at all.
	\n
	- Many standard libraries provide only the C and POSIX locales, thus GCC supports localization
	only under Linux. On all other platforms, attempting to create locales other than "C" or
	"POSIX" would fail.

	*/