| <html lang="en"> |
| <head> |
| <title>The message catalog files - The GNU C Library</title> |
| <meta http-equiv="Content-Type" content="text/html"> |
| <meta name="description" content="The GNU C Library"> |
| <meta name="generator" content="makeinfo 4.13"> |
| <link title="Top" rel="start" href="index.html#Top"> |
| <link rel="up" href="Message-catalogs-a-la-X_002fOpen.html#Message-catalogs-a-la-X_002fOpen" title="Message catalogs a la X/Open"> |
| <link rel="prev" href="The-catgets-Functions.html#The-catgets-Functions" title="The catgets Functions"> |
| <link rel="next" href="The-gencat-program.html#The-gencat-program" title="The gencat program"> |
| <link href="http://www.gnu.org/software/texinfo/" rel="generator-home" title="Texinfo Homepage"> |
| <!-- |
| This file documents the GNU C library. |
| |
| This is Edition 0.12, last updated 2007-10-27, |
| of `The GNU C Library Reference Manual', for version |
| 2.8 (Sourcery G++ Lite 2011.03-41). |
| |
| Copyright (C) 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2001, 2002, |
| 2003, 2007, 2008, 2010 Free Software Foundation, Inc. |
| |
| Permission is granted to copy, distribute and/or modify this document |
| under the terms of the GNU Free Documentation License, Version 1.3 or |
| any later version published by the Free Software Foundation; with the |
| Invariant Sections being ``Free Software Needs Free Documentation'' |
| and ``GNU Lesser General Public License'', the Front-Cover texts being |
| ``A GNU Manual'', and with the Back-Cover Texts as in (a) below. A |
| copy of the license is included in the section entitled "GNU Free |
| Documentation License". |
| |
| (a) The FSF's Back-Cover Text is: ``You have the freedom to |
| copy and modify this GNU manual. Buying copies from the FSF |
| supports it in developing GNU and promoting software freedom.''--> |
| <meta http-equiv="Content-Style-Type" content="text/css"> |
| <style type="text/css"><!-- |
| pre.display { font-family:inherit } |
| pre.format { font-family:inherit } |
| pre.smalldisplay { font-family:inherit; font-size:smaller } |
| pre.smallformat { font-family:inherit; font-size:smaller } |
| pre.smallexample { font-size:smaller } |
| pre.smalllisp { font-size:smaller } |
| span.sc { font-variant:small-caps } |
| span.roman { font-family:serif; font-weight:normal; } |
| span.sansserif { font-family:sans-serif; font-weight:normal; } |
| --></style> |
| <link rel="stylesheet" type="text/css" href="../cs.css"> |
| </head> |
| <body> |
| <div class="node"> |
| <a name="The-message-catalog-files"></a> |
| <p> |
| Next: <a rel="next" accesskey="n" href="The-gencat-program.html#The-gencat-program">The gencat program</a>, |
| Previous: <a rel="previous" accesskey="p" href="The-catgets-Functions.html#The-catgets-Functions">The catgets Functions</a>, |
| Up: <a rel="up" accesskey="u" href="Message-catalogs-a-la-X_002fOpen.html#Message-catalogs-a-la-X_002fOpen">Message catalogs a la X/Open</a> |
| <hr> |
| </div> |
| |
| <h4 class="subsection">8.1.2 Format of the message catalog files</h4> |
| |
| <p>The only reasonable way the translate all the messages of a function and |
| store the result in a message catalog file which can be read by the |
| <code>catopen</code> function is to write all the message text to the |
| translator and let her/him translate them all. I.e., we must have a |
| file with entries which associate the set/message tuple with a specific |
| translation. This file format is specified in the X/Open standard and |
| is as follows: |
| |
| <ul> |
| <li>Lines containing only whitespace characters or empty lines are ignored. |
| |
| <li>Lines which contain as the first non-whitespace character a <code>$</code> |
| followed by a whitespace character are comment and are also ignored. |
| |
| <li>If a line contains as the first non-whitespace characters the sequence |
| <code>$set</code> followed by a whitespace character an additional argument |
| is required to follow. This argument can either be: |
| |
| <ul> |
| <li>a number. In this case the value of this number determines the set |
| to which the following messages are added. |
| |
| <li>an identifier consisting of alphanumeric characters plus the underscore |
| character. In this case the set get automatically a number assigned. |
| This value is one added to the largest set number which so far appeared. |
| |
| <p>How to use the symbolic names is explained in section <a href="Common-Usage.html#Common-Usage">Common Usage</a>. |
| |
| <p>It is an error if a symbol name appears more than once. All following |
| messages are placed in a set with this number. |
| </ul> |
| |
| <li>If a line contains as the first non-whitespace characters the sequence |
| <code>$delset</code> followed by a whitespace character an additional argument |
| is required to follow. This argument can either be: |
| |
| <ul> |
| <li>a number. In this case the value of this number determines the set |
| which will be deleted. |
| |
| <li>an identifier consisting of alphanumeric characters plus the underscore |
| character. This symbolic identifier must match a name for a set which |
| previously was defined. It is an error if the name is unknown. |
| </ul> |
| |
| <p>In both cases all messages in the specified set will be removed. They |
| will not appear in the output. But if this set is later again selected |
| with a <code>$set</code> command again messages could be added and these |
| messages will appear in the output. |
| |
| <li>If a line contains after leading whitespaces the sequence |
| <code>$quote</code>, the quoting character used for this input file is |
| changed to the first non-whitespace character following the |
| <code>$quote</code>. If no non-whitespace character is present before the |
| line ends quoting is disable. |
| |
| <p>By default no quoting character is used. In this mode strings are |
| terminated with the first unescaped line break. If there is a |
| <code>$quote</code> sequence present newline need not be escaped. Instead a |
| string is terminated with the first unescaped appearance of the quote |
| character. |
| |
| <p>A common usage of this feature would be to set the quote character to |
| <code>"</code>. Then any appearance of the <code>"</code> in the strings must |
| be escaped using the backslash (i.e., <code>\"</code> must be written). |
| |
| <li>Any other line must start with a number or an alphanumeric identifier |
| (with the underscore character included). The following characters |
| (starting after the first whitespace character) will form the string |
| which gets associated with the currently selected set and the message |
| number represented by the number and identifier respectively. |
| |
| <p>If the start of the line is a number the message number is obvious. It |
| is an error if the same message number already appeared for this set. |
| |
| <p>If the leading token was an identifier the message number gets |
| automatically assigned. The value is the current maximum messages |
| number for this set plus one. It is an error if the identifier was |
| already used for a message in this set. It is OK to reuse the |
| identifier for a message in another thread. How to use the symbolic |
| identifiers will be explained below (see <a href="Common-Usage.html#Common-Usage">Common Usage</a>). There is |
| one limitation with the identifier: it must not be <code>Set</code>. The |
| reason will be explained below. |
| |
| <p>The text of the messages can contain escape characters. The usual bunch |
| of characters known from the ISO C<!-- /@w --> language are recognized |
| (<code>\n</code>, <code>\t</code>, <code>\v</code>, <code>\b</code>, <code>\r</code>, <code>\f</code>, |
| <code>\\</code>, and <code>\</code><var>nnn</var>, where <var>nnn</var> is the octal coding of |
| a character code). |
| </ul> |
| |
| <p><strong>Important:</strong> The handling of identifiers instead of numbers for |
| the set and messages is a GNU extension. Systems strictly following the |
| X/Open specification do not have this feature. An example for a message |
| catalog file is this: |
| |
| <pre class="smallexample"> $ This is a leading comment. |
| $quote " |
| |
| $set SetOne |
| 1 Message with ID 1. |
| two " Message with ID \"two\", which gets the value 2 assigned" |
| |
| $set SetTwo |
| $ Since the last set got the number 1 assigned this set has number 2. |
| 4000 "The numbers can be arbitrary, they need not start at one." |
| </pre> |
| <p>This small example shows various aspects: |
| <ul> |
| <li>Lines 1 and 9 are comments since they start with <code>$</code> followed by |
| a whitespace. |
| <li>The quoting character is set to <code>"</code>. Otherwise the quotes in the |
| message definition would have to be left away and in this case the |
| message with the identifier <code>two</code> would loose its leading whitespace. |
| <li>Mixing numbered messages with message having symbolic names is no |
| problem and the numbering happens automatically. |
| </ul> |
| |
| <p>While this file format is pretty easy it is not the best possible for |
| use in a running program. The <code>catopen</code> function would have to |
| parser the file and handle syntactic errors gracefully. This is not so |
| easy and the whole process is pretty slow. Therefore the <code>catgets</code> |
| functions expect the data in another more compact and ready-to-use file |
| format. There is a special program <code>gencat</code> which is explained in |
| detail in the next section. |
| |
| <p>Files in this other format are not human readable. To be easy to use by |
| programs it is a binary file. But the format is byte order independent |
| so translation files can be shared by systems of arbitrary architecture |
| (as long as they use the GNU C Library). |
| |
| <p>Details about the binary file format are not important to know since |
| these files are always created by the <code>gencat</code> program. The |
| sources of the GNU C Library also provide the sources for the |
| <code>gencat</code> program and so the interested reader can look through |
| these source files to learn about the file format. |
| |
| </body></html> |
| |