| <html lang="en"> |
| <head> |
| <title>Low-Level Time String Parsing - The GNU C Library</title> |
| <meta http-equiv="Content-Type" content="text/html"> |
| <meta name="description" content="The GNU C Library"> |
| <meta name="generator" content="makeinfo 4.13"> |
| <link title="Top" rel="start" href="index.html#Top"> |
| <link rel="up" href="Parsing-Date-and-Time.html#Parsing-Date-and-Time" title="Parsing Date and Time"> |
| <link rel="next" href="General-Time-String-Parsing.html#General-Time-String-Parsing" title="General Time String Parsing"> |
| <link href="http://www.gnu.org/software/texinfo/" rel="generator-home" title="Texinfo Homepage"> |
| <!-- |
| This file documents the GNU C library. |
| |
| This is Edition 0.12, last updated 2007-10-27, |
| of `The GNU C Library Reference Manual', for version |
| 2.8 (Sourcery G++ Lite 2011.03-41). |
| |
| Copyright (C) 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2001, 2002, |
| 2003, 2007, 2008, 2010 Free Software Foundation, Inc. |
| |
| Permission is granted to copy, distribute and/or modify this document |
| under the terms of the GNU Free Documentation License, Version 1.3 or |
| any later version published by the Free Software Foundation; with the |
| Invariant Sections being ``Free Software Needs Free Documentation'' |
| and ``GNU Lesser General Public License'', the Front-Cover texts being |
| ``A GNU Manual'', and with the Back-Cover Texts as in (a) below. A |
| copy of the license is included in the section entitled "GNU Free |
| Documentation License". |
| |
| (a) The FSF's Back-Cover Text is: ``You have the freedom to |
| copy and modify this GNU manual. Buying copies from the FSF |
| supports it in developing GNU and promoting software freedom.''--> |
| <meta http-equiv="Content-Style-Type" content="text/css"> |
| <style type="text/css"><!-- |
| pre.display { font-family:inherit } |
| pre.format { font-family:inherit } |
| pre.smalldisplay { font-family:inherit; font-size:smaller } |
| pre.smallformat { font-family:inherit; font-size:smaller } |
| pre.smallexample { font-size:smaller } |
| pre.smalllisp { font-size:smaller } |
| span.sc { font-variant:small-caps } |
| span.roman { font-family:serif; font-weight:normal; } |
| span.sansserif { font-family:sans-serif; font-weight:normal; } |
| --></style> |
| <link rel="stylesheet" type="text/css" href="../cs.css"> |
| </head> |
| <body> |
| <div class="node"> |
| <a name="Low-Level-Time-String-Parsing"></a> |
| <a name="Low_002dLevel-Time-String-Parsing"></a> |
| <p> |
| Next: <a rel="next" accesskey="n" href="General-Time-String-Parsing.html#General-Time-String-Parsing">General Time String Parsing</a>, |
| Up: <a rel="up" accesskey="u" href="Parsing-Date-and-Time.html#Parsing-Date-and-Time">Parsing Date and Time</a> |
| <hr> |
| </div> |
| |
| <h5 class="subsubsection">21.4.6.1 Interpret string according to given format</h5> |
| |
| <p>The first function is rather low-level. It is nevertheless frequently |
| used in software since it is better known. Its interface and |
| implementation are heavily influenced by the <code>getdate</code> function, |
| which is defined and implemented in terms of calls to <code>strptime</code>. |
| |
| <!-- time.h --> |
| <!-- XPG4 --> |
| <div class="defun"> |
| — Function: char * <b>strptime</b> (<var>const char *s, const char *fmt, struct tm *tp</var>)<var><a name="index-strptime-2663"></a></var><br> |
| <blockquote><p>The <code>strptime</code> function parses the input string <var>s</var> according |
| to the format string <var>fmt</var> and stores its results in the |
| structure <var>tp</var>. |
| |
| <p>The input string could be generated by a <code>strftime</code> call or |
| obtained any other way. It does not need to be in a human-recognizable |
| format; e.g. a date passed as <code>"02:1999:9"</code> is acceptable, even |
| though it is ambiguous without context. As long as the format string |
| <var>fmt</var> matches the input string the function will succeed. |
| |
| <p>The user has to make sure, though, that the input can be parsed in a |
| unambiguous way. The string <code>"1999112"</code> can be parsed using the |
| format <code>"%Y%m%d"</code> as 1999-1-12, 1999-11-2, or even 19991-1-2. It |
| is necessary to add appropriate separators to reliably get results. |
| |
| <p>The format string consists of the same components as the format string |
| of the <code>strftime</code> function. The only difference is that the flags |
| <code>_</code>, <code>-</code>, <code>0</code>, and <code>^</code> are not allowed. |
| <!-- Is this really the intention? -drepper --> |
| Several of the distinct formats of <code>strftime</code> do the same work in |
| <code>strptime</code> since differences like case of the input do not matter. |
| For reasons of symmetry all formats are supported, though. |
| |
| <p>The modifiers <code>E</code> and <code>O</code> are also allowed everywhere the |
| <code>strftime</code> function allows them. |
| |
| <p>The formats are: |
| |
| <dl> |
| <dt><code>%a</code><dt><code>%A</code><dd>The weekday name according to the current locale, in abbreviated form or |
| the full name. |
| |
| <br><dt><code>%b</code><dt><code>%B</code><dt><code>%h</code><dd>The month name according to the current locale, in abbreviated form or |
| the full name. |
| |
| <br><dt><code>%c</code><dd>The date and time representation for the current locale. |
| |
| <br><dt><code>%Ec</code><dd>Like <code>%c</code> but the locale's alternative date and time format is used. |
| |
| <br><dt><code>%C</code><dd>The century of the year. |
| |
| <p>It makes sense to use this format only if the format string also |
| contains the <code>%y</code> format. |
| |
| <br><dt><code>%EC</code><dd>The locale's representation of the period. |
| |
| <p>Unlike <code>%C</code> it sometimes makes sense to use this format since some |
| cultures represent years relative to the beginning of eras instead of |
| using the Gregorian years. |
| |
| <br><dt><code>%d</code><br><dt><code>%e</code><dd>The day of the month as a decimal number (range <code>1</code> through <code>31</code>). |
| Leading zeroes are permitted but not required. |
| |
| <br><dt><code>%Od</code><dt><code>%Oe</code><dd>Same as <code>%d</code> but using the locale's alternative numeric symbols. |
| |
| <p>Leading zeroes are permitted but not required. |
| |
| <br><dt><code>%D</code><dd>Equivalent to <code>%m/%d/%y</code>. |
| |
| <br><dt><code>%F</code><dd>Equivalent to <code>%Y-%m-%d</code>, which is the ISO 8601<!-- /@w --> date |
| format. |
| |
| <p>This is a GNU extension following an ISO C99<!-- /@w --> extension to |
| <code>strftime</code>. |
| |
| <br><dt><code>%g</code><dd>The year corresponding to the ISO week number, but without the century |
| (range <code>00</code> through <code>99</code>). |
| |
| <p><em>Note:</em> Currently, this is not fully implemented. The format is |
| recognized, input is consumed but no field in <var>tm</var> is set. |
| |
| <p>This format is a GNU extension following a GNU extension of <code>strftime</code>. |
| |
| <br><dt><code>%G</code><dd>The year corresponding to the ISO week number. |
| |
| <p><em>Note:</em> Currently, this is not fully implemented. The format is |
| recognized, input is consumed but no field in <var>tm</var> is set. |
| |
| <p>This format is a GNU extension following a GNU extension of <code>strftime</code>. |
| |
| <br><dt><code>%H</code><dt><code>%k</code><dd>The hour as a decimal number, using a 24-hour clock (range <code>00</code> through |
| <code>23</code>). |
| |
| <p><code>%k</code> is a GNU extension following a GNU extension of <code>strftime</code>. |
| |
| <br><dt><code>%OH</code><dd>Same as <code>%H</code> but using the locale's alternative numeric symbols. |
| |
| <br><dt><code>%I</code><dt><code>%l</code><dd>The hour as a decimal number, using a 12-hour clock (range <code>01</code> through |
| <code>12</code>). |
| |
| <p><code>%l</code> is a GNU extension following a GNU extension of <code>strftime</code>. |
| |
| <br><dt><code>%OI</code><dd>Same as <code>%I</code> but using the locale's alternative numeric symbols. |
| |
| <br><dt><code>%j</code><dd>The day of the year as a decimal number (range <code>1</code> through <code>366</code>). |
| |
| <p>Leading zeroes are permitted but not required. |
| |
| <br><dt><code>%m</code><dd>The month as a decimal number (range <code>1</code> through <code>12</code>). |
| |
| <p>Leading zeroes are permitted but not required. |
| |
| <br><dt><code>%Om</code><dd>Same as <code>%m</code> but using the locale's alternative numeric symbols. |
| |
| <br><dt><code>%M</code><dd>The minute as a decimal number (range <code>0</code> through <code>59</code>). |
| |
| <p>Leading zeroes are permitted but not required. |
| |
| <br><dt><code>%OM</code><dd>Same as <code>%M</code> but using the locale's alternative numeric symbols. |
| |
| <br><dt><code>%n</code><dt><code>%t</code><dd>Matches any white space. |
| |
| <br><dt><code>%p</code><br><dt><code>%P</code><dd>The locale-dependent equivalent to ‘<samp><span class="samp">AM</span></samp>’ or ‘<samp><span class="samp">PM</span></samp>’. |
| |
| <p>This format is not useful unless <code>%I</code> or <code>%l</code> is also used. |
| Another complication is that the locale might not define these values at |
| all and therefore the conversion fails. |
| |
| <p><code>%P</code> is a GNU extension following a GNU extension to <code>strftime</code>. |
| |
| <br><dt><code>%r</code><dd>The complete time using the AM/PM format of the current locale. |
| |
| <p>A complication is that the locale might not define this format at all |
| and therefore the conversion fails. |
| |
| <br><dt><code>%R</code><dd>The hour and minute in decimal numbers using the format <code>%H:%M</code>. |
| |
| <p><code>%R</code> is a GNU extension following a GNU extension to <code>strftime</code>. |
| |
| <br><dt><code>%s</code><dd>The number of seconds since the epoch, i.e., since 1970-01-01 00:00:00 UTC. |
| Leap seconds are not counted unless leap second support is available. |
| |
| <p><code>%s</code> is a GNU extension following a GNU extension to <code>strftime</code>. |
| |
| <br><dt><code>%S</code><dd>The seconds as a decimal number (range <code>0</code> through <code>60</code>). |
| |
| <p>Leading zeroes are permitted but not required. |
| |
| <p><strong>NB:</strong> The Unix specification says the upper bound on this value |
| is <code>61</code>, a result of a decision to allow double leap seconds. You |
| will not see the value <code>61</code> because no minute has more than one |
| leap second, but the myth persists. |
| |
| <br><dt><code>%OS</code><dd>Same as <code>%S</code> but using the locale's alternative numeric symbols. |
| |
| <br><dt><code>%T</code><dd>Equivalent to the use of <code>%H:%M:%S</code> in this place. |
| |
| <br><dt><code>%u</code><dd>The day of the week as a decimal number (range <code>1</code> through |
| <code>7</code>), Monday being <code>1</code>. |
| |
| <p>Leading zeroes are permitted but not required. |
| |
| <p><em>Note:</em> Currently, this is not fully implemented. The format is |
| recognized, input is consumed but no field in <var>tm</var> is set. |
| |
| <br><dt><code>%U</code><dd>The week number of the current year as a decimal number (range <code>0</code> |
| through <code>53</code>). |
| |
| <p>Leading zeroes are permitted but not required. |
| |
| <br><dt><code>%OU</code><dd>Same as <code>%U</code> but using the locale's alternative numeric symbols. |
| |
| <br><dt><code>%V</code><dd>The ISO 8601:1988<!-- /@w --> week number as a decimal number (range <code>1</code> |
| through <code>53</code>). |
| |
| <p>Leading zeroes are permitted but not required. |
| |
| <p><em>Note:</em> Currently, this is not fully implemented. The format is |
| recognized, input is consumed but no field in <var>tm</var> is set. |
| |
| <br><dt><code>%w</code><dd>The day of the week as a decimal number (range <code>0</code> through |
| <code>6</code>), Sunday being <code>0</code>. |
| |
| <p>Leading zeroes are permitted but not required. |
| |
| <p><em>Note:</em> Currently, this is not fully implemented. The format is |
| recognized, input is consumed but no field in <var>tm</var> is set. |
| |
| <br><dt><code>%Ow</code><dd>Same as <code>%w</code> but using the locale's alternative numeric symbols. |
| |
| <br><dt><code>%W</code><dd>The week number of the current year as a decimal number (range <code>0</code> |
| through <code>53</code>). |
| |
| <p>Leading zeroes are permitted but not required. |
| |
| <p><em>Note:</em> Currently, this is not fully implemented. The format is |
| recognized, input is consumed but no field in <var>tm</var> is set. |
| |
| <br><dt><code>%OW</code><dd>Same as <code>%W</code> but using the locale's alternative numeric symbols. |
| |
| <br><dt><code>%x</code><dd>The date using the locale's date format. |
| |
| <br><dt><code>%Ex</code><dd>Like <code>%x</code> but the locale's alternative data representation is used. |
| |
| <br><dt><code>%X</code><dd>The time using the locale's time format. |
| |
| <br><dt><code>%EX</code><dd>Like <code>%X</code> but the locale's alternative time representation is used. |
| |
| <br><dt><code>%y</code><dd>The year without a century as a decimal number (range <code>0</code> through |
| <code>99</code>). |
| |
| <p>Leading zeroes are permitted but not required. |
| |
| <p>Note that it is questionable to use this format without |
| the <code>%C</code> format. The <code>strptime</code> function does regard input |
| values in the range 68 to 99 as the years 1969 to |
| 1999 and the values 0 to 68 as the years |
| 2000 to 2068. But maybe this heuristic fails for some |
| input data. |
| |
| <p>Therefore it is best to avoid <code>%y</code> completely and use <code>%Y</code> |
| instead. |
| |
| <br><dt><code>%Ey</code><dd>The offset from <code>%EC</code> in the locale's alternative representation. |
| |
| <br><dt><code>%Oy</code><dd>The offset of the year (from <code>%C</code>) using the locale's alternative |
| numeric symbols. |
| |
| <br><dt><code>%Y</code><dd>The year as a decimal number, using the Gregorian calendar. |
| |
| <br><dt><code>%EY</code><dd>The full alternative year representation. |
| |
| <br><dt><code>%z</code><dd>The offset from GMT in ISO 8601<!-- /@w -->/RFC822 format. |
| |
| <br><dt><code>%Z</code><dd>The timezone name. |
| |
| <p><em>Note:</em> Currently, this is not fully implemented. The format is |
| recognized, input is consumed but no field in <var>tm</var> is set. |
| |
| <br><dt><code>%%</code><dd>A literal ‘<samp><span class="samp">%</span></samp>’ character. |
| </dl> |
| |
| <p>All other characters in the format string must have a matching character |
| in the input string. Exceptions are white spaces in the input string |
| which can match zero or more whitespace characters in the format string. |
| |
| <p><strong>Portability Note:</strong> The XPG standard advises applications to use |
| at least one whitespace character (as specified by <code>isspace</code>) or |
| other non-alphanumeric characters between any two conversion |
| specifications. The GNU C Library<!-- /@w --> does not have this limitation but |
| other libraries might have trouble parsing formats like |
| <code>"%d%m%Y%H%M%S"</code>. |
| |
| <p>The <code>strptime</code> function processes the input string from right to |
| left. Each of the three possible input elements (white space, literal, |
| or format) are handled one after the other. If the input cannot be |
| matched to the format string the function stops. The remainder of the |
| format and input strings are not processed. |
| |
| <p>The function returns a pointer to the first character it was unable to |
| process. If the input string contains more characters than required by |
| the format string the return value points right after the last consumed |
| input character. If the whole input string is consumed the return value |
| points to the <code>NULL</code> byte at the end of the string. If an error |
| occurs, i.e., <code>strptime</code> fails to match all of the format string, |
| the function returns <code>NULL</code>. |
| </p></blockquote></div> |
| |
| <p>The specification of the function in the XPG standard is rather vague, |
| leaving out a few important pieces of information. Most importantly, it |
| does not specify what happens to those elements of <var>tm</var> which are |
| not directly initialized by the different formats. The |
| implementations on different Unix systems vary here. |
| |
| <p>The GNU libc implementation does not touch those fields which are not |
| directly initialized. Exceptions are the <code>tm_wday</code> and |
| <code>tm_yday</code> elements, which are recomputed if any of the year, month, |
| or date elements changed. This has two implications: |
| |
| <ul> |
| <li>Before calling the <code>strptime</code> function for a new input string, you |
| should prepare the <var>tm</var> structure you pass. Normally this will mean |
| initializing all values are to zero. Alternatively, you can set all |
| fields to values like <code>INT_MAX</code>, allowing you to determine which |
| elements were set by the function call. Zero does not work here since |
| it is a valid value for many of the fields. |
| |
| <p>Careful initialization is necessary if you want to find out whether a |
| certain field in <var>tm</var> was initialized by the function call. |
| |
| <li>You can construct a <code>struct tm</code> value with several consecutive |
| <code>strptime</code> calls. A useful application of this is e.g. the parsing |
| of two separate strings, one containing date information and the other |
| time information. By parsing one after the other without clearing the |
| structure in-between, you can construct a complete broken-down time. |
| </ul> |
| |
| <p>The following example shows a function which parses a string which is |
| contains the date information in either US style or ISO 8601<!-- /@w --> form: |
| |
| <pre class="smallexample"> const char * |
| parse_date (const char *input, struct tm *tm) |
| { |
| const char *cp; |
| |
| /* <span class="roman">First clear the result structure.</span> */ |
| memset (tm, '\0', sizeof (*tm)); |
| |
| /* <span class="roman">Try the ISO format first.</span> */ |
| cp = strptime (input, "%F", tm); |
| if (cp == NULL) |
| { |
| /* <span class="roman">Does not match. Try the US form.</span> */ |
| cp = strptime (input, "%D", tm); |
| } |
| |
| return cp; |
| } |
| </pre> |
| </body></html> |
| |