| |
| NAME |
| parse_date - parses a date string into a timespec struct. |
| |
| SYNOPSIS |
| #include "timeutils.h" |
| |
| int parse_date(struct timespec *result, char const *p, |
| struct timespec const *now) |
| |
| LDADD libcommon.la |
| |
| DESCRIPTION |
| Parse a date/time string, storing the resulting time value into *result. |
| The string itself is pointed to by *p. Return 1 if successful. |
| *p can be an incomplete or relative time specification; if so, use |
| *now as the basis for the returned time. |
| |
| |
| This function is based upon gnulib's parse-datetime.y-dd7a871. |
| |
| Below is a plain text version of the gnulib parse-datetime.texi-dd7a871 manual |
| describing the input strings that are recognized. |
| |
| Any future modifications to the util-linux parser that affect input strings |
| should be noted below. |
| |
| |
| 1 Date input formats |
| ******************** |
| |
| First, a quote: |
| |
| Our units of temporal measurement, from seconds on up to months, |
| are so complicated, asymmetrical and disjunctive so as to make |
| coherent mental reckoning in time all but impossible. Indeed, had |
| some tyrannical god contrived to enslave our minds to time, to |
| make it all but impossible for us to escape subjection to sodden |
| routines and unpleasant surprises, he could hardly have done |
| better than handing down our present system. It is like a set of |
| trapezoidal building blocks, with no vertical or horizontal |
| surfaces, like a language in which the simplest thought demands |
| ornate constructions, useless particles and lengthy |
| circumlocutions. Unlike the more successful patterns of language |
| and science, which enable us to face experience boldly or at least |
| level-headedly, our system of temporal calculation silently and |
| persistently encourages our terror of time. |
| |
| ... It is as though architects had to measure length in feet, |
| width in meters and height in ells; as though basic instruction |
| manuals demanded a knowledge of five different languages. It is |
| no wonder then that we often look into our own immediate past or |
| future, last Tuesday or a week from Sunday, with feelings of |
| helpless confusion. ... |
| |
| --Robert Grudin, `Time and the Art of Living'. |
| |
| This section describes the textual date representations that GNU |
| programs accept. These are the strings you, as a user, can supply as |
| arguments to the various programs. The C interface (via the |
| `parse_datetime' function) is not described here. |
| |
| 1.1 General date syntax |
| ======================= |
| |
| A "date" is a string, possibly empty, containing many items separated |
| by whitespace. The whitespace may be omitted when no ambiguity arises. |
| The empty string means the beginning of today (i.e., midnight). Order |
| of the items is immaterial. A date string may contain many flavors of |
| items: |
| |
| * calendar date items |
| |
| * time of day items |
| |
| * time zone items |
| |
| * combined date and time of day items |
| |
| * day of the week items |
| |
| * relative items |
| |
| * pure numbers. |
| |
| We describe each of these item types in turn, below. |
| |
| A few ordinal numbers may be written out in words in some contexts. |
| This is most useful for specifying day of the week items or relative |
| items (see below). Among the most commonly used ordinal numbers, the |
| word `last' stands for -1, `this' stands for 0, and `first' and `next' |
| both stand for 1. Because the word `second' stands for the unit of |
| time there is no way to write the ordinal number 2, but for convenience |
| `third' stands for 3, `fourth' for 4, `fifth' for 5, `sixth' for 6, |
| `seventh' for 7, `eighth' for 8, `ninth' for 9, `tenth' for 10, |
| `eleventh' for 11 and `twelfth' for 12. |
| |
| When a month is written this way, it is still considered to be |
| written numerically, instead of being "spelled in full"; this changes |
| the allowed strings. |
| |
| In the current implementation, only English is supported for words |
| and abbreviations like `AM', `DST', `EST', `first', `January', |
| `Sunday', `tomorrow', and `year'. |
| |
| The output of the `date' command is not always acceptable as a date |
| string, not only because of the language problem, but also because |
| there is no standard meaning for time zone items like `IST'. When using |
| `date' to generate a date string intended to be parsed later, specify a |
| date format that is independent of language and that does not use time |
| zone items other than `UTC' and `Z'. Here are some ways to do this: |
| |
| $ LC_ALL=C TZ=UTC0 date |
| Mon Mar 1 00:21:42 UTC 2004 |
| $ TZ=UTC0 date +'%Y-%m-%d %H:%M:%SZ' |
| 2004-03-01 00:21:42Z |
| $ date --rfc-3339=ns # --rfc-3339 is a GNU extension. |
| 2004-02-29 16:21:42.692722128-08:00 |
| $ date --rfc-2822 # a GNU extension |
| Sun, 29 Feb 2004 16:21:42 -0800 |
| $ date +'%Y-%m-%d %H:%M:%S %z' # %z is a GNU extension. |
| 2004-02-29 16:21:42 -0800 |
| $ date +'@%s.%N' # %s and %N are GNU extensions. |
| @1078100502.692722128 |
| |
| Alphabetic case is completely ignored in dates. Comments may be |
| introduced between round parentheses, as long as included parentheses |
| are properly nested. Hyphens not followed by a digit are currently |
| ignored. Leading zeros on numbers are ignored. |
| |
| Invalid dates like `2005-02-29' or times like `24:00' are rejected. |
| In the typical case of a host that does not support leap seconds, a |
| time like `23:59:60' is rejected even if it corresponds to a valid leap |
| second. |
| |
| 1.2 Calendar date items |
| ======================= |
| |
| A "calendar date item" specifies a day of the year. It is specified |
| differently, depending on whether the month is specified numerically or |
| literally. All these strings specify the same calendar date: |
| |
| 1972-09-24 # ISO 8601. |
| 72-9-24 # Assume 19xx for 69 through 99, |
| # 20xx for 00 through 68. |
| 72-09-24 # Leading zeros are ignored. |
| 9/24/72 # Common U.S. writing. |
| 24 September 1972 |
| 24 Sept 72 # September has a special abbreviation. |
| 24 Sep 72 # Three-letter abbreviations always allowed. |
| Sep 24, 1972 |
| 24-sep-72 |
| 24sep72 |
| |
| The year can also be omitted. In this case, the last specified year |
| is used, or the current year if none. For example: |
| |
| 9/24 |
| sep 24 |
| |
| Here are the rules. |
| |
| For numeric months, the ISO 8601 format `YEAR-MONTH-DAY' is allowed, |
| where YEAR is any positive number, MONTH is a number between 01 and 12, |
| and DAY is a number between 01 and 31. A leading zero must be present |
| if a number is less than ten. If YEAR is 68 or smaller, then 2000 is |
| added to it; otherwise, if YEAR is less than 100, then 1900 is added to |
| it. The construct `MONTH/DAY/YEAR', popular in the United States, is |
| accepted. Also `MONTH/DAY', omitting the year. |
| |
| Literal months may be spelled out in full: `January', `February', |
| `March', `April', `May', `June', `July', `August', `September', |
| `October', `November' or `December'. Literal months may be abbreviated |
| to their first three letters, possibly followed by an abbreviating dot. |
| It is also permitted to write `Sept' instead of `September'. |
| |
| When months are written literally, the calendar date may be given as |
| any of the following: |
| |
| DAY MONTH YEAR |
| DAY MONTH |
| MONTH DAY YEAR |
| DAY-MONTH-YEAR |
| |
| Or, omitting the year: |
| |
| MONTH DAY |
| |
| 1.3 Time of day items |
| ===================== |
| |
| A "time of day item" in date strings specifies the time on a given day. |
| Here are some examples, all of which represent the same time: |
| |
| 20:02:00.000000 |
| 20:02 |
| 8:02pm |
| 20:02-0500 # In EST (U.S. Eastern Standard Time). |
| |
| More generally, the time of day may be given as |
| `HOUR:MINUTE:SECOND', where HOUR is a number between 0 and 23, MINUTE |
| is a number between 0 and 59, and SECOND is a number between 0 and 59 |
| possibly followed by `.' or `,' and a fraction containing one or more |
| digits. Alternatively, `:SECOND' can be omitted, in which case it is |
| taken to be zero. On the rare hosts that support leap seconds, SECOND |
| may be 60. |
| |
| If the time is followed by `am' or `pm' (or `a.m.' or `p.m.'), HOUR |
| is restricted to run from 1 to 12, and `:MINUTE' may be omitted (taken |
| to be zero). `am' indicates the first half of the day, `pm' indicates |
| the second half of the day. In this notation, 12 is the predecessor of |
| 1: midnight is `12am' while noon is `12pm'. (This is the zero-oriented |
| interpretation of `12am' and `12pm', as opposed to the old tradition |
| derived from Latin which uses `12m' for noon and `12pm' for midnight.) |
| |
| The time may alternatively be followed by a time zone correction, |
| expressed as `SHHMM', where S is `+' or `-', HH is a number of zone |
| hours and MM is a number of zone minutes. The zone minutes term, MM, |
| may be omitted, in which case the one- or two-digit correction is |
| interpreted as a number of hours. You can also separate HH from MM |
| with a colon. When a time zone correction is given this way, it forces |
| interpretation of the time relative to Coordinated Universal Time |
| (UTC), overriding any previous specification for the time zone or the |
| local time zone. For example, `+0530' and `+05:30' both stand for the |
| time zone 5.5 hours ahead of UTC (e.g., India). This is the best way to |
| specify a time zone correction by fractional parts of an hour. The |
| maximum zone correction is 24 hours. |
| |
| Either `am'/`pm' or a time zone correction may be specified, but not |
| both. |
| |
| 1.4 Time zone items |
| =================== |
| |
| A "time zone item" specifies an international time zone, indicated by a |
| small set of letters, e.g., `UTC' or `Z' for Coordinated Universal |
| Time. Any included periods are ignored. By following a |
| non-daylight-saving time zone by the string `DST' in a separate word |
| (that is, separated by some white space), the corresponding daylight |
| saving time zone may be specified. Alternatively, a |
| non-daylight-saving time zone can be followed by a time zone |
| correction, to add the two values. This is normally done only for |
| `UTC'; for example, `UTC+05:30' is equivalent to `+05:30'. |
| |
| Time zone items other than `UTC' and `Z' are obsolescent and are not |
| recommended, because they are ambiguous; for example, `EST' has a |
| different meaning in Australia than in the United States. Instead, |
| it's better to use unambiguous numeric time zone corrections like |
| `-0500', as described in the previous section. |
| |
| If neither a time zone item nor a time zone correction is supplied, |
| timestamps are interpreted using the rules of the default time zone |
| (*note Specifying time zone rules::). |
| |
| 1.5 Combined date and time of day items |
| ======================================= |
| |
| The ISO 8601 date and time of day extended format consists of an ISO |
| 8601 date, a `T' character separator, and an ISO 8601 time of day. |
| This format is also recognized if the `T' is replaced by a space. |
| |
| In this format, the time of day should use 24-hour notation. |
| Fractional seconds are allowed, with either comma or period preceding |
| the fraction. ISO 8601 fractional minutes and hours are not supported. |
| Typically, hosts support nanosecond timestamp resolution; excess |
| precision is silently discarded. |
| |
| Here are some examples: |
| |
| 2012-09-24T20:02:00.052-05:00 |
| 2012-12-31T23:59:59,999999999+11:00 |
| 1970-01-01 00:00Z |
| |
| 1.6 Day of week items |
| ===================== |
| |
| The explicit mention of a day of the week will forward the date (only |
| if necessary) to reach that day of the week in the future. |
| |
| Days of the week may be spelled out in full: `Sunday', `Monday', |
| `Tuesday', `Wednesday', `Thursday', `Friday' or `Saturday'. Days may |
| be abbreviated to their first three letters, optionally followed by a |
| period. The special abbreviations `Tues' for `Tuesday', `Wednes' for |
| `Wednesday' and `Thur' or `Thurs' for `Thursday' are also allowed. |
| |
| A number may precede a day of the week item to move forward |
| supplementary weeks. It is best used in expression like `third |
| monday'. In this context, `last DAY' or `next DAY' is also acceptable; |
| they move one week before or after the day that DAY by itself would |
| represent. |
| |
| A comma following a day of the week item is ignored. |
| |
| 1.7 Relative items in date strings |
| ================================== |
| |
| "Relative items" adjust a date (or the current date if none) forward or |
| backward. The effects of relative items accumulate. Here are some |
| examples: |
| |
| 1 year |
| 1 year ago |
| 3 years |
| 2 days |
| |
| The unit of time displacement may be selected by the string `year' |
| or `month' for moving by whole years or months. These are fuzzy units, |
| as years and months are not all of equal duration. More precise units |
| are `fortnight' which is worth 14 days, `week' worth 7 days, `day' |
| worth 24 hours, `hour' worth 60 minutes, `minute' or `min' worth 60 |
| seconds, and `second' or `sec' worth one second. An `s' suffix on |
| these units is accepted and ignored. |
| |
| The unit of time may be preceded by a multiplier, given as an |
| optionally signed number. Unsigned numbers are taken as positively |
| signed. No number at all implies 1 for a multiplier. Following a |
| relative item by the string `ago' is equivalent to preceding the unit |
| by a multiplier with value -1. |
| |
| The string `tomorrow' is worth one day in the future (equivalent to |
| `day'), the string `yesterday' is worth one day in the past (equivalent |
| to `day ago'). |
| |
| The strings `now' or `today' are relative items corresponding to |
| zero-valued time displacement, these strings come from the fact a |
| zero-valued time displacement represents the current time when not |
| otherwise changed by previous items. They may be used to stress other |
| items, like in `12:00 today'. The string `this' also has the meaning |
| of a zero-valued time displacement, but is preferred in date strings |
| like `this thursday'. |
| |
| When a relative item causes the resulting date to cross a boundary |
| where the clocks were adjusted, typically for daylight saving time, the |
| resulting date and time are adjusted accordingly. |
| |
| The fuzz in units can cause problems with relative items. For |
| example, `2003-07-31 -1 month' might evaluate to 2003-07-01, because |
| 2003-06-31 is an invalid date. To determine the previous month more |
| reliably, you can ask for the month before the 15th of the current |
| month. For example: |
| |
| $ date -R |
| Thu, 31 Jul 2003 13:02:39 -0700 |
| $ date --date='-1 month' +'Last month was %B?' |
| Last month was July? |
| $ date --date="$(date +%Y-%m-15) -1 month" +'Last month was %B!' |
| Last month was June! |
| |
| Also, take care when manipulating dates around clock changes such as |
| daylight saving leaps. In a few cases these have added or subtracted |
| as much as 24 hours from the clock, so it is often wise to adopt |
| universal time by setting the `TZ' environment variable to `UTC0' |
| before embarking on calendrical calculations. |
| |
| 1.8 Pure numbers in date strings |
| ================================ |
| |
| The precise interpretation of a pure decimal number depends on the |
| context in the date string. |
| |
| If the decimal number is of the form YYYYMMDD and no other calendar |
| date item (*note Calendar date items::) appears before it in the date |
| string, then YYYY is read as the year, MM as the month number and DD as |
| the day of the month, for the specified calendar date. |
| |
| If the decimal number is of the form HHMM and no other time of day |
| item appears before it in the date string, then HH is read as the hour |
| of the day and MM as the minute of the hour, for the specified time of |
| day. MM can also be omitted. |
| |
| If both a calendar date and a time of day appear to the left of a |
| number in the date string, but no relative item, then the number |
| overrides the year. |
| |
| 1.9 Seconds since the Epoch |
| =========================== |
| |
| If you precede a number with `@', it represents an internal timestamp |
| as a count of seconds. The number can contain an internal decimal |
| point (either `.' or `,'); any excess precision not supported by the |
| internal representation is truncated toward minus infinity. Such a |
| number cannot be combined with any other date item, as it specifies a |
| complete timestamp. |
| |
| Internally, computer times are represented as a count of seconds |
| since an epoch--a well-defined point of time. On GNU and POSIX |
| systems, the epoch is 1970-01-01 00:00:00 UTC, so `@0' represents this |
| time, `@1' represents 1970-01-01 00:00:01 UTC, and so forth. GNU and |
| most other POSIX-compliant systems support such times as an extension |
| to POSIX, using negative counts, so that `@-1' represents 1969-12-31 |
| 23:59:59 UTC. |
| |
| Traditional Unix systems count seconds with 32-bit two's-complement |
| integers and can represent times from 1901-12-13 20:45:52 through |
| 2038-01-19 03:14:07 UTC. More modern systems use 64-bit counts of |
| seconds with nanosecond subcounts, and can represent all the times in |
| the known lifetime of the universe to a resolution of 1 nanosecond. |
| |
| On most hosts, these counts ignore the presence of leap seconds. |
| For example, on most hosts `@915148799' represents 1998-12-31 23:59:59 |
| UTC, `@915148800' represents 1999-01-01 00:00:00 UTC, and there is no |
| way to represent the intervening leap second 1998-12-31 23:59:60 UTC. |
| |
| 1.10 Specifying time zone rules |
| =============================== |
| |
| Normally, dates are interpreted using the rules of the current time |
| zone, which in turn are specified by the `TZ' environment variable, or |
| by a system default if `TZ' is not set. To specify a different set of |
| default time zone rules that apply just to one date, start the date |
| with a string of the form `TZ="RULE"'. The two quote characters (`"') |
| must be present in the date, and any quotes or backslashes within RULE |
| must be escaped by a backslash. |
| |
| For example, with the GNU `date' command you can answer the question |
| "What time is it in New York when a Paris clock shows 6:30am on October |
| 31, 2004?" by using a date beginning with `TZ="Europe/Paris"' as shown |
| in the following shell transcript: |
| |
| $ export TZ="America/New_York" |
| $ date --date='TZ="Europe/Paris" 2004-10-31 06:30' |
| Sun Oct 31 01:30:00 EDT 2004 |
| |
| In this example, the `--date' operand begins with its own `TZ' |
| setting, so the rest of that operand is processed according to |
| `Europe/Paris' rules, treating the string `2004-10-31 06:30' as if it |
| were in Paris. However, since the output of the `date' command is |
| processed according to the overall time zone rules, it uses New York |
| time. (Paris was normally six hours ahead of New York in 2004, but |
| this example refers to a brief Halloween period when the gap was five |
| hours.) |
| |
| A `TZ' value is a rule that typically names a location in the `tz' |
| database (http://www.twinsun.com/tz/tz-link.htm). A recent catalog of |
| location names appears in the TWiki Date and Time Gateway |
| (http://twiki.org/cgi-bin/xtra/tzdate). A few non-GNU hosts require a |
| colon before a location name in a `TZ' setting, e.g., |
| `TZ=":America/New_York"'. |
| |
| The `tz' database includes a wide variety of locations ranging from |
| `Arctic/Longyearbyen' to `Antarctica/South_Pole', but if you are at sea |
| and have your own private time zone, or if you are using a non-GNU host |
| that does not support the `tz' database, you may need to use a POSIX |
| rule instead. Simple POSIX rules like `UTC0' specify a time zone |
| without daylight saving time; other rules can specify simple daylight |
| saving regimes. *Note Specifying the Time Zone with `TZ': (libc)TZ |
| Variable. |
| |
| 1.11 Authors of `parse_datetime' |
| ================================ |
| |
| `parse_datetime' started life as `getdate', as originally implemented |
| by Steven M. Bellovin (<smb@research.att.com>) while at the University |
| of North Carolina at Chapel Hill. The code was later tweaked by a |
| couple of people on Usenet, then completely overhauled by Rich $alz |
| (<rsalz@bbn.com>) and Jim Berets (<jberets@bbn.com>) in August, 1990. |
| Various revisions for the GNU system were made by David MacKenzie, Jim |
| Meyering, Paul Eggert and others, including renaming it to `get_date' to |
| avoid a conflict with the alternative Posix function `getdate', and a |
| later rename to `parse_datetime'. The Posix function `getdate' can |
| parse more locale-specific dates using `strptime', but relies on an |
| environment variable and external file, and lacks the thread-safety of |
| `parse_datetime'. |
| |
| This chapter was originally produced by François Pinard |
| (<pinard@iro.umontreal.ca>) from the `parse_datetime.y' source code, |
| and then edited by K. Berry (<kb@cs.umb.edu>). |
| |