Austin Group Defect Tracker

Aardvark Mark IV


Viewing Issue Simple Details Jump to Notes ] Issue History ] Print ]
ID Category Severity Type Date Submitted Last Update
0001619 [Issue 8 drafts] Base Definitions and Headers Comment Enhancement Request 2022-11-18 12:21 2022-11-21 16:21
Reporter geoffclare View Status public  
Assigned To
Priority normal Resolution Accepted As Marked  
Status Resolved   Product Version Draft 2.1
Name Geoff Clare
Organization The Open Group
User Reference
Section 8.3
Page Number 161
Line Number 5616
Final Accepted Text See Note: 0006076
Summary 0001619: Add support for TZ=Area/Location
Description Support for TZ values that use the Area/Location format is very widespread and it should be added to the standard.
Desired Action On page 161 line 5616 section 8.3, change:
The value of TZ has one of the two forms (spaces inserted for clarity):
:characters
or:
std offset dst offset, rule
If TZ is of the first format (that is, if the first character is a <colon>), the characters following the <colon> are handled in an implementation-defined manner.

The expanded format (for all TZs whose value does not have a <colon> as the first character) is as follows:
to:
The application shall ensure that the value of TZ has one of the three forms (spaces inserted for clarity):
:characters
or:
Area/Location
or:
std offset dst offset, rule
If TZ is of the first format (that is, if the first character is a <colon>), the characters following the <colon> are handled in an implementation-defined manner.

If TZ is of the second format (that is, if the first character is not a <colon> and the value includes one or two <slash> characters and no <comma>, <less-than-sign>, or <greater-than-sign> characters), Area and Location together indicate either a geographical timezone or a special timezone from an implementation-defined timezone database. If the value contains two <slash> characters, the separator between Area and Location is the first <slash> and the second <slash> is part of Location. Examples of geographical timezones that may be supported include <tt>Africa/Cairo</tt>, <tt>America/Indiana/Indianapolis</tt>, <tt>America/New_York</tt>, <tt>Asia/Tokyo</tt>, and <tt>Europe/London</tt>. The data for each geographical timezone shall include:

  • Whether Daylight Saving Time (DST) is observed, and if so the rules used to determine when the transitions to and from DST occur.

  • The offset from Coordinated Universal Time of the timezone's standard time and, if observed, of its DST.

  • The timezone names for standard time (std) and, if observed, for DST (dst) to be used by tzset(). These shall each contain no more than {TZNAME_MAX} bytes.
If there are any historical variations, or known future variations, of the above data for a geographical timezone, these variations shall be included in the database, except that historical variations from before the Epoch need not be included.

The database should incorporate the geographical timezones from the IANA timezone database and the implementation should provide a way to update it in accordance with RFC 6557; if these recommendations are not followed, an implementation-defined way to update the database shall be provided.

Implementations shall support the special timezone <tt>Etc/UTC</tt> and may support other special timezones from the IANA timezone database or additional implementation-defined special timezones. The behavior for <tt>TZ=Etc/UTC</tt> shall be identical to <tt>TZ=UTC0</tt> (see below).

The expanded form of the third format (without the inserted spaces) is as follows:

On page 3460 line 118332 section A.8.3, change:
Implementations are encouraged to use the time zone database maintained by IANA to determine when Daylight Saving Time changes occur and to handle TZ values that start with a <colon>. See RFC 6557.
to:
Implementations are encouraged to incorporate the IANA timezone database into the timezone database used for TZ values of the form Area/Location and to provide a way to update it in accordance with RFC 6557.

The TZ format beginning with <colon> was originally introduced as a way for implementations to support geographical timezones in the form :Area/Location as an extension, but implementations started to support them without the leading <colon> (as well as with it) and their use without the <colon> became the de-facto standard. Consequently when geographical timezones were added to this standard, it was without the <colon>. However, this format is only specified as being recognized when the value includes one or two <slash> characters; portable applications still need to include the <colon> in order to specify any geographical timezone value that does not include any <slash> characters.

Tags issue8
Attached Files

- Relationships

-  Notes
(0006065)
steffen (reporter)
2022-11-18 21:44

Area/Location, yes.

This is an all-english thing with sometimes strange names that are rather specific to IANA TZ, or for direct consumers of that as a citizen.

Just a suggestion.

The ICU project maintains several mappings that may serve actual users better.

For one there is one that includes countries. It is still all english, but translatable. The hypthetic America:Argentina:Buenos_Aires may serve users better.

There are also UN Locode mappings. UN Locode support should possibly be support ed or reserved as trade will continue to be the thing, i would suggest via leading @ plus the five letters which makes up a Locode entry.
I would presume these Locode IDs become known more and more in the future.

kre@ knows both much better and for longer, but IANA TZ over and over discusses the syntax and content of the Area/Location IDs as such, and whereas i would assume backward-compatibility will not be lost, i could imagine a new maintainer (the current one is on the austin ML and reads this, sufficient time provided) switches to a new format that serves ISO 3166 country codes much better, and ISO 3166 country codes are the base (root level) of UN Locodes.
(0006068)
kre (reporter)
2022-11-19 18:23

Re Note: 0006065

Not so much English Language (though that is what the tz database uses)
as POSIX Portable Character Set (XBD 6) - which is all that implementations
are required to accept, and hence all that the standard can actually require.

I can accept the text in the Desired Action, though I would have specified
rather less about the new format, instead of requiring a '/' and no ','
I would have simply allowed any strng not starting with ':' (which remains
as it has been for ages) and which does not match
         ^[[:alpha:]]+[+-]?[[:digit:]]+
(as an ERE) (add ".*$" to make it anchored both ends if desired).
This is easy enough to specify in words, rather than as an ERE, if that
achieves a better result.

That would allow a large variety of naming schemes, rather than committing
to one, when the actual name is not all that important to anything,
what is more important is the data it provides, and the specification of
what is required for that looks OK.

For the specification, if this variation is reasonable, I'd make that
pattern be the 2nd case, and have it be one that matches the TZ value,
rather than one which does not (and then represents a current TZ specified
string ("std offset dst offset, rule") and then have the third case be "any
other string", and be the new one.
(0006073)
geoffclare (manager)
2022-11-21 09:59
edited on: 2022-11-21 10:02

Here's some alternative wording that disambiguates the 2nd and 3rd formats the way kre suggests (and swaps them round).

I don't have any preference between the original wording in the desired action and this alternative, although I can see that this one might be more future-proof.


On page 161 line 5616 section 8.3, change:
The value of TZ has one of the two forms (spaces inserted for clarity):
:characters
or:
std offset dst offset, rule
If TZ is of the first format (that is, if the first character is a <colon>), the characters following the <colon> are handled in an implementation-defined manner.

The expanded format (for all TZs whose value does not have a <colon> as the first character) is as follows:
to:
The application shall ensure that the value of TZ has one of the three forms (spaces inserted for clarity):
:characters
or:
std offset dst offset, rule
or:

A format specifying a geographical timezone or a special timezone.

If TZ is of the first format (that is, if the first character is a <colon>), the characters following the <colon> are handled in an implementation-defined manner.

The expanded form of the second format (without the inserted spaces) is as follows:

After page 163 line 5700 section 8.3, add:
If TZ is of the third format (that is, if the first character is not a <colon> and the value does not match the syntax for the second format), the value indicates either a geographical timezone or a special timezone from an implementation-defined timezone database. Typically these take the form
Area/Location
as in the IANA timezone database. Examples of geographical timezones that may be supported include <tt>Africa/Cairo</tt>, <tt>America/Indiana/Indianapolis</tt>, <tt>America/New_York</tt>, <tt>Asia/Tokyo</tt>, and <tt>Europe/London</tt>. The data for each geographical timezone shall include:

  • Whether Daylight Saving Time (DST) is observed, and if so the rules used to determine when the transitions to and from DST occur.

  • The offset from Coordinated Universal Time of the timezone's standard time and, if observed, of its DST.

  • The timezone names for standard time (std) and, if observed, for DST (dst) to be used by tzset(). These shall each contain no more than {TZNAME_MAX} bytes.
If there are any historical variations, or known future variations, of the above data for a geographical timezone, these variations shall be included in the database, except that historical variations from before the Epoch need not be included.

If the database incorporates the geographical timezones from the IANA timezone database, the implementation should provide a way to update it in accordance with RFC 6557; if this recommendation is not followed, an implementation-defined way to update the database shall be provided.

Implementations shall support the special timezone <tt>Etc/UTC</tt> and may support additional implementation-defined special timezones. The behavior for <tt>TZ=Etc/UTC</tt> shall be identical to <tt>TZ=UTC0</tt> (as described in the second format above).

On page 3460 line 118332 section A.8.3, change:
Implementations are encouraged to use the time zone database maintained by IANA to determine when Daylight Saving Time changes occur and to handle TZ values that start with a <colon>. See RFC 6557.
to:
Implementations are encouraged to incorporate the IANA timezone database into the timezone database used for TZ values specifying geographical and special timezones, and to provide a way to update it in accordance with RFC 6557.

The TZ format beginning with <colon> was originally introduced as a way for implementations to support geographical timezones in the form :Area/Location as an extension, but implementations started to support them without the leading <colon> (as well as with it) and their use without the <colon> became the de-facto standard. Consequently when geographical timezones were added to this standard, it was without the <colon>.


(0006075)
kre (reporter)
2022-11-21 14:38

Re Note: 0006073

That version is, I think, better than the one in the desired action.
It contains more about the RFC specified database than is really
needed, but that's harmless (and does make the intent clear).

I would make three changes to what is there though.

First, the first requirement of the database:

   Whether Daylight Saving Time (DST) is observed, and if so the rules
   used to determine when the transitions to and from DST occur.

That is not what the tzdata database (generally) provides - what it
gives is when the transitions actually occurred, and when it is possible
a rule that can be used to predict the future (which is generally in the
form of an old style (2nd format here) TZ value). When that isn't possible
it will often guess future transition dates (that's often done when the
expected dates in the future depend upon, or are likely to be affected by,
astronomical or religious events - for which no simple 2nd format TZ string
can cover more than one future year). So, I would replace that
requirement with something less specific (and for this, I will inhibit my
usual attempt to substitute the more rational "summer time" for the ludicrous
"daylight saving")

    If daylight saving is or has been observed a method to discover the dates
    and times of transitions to and from daylight saving time, the new offset
    from UTC which applied or will apply during periods of daylight saving,
    and how that is represented as a label (abbreviation) in textual form,
    for times in the past, and predictions of the future.

That is, in the database there are records like:
    At time_t=X the offset became N, which is [not] daylight saving time,
    and is called EST (or EDT) (or whatever).
There is a record like that for every recorded transition. Simple zones
(like UTC) just say "Since the beginning of time, the offset is 0, not
daylight saving, called 'UTC'" and that is the entire datbase. From
that (if it was ever needed, and it isn't, as there is no interface provided
from which anyone can request this info) we can conclude that daylight
saving is not used in this zone, as there are no records that say "is
daylight saving". A zone which used summer time for one short period,
sometime in the past, as an experiment perhaps - presumably one which
failed - would have the transitions indicating when summer time started,
and ended (for whatever years the experiment lasted) - and then nothing
after that. Whether that is considered as "having daylight saving" is
kind of difficult to judge - some people might say yes, because it was
used once, others no, because it isn't now, and isn't expected in the
future. But it really doesn't matter.


Second, rather than:

   If the database incorporates the geographical timezones from the
   IANA timezone database, ....

I would say something more generic, like:

   If the format references an external database - that is, the value of
   TZ is not itself the specification of the timezone - the implementation
   shall provide an implementation defined method to allow the database to
   be updated, for example that specified by RFC 6557.

That is, it isn't only if an IANA format (tzdata) database is being
used that there needs to be a way to update it, anything which requires
external data (beyond the TZ string itself) needs the ability to be updated.

And, I would delete this paragraph:

     Implementations shall support the special timezone <tt>Etc/UTC</tt>
     and may support additional implementation-defined special timezones.
     The behavior for <tt>TZ=Etc/UTC</tt> shall be identical to
     <tt>TZ=UTC0</tt> (as described in the second format above).

That (or something similar) is needed in the version of the text in the
Desired Action, as that one bases the format around the Area/... concept,
and "Etc" cannot really be considered an area. In this updated version
that's not needed.

Other than to allow Etc/* to exist, there never was a reason to require
Etc/UTC to exist though - if someone desires UTC0 as their timezone, they
can always simply do: TZ=UTC0 - we do not need to mandate alternatives
also exist. Anyone using the tzdata files will get Etc/UTC (and just
UTC as well) but I see no reason to require any of those to exist.

For the purposes of testing (conformance tests) only the 2nd form of TZ
string should ever be used - while the other formats need to be tested,
one cannot write a test to determine correctness based upon those, as
we're not attempting (not even in the desired action version) to specify
what those actually mean (except that Etc/UTC special case). That is,
while one might expect America/New_York to give the data for the eastern
timezone of the US, there's no reason (not specified here, and nor should
we attempt to) that in some implementation that doesn't give the translation
data for Jedah or Auckland. No conformance test can rely upon anything there.

Beyond what the labels mean, the data referenced by them is subject to more
or less arbitrary changes, as corrections to what was believed to be
historically correct are made, or incorrect predictions for the future
are updated.

Lastly, do not allow the (likely) implementation of this method as a
reference to tzdata to influence any wording which may prohibit an
implementation from simply giving all of the needed data as the (possibly
encoded) value of the value of the TZ environment variable. All we
should be demanding is that implementations allow some method by which
real world timezone information can be passed to applications - particularly
via localtime() and mktime() using tzset(). How the implementation
chooses to do that should not matter. [But obviously using tzdata is far
and away the best/easiest method - at least currently.]


Personally I'd like to see more encouragement for support of historical
data (before the Epoch) when it can be determined, than "need not be
provided" suggests, but how much it is possible to encourage that, if
at all, I will leave for others to determine. There's no question but
that the further back in time one goes the less reliable is the data,
and of course, once one goes back before the introduction of standard time
(we don't even need to go back to the beginning of the Gregorian Calendar
in most jurisdictions for this) the whole thing becomes ludicrous - there's
no rational method of translating a local time into an offset from UTC (or
vice versa) when the local time was determined by someone looking at a shadow
and proclaiming "now it is noon, set the clock".
(0006076)
geoffclare (manager)
2022-11-21 15:27

In Note: 0006075 kre makes some valid points. Here is my attempt to apply them to my previous suggested wording.

I didn't include the "how that is represented as a label" part in the first (now second) bullet point, as I believe that is covered by a combination of the third bullet point and the part after the bullet list about historical variations.

I also omitted the text that talked about the value of TZ itself being the specification of the timezone, as implementations can use a TZ value beginning with colon to provide that ability.

On page 161 line 5616 section 8.3, change:
The value of TZ has one of the two forms (spaces inserted for clarity):
:characters
or:
std offset dst offset, rule
If TZ is of the first format (that is, if the first character is a <colon>), the characters following the <colon> are handled in an implementation-defined manner.

The expanded format (for all TZs whose value does not have a <colon> as the first character) is as follows:
to:
The application shall ensure that the value of TZ has one of the three forms (spaces inserted for clarity):
:characters
or:
std offset dst offset, rule
or:

A format specifying a geographical timezone or a special timezone.

If TZ is of the first format (that is, if the first character is a <colon>), the characters following the <colon> are handled in an implementation-defined manner.

The expanded form of the second format (without the inserted spaces) is as follows:

After page 163 line 5700 section 8.3, add:
If TZ is of the third format (that is, if the first character is not a <colon> and the value does not match the syntax for the second format), the value indicates either a geographical timezone or a special timezone from an implementation-defined timezone database. Typically these take the form
Area/Location
as in the IANA timezone database. Examples of geographical timezones that may be supported include <tt>Africa/Cairo</tt>, <tt>America/Indiana/Indianapolis</tt>, <tt>America/New_York</tt>, <tt>Asia/Tokyo</tt>, and <tt>Europe/London</tt>. The data for each geographical timezone shall include:

  • The offset from Coordinated Universal Time of the timezone's standard time.

  • If Daylight Saving Time (DST) is, or has historically been, observed: a method to discover the dates and times of transitions to and from DST and the offset from Coordinated Universal Time during periods when DST was, is, or is predicted to be, in effect.

  • The timezone names for standard time (std) and, if observed, for DST (dst) to be used by tzset(). These shall each contain no more than {TZNAME_MAX} bytes.
If there are any historical variations, or known future variations, of the above data for a geographical timezone, these variations shall be included in the database, except that historical variations from before the Epoch need not be included.

If the database incorporates an external database such as the one maintained by IANA, the implementation shall provide an implementation-defined method to allow the database to be updated, for example the method specified by RFC 6557.

On page 3460 line 118332 section A.8.3, change:
Implementations are encouraged to use the time zone database maintained by IANA to determine when Daylight Saving Time changes occur and to handle TZ values that start with a <colon>. See RFC 6557.
to:
Implementations are encouraged to incorporate the IANA timezone database into the timezone database used for TZ values specifying geographical and special timezones, and to provide a method to allow it to be updated in accordance with RFC 6557.

The TZ format beginning with <colon> was originally introduced as a way for implementations to support geographical timezones in the form :Area/Location as an extension, but implementations started to support them without the leading <colon> (as well as with it) and their use without the <colon> became the de-facto standard. Consequently when geographical timezones were added to this standard, it was without the <colon>.
(0006081)
kre (reporter)
2022-11-21 16:20
edited on: 2022-11-21 16:42

Bugnote:6076 is OK.

The mention of the labels in conjunction with the transitions was
an attempt to make it clearer that there isn't necessarily just one
"standard" and "daylight" ("summer") time label (each) for a zone.
What the time is called can vary over time... But you're right,
there is probably no compelling need to make that point.

OK on the presuming a database is used in this format (in practice
it needs to be anyway, the specification for some zones needs quite
a lot of data, more than is likely to be reasonable in a TZ value
string) - further, if implementations do introduce some other method
(eg: having TZ reference some external server over the net, which then
means there's no local database that ever needs updating) then if needed
some later version of the standard can adjust things to allow that.

ps: "but implementations started to support them without the leading <colon>"
is a slight fib - implementations had been supporting them that way long
before the ':' variant of TZ was invented.


- Issue History
Date Modified Username Field Change
2022-11-18 12:21 geoffclare New Issue
2022-11-18 12:21 geoffclare Name => Geoff Clare
2022-11-18 12:21 geoffclare Organization => The Open Group
2022-11-18 12:21 geoffclare Section => 8.3
2022-11-18 12:21 geoffclare Page Number => 161
2022-11-18 12:21 geoffclare Line Number => 5616
2022-11-18 12:23 geoffclare Desired Action Updated
2022-11-18 21:44 steffen Note Added: 0006065
2022-11-19 18:23 kre Note Added: 0006068
2022-11-21 09:59 geoffclare Note Added: 0006073
2022-11-21 10:02 geoffclare Note Edited: 0006073
2022-11-21 14:38 kre Note Added: 0006075
2022-11-21 15:27 geoffclare Note Added: 0006076
2022-11-21 16:20 kre Note Added: 0006081
2022-11-21 16:21 nick Final Accepted Text => See Note: 0006076
2022-11-21 16:21 nick Status New => Resolved
2022-11-21 16:21 nick Resolution Open => Accepted As Marked
2022-11-21 16:22 nick Tag Attached: issue8
2022-11-21 16:42 kre Note Edited: 0006081


Mantis 1.1.6[^]
Copyright © 2000 - 2008 Mantis Group
Powered by Mantis Bugtracker