Austin Group Defect Tracker

Aardvark Mark IV


Viewing Issue Simple Details Jump to Notes ] Issue History ] Print ]
ID Category Severity Type Date Submitted Last Update
0001386 [Issue 8 drafts] System Interfaces Objection Error 2020-08-08 19:43 2020-10-08 10:32
Reporter eggert View Status public  
Assigned To
Priority normal Resolution Accepted As Marked  
Status Applied   Product Version Draft 1
Name Paul Eggert
Organization UCLA
User Reference
Section strftime
Page Number 1964-1965
Line Number 63876-63887
Final Accepted Text Note: 0005000
Summary 0001386: strftime RETURN VALUE and ERRORS problems
Description The RETURN VALUE and ERRORS section of strftime have been revised recently, and when I changed the tzdb reference implementation to support these revisions I found some problems.

1. The [EOVERFLOW] description requires strftime to fail if the requested time does not fit into time_t. This disallows a strftime implementation that internally uses an integer wider than time_t, e.g., a strftime operating in a mixed 32- and 64-bit time_t environment that uses 64 bits internally even when called via a 32-bit API. [EOVERFLOW] should be a "may fail", not a "shall fail".

2. No errno value is specified if strftime fails due to exhausting the output buffer. This is a longstanding problem with strftime: portable code has no way to distinguish success from failure when strftime returns 0, and this is a real problem in practice. The simplest way I can see to fix this is to require that errno is unchanged when 0 is returned on success. I've just now changed the reference strftime implementation in tzdb to do this, which was an easy thing to do while adding support for the other errno changes in draft POSIX; see <https://mm.icann.org/pipermail/tz/2020-August/029187.html>. [^]

3. Unnecessary terminology switching between "null byte" and "NUL character".
Desired Action In lines 63887-63878, change "If the total number of resulting bytes including the terminating null byte is not more than _maxsize_" to "If successful".

After line 63880, add "If successful and 0 is returned, errno is unchanged."

After line 63882, add (as a "shall fail"):

  [ERANGE] The total number of resulting bytes including the terminating NUL character is more than _maxsize_.

Move lines 63883-63884 to just after line 63887, so that [EOVERFLOW] is a "may fail" rather than a "shall fail".

After line 63906 "APPLICATION USAGE" add:

A return value of 0 may indicate either success or failure if a format consists entirely of conversion specifiers like "%z" that happen to be replaced by no characters. To distinguish between success and failure when strftime() returns 0, an application can set errno to 0 before calling strftime() and test whether errno is 0 afterwards.
Tags issue8
Attached Files

- Relationships
has duplicate 0001353Closed strftime errno issues 

-  Notes
(0004904)
eggert (reporter)
2020-08-08 19:53
edited on: 2020-08-08 22:43

Also see 0001353, where I've commented on the technical differences between the desired action there and the one here.

(0004905)
shware_systems (reporter)
2020-08-08 21:39

Re: #1
It does not matter the internal implementation, it can use a 128-bit long double if it wants, if the calculated value is not in the symbolic range, as integral seconds, TIME_MIN to TIME_MAX it is a shall fail condition. Perhaps other phrasing than "fit in" that better reflects this range is warranted but a section change is not, that I see.
(0004906)
kre (reporter)
2020-08-08 21:47
edited on: 2020-08-09 16:55

Trivia, a %z conversion can never result in 0 bytes, I think you meant %Z
but a much better (more likely) example is %p.

The ERANGE error isn't needed, it is easy to handle the ambiguity, by
simply never providing a conversion that can legitimately result in 0
bytes as the result. That simply means supplying an irrelevant prefix
character (like a space) at the start of the format if there is any possibility
that a successful return might be 0 bytes, or even always), and then ignoring
it (using s+1 as the resulting string, and result-1 as the length) when a
result >= 1 is returned (with 0 meaning one of the error cases, always).

(0004907)
eggert (reporter)
2020-08-09 07:19

"it can use a 128-bit long double if it wants, if the calculated value is not in the symbolic range, as integral seconds, TIME_MIN to TIME_MAX it is a shall fail condition."

Yes I know that is what the current draft spec says. However, there is no reason for the spec to say that, and that is why I objected. If the strftime implementation can output the correct string for %s even though the represented value is outside of time_t range, there's no good reason for POSIX to require the implementation to fail.
(0004909)
eggert (reporter)
2020-08-09 07:27

"That simply means supplying an irrelevant prefix character (like a space) at the start of the format if there is any possibility that a successful return might be 0"

That is not such a simple thing in a general-purpose subroutine such as a one that accepts a strftime format, generates a date string with that format, and puts the date string in a user-specified buffer. Such a subroutine would have to allocate a buffer to store a near-copy of the format, and another buffer to store a near-copy of the returned string. This is awkward and inefficient and hardly anybody does this. It's more common to use code like this:

        cp[0] = '\1';
        result = strftime(cp, size, format, tmp);
        if (result != 0 || cp[0] == '\0')
            /* strftime worked */;

(this is code taken from tzdb date.c) which "generally" works (although it relies on undocumented strftime internals) and which we hope is good enough. It would be much better if the strftime API gave us a straightforward way to distinguish success from failure. And this is very easy for implementations to do.

"I think you meant %Z" - You're right, thanks.
(0004910)
shware_systems (reporter)
2020-08-09 12:06
edited on: 2020-08-09 21:57

Re: Note: 0004907

The type for "seconds since the epoch" is time_t, so values outside its range are not valid specifications for that platform. For 32-bit time_t's this is the 2038 rollover issue. That you need 60 bits or so to represent seconds since the Big Bang, and counting, is a known bug of many implementations in not being able to represent all times of scientific interest in a time_t. You certainly need more than 32 bits if the range of tm_year is 32 bits, and these don't even cover the age of the planets, as a similar defect of most tm struct definitions. Yes, doing the conversion as if by mktime() is somewhat arbitrary, but it is consistent with other uses of "seconds since the epoch" elsewhere in the standard.

(0004911)
kre (reporter)
2020-08-09 16:45
edited on: 2020-08-09 16:47

Re Note: 0004905 Note: 0004907 and Note: 0004910

Paul is correct, there is no reason to prohibit an implementation of strftime()
from accepting a struct tm that cannot be converted into a time_t. The spec
should allow an implementation to fail in that case, not require it to do so.
The purpose of strftime() is not tm validation (if it were so, then the shall
fail would make sense, but it isn't, so it doesn't).

(0004912)
kre (reporter)
2020-08-09 16:51

Re Note: 0004909

You're right, the workaround can be difficult to handle in a poorly
designed subroutine. Redesigining the subroutine API can allow it to
be handled simply however. In practice there are few such interfaces,
and even fewer exist in places where a 0 length valid result is
plausible, so most of the time, handling strftime() as it now exists
is not a issue worth complaining about.
(0004913)
shware_systems (reporter)
2020-08-10 11:40
edited on: 2020-08-10 14:52

Re: Note: 0004911
If a platform wants that functionality it can add a char *func(const tm *arg) interface for it. A proposal for adding this using a %Os specification might get support also, if implemented.

Apparently enough platforms that added %s as an extension were using mktime() as part of the code for the conversions so the standard reflects this as the existing practice. Someone that was part of the discussions would need to verify that. Since mktime() does validate the structure, strftime() also is required to validate it now, if %s is used.

(0004917)
kre (reporter)
2020-08-10 17:45
edited on: 2020-08-10 23:49

Re Note: 0004913

    [...] extension were using mktime() as part of the code for the
    conversions so the standard reflects this as the existing practice.

That's fine, and allowing the error is fine too.

But:

    Since mktime() does validate the structure, strftime() also is
    required to validate it now, if %s is used.

In the specification of the 's' conversion, after "as if by a call to mktime()"
we should add

    except that errors that may be generated by mktime need not be
    detected, and no actual time_t value need be generated.

It is "as if" by mktime() - nothing requires that mktime() actually be
used (even for %s) - especially as:

    determined using the LC_TIME category of the current locale and by the
    values of zero or more members of the broken-down time structure pointed
    to by timeptr, as specified in brackets in the description. If any of the
    specified values are outside the normal range, the characters stored are
    unspecified.

That is, struct tm normalisation is not required of strftime() (even though
mktime() does that) as if any values are out of range, the results are
unspecified. So an implementation could calculate the number of seconds
by simply usig a series of multiplications and additions based upon the
fields of the tm struct (with appropriate checks for month lengths, leap years
etc) with the results in any integral type it chooses.

There should be no requirement for an EOVERFLOW to be generated, that error
should be moved into the "may fail" section. Forcing implementations to
be defective, just because old ones were is not desirable or productive.

(0005000)
rhansen (manager)
2020-09-21 15:29
edited on: 2020-09-21 15:31

On page 1961 line 63753 fix the shading bug.

On page 1962 line 63784 section strftime(), change:
Replaced by the number of seconds since the Epoch as a decimal number (see XBD Section 4.17, on page 93). [all members, as if by a call to mktime()]
to:
Replaced by the number of seconds since the Epoch as a decimal number, calculated as described for mktime(). [all members]

On page 1964 lines 63877-63878, change:
If the total number of resulting bytes including the terminating null byte is not more than maxsize
to:
If successful

On page 1964 line 63879, change:
Otherwise
to:
If successful, errno shall not be changed. Otherwise

On page 1964, after line 63882, add (as a "shall fail"):
[ERANGE] The total number of resulting bytes including the terminating NUL character is more than maxsize.

On page 1965, move lines 63883-63884 to just after line 63887, so that [EOVERFLOW] is a "may fail" rather than a "shall fail".

On page 1965 after line 63906 "APPLICATION USAGE" add:
A return value of 0 may indicate either success or failure if a format is the empty string or consists of conversion specifications such as <tt>%p</tt> or <tt>%Z</tt> that are replaced by no characters in the current locale or because of the current setting of tzname[], respectively. To distinguish between success and failure when strftime() returns 0, an application can set errno to 0 before calling strftime() and test whether errno is 0 afterwards.



- Issue History
Date Modified Username Field Change
2020-08-08 19:43 eggert New Issue
2020-08-08 19:43 eggert Name => Paul Eggert
2020-08-08 19:43 eggert Organization => UCLA
2020-08-08 19:43 eggert Section => strftime
2020-08-08 19:43 eggert Page Number => 1964-1965
2020-08-08 19:43 eggert Line Number => 63876-63887
2020-08-08 19:53 eggert Note Added: 0004904
2020-08-08 21:39 shware_systems Note Added: 0004905
2020-08-08 21:47 kre Note Added: 0004906
2020-08-08 22:41 Don Cragun Relationship added related to 0001353
2020-08-08 22:43 Don Cragun Note Edited: 0004904
2020-08-09 07:19 eggert Note Added: 0004907
2020-08-09 07:19 eggert Note Added: 0004908
2020-08-09 07:19 eggert Note Deleted: 0004908
2020-08-09 07:27 eggert Note Added: 0004909
2020-08-09 12:06 shware_systems Note Added: 0004910
2020-08-09 16:45 kre Note Added: 0004911
2020-08-09 16:47 kre Note Edited: 0004911
2020-08-09 16:51 kre Note Added: 0004912
2020-08-09 16:55 kre Note Edited: 0004906
2020-08-09 21:57 Don Cragun Note Edited: 0004910
2020-08-10 11:40 shware_systems Note Added: 0004913
2020-08-10 14:52 Don Cragun Note Edited: 0004913
2020-08-10 17:45 kre Note Added: 0004917
2020-08-10 23:49 kre Note Edited: 0004917
2020-09-03 15:49 Don Cragun Relationship replaced has duplicate 0001353
2020-09-21 15:29 rhansen Note Added: 0005000
2020-09-21 15:30 rhansen Note Edited: 0005000
2020-09-21 15:30 rhansen Note Edited: 0005000
2020-09-21 15:30 rhansen Note Edited: 0005000
2020-09-21 15:31 rhansen Note Edited: 0005000
2020-09-21 15:33 rhansen Final Accepted Text => Note: 0005000
2020-09-21 15:33 rhansen Status New => Resolved
2020-09-21 15:33 rhansen Resolution Open => Accepted As Marked
2020-09-21 15:33 rhansen Tag Attached: issue8
2020-10-08 10:32 geoffclare Status Resolved => Applied


Mantis 1.1.6[^]
Copyright © 2000 - 2008 Mantis Group
Powered by Mantis Bugtracker