Austin Group Defect Tracker

Aardvark Mark IV


Viewing Issue Simple Details Jump to Notes ] Issue History ] Print ]
ID Category Severity Type Date Submitted Last Update
0001647 [1003.1(2016/18)/Issue7+TC2] System Interfaces Objection Clarification Requested 2023-03-28 16:32 2023-08-22 14:22
Reporter eblake View Status public  
Assigned To
Priority normal Resolution Accepted As Marked  
Status Applied  
Name Eric Blake
Organization Red Hat
User Reference ebb.printf %lc
Section fprintf
Page Number 913
Line Number 30957
Interp Status Approved
Final Accepted Text Note: 0006239
Summary 0001647: printf("%lc", (wint_t)0) can't output NUL byte
Description In comparing a table of wide vs. narrow print operations, coupled with the NUL byte/character, we have the following surprising table of results:
narrow with narrow: printf("%c", '\0') -> 1 NUL byte
wide with wide: wprintf("%lc", L'\0') -> 1 NUL character
wide with narrow: wprintf("%c", '\0') -> 1 NUL character
narrow with wide: printf("%lc", L'\0') -> 0 bytes

Why? Because "If an l (ell) qualifier is present, the wint_t argument shall be converted as if by an ls conversion specification with no precision and an argument that points to a two-element array of type wchar_t, the first element of which contains the wint_t argument to the ls conversion specification and the second element contains a null wide character.", and printf("%ls", L"") outputs 0 bytes.

Even though ISO C has specified this for more than 23 years, it would make a lot more sense if 0 weren't special-cased as the one wide character you can't print to a narrow stream. Most libc have done the common-sense mapping, and only recently did we learn that musl differed from everyone else in actually obeying the literal requirements of C, leading to this glibc bug report: https://sourceware.org/bugzilla/show_bug.cgi?id=30257 [^]

Since these interfaces defer to the C standard unless explicitly stated otherwise, any change we do here will need to be coordinated with WG14. I recommend that the Austin Group start by filing a ballot defect report against the upcoming C23 recommending that narrow *printf %lc should behave like the other three combinations. At that point, even though Issue 8 will be tied to C17 which has the undesirable semantics, we can use <CX> shading to require POSIX to be in line with what C23 will land on. However, we should not start an interpretation request unless we know for sure how WG14 wants to proceed.
Desired Action After coordination with WG14, and after applying the change to 0001643, change page 913 line 30957 (fprintf DESCRIPTION for <tt>%c</tt>) from:
If an <tt>l</tt> (ell) qualifier is present, the wint_t argument shall be converted as if by an <tt>ls</tt> conversion specification with no precision and an argument that points to a two-element array of type wchar_t, the first element of which contains the wint_t argument to the <tt>lc</tt> conversion specification and the second element contains a null wide character.
to:
If an <tt>l</tt> (ell) qualifier is present, <CX>the wint_t argument shall be converted to a multi-byte sequence as if by a call to wcrtomb( ) with the wint_t argument converted to wchar_t and an initial shift state, and the resulting bytes written.</CX>
Tags applied_after_i8d3, issue8
Attached Files

- Relationships
related to 0001643Applied 1003.1(2016/18)/Issue7+TC2 fprintf %lc: wrong reference to the current conversion specification 
related to 0001755Applied Issue 8 drafts not deferring to C17 on specifics has knock-on effects 

-  Notes
(0006237)
eblake (manager)
2023-03-28 17:30

Maybe also add a paragraph to RATIONALE, page 999 line 34005:
The behavior specified for <tt>%lc</tt> conversions differs slightly from the specification in the ISO C standard, in that printing the null wide character produces a NUL byte instead of 0 bytes of output as would be required by a strict reading of the C standard's direction to behave as if applying the <tt>%ls</tt> specifier to a wchar_t array whose first element is the null wide character. Requiring a multibyte output for every possible wide character, including the null character, matches historical practice, and provides consistency with <tt>%c</tt> in fprintf( ) and with both <tt>%c</tt> and <tt>%lc</tt> in fwprintf( ). It is anticipated that a future edition of the C standard will change to match the behavior specified here.
(0006239)
eblake (manager)
2023-03-30 16:33
edited on: 2023-04-03 15:28

In addition to this interpretation response, the Austin Group plans to file a ballot defect on C23 to WG14.

Interpretation response
------------------------
The standard states that printf("%lc", (wint_t)0) writes no bytes, and conforming implementations must conform to this. However, concerns have been raised about this which are being referred to the sponsor.

Rationale:
-------------
The requirement to write no bytes does not match historical practice. However, the requirement derives from the ISO C standard and an attempt to change the requirements in a TC for Issue 7 would introduce a conflict. Therefore this will be addressed in Issue 8 by not deferring to the ISO C standard regarding this behavior.

Notes to the Editor (not part of this interpretation):
-------------------------------------------------------
Change page 909 line 30747 (fprintf DESCRIPTION) from:

    
Excluding dprintf(): The functionality described on this reference page is aligned with the ISO C standard. Any conflict between the requirements described here and the ISO C standard is unintentional. This volume of POSIX.1-2017 defers to the ISO C standard.


to:

    
Except for dprintf() and the behavior of the <tt>%lc</tt> conversion when passed a null wide character, the functionality described on this reference page is aligned with the ISO C standard. Any other conflict between the requirements described here and the ISO C standard is unintentional. This volume of POSIX.1-202x defers to the ISO C standard for all fprintf(), printf(), snprintf(), and sprintf() functionality except in relation to the <tt>%lc</tt> conversion when passed a null wide character.


    
Change page 913 line 30957 (fprintf DESCRIPTION for <tt>%c</tt>) from:
    
If an <tt>l</tt> (ell) qualifier is present, the wint_t argument shall be converted as if by an <tt>ls</tt> conversion specification with no precision and an argument that points to a two-element array of type wchar_t, the first element of which contains the wint_t argument to the <tt>ls</tt> conversion specification and the second element contains a null wide character.

to:
    
If an <tt>l</tt> (ell) qualifier is present, [CX]the wint_t argument shall be converted to a multi-byte sequence as if by a call to wcrtomb( ) with a pointer to storage of at least MB_CUR_MAX bytes, the wint_t argument converted to wchar_t, and an initial shift state, and the resulting byte(s) written.[/CX]


Add a paragraph to RATIONALE, page 920 line 31263:
    
The behavior specified for the <tt>%lc</tt> conversion differs slightly from the specification in the ISO C standard, in that printing the null wide character produces a null byte instead of 0 bytes of output as would be required by a strict reading of the ISO C standard's direction to behave as if applying the <tt>%ls</tt> specifier to a wchar_t array whose first element is the null wide character. Requiring a multibyte output for every possible wide character, including the null character, matches historical practice, and provides consistency with <tt>%c</tt> in fprintf( ) and with both <tt>%c</tt> and <tt>%lc</tt> in fwprintf( ). It is anticipated that a future edition of the ISO C standard will change to match the behavior specified here.


(0006246)
eblake (manager)
2023-04-03 15:27

On 2023-04-03, Note: 0006239 was edited in place to fix a typo and start the interpretation response period.
(0006248)
ajosey (manager)
2023-04-03 16:31

An interpretation is proposed: 3rd April 2023
(0006251)
hvd (reporter)
2023-04-03 16:46

The proposed change makes no change to the requirements to c99/c17 to provide a conforming C99/C17 environment. Does that imply that the resolution requires implementations to provide different behaviour depending on whether _POSIX_C_SOURCE is defined?
(0006346)
geoffclare (manager)
2023-06-23 16:19

Re Note: 0006251 Where does the Issue 8 draft say that the c17 utility has to provide a conforming C17 environment? I only see a requirement for it to "accept source code conforming to the ISO C standard", which is not the same thing.
Also, conforming applications are required to define _POSIX_C_SOURCE (for POSIX-conforming applications) or _XOPEN_SOURCE (for XSI-conforming applications) so I don't see why POSIX would even try to specify how an implementation behaves when applications are compiled without defining one of those macros.
(0006347)
hvd (reporter)
2023-06-23 16:45

> Where does the Issue 8 draft say that the c17 utility has to provide a conforming C17 environment? I only see a requirement for it to "accept source code conforming to the ISO C standard", which is not the same thing.

It requires more than that, it follows by the deference to the C standard's translation phases that all of the requirements of those translation phases apply to the c17 utility. But the requirement to "accept source code conforming to the ISO C standard" is by itself already sufficient. A conforming C17 program can call printf("%lc", (wint_t)0).

> Also, conforming applications are required to define _POSIX_C_SOURCE (for POSIX-conforming applications) or _XOPEN_SOURCE (for XSI-conforming applications)

This requirement is only placed on "Strictly Conforming POSIX Application". No such requirement exists for "Conforming POSIX Application". Even if such a requirement did exist, that is irrelevant, because the description of the c17 utility is literally "compile standard C programs" and as you say, is required to "ccept source code conforming to the ISO C standard", regardless of whether such code meets the requirements on Strictly Conforming POSIX Applications or on Conforming POSIX Applications.
(0006349)
geoffclare (manager)
2023-06-26 09:25

> it follows by the deference to the C standard's translation phases that all of the requirements of those translation phases apply to the c17 utility

Irrelevant, as there is nothing in the C17 description of those translation phases in 5.1.1.2 that would pull in the C17 requirements for the behaviour of printf() from 7.21.6.3.

> the requirement to "accept source code conforming to the ISO C standard" is by itself already sufficient. A conforming C17 program can call printf("%lc", (wint_t)0).

Non-sequitur. If the source code contains a call to printf("%lc", (wint_t)0), c17 is required to produce a program (or shared library) that calls a function called printf() with those two arguments. There is nothing that says the behaviour of that function has to be as described in C17; the required behaviour is described in XSH and any behavioural requirements from C17 derive only from the XSH description, so that is the only place that needs to state anything about POSIX not deferring to C17 on a specific aspect of printf()'s behaviour.

> This requirement is only placed on "Strictly Conforming POSIX Application". No such requirement exists for "Conforming POSIX Application".

The relevant text from XSH 2.2.1.1 is:
A POSIX-conforming application shall ensure that the feature test macro _POSIX_C_SOURCE is defined before inclusion of any header.
"POSIX-conforming application" is not the same as "Conforming POSIX Application"; this text does not distinguish between the different ways an application can conform to POSIX, and so applies to all three.
(0006350)
hvd (reporter)
2023-06-26 09:56

> Irrelevant, as there is nothing in the C17 description of those translation phases in 5.1.1.2 that would pull in the C17 requirements for the behaviour of printf() from 7.21.6.3.

What library components do you think translation phase 8 is referring to, if not the components of the library as defined in section 7? That is what "library" means in the C standard; there is nothing else it could be referring to. An implementation that links in a printf() that conflicts with 7.21.6.3 doesn't implement translation phase 8 in the manner specified by the C standard, and therefore doesn't implement the c17 utility in the manner proposed for the next version of POSIX.

> Non-sequitur.

See above.

> "POSIX-conforming application" is not the same as "Conforming POSIX Application";

Oh, at the start, you wrote "conforming application". This term is not actually defined anywhere, but is used elsewhere to mean "conforming POSIX application", not "POSIX-conforming application", see e.g. Rationale for Base Definitions. You're right, "POSIX-conforming application" does require that. Which doesn't change anything, as nothing in the specification of the c17 utility requires its inputs to be POSIX-conforming applications any more than it requires its inputs to be conforming POSIX applications.
(0006361)
geoffclare (manager)
2023-06-27 08:58

> What library components do you think translation phase 8 is referring to, if not the components of the library as defined in section 7?

It is a reference back to the following text in 5.1.1.1:
Previously translated translation units may be preserved individually or in libraries.
If the use of "library components" in 5.1.1.2 was intended to be referring to section 7, then section 7 would be included in the forward references at the end of 5.1.1.2, but it is not there.

> Oh, at the start, you wrote "conforming application".

That's because I was talking about both POSIX-conforming applications and XSI-conforming applications.
(0006365)
geoffclare (manager)
2023-06-27 12:29

Some historical background: the references to translation phases are not in the current standard on the c99 page. They were added to the c17 page in the Issue 8 drafts as part of the bug 0001294 additions to provide a way to create shared libraries. There was certainly no intention that they would have any effect on how standard functions are required to behave.
(0006366)
hvd (reporter)
2023-06-29 01:47
edited on: 2023-06-29 02:01

> Some historical background: the references to translation phases are not in the current standard on the c99 page. They were added to the c17 page in the Issue 8 drafts as part of the bug 0001294 additions to provide a way to create shared libraries. There was certainly no intention that they would have any effect on how standard functions are required to behave.

Agreed, but not in the same way that you intend. Even the c89 utility was specified as "compile standard C programs" and "it will accept source code conforming to the ISO C standard". It has always seemed so obvious to me that if _POSIX_C_SOURCE/_XOPEN_SOURCE is not defined, the requirements of ISO C continue to apply in full, that it did not need to be explicitly stated, and changes to the c17 utility that make it easier to infer this should not be read as a change in requirements, merely as a clarification.

As it turns out, there is one other place that does say this is what is intended: the description of CX "Extension to the ISO C standard", first added in SUSv3, but I am quoting from 202x_d3-cb2.1.pdf here.

> Extension to the ISO C standard

> The functionality described is an extension to the ISO C standard. Application developers may make use of an extension as it is supported on all POSIX.1-202x-conforming systems. With each function or header from the ISO C standard, a statement to the effect that ``any conflict is unintentional’’ is included. That is intended to refer to a direct conflict. POSIX.1-202x acts in part as a profile of the ISO C standard, and it may choose to further constrain behaviors allowed to vary by the ISO C standard. Such limitations and other compatible differences are not considered conflicts, even if a CX mark is missing. The markings are for information only. Where additional semantics apply to a function or header, the material is identified by use of the CX margin legend.

This is why POSIX requires _POSIX_C_SOURCE in the first place: when _POSIX_C_SOURCE/_XOPEN_SOURCE is not defined, the requirements of ISO C apply and ISO C does not permit e.g. for <stdio.h> to declare getline(). When _POSIX_C_SOURCE/_XOPEN_SOURCE is defined, however, as far as ISO C is concerned, the behaviour is undefined, and another standard such as POSIX that places otherwise conflicting requirements only when such macros are defined is actually not in conflict. I understand from your comments that your view is that when _POSIX_C_SOURCE/_XOPEN_SOURCE is not defined, rather than deferring to ISO C, the behaviour is undefined (unspecified?) in POSIX. For instance, given

  $ cat >program.c <<EOF &&
  > #include <stdio.h>
  > int main(void) { puts("Hello, world!"); }
  > EOF
  > c17 -o program program.c &&
  > ./program

the output is not required by POSIX to be "Hello, world!". Is that a correct interpretation of your comments?

If it is, the whole "Extension to the ISO C standard" is silly: there cannot be even a single program that is both valid under ISO C and valid under POSIX under your interpretation that includes any standard library headers: if _POSIX_C_SOURCE/_XOPEN_SOURCE are defined, the behaviour is undefined under ISO C, if _POSIX_C_SOURCE/_XOPEN_SOURCE are not defined, the behaviour is undefined under POSIX. That does not extend ISO C in any meaningful way.

But, actually, this raises another concern: ISO C does not require the use of standard library headers (7.1.4p2). On implementations where wint_t is a typedef for int, the below is a correct (but not strictly conforming) ISO C program:

  extern int printf(const char * restrict format, ...);
  int main(void) { printf("%lc\n", 0); }

Avoiding the use of standard library headers makes this a POSIX-conforming application even without defining _POSIX_C_SOURCE, so presumably would be required to be accepted by the c17 utility even under your interpretation. This exact same program is required by ISO C17 to output nothing, and by POSIX with the proposed bug resolution to output a null byte. As such, the proposed change makes it impossible to continue to classify POSIX as "a profile of the ISO C standard".

Note that if WG14 would be willing to classify this as a defect, that avoids the entire issue. If this is classified as a defect, that gives licence to implementations to apply the change even when claiming conformance to older versions of ISO C, and avoids the need for a conflict between ISO C17 and POSIX. Would they be willing to do so?

(0006369)
geoffclare (manager)
2023-06-29 11:28

> As it turns out, there is one other place that does say this is what is intended: the description of CX "Extension to the ISO C standard"

The CX text is just giving information about statements that exist on other pages. But thanks for pointing it out, because that text no longer matches the statements it is referring to, and so needs adjusting.

> I understand from your comments that your view is that when _POSIX_C_SOURCE/_XOPEN_SOURCE is not defined, rather than deferring to ISO C, the behaviour is undefined (unspecified?) in POSIX. [...] Is that a correct interpretation of your comments?

Yes, it's undefined. Anywhere POSIX requires confirming applications to do something (such as the numerous places it says "The application shall ensure that ..."), if an application doesn't do that thing then the behaviour is undefined. I see no reason that this would not include the requirement to define _POSIX_C_SOURCE or _XOPEN_SOURCE.

> If it is, the whole "Extension to the ISO C standard" is silly: there cannot be even a single program that is both valid under ISO C and valid under POSIX under your interpretation that includes any standard library headers: if _POSIX_C_SOURCE/_XOPEN_SOURCE are defined, the behaviour is undefined under ISO C, if _POSIX_C_SOURCE/_XOPEN_SOURCE are not defined, the behaviour is undefined under POSIX. That does not extend ISO C in any meaningful way.

It just means that conforming to ISO C and conforming to POSIX are two independent choices that implementations can make. If an implementation chooses to conform to both, then there will indeed be many programs that are both valid under ISO C and valid under POSIX, and on such implementations POSIX extends C. But implementations ought to be able to choose to just support POSIX and not "bare C" if they want to.

> ISO C does not require the use of standard library headers

True, which means an implementation that wants to support both printf() behaviours would not be able to choose between them based on whether _POSIX_C_SOURCE/_XOPEN_SOURCE is defined. But there are other ways they could do it (command line options and environment variables being the most obvious).

> Note that if WG14 would be willing to classify this as a defect, that avoids the entire issue.

That is indeed the root cause of this entire issue. WG14 are not issuing DRs for C17, so we are forced to address C17 defects by saying that POSIX does not defer to C17 for certain specific things.

I expect that in practice, implementations will treat the change in C23 as being a de-facto DR for C17 and act accordingly.
(0006377)
geoffclare (manager)
2023-07-06 08:47

I have submitted bug 0001755 to address the points discussed in notes 6251 through 6369.
(0006403)
agadmin (administrator)
2023-07-27 15:28

Interpretation approved: 27 July 2023

- Issue History
Date Modified Username Field Change
2023-03-28 16:32 eblake New Issue
2023-03-28 16:32 eblake Name => Eric Blake
2023-03-28 16:32 eblake Organization => Red Hat
2023-03-28 16:32 eblake User Reference => ebb.printf %lc
2023-03-28 16:32 eblake Section => fprintf
2023-03-28 16:32 eblake Page Number => 913
2023-03-28 16:32 eblake Line Number => 30957
2023-03-28 16:32 eblake Interp Status => ---
2023-03-28 16:32 eblake Relationship added child of 0001643
2023-03-28 16:33 eblake Desired Action Updated
2023-03-28 17:02 eblake Desired Action Updated
2023-03-28 17:30 eblake Note Added: 0006237
2023-03-28 17:31 eblake Description Updated
2023-03-28 17:37 eblake Desired Action Updated
2023-03-30 16:33 eblake Note Added: 0006239
2023-03-30 17:14 eblake Tag Attached: tc3-2008
2023-03-30 17:14 eblake Tag Attached: issue8
2023-04-03 15:27 eblake Note Added: 0006246
2023-04-03 15:28 eblake Note Edited: 0006239
2023-04-03 15:28 geoffclare Tag Detached: tc3-2008
2023-04-03 15:31 ajosey Interp Status --- => Pending
2023-04-03 15:31 ajosey Final Accepted Text => Note: 0006239
2023-04-03 15:31 ajosey Resolution Open => Accepted As Marked
2023-04-03 15:32 ajosey Status New => Interpretation Required
2023-04-03 16:31 ajosey Note Added: 0006248
2023-04-03 16:31 ajosey Status Interpretation Required => Resolution Proposed
2023-04-03 16:46 hvd Note Added: 0006251
2023-04-03 17:15 ajosey Interp Status Pending => Proposed
2023-04-03 17:15 ajosey Status Resolution Proposed => Interpretation Required
2023-04-20 16:22 geoffclare Relationship replaced related to 0001643
2023-06-23 16:19 geoffclare Note Added: 0006346
2023-06-23 16:45 hvd Note Added: 0006347
2023-06-26 09:25 geoffclare Note Added: 0006349
2023-06-26 09:56 hvd Note Added: 0006350
2023-06-27 08:58 geoffclare Note Added: 0006361
2023-06-27 12:29 geoffclare Note Added: 0006365
2023-06-29 01:47 hvd Note Added: 0006366
2023-06-29 02:01 hvd Note Edited: 0006366
2023-06-29 11:28 geoffclare Note Added: 0006369
2023-07-06 08:45 geoffclare Relationship added related to 0001755
2023-07-06 08:47 geoffclare Note Added: 0006377
2023-07-27 15:28 agadmin Interp Status Proposed => Approved
2023-07-27 15:28 agadmin Note Added: 0006403
2023-08-22 14:22 geoffclare Status Interpretation Required => Applied
2023-08-22 14:23 geoffclare Tag Attached: applied_after_i8d3


Mantis 1.1.6[^]
Copyright © 2000 - 2008 Mantis Group
Powered by Mantis Bugtracker