| Anonymous | Login | Signup for a new account | 2010-02-09 12:36 UTC |
| Main | My View | View Issues | Change Log | Docs |
| Viewing Issue Simple Details [ Jump to Notes ] | [ Issue History ] [ Print ] | |||||||||||
| ID | Category | Severity | Type | Date Submitted | Last Update | |||||||
| 0000073 | [1003.1(2008)/Issue 7] System Interfaces | Comment | Clarification Requested | 2009-06-28 15:20 | 2009-11-05 16:20 | |||||||
| Reporter | nick | View Status | public | |||||||||
| Assigned To | ajosey | |||||||||||
| Priority | normal | Resolution | Open | |||||||||
| Status | Under Review | |||||||||||
| Name | Nick Stoughton | |||||||||||
| Organization | USENIX | |||||||||||
| User Reference | nms-C-wmemcmp | |||||||||||
| Section | wmemcmp | |||||||||||
| Page Number | 2254 | |||||||||||
| Line Number | 70784-70789 | |||||||||||
| Interp Status | --- | |||||||||||
| Final Accepted Text | ||||||||||||
| Summary | 0000073: wmemcmp C conflict? | |||||||||||
| Description |
This issue is for tracking purposes. The following question is being discussed in the C committee at present, and highlights a difference between C-1990 with AMD-1 and C99. POSIX has followed the C89+AMD1 words, and so is possibly at odds with C99. ===> from Joseph Myers When are wide string library functions required to handle values of type wchar_t that do not represent any value in the execution character set, and when does using such values with a library function result in undefined behavior? Consider the following testcase as an example: #include <stdlib.h> #include <wchar.h> wchar_t w0 = WCHAR_MIN; wchar_t w1 = WCHAR_MAX; int main (void) { if (wmemcmp (&w0, &w1, 1) < 0) return 0; else abort (); } Suppose that WCHAR_MIN and WCHAR_MAX do not both represent values in the execution character set. If the arguments to wmemcmp are valid, wmemcmp must return a value less than 0 because 7.24.4.4 says the comparison is done the same way as comparing integers of type wchar_t, so the program must execute successfully. With the GNU C Library, however, it aborts; wchar_t is UTF-32 but has a signed type so WCHAR_MIN is negative and does not represent a member of the execution character set. C90 AMD1 had an explicit statement (7.16.4.6) that made clear that these inputs were valid (and so wmemcmp had to return a value less than 0 for the above example in C90 AMD1): These functions operate on arrays of type wchar_t whose size is specified by a separate count argument. These functions are not affected by locale and all wchar_t values are treated identically. The null wide character and wchar_t values not corresponding to valid multibyte characters are not treated specially. I cannot however find any equivalent statement in C99. Was this a deliberate change from AMD1, or a side-effect of how the functions were rearranged when added to C99? POSIX repeats the above requirement from C90 AMD1, but I believe this is an accident of taking the specification from there originally and is not intended to impose any requirements beyond those of C99. Much the same issue applies to wcscmp and wcsncmp, where the comparison semantics are specified but AMD1 has no mention of wide characters not corresponding to members of the execution character set, and in principle to other wcs* and wmem* functions that have no reason to need to consider the semantics of the characters they process (but are less likely than the comparison functions to have problems with the full set of wchar_t values in practice). |
|||||||||||
| Desired Action |
Await for decision from C and if necessary make whatever change to align with the emerging C standard. Issue an interp to describe the discrepancy. |
|||||||||||
| Tags | c99 | |||||||||||
| Attached Files | ||||||||||||
|
|
||||||||||||
| Mantis 1.1.6[^] Copyright © 2000 - 2008 Mantis Group |