Austin Group Defect Tracker

Aardvark Mark IV


Viewing Issue Simple Details Jump to Notes ] Issue History ] Print ]
ID Category Severity Type Date Submitted Last Update
0000708 [1003.1(2013)/Issue7+TC1] System Interfaces Editorial Enhancement Request 2013-06-07 21:23 2023-09-04 10:15
Reporter dalias View Status public  
Assigned To
Priority normal Resolution Accepted As Marked  
Status Applied  
Name Rich Felker
Organization musl libc
User Reference
Section XSH 2.9.1 Thread-Safety
Page Number unknown
Line Number unknown
Interp Status ---
Final Accepted Text Note: 0006433
Summary 0000708: Make mblen, mbtowc, and wctomb thread-safe for alignment with C11
Description Per C11 7.1.4 paragraph 5,

"Unless explicitly stated otherwise in the detailed descriptions that follow, library functions shall prevent data races as follows: A library function shall not directly or indirectly access objects accessible by threads other than the current thread unless the objects are accessed directly or indirectly via the function's arguments. A library function shall not directly or indirectly modify objects accessible by threads other than the current thread unless the objects are accessed directly or indirectly via the function's non-const arguments. Implementations may share their own internal objects between threads if the objects are not visible to users and are protected against data races."

7.22.7 (Multibyte/wide character conversion functions) does not specify that these functions are not required to avoid data races with other calls. The only time they would even potentially be subject to data races is for state-dependent encodings, which are all but obsolete; for single-byte or modern multi-byte (i.e. UTF-8) encodings, these functions are pure.

Note that 7.29.6.3 (Restartable multibyte/wide character conversion functions) does make exceptions that the "r" versions of these functions are not required to avoid data races when the state argument is NULL.
Desired Action Remove mblen, mbtowc, and wctomb from the list of functions which are not required to be thread-safe.
Tags applied_after_i8d3, C11, issue8
Attached Files

- Relationships

-  Notes
(0001647)
geoffclare (manager)
2013-06-08 09:12

It seems odd that C11 would have different thread-safety requirements
for mbrlen, mbrtowc, and wcrtomb with a null state argument than for
mblen, mbtowc, and wctomb. We should query this with the C committee,
as it may well be unintentional.
(0001648)
dalias (reporter)
2013-06-08 12:14

I think there's a very good reason for the discrepancy: the restartable versions can store a partially-decoded character in the mbstate_t object, so even for state-independent encodings, there is state which would need to be protected against data races. The non-restartable versions, on the other hand, are pure except in the case of state-dependent encodings, which are mostly a relic of the past and which were never supported on most POSIX systems, since these encodings are mostly incompatible with POSIX filesystem semantics. Only implementations supporting such encodings (which might not even exist - can anyone confirm?) would incur the burden of avoiding data races. Note that these functions give applications access to information on whether the locale's encoding is state-dependent, so a portable application could use the restartable interfaces when the locale is state-dependent, and the non-restartable ones otherwise.

As to the motivation behind my request for this change, I have spent a good deal of time investigating the performance bottlenecks in character-at-a-time multibyte processing, and it turns out that there is a fundamental bottleneck in the restartable interfaces due to their interface requirements for handling the ps argument and partially-decoded characters. For applications which don't need partial-character processing capability, I believe it would make sense to encourage a transition to the non-restartable interfaces, but of course this is problematic if the non-restartable interfaces are not thread-safe. In my experiments, I found the non-restartable interfaces capable of reaching roughly a 50% performance advantage over the restartable ones; this difference would of course become even more extreme if the core decoding algorithms were further optimized.
(0001651)
nick (manager)
2013-06-13 15:35

This will be raised as a potential defect with the C committee, and any decision on how to proceed should be made there first.
(0005933)
nick (manager)
2022-08-15 15:11

The C committe updated the defect report: https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2396.htm#dr_498. [^] The resulting document (N2281) was discussed, and the minutes show:
    

Make mblen, mbtowc, and wctomb thread-safer [N 2281]
 Rajan: Agree with the goal in principle, but not with the words. For example, mblen cannot be state independent if the state is locale dependent.
 Blaine: Perhaps say a call to setlocale for a stateful encoding may also introduce a data race.
 Jens: What that suggests is dealing with setlocale is not referred to here. There are two problems: having the state change via setlocale or with the function itself.
 Jens: No reason to make the second change as it is already covered in the section preamble.
 Fred: Is ‘other calls’ concurrent, sequential or both?
 David: It is for any other call.
 Fred: Currently it doesn’t say anything about needing to be sequential and that needs to happen. The paper needs more work.
 Issues: Dealing with setlocale, duplicated text about data races with the same function, and the data race with
 ‘other calls’.
 Rajan: Perhaps say "not required to avoid data races as long as the LC_CTYPE category does not change" or something similar.
 Blaine: This does not seem to do what is intended. It should it be possible to clearly state that you can get data-race free with proper specification.
 Blaine: This paper needs more positive assertions of being data race free in the presence of possible changes to/from state dependent encodings. It doesn’t seem the words here achieve the goal.
(0006139)
geoffclare (manager)
2023-01-26 12:11

In the January 2023 ballot resolution meeting, WG14 agreed to change (in C23) the requirement for these functions to avoid data races. At the moment their plan is to make it implementation-defined, but it is possible someone may submit a paper for consideration at the next meeting along the lines of N2281 but with wording changes to address the points raised in the discussion quoted in Note: 0005933.

So we should leave this bug open pending a final decision by WG14, but what goes into C23 will likely only affect what we put in rationale and future directions in Issue 8, so we could start discussion at least on the normative text. I propose the following:

In all of these changes the placeholder XXXX should be replaced with the name of the function. Page and line numbers are for Issue 8 draft 2.1.

On page 1274 line 42635 section mblen(), and
page 1285 line 42996 section mbtowc(), and
page 2255 line 72488 section wctomb(), change:
The functionality described on this reference page is aligned with the ISO C standard. Any conflict between the requirements described here and the ISO C standard is unintentional. This volume of POSIX.1-202x defers to the ISO C standard.
to:
Except for requirements relating to data races, the functionality described on this reference page is aligned with the ISO C standard. Any other conflict between the requirements described here and the ISO C standard is unintentional. This volume of POSIX.1-202x defers to the ISO C standard for all XXXX() functionality except in relation to data races.

On page 1274 line 42652 section mblen(), and
page 1285 line 43015 section mbtowc(), and
page 2255 line 72505 section wctomb(), change:
[CX]The XXXX() function need not be thread-safe.[/CX]
to:
The XXXX() function [CX]need not be thread-safe; however, it[/CX] shall avoid data races with all other functions.

On page 1274 line 42669 section mblen(), and
page 1286 line 43032 section mbtowc(), and
page 2255 line 72522 section wctomb(), change RATIONALE from "None" to:
When the ISO C standard introduced threads in C11, it required XXXX() to avoid data races (with itself as well as with other functions), whereas POSIX.1-2008 did not require it to be thread-safe, and in many implementations it did not avoid data races with itself and still does not. The ISO C committee intend to change the requirements in C23, but since POSIX.1 currently refers to C17 it is necessary for it not to defer to the ISO C standard regarding data races in order to continue to allow this function not to avoid data races with itself.

On page 1274 line 42669 section mblen(), and
page 1286 line 43034 section mbtowc(), and
page 2256 line 72524 section wctomb(), change FUTURE DIRECTIONS from "None" to:
It is expected that a change in C23 will allow a future version of this standard to remove the data race exception from the statement that it defers to the ISO C standard.
(0006433)
geoffclare (manager)
2023-08-14 16:25

In all of these changes the placeholder XXXX should be replaced with the name of the function. Page and line numbers are for Issue 8 draft 2.1.

On page 1274 line 42635 section mblen(), and
page 1285 line 42996 section mbtowc(), and
page 2255 line 72488 section wctomb(), change:
The functionality described on this reference page is aligned with the ISO C standard. Any conflict between the requirements described here and the ISO C standard is unintentional. This volume of POSIX.1-202x defers to the ISO C standard.
to:
Except for requirements relating to data races, the functionality described on this reference page is aligned with the ISO C standard. Any other conflict between the requirements described here and the ISO C standard is unintentional. This volume of POSIX.1-202x defers to the ISO C standard for all XXXX() functionality except in relation to data races.

On page 1274 line 42652 section mblen(), and
page 1285 line 43015 section mbtowc(), and
page 2255 line 72505 section wctomb(), change:
[CX]The XXXX() function need not be thread-safe.[/CX]
to:
The XXXX() function [CX]need not be thread-safe; however, it[/CX] shall avoid data races with all other functions.

On page 1274 line 42669 section mblen(), and
page 1286 line 43032 section mbtowc(), and
page 2255 line 72522 section wctomb(), change RATIONALE from "None" to:
When the ISO C standard introduced threads in C11, it required XXXX() to avoid data races (with itself as well as with other functions), whereas POSIX.1-2008 did not require it to be thread-safe, and in many implementations it did not avoid data races with itself and still does not. The ISO C committee intend to change the requirements in a future version of the ISO C standard, but since POSIX.1 currently refers to C17 it is necessary for it not to defer to the ISO C standard regarding data races in order to continue to allow this function not to avoid data races with itself.

On page 1274 line 42669 section mblen(), and
page 1286 line 43034 section mbtowc(), and
page 2256 line 72524 section wctomb(), change FUTURE DIRECTIONS from "None" to:
It is expected that a change in a future version of the ISO C standard will allow a future version of this standard to remove the data race exception from the statement that it defers to the ISO C standard.

- Issue History
Date Modified Username Field Change
2013-06-07 21:23 dalias New Issue
2013-06-07 21:23 dalias Name => Rich Felker
2013-06-07 21:23 dalias Organization => musl libc
2013-06-07 21:23 dalias Section => XSH 2.9.1 Thread-Safety
2013-06-07 21:23 dalias Page Number => unknown
2013-06-07 21:23 dalias Line Number => unknown
2013-06-08 09:12 geoffclare Note Added: 0001647
2013-06-08 09:13 geoffclare Tag Attached: C11
2013-06-08 12:14 dalias Note Added: 0001648
2013-06-13 15:35 nick Note Added: 0001651
2013-12-03 21:09 torvald Issue Monitored: torvald
2022-08-15 15:11 nick Note Added: 0005933
2023-01-26 12:11 geoffclare Note Added: 0006139
2023-08-14 16:25 geoffclare Note Added: 0006433
2023-08-14 16:28 geoffclare Interp Status => ---
2023-08-14 16:28 geoffclare Final Accepted Text => Note: 0006433
2023-08-14 16:28 geoffclare Status New => Resolved
2023-08-14 16:28 geoffclare Resolution Open => Accepted As Marked
2023-08-14 16:29 geoffclare Tag Attached: issue8
2023-09-04 10:15 geoffclare Status Resolved => Applied
2023-09-04 10:16 geoffclare Tag Attached: applied_after_i8d3


Mantis 1.1.6[^]
Copyright © 2000 - 2008 Mantis Group
Powered by Mantis Bugtracker