|Anonymous | Login||2023-12-01 23:26 UTC|
|Main | My View | View Issues | Change Log | Docs|
|Viewing Issue Simple Details|
|ID||Category||Severity||Type||Date Submitted||Last Update|
|0001022||[1003.1(2013)/Issue7+TC1] System Interfaces||Comment||Clarification Requested||2016-01-10 20:47||2023-03-27 15:24|
|Final Accepted Text|
|Summary||0001022: error indicator for encoding errors in fgetwc(3)|
POSIX requires for fgetwc(3):
"If an encoding error occurs, the error indicator for the stream shall be set, fgetwc() shall return WEOF, and shall set errno to indicate the error."
This requirement is reasonable because it allows to easily distinguish EOF and error by inspecting the return value and calling ferror(3) only, without clearing errno before and inspecting errno after calling fgetwc(3).
However, the C standard only says (C11 220.127.116.11):
"If a read error occurs, the error indicator for the stream is set and the fgetwc function returns WEOF. If an encoding error occurs (including too few bytes), the value of the macro EILSEQ is stored in errno and the fgetwc function returns WEOF."
That could be construed to mean that setting the error indicator is not required if an encoding error occurs, though the wording might leave room for interpretation. Because POSIX explicitly defers to ISO C, this might be seen as a contradiction in the POSIX standard. At worst, people might conclude that the requirement to set the error indicator does not take effect because it's neither unambiguously required by C nor marked as a POSIX extension.
Fortunately, at least the following operating systems implement the behaviour mandated by POSIX:
FreeBSD sets the error flag since Oct 16, 2002 (rev. 105234).
NetBSD sets the error flag since July 3, 2006 (rev. 1.5), referencing SUSv3.
Dragonfly sets the error flag since April 21, 2009,
(e0f95098eeba0176864b9cafe6d69b5b7bc0e73f), sync from FreeBSD.
SunOS 5.11 11.2 sun4u sparc SUNW,SPARC-Enterprise
SunOS 5.10 Generic_150400-17 sun4v sparc SUNW,SPARC-Enterprise-T5220
Linux SMP Debian 3.16.7-ckt11-1+deb8u3 (2015-08-04) i686 GNU/Linux
Clarify the situation by marking the words
"the error indicator for the stream shall be set"
in the sentence starting with
"If an encoding error occurs,"
with the [CX] marker.
If the C standard did not intend to require setting the error indicator, that is required to resolve the conflict. If the C standard did intend to require setting the error indicator (and just didn't make it very clear), the clarification does no harm.
A similar clarification should be applied to fputwc(3), which would then contain:
"Upon successful completion, fputwc() shall return wc. If a write error occurs, it shall return WEOF, the error indicator for the stream shall be set, [CX] [Option Start] and errno shall be set to indicate the error. [Option End]
If an encoding error occurs, it shall return WEOF, [CX] [Option Start] the error indicator for the stream shall be set, [Option End] and errno shall be set to EILSEQ."
C99 says in 7.19.1 that the error indicator for a stream "records whether a read/write error has occurred".
To me this implies that if only an encoding error (EILSEQ) has occurred, the error indicator must not be set otherwise it would wrongly be indicating that a read/write error has occurred.
I believe this is a genuine conflict between C99 and POSIX and needs to be raised with the C committee.
|It appears some implementations did set ferror() if errno changed for any reason, so this was made part of POSIX at some point, even if technically "wrong" for the above reason. As the C standard doesn't preclude ferror() also being set, since it leaves implementation-defined what constitutes a "read/write error", it's more a conformance distinction can't be made than conflict, but imo should be marked as CX since the text differs.|
I think there is another missing CX item that should be discussed. I feel it should be explicit as a portability matter:
If the error indicator for the stream is already set when a call to the interface is made it shall set errno to EINVAL and exit before attempting any internal operation that might set the error indicator if it was clear as a side effect.
While the C standard leaving it to applications to call clearerr() themselves may be robust enough for a single threaded and single process environment, in a multi-thread environment one thread (or process) may get focus and attempt to call a wide I/O interface before the code that has caused the indicator to be set in another thread has a chance to call clearerr(), if us4s the call at all. As a potential race condition I think this applies to many other interfaces too, not just the wide char ones.
Thanks for your feedback, geoffclare. To make really sure that i understand correctly what you are saying: Your intention is that POSIX should not be changed, that all the operating systems i listed should continue what they are already doing, and that the C standard should be amended to also require setting the error flag in case of an encoding error. Right?
I would welcome that solution.
I don't know how reasonable it is to use non-atomic I/O operations on the same FILE object in two threads if you want to recover from I/O errors. If you are forced to do that, i think you have to write your own locking code anyway: Acquire the dedicated lock you define, call the I/O function, check ferror() and/or feof(), call clearerr() if needed, release the lock. Otherwise, you get a race in the other direction, too: I/O call succeeds in thread A, I/O call fails in thread B, A calls ferror(), boom. I don't see how any change to the definition of the interfaces might cure those races in either direction. Anyway, that's not related to the original issue, i don't think it's a wise move to conflate separate topics in the same bugtracking ticket.
edited on: 2019-09-05 15:42
WG14 have added normative text to C17:
Change 18.104.22.168p3 to require the error indicator to be set in this case:
|In view of the addition in C17, we believe no change is needed in POSIX and this bug can be closed.|
|On the musl list, https://www.openwall.com/lists/musl/2023/03/20/8 [^] points out that C17 addressed fgetwc(), but not fputwc(). As a followup, https://www.openwall.com/lists/musl/2023/03/20/10 [^] requests that the Austin Group submit a ballot comment against C23 to ensure that C23 matches POSIX for both fgetwc() and fputwc().|
|2016-01-10 20:47||schwarze||New Issue|
|2016-01-10 20:47||schwarze||Name||=> Ingo Schwarze|
|2016-01-10 20:47||schwarze||Organization||=> OpenBSD|
|2016-01-10 20:47||schwarze||Section||=> fgetwc(3)|
|2016-01-10 20:47||schwarze||Page Number||=> 0|
|2016-01-10 20:47||schwarze||Line Number||=> 0|
|2016-01-21 12:00||geoffclare||Project||2008-TC2 => 1003.1(2013)/Issue7+TC1|
|2016-01-21 12:04||geoffclare||Note Added: 0003027|
|2016-01-21 13:44||shware_systems||Note Added: 0003028|
|2016-01-22 01:06||shware_systems||Note Added: 0003029|
|2016-01-22 02:02||schwarze||Note Added: 0003030|
|2017-11-21 16:23||geoffclare||Relationship added||has duplicate 0001170|
|2019-02-04 16:31||nick||Tag Attached: C11|
|2019-02-04 16:31||nick||Tag Attached: c99|
|2019-05-02 13:03||nick||Note Added: 0004383|
|2019-09-05 15:42||nick||Note Edited: 0004383|
|2019-09-05 15:58||geoffclare||Interp Status||=> ---|
|2019-09-05 15:58||geoffclare||Note Added: 0004554|
|2019-09-05 15:58||geoffclare||Status||New => Closed|
|2019-09-05 15:58||geoffclare||Resolution||Open => Rejected|
|2023-03-27 15:24||eblake||Note Added: 0006236|
|2023-07-24 09:25||geoffclare||Relationship added||related to 0001769|
|Mantis 1.1.6[^] Copyright © 2000 - 2008 Mantis Group|