0001020: snprintf: the description of the n argument conflicts with ISO C

Notes
(0002999) jsm28 (reporter) 2016-01-05 17:49	http://austingroupbugs.net/view.php?id=761 [^] was previously rejected. While I think the rejection was clearly on spurious grounds - that these cases are clearly defined by ISO C and the POSIX specification conflicts - it might be appropriate to raise an ISO C DR to get a WG14 view and make this a liaison issue.

(0003000) shware_systems (reporter) 2016-01-05 20:07	Both versions are essentially equivalent. The POSIX version simply makes clearer that if you use an n larger than the size allocated to s you run the risk of overwriting other allocations, but the implementation isn't required to do bounds checking to prevent overruns. The C version, while plausible in theory as an isolated operation for an arbitrary n value, glosses this possibility over more and both are reliable in practice only when s has the highest allocated base address of the system and the system has at least n writeable bytes installed above that address, for an arbitrary n <= SIZE_MAX. This lack of reliability has been noted and is addressed in C11 by the snprintf_s() interface, in Annex K, which requires the implementation to use min(strlen(result), sizeof s[], n) as the number of bytes to copy. sprintf_s() adds (strlen(result) > sizeof s[] && n >= sizeof s[]) as being a constraint violation and only a \0 is stored. Both implicitly leave how sizeof s[] is determined when s refers to a dynamic allocation as implementation-defined, and unspecified what to do when s=malloc(0) or otherwise points to a non-writable address.

(0003006) ch3root (reporter) 2016-01-05 22:35	In reply to 0003000: Not sure what you are trying to say. The situation seems to be quite simple to me. Sure, an implementation is not required to do bounds checking but in POSIX version it can use s+n in the code while in ISO C version it cannot. And from user's POV: ISO C version can be called as `char s[10]; snprintf(s, SIZE_MAX, "%s", "abc");` while POSIX version cannot. Reliability is a somewhat separate question.

(0003014) shware_systems (reporter) 2016-01-07 05:02	I believe the language of the C standard also allows s+bytes_to_copy to be used as a loop termination test value. As neither standard requires bounds checking the examples given are valid for both standards also, afaik. The point is more checking for possible errors or other terminating circumstance can't be deferred until the loop reaches that value; both standards expect the implementation to behave as if the checks happen for each char stored. Changing the parameter description to a size does not change this for POSIX. As POSIX supports memory locking and threading it has more tests for each loop iteration than is needed for C99, actually, that all have priority over the s+n check. For both standards portable applications also can not rely on the implementation checking the next char to be stored isn't past the range of memory allocated to s[] as one of those tests; that is left as an extension, historically, to permit speedier and smaller unsafe implementations. In C11, the Annex K interfaces still expect the checks per loop, but because they are required to respect the size of s the implementation can use s+internal_sizeof(s[]) when this is less than s+bytes_to_copy as the check value instead and this is portable. On POSIX the implementations of the new interfaces may even be faster due to various optimizations plausible when *s is temporarily write locked.

(0003016) ch3root (reporter) 2016-01-07 10:37	C11, 6.5.6p8, prohibits calculating P+N where P+N doesn't point into the same array as P. In practice, P+N can wrap when P is near the end of address space and N is greater than the size of the array that P points into. If P+N wrapped checks like "P+I < P+N" will give bogus results. Bounds checking (and Annex K etc.) are not relevant here.

(0003018) Vincent Lefevre (reporter) 2016-01-07 11:43	This seems to be an issue for optimizing compilers. Consider the following code: char s[10]; if (foo) snprintf(s, 20, "abc"); If the snprintf here is wrong according to POSIX, then a POSIX implementation could deduce that foo is 0, yielding a behavior that could be different from an ISO C implementation (where foo being 1 is valid).

(0003019) shware_systems (reporter) 2016-01-07 13:51 edited on: 2016-01-07 14:14	6.5.6p8 leaves it undefined, not prohibited, per: "If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined." By the If clause there, the standard is saying any addition is possible, it is up to the application to ensure the result stays within the bounds of an array object declaration, not the compilers. The compiler's responsibility is to ensure that at least one char at the end of memory remains unallocated so an [n+1] address calculation won't wrap. When the application has done so then only the non-overflow guarantee applies because that last char is unallocated. It says nothing about whether p+n is required to abort a dereference attempt when p is inside the array and p+n isn't, as an lvalue or rvalue. It is up to the application, in this case the snprintf implementation, to provide separate code paths when P+N < P due to overflow; one to use when P+I > P, and the other P+I < P also.

(0003020) user229 2016-01-08 00:45	I don't think the language in Annex K is actually intended to require implementations to use a "magically determine the real size of the array argument" operation, it's just using the same kind of sloppy language that POSIX does, saying "too big for the array pointed to by s" as if it were a synonym for "greater than n".

(0003482) geoffclare (manager) 2016-11-03 15:14	Change: with the addition of the n argument which states the size of the buffer referred to by s. to: with the addition of the n argument which limits the number of bytes written to the buffer referred to by s.

(0003483) shware_systems (reporter) 2016-11-07 09:21	Re: 3020 I agree the language is more vague on details than other parts of the standard, but I don't see it as particularly sloppy. The point of the '_s' interfaces is having versions that don't have common security holes like buffer overruns, so when it says 'fit within' or 'too big' to me the intent is the implementations >>shall do what's necessary, documented or not<< to prevent those overruns. It seems there it's left open 'how' to work that magic so imps, at their discretion, can use their existing code to get it done or new methods. Either way could reference the same data in the memory manager free(void *) has to, to get the value used with malloc() or realloc(), or calculate a usable size if block chaining/coalescing used. This applies whether the malloc() calls happen at runtime or when creating object descriptions for an object file at compile time, for auto or static declarations. On POSIX systems there's the complexity of supporting dl_sym() too, in some fashion, for externs exported by libraries, but code exists for that also. The vague aspect I see is whether the sizes are 'as declared' or 'as allocated to satisfy alignment practices', not that sizing involves any magic.

Issue History
Date Modified	Username	Field	Change
2016-01-05 16:34	ch3root	New Issue
2016-01-05 16:34	ch3root	Name	=> Alexander Cherepanov
2016-01-05 16:34	ch3root	Section	=> snprintf
2016-01-05 16:34	ch3root	Page Number	=> (page or range of pages)
2016-01-05 16:34	ch3root	Line Number	=> (Line or range of lines)
2016-01-05 17:49	jsm28	Note Added: 0002999
2016-01-05 20:07	shware_systems	Note Added: 0003000
2016-01-05 20:18	eblake	Relationship added	related to 0000761
2016-01-05 22:26	Florian Weimer	Issue Monitored: Florian Weimer
2016-01-05 22:35	ch3root	Note Added: 0003006
2016-01-07 05:02	shware_systems	Note Added: 0003014
2016-01-07 10:37	ch3root	Note Added: 0003016
2016-01-07 11:43	Vincent Lefevre	Note Added: 0003018
2016-01-07 13:51	shware_systems	Note Added: 0003019
2016-01-07 14:14	shware_systems	Note Edited: 0003019
2016-01-08 00:45	user229	Note Added: 0003020
2016-01-08 02:52	Don Cragun	Section	snprintf => fprintf() description of snprintf()
2016-01-08 02:52	Don Cragun	Page Number	(page or range of pages) => 900
2016-01-08 02:52	Don Cragun	Line Number	(Line or range of lines) => 30166-30167
2016-01-08 02:52	Don Cragun	Interp Status	=> ---
2016-11-03 15:14	geoffclare	Note Added: 0003482
2016-11-03 15:15	geoffclare	Final Accepted Text	=> Note: 0003482
2016-11-03 15:15	geoffclare	Status	New => Resolved
2016-11-03 15:15	geoffclare	Resolution	Open => Accepted As Marked
2016-11-03 15:15	geoffclare	Tag Attached: tc3-2008
2016-11-07 09:21	shware_systems	Note Added: 0003483
2019-10-21 14:06	geoffclare	Status	Resolved => Applied
2024-06-11 08:58	agadmin	Status	Applied => Closed

Aardvark Mark IV