0001591: Proposed strlcpy spec has problems - Austin Group Defect Tracker

Notes
(0005872) wahern (reporter) 2022-06-30 01:12	> * To address the denial-of-service problem, drop the requirement that strlcpy must compute the string length of an overlong source. If SRC is too long, allow strlcpy(DST, SRC, sizeof DST) to return any value greater than or equal to sizeof DST. This will support the vast majority of uses of strlcpy, which do not need or use the string length. The behavior of returning the logical length is the same or similar to several other POSIX functions, including snprintf and regerror. In fact, OpenBSD's regerror implementation uses strlcpy to provide the POSIX-required regerror return value semantics. It's not uncommon to use the returned logical length to resize a buffer appropriately--e.g. to call regerror once, and upon truncation realloc a buffer and call it again.[1] So simply surveying the immediate sites where strlcpy is used is not going to capture real-world dependencies. Changing this semantic sabotages the purpose of standardizing strlcpy. Most of your objections seem like general gripes with C strings in general. They may be legitimate enough, but strike me as irrelevant. Ditto wrt AddressSanitizer and FORTIFY_SOURCE, which aren't even standardized and are largely platform dependent. [1] If subsequent calls are not idempotent, e.g. because the locale has been changed, then hopefully the caller is using a loop, but in any event idempotency is often expected, rightly or wrongly. There are potential pitfalls like this all over the place. (POSIX says changing locale between regcomp and regexec is undefined, but is silent wrt regerror.) Objections to strlcpy have always struck me as making perfect the enemy of the good. All of this hand-wringing would be better spent on the C committee advocating for proposals that improve and extend variably modified array types. Some of these proposals come close to effectively specifying dependent typing semantics between pointer and size parameters and even struct members. This would be much better--easier, more consistent, more reliable--than the __builtin_object_size hacks FORTIFY_SOURCE rely upon.

(0005873) eggert (reporter) 2022-06-30 04:49	> It's not uncommon to use the returned logical length to resize a buffer appropriately That's not the case for FreeBSD. It has thousands of calls to strlcpy. Most don't inspect the return value, and would suffer from silent truncation if the source string were too long. The relatively few places that inspect the return value almost invariably error out if the source string is too long. It is exceedingly rare for calling code to behave as you describe. Again, I'm not proposing that FreeBSD change its strlcpy implementation, only that the standard should not prohibit more-useful implementations, assuming it standardizes strlcpy at all. > Most of your objections seem like general gripes with C strings in general. No, they're specifically aimed at strlcpy and its three relatives.

(0005874) geoffclare (manager) 2022-06-30 08:38	Updated to give the page and line numbers from draft 2.1

(0005875) geoffclare (manager) 2022-06-30 08:53	There seems to be some confusion about the status of these interfaces. Not only are they in draft 2.1 but the source for that addition is a published (non-draft) standard of The Open Group https://publications.opengroup.org/c211 [^]

(0005876) carlos (reporter) 2022-06-30 13:15	The current draft text captures the essence of the existing implementations. Changing the text as suggested could produce observably different implementations that are not compatible. This would cause portability traps for software ported from BSD to an implementation that makes use of the additional permissions that are being suggested.

(0005907) nrk (reporter) 2022-07-25 14:00	> That's not the case for FreeBSD. It has thousands of calls to strlcpy. Most > don't inspect the return value My experience has been similar. MAJORITY of the strlcpy calls I see don't inspect the return value. This is because most people don't really want strlcpy, what they actually want is a string copy function which will truncate if needed (assuming truncation is desired). strlcpy is simply used as a (horrible) means to this end. What people really want is something more closer to this: strcpy_trunc(buf, src, sizeof buf); This function would act similar to the following, no need to suffer from O(n) strlen(src) call: if (memccpy(buf, src, '\0', sizeof buf) == NULL) buf[sizeof buf - 1] = '\0'; And perhaps return the pointer to '\0' inside buf (similar to stpcpy) as a bonus which can be used for quickly calculating the strlen or for efficient string concatenation. char strcpy_trunc(char restrict dest, const char restrict src, size_t n); Such a function would give people what they actually* want; which is to copy a string into a fixed-buffer, truncating if necessary; while being efficient as well. I should also note that strlcpy isn't even that widely adopted. Neither windows nor glibc provides it. IMO it's not a good idea standardizing an inefficient, not widely implemented and abused function. If this gets standardized, more people are going to abusing it as it and when a better function (such as the one shown above) does later get standardized, there's already going to be too many code-base which will rely on this non-optimal one making it hard to remove.

(0006017) Don Cragun (manager) 2022-10-31 15:33	These functions are being added to the standard because they are in widespread use, not because of any considerations relating to security or efficiency. Now that they are in the draft there is no consensus for removing them and therefore this bug is being rejected.

Issue History
Date Modified	Username	Field	Change
2022-06-29 18:39	eggert	New Issue
2022-06-29 18:39	eggert	Name	=> Paul Eggert
2022-06-29 18:39	eggert	Organization	=> UCLA
2022-06-29 18:39	eggert	Section	=> no section number yet (no draft yet)
2022-06-29 18:39	eggert	Page Number	=> no page number yet (no draft yet)
2022-06-29 18:39	eggert	Line Number	=> no line number yet (no draft yet)
2022-06-30 01:12	wahern	Note Added: 0005872
2022-06-30 04:49	eggert	Note Added: 0005873
2022-06-30 08:38	geoffclare	Section	no section number yet (no draft yet) => strlcat, wcslcat
2022-06-30 08:38	geoffclare	Page Number	no page number yet (no draft yet) => 2005, 2216
2022-06-30 08:38	geoffclare	Line Number	no line number yet (no draft yet) => 65136, 71365
2022-06-30 08:38	geoffclare	Note Added: 0005874
2022-06-30 08:38	geoffclare	version	=> Draft 2.1
2022-06-30 08:39	geoffclare	Relationship added	related to 0000986
2022-06-30 08:53	geoffclare	Note Added: 0005875
2022-06-30 13:15	carlos	Note Added: 0005876
2022-07-11 15:49	nrk	Issue Monitored: nrk
2022-07-25 14:00	nrk	Note Added: 0005907
2022-08-15 13:35	Florian Weimer	Issue Monitored: Florian Weimer
2022-10-31 15:33	Don Cragun	Note Added: 0006017
2022-10-31 15:33	Don Cragun	Status	New => Closed
2022-10-31 15:33	Don Cragun	Resolution	Open => Rejected

Aardvark Mark IV