Austin Group Defect Tracker

Aardvark Mark IV


Viewing Issue Simple Details Jump to Notes ] Issue History ] Print ]
ID Category Severity Type Date Submitted Last Update
0000986 [1003.1(2008)/Issue 7] Base Definitions and Headers Editorial Clarification Requested 2015-09-23 08:35 2021-05-07 15:18
Reporter EdSchouten View Status public  
Assigned To ajosey
Priority normal Resolution Accepted As Marked  
Status Applied  
Name Ed Schouten
Organization Nuxi
User Reference
Section <string.h> and <wchar.h>
Page Number n/a
Line Number n/a
Interp Status ---
Final Accepted Text Note: 0005334
Summary 0000986: Would it be worth investigating adding strlcpy(), strlcat(), wcslcpy() and wcslcat()?
Description Back in 1998, OpenBSD 2.4 added the functions strlcpy() and strlcat(). These functions eventually made it into a whole bunch of other operating systems. By now, at least OpenBSD, FreeBSD, NetBSD, Solaris, Mac OS X, and QNX provide these functions. Implementations are also present in popular Open Source projects like SDL, GLib, ffmpeg and rsync. The Linux kernel also uses them internally.

These functions have already been in use for the last 17 years and people seem to like them, which is why I'd like to propose that we add them in the next version of POSIX.

It is important to mention that these functions do not come without any form of criticism. One important concern about these functions is that they may silently truncate the string if the output buffer is too small. My response to this would be the following:

1. Checking the return value allows you to detect truncation.

2. This is not different from other existing functions such as snprintf(), which seems to be preferred over sprintf() nowadays.

3. Even though string truncation is bad and could potentially lead to security vulnerabilities, it is important to take into account what the impact would have been if strcpy() and strcat() were used instead. At least strlcpy() and strlcat() would prevent the buffer overflow from happening, making it less likely that the integrity of the control flow of the program is affected.
Desired Action Please add them. :-)
Tags issue8
Attached Files

- Relationships
related to 0001591Closed Issue 8 drafts Proposed strlcpy spec has problems 

-  Notes
(0002843)
eblake (manager)
2015-09-23 14:00

Ulrich Drepper is no longer as active on developing either POSIX or glibc, but his scathing analysis of these interfaces is still true: the only correct way to use them requires MORE boilerplate than what you could do by using other existing interfaces:

https://stackoverflow.com/questions/2114896/why-is-strlcpy-and-strlcat-considered-to-be-insecure [^]

Now, strlcat does effectively do this check, if the programmer remembers to check the result - so you can use it safely:

if (strlcat(dest, source, dest_bufferlen) >= dest_bufferlen)
{
    /* Bug out */
}

Ulrich's point is that since you have to have destlen and sourcelen around (or recalculate them, which is what strlcat effectively does), you might as well just use the more efficient memcpy anyway:

if (destlen + sourcelen > dest_maxlen)
{
    goto error_out;
}
memcpy(dest + destlen, source, sourcelen + 1);
destlen += sourcelen;


As such, I'm not personally in favor of standardizing them, but am happy to let implementations continue to provide them as extensions.
(0002844)
EdSchouten (updater)
2015-09-23 16:19

Hi Eric,

Thanks for the quick response!

I agree. strlcat() does suffer from the issue that it allows people to write less efficient code more easily. Once you start to call strlcat() in a loop, you could create a piece of code that runs in quadratic time where linear time is possible.

That said, I think it's important that we look at this problem from a different point of view. Let's put the ways you can copy/concatenate strings on an axis from 'bad' to 'good', based on safety, efficiency, etc.:

strcpy()/strcat() <-----> strlcpy()/strlcat() <-----> strlen()+memcpy()

Now the question becomes: if we don't think it's a good idea to add strlcpy()/strlcat() because strlen()+memcpy() are more efficient, why should we still standardize strcpy()/strcat()? As far as I know, POSIX is not a superset of the C standard[1], so there is nothing that prevents us from deprecating these functions. But my guess is that people would get quite upset if there's no standard function to copy strings.

What I try to say is that we should think of strlcpy()/strlcat() as an incremental improvement over strcpy()/strcat(). It has never even tried to address the running time problem of C string concatenation. They have been designed so that existing code can be refactored to use these functions pretty easily, and practice has shown that they are good at this. Even if they end up truncating strings, it's still better than losing integrity of control flow due to a buffer overflow.

If we don't think that strlcpy()/strlcat() are the right solution to the problem, would it make sense to rephrase this bug: can we come up with a set of functions that do tackle this problem the right way?

[1] ctime() is deprecated in issue 7, making me assume it's going to be removed from issue 8 (?). It's still part of C11.
(0002851)
shware_systems (reporter)
2015-10-01 05:01

Re: 2844, Note [1]
If Issue 8 defers to C11, ctime() will probably get the new LEGACY mark, not removed, or stay marked OBSOLETE. It becomes a candidate for removal in the issue following the C standard removing it, afaik. Couldn't say whether it's on the agenda now for SC22.
(0002869)
joerg (reporter)
2015-10-09 14:00

I see no verification for the claim that strlen() + memcpy() is faster than strlcpy().

strlcpy() is the most useful method to avoid buffer overruns and to detect truncation (in case you check the return value).

strlcpy() is much faster than strncpy().

strlcpy() allows you to use the result to compute the amount of space needed for the target in an efficient way.

The strl*() type functions are available on *BSD, OSX, Solaris since late 1998 (*BSD) and early 1999 (Solaris). I believe that this is a verification that many people believe they are a useful addition.

Mr. Drepper is a single person that does not seem to discuss his claims with others. I cannot see why the claims for strl*() from Mr. Drepper should include a valid point.
(0002892)
steffen (reporter)
2015-11-09 20:57

Since this is a new interface for the standard i think it would be worth having a look at strscpy() that the Linux community now seems to advocate for future development; it effectively is

   ssize_t
   strscpy(char *dst, char const *src, size_t dstsize){
      ssize_t rv;

      if(LIKELY(dstsize > 0)){
         rv = 0;
         do{
            if((dst[rv] = src[rv]) == '\0')
               goto jleave;
            ++rv;
         }while(--dstsize > 0);
         dst[--rv] = '\0';
      }
      rv = -E2BIG;
   jleave:
      return rv;
   }

Nice properties: no zero padding, dst always terminated if dstsize greater 0 - what strncpy() should have been from the beginning in my not so humble opinion.
-E2BIG is possibly a bit strange in user space, -1 is a common error value in POSIX, or -dstsize for the very conservative under us (with the potential to save a temporary / not wasting a possible return register).

A strscat() in equal spirit does not yet exist, but is thinkable.
(0002894)
joerg (reporter)
2015-11-10 11:33

This strscpy() proposal looks less usable than strlcpy() as it does not return the needed size for a copy operation that does not truncate.
(0002895)
EdSchouten (updater)
2015-11-10 11:58

I agree with Joerg.
(0002896)
steffen (reporter)
2015-11-10 13:08

Well, i don't - since if buffer resizing due to E2BIG is a regular case in a particular code flow then i would definitely do a (usually highly optimized) strlen() on the source buffer at first in order to cut down needless reallocations and second runs. Definitely.
No no. You'd use this function at places where you are pretty sure that the buffer is sufficiently spaced, or where it is a regular error condition if the buffer is too small -- in which case you surely don't want the complete input buffer to be traversed needlessly.
Speedier and more sensitive than snprintf(BUF, sizeof BUF, "%s", INPUT), which is often used instead (at a cost in time, and CPU cycles and thus also energy).
And maybe even with a debug log or even assert that triggers possible buffer overflows during development in the former case.
Quite unmasked that is: strlcpy() is a kind of "go, take the money and run"; and rob another bank if its out.
(0002897)
EdSchouten (updater)
2015-11-10 16:20

Steffen,

- strlcpy(), strlcat(), etc. have already been around for a very long time. They are used by quite a lot of existing Open Source software. Not just software that works on the BSDs, but also software that's designed to run on many other UNIX/non-UNIX operating systems.

- strscpy() has only been around for 6-7 months and is specific to Linux. More specifically, it's only available inside of the Linux kernel, and used exactly three times, all in one source file (arch/tile/gxio/mpipe.c). There is no other precedent at all. There is no strscat() either.

- strlcpy() fits within the existing set of functions like a glove. strlcpy(a, b, n) behaves identically to snprintf(a, n, "%s", b). The return value always corresponds to the number of non-null bytes that would have been written. If we truly think that this is bad design, should we come up with a new version of snprintf() that also doesn't do this? I don't think so.

- In its current form, strscpy() cannot be standardized for a couple of reasons:

1. As far as I know, there are exactly zero functions in POSIX that return negative error numbers. There are some functions that return error numbers directly, but never negated. It doesn't fit in style-wise.

2. The strscpy() function's return type is ssize_t. This is what the current <sys/types.h> article has to say on ssize_t:

"The type ssize_t shall be capable of storing values at least in the range [-1, {SSIZE_MAX}]."

In other words, it may return a value that is outside of bounds for the type that's used. Should we then introduce an SSIZE_MIN? I don't think so. That's not what ssize_t was created for.

Alternatively this could be repaired by changing the function to return -1 as you mentioned, but this leads to the following problem: it would mean that we'd be standardizing something that conflicts with existing implementations. If someone would write a piece of code in userspace that does this:

if (strscpy(....) == -1) {
   ...
}

And move that into the kernel, then you suddenly wouldn't detect truncation anymore. The same thing holds when moving code out of the kernel. And this is really not an uncommon scenario. Kernel developers do it all the time. For example, I designed FreeBSD's VT100/xterm terminal emulator entirely in userspace and only moved it into the kernel afterwards.

- The advantages of strscpy() over strlcpy() are *small*. If I read the email thread regarding its introduction, the (only?) advantage of this function over strlcpy() is that this piece of code:

if (strlcpy(a, b, some->really->long->expression->that->attempts->to->get->the->field->length) >= some->really->long->expression->that->attempts->to->get->the->field->length) {
  ...
}

can be rewritten to the following statement that is shorter:

if (strscpy(a, b, some->really->long->expression->that->attempts->to->get->the->field->length) < 0) {
  ...
}

I think this only provides very minimal gain. I just did a grep on the FreeBSD source tree and in almost all cases the expression that the function is compared against is compact. It's either a *_MAX constant from some header file or a sizeof() expression referring to an array on the stack or a struct member of low depth.
(0002900)
steffen (reporter)
2015-11-10 17:07

Dear Ed, first i liked cons25, fwiw.
Yes this is a mysterious post but i like the last sentence, do i. A remarkable amount of commits that simply changed existing code to the strl* family has happened in the FreeBSD tree since you've opened this issue, too, and i think that the Linux community is better served to use this new interface for new development only, instead of converting old code to this strl interface in a three line diff context. I don't know wether Linus Torvalds referred to this FreeBSD commit series when he explicitly pointed out that this is not desired for Linux and strscpy(). (Note i personally had a, possibly even odd feeling once those commit messages flew by, weeks before i have read that article.)

I could only repost the first two sentences of message 0002896 again to answer you. I see you have read the Linux commit message.. hm. Quite polemical, hm, hmm. Well.

  strlcpy(a, b, n) behaves identically to snprintf(a, n, "%s", b)

Fantastic! Let's just avoid redundancy.
Ah, what is a kernel anyway, the sun will blow up at times and then, after some more time, there is nothing but a black hole, at least maybe, and if we have enough mass. But i doubt the latter.
(0002901)
steffen (reporter)
2015-11-10 17:27

P.S.: Ach! on ISO C defining size_t without a signed counterpart.
Unfortunately i miss that sense of humour that is expressed with the -1..SSIZE_MAX definition. At least. No i'd throw it overboard and define it as the signed counterpart of size_t, regulary and officially, and then nice things would be possible, like returning "-dstsize - 1", not only for the above. That'd be a much more dense way of doing things, what do you mean. But at least the latter will remain a dream.
(0003293)
shware_systems (reporter)
2016-07-07 23:18

Adding these functions would require a sponsor and some proposed text. At the 20160707 call it was decided to ask the Open Group if they would be willing to sponsor the interfaces in the Desired Action.
(0004967)
geoffclare (manager)
2020-09-03 10:45
edited on: 2020-09-03 13:59

Suggested changes to go into The Open Group company review...

Page and line numbers are for the 2016/2018 edition.

On page 363 line 12410 section <string.h>, add:
[CX]size_t strlcat(char *restrict, const char *restrict, size_t);
size_t strlcpy(char *restrict, const char *restrict, size_t);[/CX]

On page 364 line 12437 section <string.h>, add strlcat() to SEE ALSO.

On page 460 line 15893 section <wchar.h>, add:
[CX]size_t wcslcat(wchar_t *restrict, const wchar_t *restrict, size_t);
size_t wcslcpy(wchar_t *restrict, const wchar_t *restrict, size_t);[/CX]

On page 461 line 15955 section <wchar.h>, add wcslcat() to SEE ALSO.

On page 494 line 17097-17146 section 2.4.3, add strlcat(), strlcpy(), wcslcat(), and wcslcpy() to the table of async-signal-safe functions.

On page 2053 insert a new strlcat page:

NAME
strlcat, strlcpy -- size-bounded string concatenation and copying

SYNOPSIS
#include <string.h>

[CX]size_t strlcat(char *restrict dst, const char *restrict src,
    size_t dstsize);
size_t strlcpy(char *restrict dst, const char *restrict src,
    size_t dstsize);[/CX]

DESCRIPTION
The strlcpy() and strlcat() functions copy and concatenate strings, stopping when either a NUL terminator in the source string is encountered or the specified full size of the destination buffer is reached. They NUL terminate the result if there is room. The application should ensure that room for the NUL terminator is included in dstsize.

The strlcpy() function shall copy not more than dstsize - 1 bytes from the string pointed to by src to the array pointed to by dst; a NUL byte in src and bytes that follow it shall not be copied. A terminating NUL byte shall be appended to the result, unless dstsize is 0. If copying takes place between objects that overlap, the behavior is undefined.

The strlcat() function shall append not more than dstsize - strlen(dst) - 1 bytes from the string pointed to by src to the end of the string pointed to by dst; a NUL byte in src and bytes that follow it shall not be appended. The initial byte of src shall overwrite the NUL byte at the end of dst. A terminating NUL byte shall be appended to the result, unless its location would be at or beyond dst + dstsize. If copying takes place between objects that overlap, the behavior is undefined.

The strlcpy() and strlcat() functions shall not change the setting of errno on valid input.

RETURN VALUE
Upon successful completion, the strlcpy() function shall return the length of the string pointed to by src; that is, the number of bytes in the string, not including the terminating NUL byte.

Upon successful completion, the strlcat() function shall return the initial length of the string pointed to by dst plus the length of the string pointed to by src.

No return values are reserved to indicate an error.

ERRORS
No errors are defined.

EXAMPLES
The following example detects truncation while combining a path prefix (including trailing <slash>) and a filename to produce a portable pathname:
char *prefix, *filenam, pathnam[_POSIX_PATH_MAX];

if (strlcpy(pathnam, prefix, sizeof pathnam) >= sizeof pathnam ||
    strlcat(pathnam, filenam, sizeof pathnam) >= sizeof pathnam)
{
    // truncation occurred
    ...
}

This code ensures there is room for the NUL terminator by:
  • Calling strlcpy() with a non-zero dstsize argument.

  • Only calling strlcat() if the return value of strlcpy() indicated that truncation did not occur.

APPLICATION USAGE
The return value of the strlcpy() and strlcat() functions follows the same convention as snprintf(); that is, they return the total length of the string they tried to create. If the return value is greater than or equal to dstsize, the output string has been truncated.

RATIONALE
None.

FUTURE DIRECTIONS
None.

SEE ALSO
snprintf(), strlen(), strncat(), strncpy(), wcslcat()

XBD <string.h>

CHANGE HISTORY
First released in Issue 8.


Add strlcat() to the SEE ALSO section for each existing function page listed in the strlcat() SEE ALSO above.

On page 2256 insert a new wcslcat page:

NAME
wcslcat, wcslcpy -- size-bounded wide string concatenation and copying

SYNOPSIS
#include <wchar.h>

[CX]size_t wcslcat(wchar_t *restrict dst, const wchar_t *restrict src,
    size_t dstsize);
size_t wcslcpy(wchar_t *restrict dst, const wchar_t *restrict src,
    size_t dstsize);[/CX]

DESCRIPTION
The wcslcpy() and wcslcat() functions copy and concatenate wide strings, stopping when either a terminating null wide-character code in the source wide string is encountered or the specified full size (in wide-character codes) of the destination buffer is reached. They null terminate the result if there is room. The application should ensure that room for the terminating null wide-character code is included in dstsize.

The wcslcpy() function shall copy not more than dstsize - 1 wide-character codes from the wide string pointed to by src to the array pointed to by dst; a terminating null wide-character code in src and wide-character codes that follow it shall not be copied. A terminating null wide-character code shall be appended to the result, unless dstsize is 0. If copying takes place between objects that overlap, the behavior is undefined.

The wcslcat() function shall append not more than dstsize - wcslen(dst) - 1 wide-character codes from the wide string pointed to by src to the end of the wide string pointed to by dst; a terminating null wide-character code in src and wide-character codes that follow it shall not be appended. The initial wide-character code of src shall overwrite the terminating null wide-character code at the end of dst. A terminating null wide-character code shall be appended to the result, unless its location would be at or beyond dst + dstsize. If copying takes place between objects that overlap, the behavior is undefined.

The wcslcpy() and wcslcat() functions shall not change the setting of errno on valid input.

RETURN VALUE
Upon successful completion, the wcslcpy() function shall return the length of the wide string pointed to by src; that is, the number of wide-character codes in the wide string, not including the terminating null wide-character code.

Upon successful completion, the wcslcat() function shall return the initial length of the wide string pointed to by dst plus the length of the wide string pointed to by src.

No return values are reserved to indicate an error.

ERRORS
No errors are defined.

EXAMPLES
None.

APPLICATION USAGE
The return value of the wcslcpy() and wcslcat() functions follows the same convention as snprintf(); that is, they return the total length (in wide-character codes) of the wide string they tried to create. If the return value is greater than or equal to dstsize, the output wide string has been truncated.

RATIONALE
None.

FUTURE DIRECTIONS
None.

SEE ALSO
snprintf(), strlcat(), wcslen(), wcsncat(), wcsncpy()

XBD <wchar.h>

CHANGE HISTORY
First released in Issue 8.


Add wcslcat() to the SEE ALSO section for each existing function page listed in the wcslcat() SEE ALSO above.

On page 3790 line 130046 section E.1, add wcslcat() and wcslcpy() to the POSIX_C_LANG_WIDE_CHAR_EXT subprofile group.

On page 3790 line 130049 section E.1, add strlcat() and strlcpy() to the POSIX_C_LIB_EXT subprofile group.

(0005050)
geoffclare (manager)
2020-10-16 09:29

The additions for these four functions have been made in the Issue8NewAPIs branch in gitlab, based on Note: 0004967.
(0005334)
geoffclare (manager)
2021-04-29 15:17

Make the changes from "Additional APIs for Issue 8, Part 1" (Austin/1110).

- Issue History
Date Modified Username Field Change
2015-09-23 08:35 EdSchouten New Issue
2015-09-23 08:35 EdSchouten Status New => Under Review
2015-09-23 08:35 EdSchouten Assigned To => ajosey
2015-09-23 08:35 EdSchouten Name => Ed Schouten
2015-09-23 08:35 EdSchouten Organization => Nuxi
2015-09-23 08:35 EdSchouten Section => <string.h>
2015-09-23 08:35 EdSchouten Page Number => n/a
2015-09-23 08:35 EdSchouten Line Number => n/a
2015-09-23 08:46 EdSchouten Section <string.h> => <string.h> and <wchar.h>
2015-09-23 14:00 eblake Note Added: 0002843
2015-09-23 16:19 EdSchouten Note Added: 0002844
2015-10-01 05:01 shware_systems Note Added: 0002851
2015-10-09 14:00 joerg Note Added: 0002869
2015-11-09 20:57 steffen Note Added: 0002892
2015-11-10 11:33 joerg Note Added: 0002894
2015-11-10 11:58 EdSchouten Note Added: 0002895
2015-11-10 13:08 steffen Note Added: 0002896
2015-11-10 16:20 EdSchouten Note Added: 0002897
2015-11-10 17:07 steffen Note Added: 0002900
2015-11-10 17:27 steffen Note Added: 0002901
2016-07-07 23:18 shware_systems Note Added: 0003293
2016-09-14 14:25 emaste Issue Monitored: emaste
2020-09-03 10:45 geoffclare Note Added: 0004967
2020-09-03 10:47 geoffclare Note Edited: 0004967
2020-09-03 10:49 geoffclare Note Edited: 0004967
2020-09-03 13:52 geoffclare Note Edited: 0004967
2020-09-03 13:59 geoffclare Note Edited: 0004967
2020-10-16 09:29 geoffclare Note Added: 0005050
2021-04-29 15:17 geoffclare Note Added: 0005334
2021-04-29 15:18 geoffclare Interp Status => ---
2021-04-29 15:18 geoffclare Final Accepted Text => Note: 0005334
2021-04-29 15:18 geoffclare Status Under Review => Resolved
2021-04-29 15:18 geoffclare Resolution Open => Accepted As Marked
2021-04-29 15:18 geoffclare Tag Attached: issue8
2021-05-07 15:18 geoffclare Status Resolved => Applied
2022-06-29 16:05 Florian Weimer Issue Monitored: Florian Weimer
2022-06-30 08:39 geoffclare Relationship added related to 0001591


Mantis 1.1.6[^]
Copyright © 2000 - 2008 Mantis Group
Powered by Mantis Bugtracker