Austin Group Defect Tracker

Aardvark Mark IV


Viewing Issue Simple Details Jump to Notes ] Issue History ] Print ]
ID Category Severity Type Date Submitted Last Update
0000110 [1003.1(2008)/Issue 7] System Interfaces Objection Omission 2009-06-30 19:39 2013-04-16 13:06
Reporter eblake View Status public  
Assigned To ajosey
Priority normal Resolution Accepted As Marked  
Status Closed  
Name Eric Blake
Organization
User Reference
Section memchr
Page Number 1284
Line Number 42163
Interp Status ---
Final Accepted Text Note: 0000143
Summary 0000110: memchr input process order
Description _____________________________________________________________________________
 OBJECTION Enhancement Request Number 39
 ebb9:xxxxxxx Defect in XSH memchr (rdvk# 1)
 {ebb.memchr} Wed, 27 May 2009 13:58:36 +0100 (BST)
 _____________________________________________________________________________

Traditional implementations of memchr process the input in ascending
 order. This has the advantage that when the object size of s is not
 known, but c occurs within the object, the caller can pass a value of
 n that is larger than the actual object size without dereferencing
 inaccessible memory. However, while the standard (and C99) is
 explicit that it is permissible to pass n smaller than the object size
 of s, it is silent on whether passing a larger n is well-defined.

 In contrast, consider the wording for fprintf when dealing with the
 %.*s specifier, from line 29938:

 "If the precision is not specified or is greater than the size of the
 array, the application shall ensure that the array contains a null
 byte."

 Many implementations of the *printf family use memchr to implement
 this statement; for example,
 http://git.sv.gnu.org/cgit/gnulib.git/tree/lib/vasnprintf.c?id=d4ca645#n197 [^]

 However, if memchr does not have any strict requirement on evaluation
 order, then this invokes undefined behavior. For example, here is a
 bug report showing what happens when memchr does not have the
 traditional behavior, but dereferences memory that fits with the n
 argument to memchr but not within the actual array passed to printf:
 http://www.alphalinux.org/archives/axp-list/March2001/0337.shtml [^]

 Likewise, application writers have noticed that it is possible to
 write faster code for finding a NUL byte, if one is present within a
 bounded length, by using memchr rather than strnlen, since the former
 has fewer conditionals (bounds check and search for NUL) than the
 latter (bounds check, search for NUL, and search for c). For example:
 http://git.sv.gnu.org/cgit/gnulib.git/tree/lib/strnlen1.c?id=d4ca645 [^]

 But again, this usage is rendered unsafe unless memchr is specified
 to behave like strnlen and not dereference past the match.
Desired Action  At the end of the paragraph at line 42164, append a sentence with CX shading:

 If n is larger than the object pointed to by s, the application shall
 ensure that an instance of c occurs within the object.

 Change the rationale at line 42174 from:

 None.

 to:

 Although C99 is silent on the behavior of memchr when s points to an
 array smaller than n bytes, this specification requires memchr to
 behave as if it accesses bytes in ascending order, thus making
 memchr(s,0,n) safe to use as a faster alternative to strnlen(s,n) when
 determining if the end of a null-terminated string occurs within n
 bytes.


 According to ebb9:xxxxxxx on 5/27/2009 6:58 AM:
 > Likewise, application writers have noticed that it is possible to
 > write faster code for finding a NUL byte, if one is present within a
 > bounded length, by using memchr rather than strnlen, since the former
 > has fewer conditionals (bounds check and search for NUL) than the
 > latter (bounds check, search for NUL, and search for c).

 Correction - I meant to compare memchr(s,c,n) to strchr(s,c) where c is
 known to occur in s; strchr requires a search for c and for NUL, and the
 search for two bytes in parallel is typically more expensive than a bounds
 check and single search. There is no strnchr, so nothing is standardized
 that performs all three of bounds check, search for NUL, and search for c
 at once. (That behavior is also useful--for example, gnulib provides a
 function memchr2--but it can wait for another day to be standardized).

 But one point remains - many applications use memchr(s,0,n) rather than
 strnlen(s,n) because strnlen was not present in earlier standards. So
 this aardvark is still useful in standardizing this relationship.

 > Change the rationale at line 42174 from:
 >
 > None.
 >
 > to:
 >
 > Although C99 is silent on the behavior of memchr when s points to an
 > array smaller than n bytes, this specification requires memchr to
 > behave as if it accesses bytes in ascending order, thus making
 > memchr(s,0,n) safe to use as a faster alternative to strnlen(s,n) when
 > determining if the end of a null-terminated string occurs within n
 > bytes.

 Therefore, we may want to strike the word 'faster' in this proposed rationale.

Tags c99, tc1-2008
Attached Files

- Relationships

-  Notes
(0000143)
msbrown (manager)
2009-06-30 19:39

In the DESCRIPTION remove "of the object" from

The memchr( ) function shall locate the first occurrence of c (converted
to an unsigned char) in the initial n bytes (each interpreted as unsigned
char) of the object pointed to by s.

In the RETURN VALUE section

The memchr( ) function shall return a pointer to the located byte,
or a null pointer if the byte does not occur in the object.

to
The memchr( ) function shall return a pointer to the located byte,
or a null pointer if the byte is not found.

Also Nick will let the C committee know about the issue

Add to DESCRIPTION
Implementations shall behave as if they read the memory byte by byte
from the beginning of the bytes pointed to by s and stop at the first
occurrence of c (if it is found in the initial n bytes).
(0000144)
eblake (manager)
2009-06-30 20:03

Note that the "Final Accepted Text" field contains two chunks of edits to the standard, but they are separated by an informative sentence ("Also Nick will let the C committee know about the issue") that should not be placed in the standard.
(0000607)
nick (manager)
2010-11-05 14:35

WG14 has added
"Implementations shall behave as if they read the memory byte by byte
from the beginning of the bytes pointed to by s and stop at the first
occurrence of c (if it is found in the initial n bytes)."
to the description of memchr in the C1x draft.

- Issue History
Date Modified Username Field Change
2009-06-30 19:39 msbrown New Issue
2009-06-30 19:39 msbrown Status New => Under Review
2009-06-30 19:39 msbrown Assigned To => ajosey
2009-06-30 19:39 msbrown Name => Mark Brown
2009-06-30 19:39 msbrown Organization => IBM
2009-06-30 19:39 msbrown Section => memchr
2009-06-30 19:39 msbrown Page Number => 1284
2009-06-30 19:39 msbrown Line Number => 42163
2009-06-30 19:39 msbrown Note Added: 0000143
2009-06-30 19:39 msbrown Status Under Review => Resolved
2009-06-30 19:39 msbrown Resolution Open => Accepted As Marked
2009-06-30 19:40 msbrown Final Accepted Text => Note: 0000143
2009-06-30 20:03 eblake Note Added: 0000144
2009-07-01 16:44 Don Cragun Name Mark Brown => Eric Blake
2009-07-01 16:44 Don Cragun Organization IBM =>
2009-07-01 16:44 Don Cragun Reporter msbrown => eblake
2009-08-06 16:24 nick Tag Attached: c99
2010-08-27 13:18 ajosey Tag Attached: tc1-2008
2010-11-05 14:35 nick Note Added: 0000607
2013-04-16 13:06 ajosey Status Resolved => Closed


Mantis 1.1.6[^]
Copyright © 2000 - 2008 Mantis Group
Powered by Mantis Bugtracker