Austin Group Defect Tracker

Aardvark Mark IV


Viewing Issue Simple Details Jump to Notes ] Issue History ] Print ]
ID Category Severity Type Date Submitted Last Update
0000945 [1003.1(2013)/Issue7+TC1] Shell and Utilities Comment Enhancement Request 2015-05-05 14:05 2019-06-10 08:54
Reporter stephane View Status public  
Assigned To
Priority normal Resolution Accepted As Marked  
Status Closed  
Name Stephane Chazelas
Organization
User Reference
Section sed
Page Number 3182
Line Number 106427-106430
Interp Status Approved
Final Accepted Text see Note: 0002733
Summary 0000945: sed: leave behaviour unspecified for label names containing }, \, #, ; or [:space:].
Description At the moment, the spec for sed allows labels that contain any character.

The RATIONALE has a statement:


     The b, t, and : commands are documented to ignore leading white space,
     but no mention is made of trailing white space. Historical
     implementations of sed assigned different locations to the labels 'x'
     and "x ". This is not useful, and leads to subtle programming errors,
     but it is historical practice, and changing it could theoretically break
     working scripts. Implementors are encouraged to provide warning messages
     about labels that are never used or jumps to labels that do not exist.


In practice, that means that sed scripts written against that standard could end-up writing non-portable scripts.

Even with POSIXLY_CORRECT=1, GNU sed doesn't allow }#;<SPC><TAB> in label names.

From my reading of:


     The argument text shall consist of one or more lines. Each embedded
     <newline> in the text shall be preceded by a <backslash>. Other
     <backslash> characters in text shall be removed, and the following
     character shall be treated literally.


b a\b should branch to the label defined as : ab, but doesn't in GNU or Solaris or FreeBSD sed. You can define a label with a newline with Solaris sed, (with \<LF>) but not FreeBSD nor GNU.

Also note that for Solaris sed, the length limit on labels is on bytes, not characters (a stéphane label in a UTF-8 locale where those 8 characters are written as 9 bytes cause an error).

So, in practice, those sed scripts are POSIX compliant, but not portable:

: {foo}
s/a/b/
t {foo}

: Stéphane
s/a/b/
t Stéphane

: a\b
s/a/b/
t ab

: a;b
s/a/b/
t a;b

: a\
b
s/a/b
t a\
b
Desired Action Change the spec to make the behaviour unspecified when label names consist of more than 8 bytes (as opposed to characters) or contain ;#}\ or [:space:] characters.

The note about backslashes being removed in arguments should probably be updated as well to mention that it doesn't apply to branching labels (where the behaviour should be unspecified if backslashes are used). (The behaviour for \\<newline> is also unclear from that text (though that's a separate issue)).

Then, sed implementations don't need to be modified, and application writers know to avoid those when writing portable scripts. That would also serve as a hint to reinforce the fact that branching commands cannot be separated from the next command with a semicolon.
Tags tc2-2008
Attached Files

- Relationships

-  Notes
(0002652)
stephane (reporter)
2015-05-05 14:41

Also note that FreeBSD sed issues a warning when labels end in blanks.
(0002733)
rhansen (manager)
2015-06-25 16:32
edited on: 2015-06-25 16:33

Interpretation response
------------------------
The standard states the requirements for sed labels, and conforming implementations must conform to this. However, concerns have been raised about this which are being referred to the sponsor.

Rationale:
-------------
Historical sed implementations did not consistently handle certain characters in label arguments; some interpreted backslashes as escape characters while others did not, some did not treat characters like ';', '}', and whitespace like other label characters, and some limited label length by bytes rather than characters. Limiting conforming scripts to using labels with names created from characters in the portable filename character set allows those scripts to run on all implementations.

Note that the paragraph starting at line 106382 (regarding backslash escaping in text arguments) only applies to the text argument for the a, c, and i commands. The standard does not specify any backslash escaping for labels.

Notes to the Editor (not part of this interpretation):
-------------------------------------------------------
On page 3180 line 106330, change:
The standard error shall be used only for diagnostic messages.
to:
The standard error shall be used only for diagnostic and warning messages.
On page 3182 after line 106412, insert a new paragraph:
If a label argument (to a b, t, or : command) contains characters outside of the portable filename character set, or if a label is longer than 8 bytes, the behavior is unspecified. The implementation shall support label arguments recognized as unique up to at least 8 bytes; the actual length (greater than or equal to 8) supported by the implementation is unspecified. It is unspecified whether exceeding the maximum supported label length causes an error or a silent truncation.
On page 3182 lines 106426-106430 Change the entire description of the b command to:
Branch to the : command verb bearing the label argument. If label is not specified, branch to the end of the script.
On page 3187 lines 106644-106646 change:
Implementors are encouraged to provide warning messages about labels that are never used or jumps to labels that do not exist.
to:
Implementors are encouraged to provide warning messages about labels that are never referenced by a b or t command, jumps to labels that do not exist, and label arguments that are subject to truncation.


(0002734)
ajosey (manager)
2015-06-26 08:47

Interpretation Proposed: 26 June 2015
(0002818)
ajosey (manager)
2015-09-07 11:33

Interpretation approved: 7 Sep 2015

- Issue History
Date Modified Username Field Change
2015-05-05 14:05 stephane New Issue
2015-05-05 14:05 stephane Name => Stephane Chazelas
2015-05-05 14:05 stephane Section => sed
2015-05-05 14:05 stephane Page Number => http://pubs.opengroup.org/onlinepubs/9699919799/utilities/sed.html#tag_20_116_13_03 [^]
2015-05-05 14:05 stephane Line Number => http://pubs.opengroup.org/onlinepubs/9699919799/utilities/sed.html#tag_20_116_13_03 [^]
2015-05-05 14:41 stephane Note Added: 0002652
2015-06-25 16:32 rhansen Note Added: 0002733
2015-06-25 16:33 rhansen Note Edited: 0002733
2015-06-25 16:38 rhansen Page Number http://pubs.opengroup.org/onlinepubs/9699919799/utilities/sed.html#tag_20_116_13_03 [^] => 3182
2015-06-25 16:38 rhansen Line Number http://pubs.opengroup.org/onlinepubs/9699919799/utilities/sed.html#tag_20_116_13_03 [^] => 106427-106430
2015-06-25 16:38 rhansen Interp Status => Proposed
2015-06-25 16:38 rhansen Final Accepted Text => see Note: 0002733
2015-06-25 16:38 rhansen Status New => Interpretation Required
2015-06-25 16:38 rhansen Resolution Open => Accepted As Marked
2015-06-25 16:39 rhansen Tag Attached: tc2-2008
2015-06-25 18:11 Don Cragun Interp Status Proposed => Pending
2015-06-26 08:47 ajosey Interp Status Pending => Proposed
2015-06-26 08:47 ajosey Note Added: 0002734
2015-09-07 11:33 ajosey Interp Status Proposed => Approved
2015-09-07 11:33 ajosey Note Added: 0002818
2019-06-10 08:54 agadmin Status Interpretation Required => Closed


Mantis 1.1.6[^]
Copyright © 2000 - 2008 Mantis Group
Powered by Mantis Bugtracker