View Issue Details

IDProjectCategoryView StatusLast Update
00009451003.1(2013)/Issue7+TC1Shell and Utilitiespublic2019-06-10 08:54
Reporterstephane Assigned To 
PrioritynormalSeverityCommentTypeEnhancement Request
Status ClosedResolutionAccepted As Marked 
NameStephane Chazelas
Organization
User Reference
Sectionsed
Page Number3182
Line Number106427-106430
Interp StatusApproved
Final Accepted Textsee 0000945:0002733
Summary0000945: sed: leave behaviour unspecified for label names containing }, \, #, ; or [:space:].
DescriptionAt the moment, the spec for sed allows labels that contain any character.

The RATIONALE has a statement:


     The b, t, and : commands are documented to ignore leading white space,
     but no mention is made of trailing white space. Historical
     implementations of sed assigned different locations to the labels 'x'
     and "x ". This is not useful, and leads to subtle programming errors,
     but it is historical practice, and changing it could theoretically break
     working scripts. Implementors are encouraged to provide warning messages
     about labels that are never used or jumps to labels that do not exist.


In practice, that means that sed scripts written against that standard could end-up writing non-portable scripts.

Even with POSIXLY_CORRECT=1, GNU sed doesn't allow }#;<SPC><TAB> in label names.

From my reading of:


     The argument text shall consist of one or more lines. Each embedded
     <newline> in the text shall be preceded by a <backslash>. Other
     <backslash> characters in text shall be removed, and the following
     character shall be treated literally.


b a\b should branch to the label defined as : ab, but doesn't in GNU or Solaris or FreeBSD sed. You can define a label with a newline with Solaris sed, (with \<LF>) but not FreeBSD nor GNU.

Also note that for Solaris sed, the length limit on labels is on bytes, not characters (a stéphane label in a UTF-8 locale where those 8 characters are written as 9 bytes cause an error).

So, in practice, those sed scripts are POSIX compliant, but not portable:

: {foo}
s/a/b/
t {foo}

: Stéphane
s/a/b/
t Stéphane

: a\b
s/a/b/
t ab

: a;b
s/a/b/
t a;b

: a\
b
s/a/b
t a\
b
Desired ActionChange the spec to make the behaviour unspecified when label names consist of more than 8 bytes (as opposed to characters) or contain ;#}\ or [:space:] characters.

The note about backslashes being removed in arguments should probably be updated as well to mention that it doesn't apply to branching labels (where the behaviour should be unspecified if backslashes are used). (The behaviour for \\<newline> is also unclear from that text (though that's a separate issue)).

Then, sed implementations don't need to be modified, and application writers know to avoid those when writing portable scripts. That would also serve as a hint to reinforce the fact that branching commands cannot be separated from the next command with a semicolon.
Tagstc2-2008

Activities

stephane

2015-05-05 14:41

reporter   bugnote:0002652

Also note that FreeBSD sed issues a warning when labels end in blanks.

rhansen

2015-06-25 16:32

manager   bugnote:0002733

Last edited: 2015-06-25 16:33

Interpretation response
------------------------
The standard states the requirements for sed labels, and conforming implementations must conform to this. However, concerns have been raised about this which are being referred to the sponsor.

Rationale:
-------------
Historical sed implementations did not consistently handle certain characters in label arguments; some interpreted backslashes as escape characters while others did not, some did not treat characters like ';', '}', and whitespace like other label characters, and some limited label length by bytes rather than characters. Limiting conforming scripts to using labels with names created from characters in the portable filename character set allows those scripts to run on all implementations.

Note that the paragraph starting at line 106382 (regarding backslash escaping in text arguments) only applies to the text argument for the a, c, and i commands. The standard does not specify any backslash escaping for labels.

Notes to the Editor (not part of this interpretation):
-------------------------------------------------------
On page 3180 line 106330, change:
The standard error shall be used only for diagnostic messages.
to:
The standard error shall be used only for diagnostic and warning messages.
On page 3182 after line 106412, insert a new paragraph:
If a label argument (to a b, t, or : command) contains characters outside of the portable filename character set, or if a label is longer than 8 bytes, the behavior is unspecified. The implementation shall support label arguments recognized as unique up to at least 8 bytes; the actual length (greater than or equal to 8) supported by the implementation is unspecified. It is unspecified whether exceeding the maximum supported label length causes an error or a silent truncation.
On page 3182 lines 106426-106430 Change the entire description of the b command to:
Branch to the : command verb bearing the label argument. If label is not specified, branch to the end of the script.
On page 3187 lines 106644-106646 change:
Implementors are encouraged to provide warning messages about labels that are never used or jumps to labels that do not exist.
to:
Implementors are encouraged to provide warning messages about labels that are never referenced by a b or t command, jumps to labels that do not exist, and label arguments that are subject to truncation.


ajosey

2015-06-26 08:47

manager   bugnote:0002734

Interpretation Proposed: 26 June 2015

ajosey

2015-09-07 11:33

manager   bugnote:0002818

Interpretation approved: 7 Sep 2015

Issue History

Date Modified Username Field Change
2015-05-05 14:05 stephane New Issue
2015-05-05 14:05 stephane Name => Stephane Chazelas
2015-05-05 14:05 stephane Section => sed
2015-05-05 14:05 stephane Page Number => http://pubs.opengroup.org/onlinepubs/9699919799/utilities/sed.html#tag_20_116_13_03
2015-05-05 14:05 stephane Line Number => http://pubs.opengroup.org/onlinepubs/9699919799/utilities/sed.html#tag_20_116_13_03
2015-05-05 14:41 stephane Note Added: 0002652
2015-06-25 16:32 rhansen Note Added: 0002733
2015-06-25 16:33 rhansen Note Edited: 0002733
2015-06-25 16:38 rhansen Page Number http://pubs.opengroup.org/onlinepubs/9699919799/utilities/sed.html#tag_20_116_13_03 => 3182
2015-06-25 16:38 rhansen Line Number http://pubs.opengroup.org/onlinepubs/9699919799/utilities/sed.html#tag_20_116_13_03 => 106427-106430
2015-06-25 16:38 rhansen Interp Status => Proposed
2015-06-25 16:38 rhansen Final Accepted Text => see 0000945:0002733
2015-06-25 16:38 rhansen Status New => Interpretation Required
2015-06-25 16:38 rhansen Resolution Open => Accepted As Marked
2015-06-25 16:39 rhansen Tag Attached: tc2-2008
2015-06-25 18:11 Don Cragun Interp Status Proposed => Pending
2015-06-26 08:47 ajosey Interp Status Pending => Proposed
2015-06-26 08:47 ajosey Note Added: 0002734
2015-09-07 11:33 ajosey Interp Status Proposed => Approved
2015-09-07 11:33 ajosey Note Added: 0002818
2019-06-10 08:54 agadmin Status Interpretation Required => Closed