Anonymous | Login | 2024-03-28 13:14 UTC |
Main | My View | View Issues | Change Log | Docs |
Viewing Issue Simple Details [ Jump to Notes ] | [ Issue History ] [ Print ] | ||||||
ID | Category | Severity | Type | Date Submitted | Last Update | ||
0000282 | [1003.1(2008)/Issue 7] Shell and Utilities | Editorial | Clarification Requested | 2010-07-12 15:20 | 2013-04-16 13:06 | ||
Reporter | bonzinip | View Status | public | ||||
Assigned To | ajosey | ||||||
Priority | normal | Resolution | Accepted As Marked | ||||
Status | Closed | ||||||
Name | Paolo Bonzini | ||||||
Organization | |||||||
User Reference | |||||||
Section | sed | ||||||
Page Number | 394 | ||||||
Line Number | 104833 | ||||||
Interp Status | Approved | ||||||
Final Accepted Text | Note: 0000533 | ||||||
Summary | 0000282: Extended Description wrong with respect to D command | ||||||
Description |
The extended description for sed says "In default operation, sed cyclically shall append a line of input, less its terminating <newline>, into the pattern space. Normally the pattern space will be empty, unless a D command terminated the last cycle." The POSIX standard requires an input read on every new cycle, regardless of what's in the pattern space after a D command. It makes no mention of skipping the input read on the next cycle if the pattern space is not empty. However, this is contrary to the operation of most (all?) implementations, which do *not* append a line if the cycle was restarted due to the D command. Looking back now at POSIX ancestors, I see that the wording *used* to be: "sed cyclically copies a line of input, less its terminating newline character, into a pattern space (unless there is something left after a D command)..." Which is consistent with GNU sed documentation and behavior. But that was back in 1997 (Open Group, Commands and Utilities, Issue 5). Reference: http://www.opengroup.org/online-pubs-short?DOC=9693999599&FORM=PDF [^] Starting with Issue 6 (which was POSIX:2001), the wording was apparently changed to: "sed cyclically shall append a line of input, less its terminating <newline>, into the pattern space. Normally the pattern space will be empty, unless a D command terminated the last cycle." The "Change History" section of that document yields only one possible (vague) clue for this change: "The EXTENDED DESCRIPTION is changed to align with the IEEE P1003.2b draft standard." Reference: http://www.opengroup.org/onlinepubs/009695399/toc.htm [^] However, a fairly late draft of 1003.2b (D12, the last or next to last version) that I found still had the (presumably correct) wording: "In default operation, sed cyclically shall copy a line of input, less its terminating <newline>, into a pattern space (unless there is something left after a D command)" It would seem that the POSIX standard on this aspect of sed (behavior of the D command) has been broken for almost 10 years, due to some apparently unjustified and mysterious reason. It doesn't seem safe to use the D command portably until the standard is fixed. This is a testcase: /[23]/q N D with input 1 2 3 should print "2" when following the GNU implementation, "2<newline>3" when following POSIX. |
||||||
Desired Action | Change back the extended description to "unless there is something left after a D command, sed cyclically shall copy a line of input, less its terminating <newline>, into a pattern space". | ||||||
Tags | tc1-2008 | ||||||
Attached Files | |||||||
|
Notes | |
(0000460) nick (manager) 2010-07-12 18:14 |
Issue 6 had several base documents, and some extensive editorial work was done during the ballot cycles to use uniform language (what became known as the great "shallification"). I suspect the root of the change was in this phase, and the change may well have been unintentional. |
(0000483) msbrown (manager) 2010-07-29 15:56 |
At line 104833, replace: In default operation, sed cyclically shall append a line of input, less its terminating <newline>, into the pattern space. Normally the pattern space will be empty, unless a D command terminated the last cycle." with: In default operation, sed cyclically shall copy a line of input, less its terminating <newline> character, into a pattern space unless there is something left after a D command. |
(0000484) ajosey (manager) 2010-07-29 16:00 |
As an informational note. The change occurred between Draft 5 and Draft 6 of the original Austin Group draft ~April 2001. Draft 5 was the 1003.2b merge and included the wording as noted. The Change Request report at: http://www.opengroup.org/austin/docs/austin_75r1.txt [^] notes a change due to change [DST-1939] which was applied. _____________________________________________________________________________ OBJECTION Enhancement Request Number 453 donnte@xxxxxxxxxxxxxx Bug in xcud5 Assorted (rdvk# 434) [DST-1939] Mon, 5 Feb 2001 19:57:08 -0800 _____________________________________________________________________________ Accept_____ Accept as marked below_X___ Duplicate_____ Reject_____ Rationale for rejected or partial changes: In default operation, sed cyclically shall append a line of input, less its terminating <newline>, into the pattern space. Normally the pattern space will be empty, unless a D command terminated the last cycle. The sed utility shall then apply in sequence all... _____________________________________________________________________________ Page: 3047 Line: 32102 Section: sed Problem: In default operation, sed cyclically shall copy a line of input, less its terminating <newline>, into a pattern space (unless there is something left after a D command), apply in sequence all This is unclear: "unless" what? I *think* it's trying to say the following. Action: In default operation, sed cyclically shall append a line of input, less its terminating <newline>, into a pattern space. Normally the pattern space will be empty, but if a D command has been used it may not be empty. It shall then apply in sequence all... |
(0000524) bonzinip (reporter) 2010-08-16 19:35 |
Thanks for the information! It was now brought to my attention that even the original text of Issue 5 is not correct. The right wording would be: "In default operation, sed cyclically shall copy a line of input, less its terminating <newline> character, into a pattern space. This step shall however be skipped whenever the last executed command was a D command, and the command found no newline in pattern space." The reason is that the wording in POSIX introduces an unwanted difference in the behavior of the D command, depending on whether the last line appended to pattern space was empty or not. For example, take the following sed invocation: (1) $ sed -ne 'N;=;D' GNU sed and BSD sed both implement the wording I suggested, so the POSIX behavior could be implemented with (2) $ sed -ne 'N;=;/\n$/d;D' in both GNU sed and BSD sed. Now, given the two inputs "\n\n\n" and "a\na\na\n", the output should be the same, since the overall structure of the file is the same (the only difference is whether lines are empty or not). However: - for "a\na\na\n" both scripts will output "2\n3\n"; - for "\n\n\n", script 1 will output "2\n3\n", while script 2 will output "2\n" This shows how the output of script 1 is more coherent. Thanks! |
(0000525) bonzinip (reporter) 2010-08-16 19:36 |
Of course, the modification should be "In default operation, sed cyclically shall copy a line of input, less its terminating <newline> character, into a pattern space. This step shall however be skipped whenever the last executed command was a D command, and the command found a newline in pattern space." I apologize for the confusion. |
(0000526) geoffclare (manager) 2010-08-17 08:31 |
There is also a related problem with the description of the D command. It says: Delete the initial segment of the pattern space through the first <newline> and start the next cycle. It doesn't say what happens if there is no <newline> in the pattern space. |
(0000532) eblake (manager) 2010-08-19 18:35 edited on: 2010-08-26 15:20 |
Based on the additional comments, the fixed wording should be as follows. At line 104833, replace: In default operation, sed cyclically shall append a line of input, less its terminating <newline>, into the pattern space. Normally the pattern space will be empty, unless a D command terminated the last cycle. The sed utility shall then apply in sequence all commands whose addresses select that pattern space, and at the end of the script copy the pattern space to standard output (except when −n is specified) and delete the pattern space. with: In default operation, sed cyclically shall append a line of input, less its terminating <newline> character, into the pattern space. Reading from input shall be skipped if a <newline> was in the pattern space prior to a D command ending the previous cycle. The sed utility shall then apply in sequence all commands whose addresses select that pattern space, until a command starts the next cycle or quits. If no commands explicitly started a new cycle, then at the end of the script the pattern space shall be copied to standard output (except when −n is specified) and the pattern space shall be deleted. At line 104926, replace: [2addr]D Delete the initial segment of the pattern space through the first <newline> and start the next cycle. with: [2addr]D If the pattern space contains no <newline>, delete the pattern space and start a normal new cycle as if the d command was issued. Otherwise, delete the initial segment of the pattern space through the first <newline>, and start the next cycle with the resultant pattern space and without reading any new input. |
(0000533) Don Cragun (manager) 2010-08-26 15:23 |
Interpretation response ------------------------ The standard states the behavior of the D command , and conforming implementations must conform to this. However, concerns have been raised about this which are being referred to the sponsor. Rationale: ------------- The current text does not match historic practice. Notes to the Editor (not part of this interpretation): ------------------------------------------------------- Make the changes specified in Note: 0000532 |
(0000573) ajosey (manager) 2010-10-14 11:28 |
Interpretation approved 14 October 2010 |
Issue History | |||
Date Modified | Username | Field | Change |
2010-07-12 15:20 | bonzinip | New Issue | |
2010-07-12 15:20 | bonzinip | Status | New => Under Review |
2010-07-12 15:20 | bonzinip | Assigned To | => ajosey |
2010-07-12 15:20 | bonzinip | Name | => Paolo Bonzini |
2010-07-12 15:20 | bonzinip | Section | => sed |
2010-07-12 15:20 | bonzinip | Page Number | => ? |
2010-07-12 15:20 | bonzinip | Line Number | => ? |
2010-07-12 18:14 | nick | Note Added: 0000460 | |
2010-07-29 15:56 | msbrown | Page Number | ? => 394 |
2010-07-29 15:56 | msbrown | Line Number | ? => 104833 |
2010-07-29 15:56 | msbrown | Interp Status | => --- |
2010-07-29 15:56 | msbrown | Note Added: 0000483 | |
2010-07-29 15:56 | msbrown | Severity | Objection => Editorial |
2010-07-29 15:56 | msbrown | Status | Under Review => Resolved |
2010-07-29 15:56 | msbrown | Resolution | Open => Accepted As Marked |
2010-07-29 15:56 | msbrown | Final Accepted Text | => Note: 0000483 |
2010-07-29 16:00 | ajosey | Note Added: 0000484 | |
2010-08-16 19:35 | bonzinip | Note Added: 0000524 | |
2010-08-16 19:36 | bonzinip | Note Added: 0000525 | |
2010-08-17 08:31 | geoffclare | Note Added: 0000526 | |
2010-08-17 08:31 | geoffclare | Resolution | Accepted As Marked => Reopened |
2010-08-19 18:35 | eblake | Note Added: 0000532 | |
2010-08-26 15:20 | eblake | Note Edited: 0000532 | |
2010-08-26 15:23 | Don Cragun | Note Added: 0000533 | |
2010-08-26 15:24 | Don Cragun | Final Accepted Text | Note: 0000483 => Note: 0000533 |
2010-08-26 15:24 | Don Cragun | Status | Resolved => Interpretation Required |
2010-08-26 15:24 | Don Cragun | Resolution | Reopened => Accepted As Marked |
2010-08-26 16:33 | Don Cragun | Interp Status | --- => Pending |
2010-09-13 05:48 | ajosey | Interp Status | Pending => Proposed |
2010-09-24 16:18 | geoffclare | Tag Attached: tc1-2008 | |
2010-10-14 11:28 | ajosey | Interp Status | Proposed => Approved |
2010-10-14 11:28 | ajosey | Note Added: 0000573 | |
2013-04-16 13:06 | ajosey | Status | Interpretation Required => Closed |
Mantis 1.1.6[^] Copyright © 2000 - 2008 Mantis Group |