Austin Group Defect Tracker

Aardvark Mark IV


Viewing Issue Simple Details Jump to Notes ] Issue History ] Print ]
ID Category Severity Type Date Submitted Last Update
0000325 [1003.1(2008)/Issue 7] Shell and Utilities Objection Omission 2010-09-29 15:09 2013-04-16 13:06
Reporter eblake View Status public  
Assigned To ajosey
Priority normal Resolution Accepted As Marked  
Status Closed  
Name Eric Blake
Organization Red Hat
User Reference ebb.tr
Section tr
Page Number 3247
Line Number 108366
Interp Status ---
Final Accepted Text Note: 0000575
Summary 0000325: tr and m4 translit behavior when string1 has repeated characters
Description The description for sed's y operator is explicit that repeating characters in
the first argument of a transliteration operation has unspecified results:
"if any of the characters in string1 appear more than once, the results are
undefined." (line 105011)

However, there is no corresponding text for either tr or m4's translit
built-in. And while sed leaves the behavior unspecified, existing
implementation practice for both tr and m4 is consistent across a large set
of implementations. Oddly enough, though, the two programs historically have
a different interpretation: tr favors the last instance of a character in
string1, while m4 favors the first:

$ echo a | tr aa 01
1
$ echo 'translit(a,aa,01)' | m4
0

The proposal below gives two alternatives; I would prefer the first which
mandates existing practice, but it may be deemed worth using the second
proposal to be more conservative and mirror sed's wording.

Meanwhile, 'tr -s a' falls into the category of -d not being specified, but
since it does not have a string2 argument, the text of line 108366 is not
applicable. Both options in the proposal fix the wording to make this clear.
The wording for m4 assumes the resolution for 0000242 has been applied first.

This proposal does not change the wording for sed, but it appears that most
sed implementations match the tr behavior of favoring the last instance of a
character.
Desired Action Option 1:

At line 108366 (XCU tr EXTENDED DESCRIPTION), change:

Each input character found in the array specified by string1 shall be replaced
by the character in the same relative position in the array specified by
string2.

to:

If string2 is present, each input character found in the array specified by
string1 shall be replaced by the character in the same relative position in
the array specified by string2. If a character occurs more than once in
string1, the replacement shall be from the final position of that character.

At line 94344 (XCU m4 EXTENDED DESCRIPTION translit), change:

The defining text of the translit macro shall be the first argument with
every character that occurs in the second argument replaced with the
corresponding character from the third argument. If no replacement character
is specified for some source character because the second argument is longer
than the third argument, that character shall be deleted from the first
argument in translit's defining text.

to:

The defining text of the translit macro shall be the first argument with
every character that occurs in the second argument replaced with the
corresponding character from the third argument. If a character appears more
than once in the second argument, the replacement shall correspond to the
first instance of the character. If no replacement character is specified for
the first instance of a source character because the second argument is longer
than the third argument, that character shall be deleted from the first
argument in translit's defining text.



Option 2:

At line 108366 (XCU tr EXTENDED DESCRIPTION), change:

Each input character found in the array specified by string1 shall be replaced
by the character in the same relative position in the array specified by
string2. When the array specified by string2 is shorter that the one specified
by string1, the results are unspecified.

to:

If string2 is present, each input character found in the array specified by
string1 shall be replaced by the character in the same relative position in
the array specified by string2. If the array specified by string2 is shorter
that the one specified by string1, or if a character occurs more than once in
string1, the results are unspecified.

After line 94349 (XCU m4 EXTENDED DESCRIPTION translit), add another sentence:

The behavior is unspecified if the same character appears more than once in
the second argument.
Tags tc1-2008
Attached Files

- Relationships

-  Notes
(0000575)
msbrown (manager)
2010-10-14 15:44

The WG will be applying Option #2 from the Desired Action field; this will both clarify that this behavior is indeed undefined in the specification and also clean up the text for tr in this case.

- Issue History
Date Modified Username Field Change
2010-09-29 15:09 eblake New Issue
2010-09-29 15:09 eblake Status New => Under Review
2010-09-29 15:09 eblake Assigned To => ajosey
2010-09-29 15:09 eblake Name => Eric Blake
2010-09-29 15:09 eblake Organization => Red Hat
2010-09-29 15:09 eblake User Reference => ebb.tr
2010-09-29 15:09 eblake Section => tr
2010-09-29 15:09 eblake Page Number => 3247
2010-09-29 15:09 eblake Line Number => 108366
2010-09-29 15:09 eblake Interp Status => ---
2010-10-14 15:44 msbrown Note Added: 0000575
2010-10-14 15:44 msbrown Resolution Open => Accepted As Marked
2010-10-14 15:46 msbrown Final Accepted Text => Note: 0000575
2010-10-14 15:46 msbrown Tag Attached: tc1-2008
2010-10-14 15:56 geoffclare Status Under Review => Resolved
2013-04-16 13:06 ajosey Status Resolved => Closed


Mantis 1.1.6[^]
Copyright © 2000 - 2008 Mantis Group
Powered by Mantis Bugtracker