Austin Group Defect Tracker

Aardvark Mark IV


Viewing Issue Simple Details Jump to Notes ] Issue History ] Print ]
ID Category Severity Type Date Submitted Last Update
0001235 [1003.1(2016/18)/Issue7+TC2] Shell and Utilities Objection Enhancement Request 2019-03-09 00:46 2019-11-14 14:31
Reporter stephane View Status public  
Assigned To
Priority normal Resolution Accepted As Marked  
Status Applied  
Name Stephane Chazelas
Organization
User Reference
Section 2.9.4.3 Case Conditional Construct
Page Number
Line Number
Interp Status ---
Final Accepted Text Note: 0004434
Summary 0001235: explicitly prohibit strcmp fallback in case statement
Description (another follow up to bug:1190)

In the Bourne shell, ksh88 and ksh93:

    case [ab] in
      [ab]) echo match
    esac

outputs "match" which is quite surprising and dangerous as it could bypass input validations like:

    case $1 in
      [0123456789]) : OK;;
      *) echo >&2 not a decimal digit; exit 1;;
    esac
   
Possibly the rationale was to align with another (mis)feature introduced by the Bourne shell, where:

    rm [ab]

would remove the [ab] file if no file matched the pattern (instead of cancelling the command in earlier sh implementations (and csh, tcsh, fish zsh)).

POSIX currently doesn't allow that ksh88/ksh93 behaviour. But since it is a deviation from the reference implementation and since most certified systems whose shell is based on AT&T ksh still have that non-conformance, it would be nice to make it explicit that that behaviour is not allowed.
Desired Action Add a conformance test case that rejects that behaviour.

Add a rationale section stating something like:

The Bourne and Korn shells used to revert to a byte to byte comparison when wildcard patterns didn't match in a "case" statement, that behaviour was considered undesirable and is not allowed by this specification.
Tags tc3-2008
Attached Files

- Relationships

-  Notes
(0004294)
kre (reporter)
2019-03-11 02:17

In case it is not obvious from other notes (attached to other issues) and
from messages on the mailing list, I completely agree with this - any shell
which allows a strcmp() of the pattern and word to be considered a match
is simply abhorrent (regardless of how ancient this practice was). There is
no need for it - one can always simply do
    case word in
    ( pattern | "pattern" ) ... ;;
    esac
if it is intended to match a pattern as either a pattern or a string.

I might go a little further, and actually add text to the normative part
of the standard to explicitly outlaw this practice, something like

    The shell shall not treat a pattern as a string and apply an additional
    match against the pattern treated as if it contained no wildcard characters.

except with better wording.

In any case, we need to make it quite clear that shells which do this (whatever
their heritage) are non-conforming, and that it is not required of an
application (script) to attempt to defeat this behaviour, with code like

    case word in
    ( "pattern" ) the non-match code here;;
    ( pattern ) the matching code here;;
    ( * ) the non-match code here ... again;;
    esac

as that's revolting, and not always easy to accomplish. Even ;& and
such are no help at all for this.

The only thing in the description I disagree with is labelling of the
glob behaviour of returning an unmatched pattern as a literal string.
Dealing with that is much easier than dealing with the consequences of
producing an error in the case of an unmatched pattern.
(0004299)
stephane (reporter)
2019-03-11 07:05

Note that in ksh93, the strcmp() seems to be done after backslash removal:

$ a='[a]' ksh -c 'case $a in $a) echo match; esac'
match
$ a='\a' ksh -c 'case $a in $a) echo match; esac'
$ a='[a]' b='[\a]' ksh -c 'case $a in $b) echo match; esac'
match
(0004434)
geoffclare (manager)
2019-06-20 10:54

Suggested change:

On page 3744 line 128516 section C.2.9.4 Compound Commands, add a new paragraph:

Some historical shells would fall back to doing a byte to byte comparison with each pattern if the pattern matching rules did not produce a match. That behavior is not allowed by this standard because it allows user input to bypass input validations like:
    case $1 in
      [0123456789]) : OK;;
      *) echo >&2 not a decimal digit; exit 1;;
    esac

- Issue History
Date Modified Username Field Change
2019-03-09 00:46 stephane New Issue
2019-03-09 00:46 stephane Name => Stephane Chazelas
2019-03-09 00:46 stephane Section => 2.9.4.3 Case Conditional Construct
2019-03-11 02:17 kre Note Added: 0004294
2019-03-11 07:05 stephane Note Added: 0004299
2019-03-11 15:53 eblake Interp Status => ---
2019-03-11 15:53 eblake Summary explicitely prohibit strcmp fallback in case statement => explicitly prohibit strcmp fallback in case statement
2019-06-20 10:54 geoffclare Note Added: 0004434
2019-06-20 15:28 geoffclare Final Accepted Text => Note: 0004434
2019-06-20 15:28 geoffclare Status New => Resolved
2019-06-20 15:28 geoffclare Resolution Open => Accepted As Marked
2019-06-20 15:28 geoffclare Tag Attached: tc3-2008
2019-11-14 14:31 geoffclare Status Resolved => Applied


Mantis 1.1.6[^]
Copyright © 2000 - 2008 Mantis Group
Powered by Mantis Bugtracker