Austin Group Defect Tracker

Aardvark Mark III


Viewing Issue Simple Details Jump to Notes ] Issue History ] Print ]
ID Category Severity Type Date Submitted Last Update
0001193 [1003.1(2016)/Issue7+TC2] Shell and Utilities Objection Omission 2018-04-23 15:26 2018-04-26 09:00
Reporter geoffclare View Status public  
Assigned To
Priority normal Resolution Open  
Status New  
Name Geoff Clare
Organization The Open Group
User Reference
Section 2.6, 2.7
Page Number 2353, 2360
Line Number 74994, 75303
Interp Status ---
Final Accepted Text
Summary 0001193: Brace expansion and {var}>file redirects in the shell
Description As far as I can tell, the standard does not permit the shell to perform brace expansion, as is current practice in some shells. I.e.:

echo {1,2}

must output:

{1,2}

Even if the expansion could produce something other than the original
word, it cannot produce multiple fields because of this statement in 2.6:
It is only field splitting or pathname expansion that can create multiple fields from a single word. The single exception to this rule is the expansion of the special parameter '@' within double-quotes, ...

Likewise, the standard does not permit the useful feature:
{var}>file
and related redirections because the grammar requires:
echo foo {var}>file
to be parsed such that {var} is a WORD to be expanded and passed to echo.
Desired Action On page 2346 line 74699 section 2.2 add:
{  ,  .  }
to the list of characters that might need to be quoted.

On page 2353 line 74994 section 2.6 change:
... expand to a single field. It is only field splitting or pathname expansion that can create multiple fields from a single word. The single exception to this rule is the expansion of the special parameter '@' within double-quotes, as described in [xref to 2.5.2].
to:
... shall expand to a single field, except as described below. The shell shall create multiple fields from a single word only as a result of field splitting, pathname expansion, or the following cases:
  1. Parameter expansion of the special parameter '@' within double-quotes, as described in [xref to 2.5.2], can create multiple fields from a single word.

  2. When the expansion occurs in a context where field splitting will be performed, a word that contains somewhere within it, before any expansions are applied, either:
    a. an unquoted <left-curly-bracket> ('{'), one or more unquoted <comma> (',') characters, and an unquoted <right-curly-bracket> ('}') in that order,

    or:

    b. one of the following sequences of characters, all unquoted:

    {start..end}

    {start..end..incr}

    where start and end are both either single characters or optionally signed decimal integers, and incr is an optionally signed decimal integer,
    and the <left-curly-bracket> ('{') is not immediately preceded by an unquoted <dollar-sign> ('$'), may be subject to an additional implementation-defined form of expansion that can create multiple fields from a single word. This expansion, if supported, shall be applied before all other expansions are applied. The other expansions shall then be applied to each field that results from this expansion.

  3. When a field that results from field splitting (see below) contains somewhere within it either:
    a. an unquoted <left-curly-bracket> ('{'), one or more unquoted <comma> (',') characters, and an unquoted <right-curly-bracket> ('}') in that order,

    or:

    b.one of the following sequences of characters, all unquoted:

    {start..end}

    {start..end..incr}

    where start and end are both either single characters or optionally signed decimal integers, and incr is an optionally signed decimal integer,
    may be subject to an additional implementation-defined form of expansion that can create multiple fields from a single field. This expansion, if supported, shall be applied immediately after field splitting has been applied in the sequence of standard expansions below. The remaining expansions shall then be applied to each field that results from this expansion.
Implementations may support only one of the implementation-defined forms of expansion described in 2 and 3 above, not both.

On page 2360 line 75303 section 2.7 add a new paragraph:
The shell may support an additional format used for redirection:

{name}redir-op word

where name is a valid shell variable name. If this format is supported its behavior is implementation-defined.

On Page 2376 line 75921 section 2.10.2 change:
[NAME in for]
to:
[NAME general case]

On Page 2380 line 76116 section 2.10.2 change:
io_redirect      :           io_file
                 | IO_NUMBER io_file
                 |           io_here
                 | IO_NUMBER io_here
                 ;
to:
io_redirect      :              io_file
                 | IO_NUMBER    io_file
                 | '{' name '}' io_file /* Optionally supported */
                 |              io_here
                 | IO_NUMBER    io_here
                 | '{' name '}' io_here /* Optionally supported */
                 ;

On XRAT page 3727 line 127859 section C.2.6 add a new first paragraph:
Some shells implement brace expansion which expands, for example, <tt>file{A,B,C}.c</tt> into the fields <tt>fileA.c</tt>, <tt>fileB.c</tt> and <tt>fileC.c</tt> or <tt>file{1..3}.c</tt> into the fields <tt>file1.c</tt>, <tt>file2.c</tt> and <tt>file3.c</tt>. This form of expansion is allowed but not required by this standard. It can be implemented at two different points in the standard expansion sequence: before all other expansions (as in bash) or following field splitting (as in ksh93).

On XRAT page 3735 line 128185 section C.2.7 add a new paragraph:
The limitation to 9 file descriptors is overcome in some shells via a form of redirection whereby a shell variable stores the file descriptor number. For example:
exec {fdvar}> foo
opens the file <tt>foo</tt> on a file descriptor greater than 9 and stores the file descriptor number in shell variable <tt>fdvar</tt>. (This can later be closed using <tt>exec {fdvar}>&-</tt>.) This form of redirection is allowed but not required by this standard.

On XRAT page 3747 line 128638 section C.2.10.2 add a new first paragraph:
The optional redirection syntax:

{name}redir-op word

(see [xref to XCU 2.7]) is accommodated in the grammar rules by two optional elements in <tt>io_redirect</tt>. Without these, the grammar would not permit this form of redirection because it would require that, for example, <tt>echo {var}> foo</tt> is parsed such that <tt>{var}</tt> is a WORD to be expanded and passed to echo.
Tags No tags attached.
Attached Files

- Relationships
related to 0001123Resolved 1003.1(2013)/Issue7+TC1 Problematic specification of execution environment for word expansions 

-  Notes
(0003989)
stephane (reporter)
2018-04-23 17:47

Oh! I hadn't realised that:

$ a="x{1,2}" ksh93 -c 'echo $a'
x1 x2


One more reason to quote all expansions.

The "result of field splitting" part in "3" suggests that:

echo "{a,b}"

could output something other than {a,b}. Note that while:

echo {a","b}

disables brace expansion in all shells,

echo {a".."b}

doesn't in all (with ksh93, that seems to depend on the quoting operator).

See also ksh93's {1..10..2%02d}

About:

> {name}redir-op word

One can't restrict "name" to valid variable names as bash and ksh93 support
{array[1]}> file, also:

$ zsh -c 'exec {1}>&2; echo $1'
12
$ ksh -c 'exec {@@@}>&2; echo $1'
ksh: @@@: invalid variable name


So the "{...}" needs to be quoted for a lot more cases than just "{name}".
(0003990)
stephane (reporter)
2018-04-23 18:12

> Oh! I hadn't realised that:
>
> $ a="x{1,2}" ksh93 -c 'echo $a'
> x1 x2

That's also the case in pdksh.

See also:

$ ksh +o braceexpand -c 'echo $(echo "{a,b}")'
a b
$ a='{a,b}' ksh +o braceexpand -c 'echo x$a'
xa xb


I'd be of the opinion of treating that pdksh/ksh93 behaviour as a bug as it can't be disabled which renders word splitting and filename generation upon expansions useless.

To me, brace expansion should not occur unless the "{" and "}" are literal and unquoted.
(0003991)
stephane (reporter)
2018-04-23 20:58

Oh, yet another surprise, in both pdksh and ksh93, brace expansion upon other expansions is actually disabled with the noglob option (set -f).


noglob disables literal brace expansions in pdksh but not in ksh93.

It's hard to make any sense of the ksh93 behaviour.

$ a={a,b} ksh93 -fc 'echo $a'
{a,b}
$ a={a,b} mksh -fc 'echo $a'
{a,b}
$ a={a,b} mksh -fc 'echo {a,b}'
{a,b}
$ a={a,b} ksh93 -fc 'echo {a,b}'
a b
(0003992)
stephane (reporter)
2018-04-23 21:18
edited on: 2018-04-23 21:22

Which brings the question: how do you store in a variable a glob pattern that is meant to match on files whose name starts with {,} in ksh93?

var='{,}*'; ls -ld $var is like ls -ld * *
var='\{\,\}*'           is like ls -ld '\\'* '\\'*
var='[{][,][}]*'        is like ls -ld [][]* [][]*


it doesn't look like there's any way around it.

(0003993)
geoffclare (manager)
2018-04-24 09:00

Re: Note: 0003992 looks like you have to resort to eval:

var='"{,}"*'; eval "ls -ld $var"

It seems to be a consequence of the design decision to do brace expansion after field splitting in ksh93. I doubt if it would be a problem in practice.
(0003995)
geoffclare (manager)
2018-04-25 15:26
edited on: 2018-08-10 15:50

New proposed changes based on email discussion...

On page 2346 line 74699 section 2.2 add:
{  ,  }
to the list of characters that might need to be quoted and add a small-font note after the list:
Note: a future version of this standard may extend the conditions under which these characters are special. Therefore applications should quote them whenever they are intended to represent themselves. This does not apply to <hyphen-minus> ('-') since it is in the portable filename character set.
(Note to the editor: this last sentence assumes that '-' is added by 0001191.)

On page 2353 line 74994 section 2.6 change:
... expand to a single field. It is only field splitting or pathname expansion that can create multiple fields from a single word. The single exception to this rule is the expansion of the special parameter '@' within double-quotes, as described in [xref to 2.5.2].
to:
... shall expand to a single field, except as described below. The shell shall create multiple fields or no fields from a single word only as a result of field splitting, pathname expansion, or the following cases:
  1. Parameter expansion of the special parameters '@' and '*', as described in [xref to 2.5.2], can create multiple fields or no fields from a single word.

  2. When the expansion occurs in a context where field splitting will be performed, a word that contains all of the following somewhere within it, before any expansions are applied, in the order specified:

    • an unquoted <left-curly-bracket> ('{') that is not immediately preceded by an unquoted <dollar-sign> ('$'),

    • one or more unquoted <comma> (',') characters or a sequence that consists of two adjacent <period> ('.') characters surrounded by other characters (which can also be <period> characters), and

    • an unquoted <right-curly-bracket> ('}')

    may be subject to an additional implementation-defined form of expansion that can create multiple fields from a single word. This expansion, if supported, shall be applied before all the other word expansions are applied. The other expansions shall then be applied to each field that results from this expansion.

On page 2360 line 75303 section 2.7 add a new paragraph:
The shell may support an additional format used for redirection:

{location}redir-op word

where location is non-empty and indicates a location where an integer value can be stored, such as the name of a shell variable. If this format is supported its behavior is implementation-defined.

On Page 2375 line 75883 section 2.10.1 change:
2. [...] shall be returned.

3. Otherwise, the token identifier TOKEN results.
to:
2. [...] shall result.

3. If the string contains at least three characters, begins with a <left-curly-bracket> ('{') and ends with a <right-curly-bracket> ('}'), and the delimiter character is one of '<' or '>', the token identifier IO_LOCATION may result; if the result is not IO_LOCATION, the token identifier TOKEN shall result.

4. Otherwise, the token identifier TOKEN shall result.

On Page 2377 line 75968 section 2.10.2 add:
%token IO_LOCATION

On Page 2380 line 76116 section 2.10.2 change:
io_redirect      :           io_file
                 | IO_NUMBER io_file
                 |           io_here
                 | IO_NUMBER io_here
                 ;
to:
io_redirect      :             io_file
                 | IO_NUMBER   io_file
                 | IO_LOCATION io_file /* Optionally supported */
                 |             io_here
                 | IO_NUMBER   io_here
                 | IO_LOCATION io_here /* Optionally supported */
                 ;

Cross-volume changes to XRAT...

On page 3718 line 127432 section C.2.2 change:
There is no additional rationale provided for this section.
to:
Although this section contains a note indicating that a future version of this standard may extend the conditions under which some characters are special, there are no plans to do so. The note is there to encourage application writers to future-proof their shell code. In some cases existing widespread use of the characters unquoted would preclude them being given a special meaning in those use cases. For example, commas are in widespread use in filenames (notably by RCS and CVS) and it is common to pass the token "{}" as an argument to find and xargs unquoted.

On page 3727 line 127859 section C.2.6 add a new first paragraph:
Some shells implement brace expansion which expands, for example, <tt>file{A,B,C}.c</tt> into the fields <tt>fileA.c</tt>, <tt>fileB.c</tt> and <tt>fileC.c</tt> or <tt>file{1..3}.c</tt> into the fields <tt>file1.c</tt>, <tt>file2.c</tt> and <tt>file3.c</tt>. This form of expansion is allowed but not required by this standard, but if supported must be performed before all of the standard word expansions. A variant which some shells implement whereby brace expansion is performed following field splitting was considered by the standard developers and rejected because it causes surprising behavior if the results of parameter expansion and command substitution happen to produce a valid brace expansion. For example, if the shell variable <tt>patt</tt> contains an arbitrary pathname glob pattern applications cannot rely on <tt>some_command -- $patt</tt> passing a list of pathnames that match the pattern to <tt>some_command</tt>. Note that quoting the braces or commas prevents this form of expansion, but quoting the periods need not prevent it.

On page 3735 line 128185 section C.2.7 add a new paragraph:
The limitation to 9 file descriptors is overcome in some shells via a form of redirection whereby a shell variable stores the file descriptor number. For example:
exec {fdvar}> foo
opens the file <tt>foo</tt> on a file descriptor greater than 9 and stores the file descriptor number in shell variable <tt>fdvar</tt>. (This can later be closed using <tt>exec {fdvar}>&-</tt>.) This form of redirection is allowed but not required by this standard.

On page 3746 line 128634 section C.2.10 add a new paragraph:
The optional redirection syntax:

{name}redir-op word

(see [xref to XCU 2.7]) is accommodated in the grammar rules by the optional IO_LOCATION token identifier and two correspondingly optional elements in <tt>io_redirect</tt>. Without these, the grammar would not permit this form of redirection because it would require that, for example, <tt>echo {var}> foo</tt> is parsed such that <tt>{var}</tt> is a WORD to be expanded and passed to echo. The grammar does not restrict the location given between the '{' and '}' in these forms (other than requiring it to be non-empty) since shells may parse an invalid location as part of an <tt>io_redirect</tt> and later treat the invalid location as an error.


(0003996)
stephane (reporter)
2018-04-25 16:54

Re: Note: 0003995

Thanks for rejecting the ksh93/pdksh behaviour (whereby brace expansion is performed upon the result of expansions) as I agree that's not workable.

A few notes on that new text:

> On page 2346 line 74699 section 2.2 add:
>
> { , }
>
> to the list of characters that might need to be quoted and add a small-font note after the list:
>
> Note: a future version of this standard may extend the conditions under which these characters are special. Therefore applications should quote them whenever they are intended to represent themselves.

Note that I don't expect anyone having to escape "," independently of "{" and "}". "," is common in file names (RCS) and in English text passed to echo (as in echo this, this and that).

It seems that as currently worded, the text requires one to quote the "{" or "," in {${var#,} though that's not required in any implementation.

It's not clear what should happen for

echo ${var#{a,b}}

either and there is some variation in practice:

~$ a='{a,b}c' bash -c 'echo ${a#{a,b}}'
}c}
~$ a='{a,b}c' ksh -c 'echo ${a#{a,b}}'
c
~$ a='{a,b}c' zsh -c 'echo ${a#{a,b}}'
c
~$ a='{a,b}c' mksh -c 'echo ${a#{a,b}}'
}c}
~$ a='{a,b}c' yash -c 'echo ${a#{a,b}}'
}c}
~$ a='{a,b}c' yash -o braceexpand -c 'echo ${a#{a,b}}'
}c}
(0003998)
chet_ramey (reporter)
2018-04-25 21:15

Re: http://austingroupbugs.net/view.php?id=1193#c3996 [^]

Bash doesn't perform brace expansion inside ${...}. It's been that way since bash-3.0 (change first made in August, 2002).
(0004001)
geoffclare (manager)
2018-04-26 09:00
edited on: 2018-04-27 09:34

Re Note: 0003996

> Note that I don't expect anyone having to escape "," independently of "{" and "}". "," is common in file names (RCS) and in English text passed to echo (as in echo this, this and that).

I still think comma needs to be in that list, because quoting the comma in {a","b} does prevent brace expansion and the standard should reflect this.

Perhaps the C.2.2 rationale change should include comma in filenames as another example of widespread use that would preclude it being given a special meaning in future. This would also preclude it being made special in the echo example because to the shell both cases are just an argument being passed to a utility.

Update: Note: 0003995 has been edited to include commas in filenames in the C.2.2 change.

> It seems that as currently worded, the text requires one to quote the "{" or "," in {${var#,} though that's not required in any implementation.

True, but I don't see that as a problem since the proposal includes advice to application writers to quote '{' everywhere (except in the word {}) if it is to represent itself, in order for scripts to be future-proof.

> It's not clear what should happen for
>
> echo ${var#{a,b}}
>
> either and there is some variation in practice

I think the variation is accommodated by the proposal through its use of "implementation-defined".


- Issue History
Date Modified Username Field Change
2018-04-23 15:26 geoffclare New Issue
2018-04-23 15:26 geoffclare Name => Geoff Clare
2018-04-23 15:26 geoffclare Organization => The Open Group
2018-04-23 15:26 geoffclare Section => 2.6, 2.7
2018-04-23 15:26 geoffclare Page Number => 2353, 2360
2018-04-23 15:26 geoffclare Line Number => 74994, 75303
2018-04-23 15:26 geoffclare Interp Status => ---
2018-04-23 15:32 geoffclare Desired Action Updated
2018-04-23 17:47 stephane Note Added: 0003989
2018-04-23 18:12 stephane Note Added: 0003990
2018-04-23 20:58 stephane Note Added: 0003991
2018-04-23 21:18 stephane Note Added: 0003992
2018-04-23 21:22 stephane Note Edited: 0003992
2018-04-24 09:00 geoffclare Note Added: 0003993
2018-04-25 15:26 geoffclare Note Added: 0003995
2018-04-25 15:30 geoffclare Note Edited: 0003995
2018-04-25 15:35 geoffclare Note Edited: 0003995
2018-04-25 16:54 stephane Note Added: 0003996
2018-04-25 21:15 chet_ramey Note Added: 0003998
2018-04-26 09:00 geoffclare Note Added: 0004001
2018-04-27 09:31 geoffclare Note Edited: 0003995
2018-04-27 09:32 geoffclare Note Edited: 0003995
2018-04-27 09:34 geoffclare Note Edited: 0004001
2018-05-01 10:19 geoffclare Note Edited: 0003995
2018-08-10 09:12 geoffclare Relationship added related to 0001123
2018-08-10 15:49 geoffclare Note Edited: 0003995
2018-08-10 15:50 geoffclare Note Edited: 0003995


Mantis 1.1.6[^]
Copyright © 2000 - 2008 Mantis Group
Powered by Mantis Bugtracker