Austin Group Defect Tracker

Aardvark Mark III


Viewing Issue Simple Details Jump to Notes ] Issue History ] Print ]
ID Category Severity Type Date Submitted Last Update
0001276 [1003.1(2013)/Issue7+TC1] Shell and Utilities Objection Error 2019-07-30 13:11 2019-08-22 06:37
Reporter stephane View Status public  
Assigned To
Priority normal Resolution Open  
Status New  
Name Stephane Chazelas
Organization
User Reference
Section 2.10.2 shell grammar rules
Page Number
Line Number
Interp Status ---
Final Accepted Text
Summary 0001276: incorrect resolution in 0000839
Description As already raised in 0001094 and 0001100 (though the 0000839 origin had not been identified then), now closed as rejected, the resolution of 839 broke rule 7.

Now 7a becomes redundant, and 7b is no longer useful to qualify the difference between a a cmd_name and cmd_word.

The difference as seen in earlier versions of the spec was to specify that keywords are not to be recognised as such when following redirections or assignments. With the 839 change, a "a=1 for bar" can no longer be parsed as a simple command as rule 7b now defers to rule 1 which means "for" does not give a "WORD" token any more. That defeats the point of having a cmd_word vs cmd_name distinction.

If the point of 839 was to allow implementations to have keywords that contain = characters, then only 7a should have been modified to say: "if the token is a reserved word, return the token for that reserved word, otherwise apply 7b".
Desired Action Either undo the change for 839 (except for the "else" part) or go with the simpler/clearer grammar approach (at least when it comes to this particular issue) suggested in 0001100 or change the whole rule 7 to:


    7. [Assignment preceding command name]

         a. [When the first word]

            If the TOKEN is exactly a reserved word, the token identifier for that reserved word shall result. Otherwise, 7b shall be applied.

         b. [Not the first word]

            If the TOKEN contains an unquoted (as determined while applying rule 4 from Token Recognition) <equals-sign> character that is not
            part of an embedded parameter expansion, command substitution, or arithmetic expansion construct (as determined while applying rule
            5 from Token Recognition):

               • If the TOKEN begins with '=', then the token WORD shall be returned.

               • If all the characters in the TOKEN preceding the first such <equals-sign> form a valid name (see XBD Name), the token
                 ASSIGNMENT_WORD shall be returned.

               • Otherwise, it is unspecified whether the WORD or ASSIGNMENT_WORD is returned.

            Otherwise, the token WORD shall be returned.

       Assignment to the name within a returned ASSIGNMENT_WORD token shall occur as specified in Simple Commands.
Tags No tags attached.
Attached Files

- Relationships
related to 0001100Closed 1003.1(2016)/Issue7+TC2 Rewrite of Section 2.10 Shell Grammar, of the Shell Standard, to fix previous reports, fix new issues, and improve presentation. 
child of 0000839Closed 1003.1(2013)/Issue7+TC1 problems with reduction of WORD to ASSIGNMENT_WORD 

-  Notes
(0004501)
stephane (reporter)
2019-07-30 13:16

Note that rule 8 (used in the fname production) defers to 7, but doesn't really need to.

What matters there is that a NAME token be not returned when the token doesn't form a valid variable name, so whether rule 7 classifies it as WORD or ASSIGNMENT_WORD doesn't really matter as long as it's not the NAME token
(0004508)
stephane (reporter)
2019-08-05 14:23

We may also want to rename "cmd_name" to something else as it's potentially misleading.

In

cmd arg

cmd is the WORD token identified as "cmd_name"

In


var=value < file cmd arg

cmd is identified as "cmd_word".

In those two examples, "cmd" is the "name of the command" being executed, but neither cmd_word nor cmd_name have to be the command's name like in $(echo cmd arg1) arg2 where the cmd_name is $(echo cmd arg1) but the command's name is "cmd" (assuming the default value of $IFS) or dryrun=; $dryrun cmd arg where cmd_name is $dryrun but the command name is "cmd".

The distinction between cmd_name and cmd_word here is about the token having different constraints when it's preceded by redirections/assignments and not (namely whether keywords are allowed).

Maybe "cmd_word_no_keyword" would be a better wording for "cmd_name".
(0004509)
stephane (reporter)
2019-08-05 14:23

One could also argue that forcing shells to interpret keywords as WORDs when preceded by assignments/redirections is not particularly useful.

Nobody's going to write:

foo=bar for arg

And expect that "for" to be looked up in $PATH.

On the other hand, a shell implementation may want to allow:

2> /dev/null [[ $a -eq $b ]]

Or

TIMEFMT=3 time cmd
...

Which would help with consistency, but is currently not allowed by POSIX as POSIX requires those to be interpreted as simple commands.
(0004530)
geoffclare (manager)
2019-08-19 10:05
edited on: 2019-08-19 10:12

The new wording for rule 7 suggested in the desired action looks good to me (but noting it has further changes proposed in bug 0001279), although it is missing the word "token" in the third bullet item, which should be:
Otherwise, it is unspecified whether the token WORD or ASSIGNMENT_WORD is returned.


Re Note: 0004508 either I'm confused or you have the new names the wrong way round - cmd_word is the one that can't be a keyword. So a new name for cmd_name should imply that it can be a keyword, not that it can't.

Re Note: 0004509 how would the rule 7 wording in the desired action need to be changed if we want to allow either behaviour?

(0004532)
stephane (reporter)
2019-08-20 18:34

Re: Note: 0004530

About Note: 0004508, yes sorry my bad.

About Note: 0004509

That could be: change:

> simple_command : cmd_prefix cmd_word cmd_suffix
> | cmd_prefix cmd_word
> | cmd_prefix
> | cmd_name cmd_suffix
> | cmd_name
> ;
> cmd_name : WORD /* Apply rule 7a */
> ;
> cmd_word : WORD /* Apply rule 7b */
> ;
> cmd_prefix : io_redirect
> | cmd_prefix io_redirect
> | ASSIGNMENT_WORD
> | cmd_prefix ASSIGNMENT_WORD

to (including 0001094 resolution (also included in 0001279)):

> simple_command : cmd_prefix cmd_word cmd_suffix
> | cmd_prefix cmd_word
> | cmd_prefix
> | cmd_word cmd_suffix
> | cmd_word
> ;
> cmd_word : WORD /* Apply rule 7 */
> ;
> cmd_prefix : io_redirect
> | cmd_prefix io_redirect
> | ASSIGNMENT_WORD /* Apply rule 7 */
> | cmd_prefix ASSIGNMENT_WORD /* Apply rule 7 */
> cmd_suffix : io_redirect
> | cmd_suffix io_redirect
> | WORD
> | cmd_suffix WORD

and rule 7 to:

> 7. [Assignment preceding command name]
>
> * If the TOKEN is exactly a reserved word, the token
> identifier for that reserved word shall result.
>
> * Otherwise
>
> If the TOKEN contains an unquoted (as determined
> while applying rule 4 from Token Recognition)
> <equals-sign> character that is not part of an
> embedded parameter expansion, command substitution,
> or arithmetic expansion construct (as determined
> while applying rule 5 from Token Recognition):
>
> • If the TOKEN begins with '=', then the token
> WORD shall be returned.
>
> • If all the characters in the TOKEN preceding
> the first such <equals-sign> form a valid name
> (see XBD Name), the token ASSIGNMENT_WORD shall
> be returned.
>
> • Otherwise, it is unspecified whether the WORD
> or ASSIGNMENT_WORD is returned.
>
> * Otherwise, the token WORD shall be returned.
>
> Assignment to the name within a returned ASSIGNMENT_WORD
> token shall occur as specified in Simple Commands.

And remove the "Otherwise, rule 7 applies" from rule 8 which doesn't make much sense (rule 7 will never yield a NAME token). Or replace with "Otherwise, a UNSPECIFIED token is returned" (see 0001279).


That would specify that "foo=bar for arg" is not any more a valid POSIX sh "simple command" than "for arg".

And foo=bar() { blah; } would still not be a valid sh function declaration, as foo=bar would still not be seen as a NAME token.
(0004534)
stephane (reporter)
2019-08-22 06:37

Note that 0000351 (about [command [-p]] export/readonly treating what looks like ASSIGNMENT_WORD specially) is related, but it doesn't seem like the proposed resolutions here would affect it.

That bug should be kept in mind when touching rule 7.

- Issue History
Date Modified Username Field Change
2019-07-30 13:11 stephane New Issue
2019-07-30 13:11 stephane Name => Stephane Chazelas
2019-07-30 13:11 stephane Section => 2.10.2 shell grammar rules
2019-07-30 13:16 stephane Note Added: 0004501
2019-07-30 14:26 eblake Relationship added child of 0000839
2019-07-30 14:27 eblake Relationship added related to 0001100
2019-08-05 14:23 stephane Note Added: 0004508
2019-08-05 14:23 stephane Note Added: 0004509
2019-08-19 10:05 geoffclare Note Added: 0004530
2019-08-19 10:12 geoffclare Note Edited: 0004530
2019-08-20 18:34 stephane Note Added: 0004532
2019-08-22 06:37 stephane Note Added: 0004534


Mantis 1.1.6[^]
Copyright © 2000 - 2008 Mantis Group
Powered by Mantis Bugtracker