Austin Group Defect Tracker

Aardvark Mark IV


Viewing Issue Simple Details Jump to Notes ] Issue History ] Print ]
ID Category Severity Type Date Submitted Last Update
0001785 [Issue 8 drafts] Shell and Utilities Objection Error 2023-10-28 04:09 2024-01-05 16:31
Reporter kre View Status public  
Assigned To
Priority normal Resolution Accepted As Marked  
Status Resolved   Product Version Draft 3
Name Robert Elz
Organization
User Reference
Section XCU 2.9.1.1
Page Number 2483
Line Number 80766-80778, 80790-80792
Final Accepted Text Note: 0006614
Summary 0001785: Conflict in specification of processing of declaration utilities
Description In XCU 2.9.1.1 bullet point 2, ut is said:

     The first word (if any) that is not a variable assignment or redirection
     shall be expanded. If any fields remain following its expansion, the
     first field shall be considered the command name. If no fields remain,
     the next word (if any) shall be expanded, and so on, until a command name
     is found or no words remain.

All that is fine and boring, then it continues:

     If there is a command name and it is recognized as a declaration utility,
     then any remaining words after the word that expanded to produce the
     command name, that would be recognized as a variable assignment in
     isolation, shall be expanded as a variable assignment [...]

(it goes on to what all of that means, which is not important here).

Note the required sequence, "the first word shall be expanded" ... [If
there is one and] "it is recognised as a declaration utility" ... "shall be
expanded as a variable assignment" ...

There is nothing optional about what is specified there, first expand
the word(s), then having found the command name, check if it (the result
of the expansion) is a declaration utility, and if so do the special processing
that is to be required of such things.

But later, after the bullet points, at lines 80790-80792 (right at the
bottom of page 2483) it says:

    When determining whether a command name is a declaration utility, an
    implementation may use only lexical analysis.

That isn't what the previous text seems to require to me.

    It is unspecified whether assignment context will be used if the
    command name would only become recognized as a declaration utility
    after word expansions.

To me, that looks to be very explicitly specified, as quoted above.



Desired Action Reconcile this nonsense.

Best would be to delete the notion of "declaration utilities" completely,
or at least make them optional (unspecified whether such things work).

They're never needed, one can always simply write

    export FOO
    FOO=whatever-I-like

and the assignment will be handled as a var-assign, without any magic
special rules (those two statements can be written in either order, except
for "readonly" where the assignment must come first), or if you prefer,
the following also works

    FOO=whatever-I-like export FOO

if you really need to do it all in one statement.

This declaration utility nonsense (the special rules for arg processing)
were added just to pacify people who don't understand the order in which
the shell parses commands in general, and the syntax of the parts. What's
more, including it, then leads to people wondering why (if we assume
a file named "foo bar" (without the quotes) exists in '.'

    dd if=~/foo* ...
or
    awk -v var=foo* ...

aren't parsed the same way, after all, they look just the same, your
average shell command line user has no idea what a "declaration utility"
might be. Must easier to explain that the special rules for things which
look like (and always are) var-assigns apply only to those which appear
before the command name, as soon as there is anything (other than a redirect)
all the special processing stops.

But in any case, this "shall do ..." followed immediately by "may be done
differently" needs to be fixed, one way or the other - either change bullet
point 2 to make all of what it says about finding declaration utilities
optional, or simply remove lines 90790-2, and require it to be implemented
as in bullet point 2.

[Aside: none of this means much to me, I have no intention of implementing
"declaration utilities" whichever scheme were to be adopted.]

Tags tc1-2024
Attached Files

- Relationships
related to 0001535Applied Issue 8 drafts Poor description of declaration (all really) utility argument processing 
related to 0001393Applied Issue 8 drafts 'command' should not be treated as a declaration utility 
related to 0000351Appliedajosey 1003.1(2008)/Issue 7 certain shell special built-ins should expand arguments in assignment context 

-  Notes
(0006557)
kre (reporter)
2023-10-28 05:48

This issue is very much related to 0001535 (the resolution to which
was applied in Feb 2022, which is well before Issue 8 Draft 3, so the
text from that bug resolution is what was considered here).

In 0001535 I pointed out this contradictory text, but that part of
the issue was completely ignored... It still needs fixing.
(0006559)
chet_ramey (reporter)
2023-10-30 14:07

Look at issue 1393 for a discussion about why it's acceptable to recognize such names lexically: so the parser can allow extended assignment syntax such as compound array assignment.

Since the NetBSD sh doesn't have those, you can ignore it.
(0006597)
geoffclare (manager)
2023-12-11 15:37

Suggested changes...

On page 2483 line 80769 section 2.9.1.1, change:
If there is a command name and it is recognized as a declaration utility, then any remaining words after the word that expanded to produce the command name, ...
to:
If there is a command name, the shell shall use one of the following methods to check whether the utility to be invoked is a declaration utility:
  • The value, prior to expansion, of the word that expanded to produce the command name is matched lexically against the names of declaration utilities.

  • The command name is matched lexically against the names of declaration utilities.
If the chosen method identifies the utility to be invoked as a declaration utility, then any remaining words after the word that expanded to produce the command name, ...

On page 2483 line 80778 section 2.9.1.1, change:
For all other command names, words after the word that produced the command name shall be subject only to regular expansion.
to:
If the utility to be invoked is not identified as a declaration utility, words after the word that produced the command name shall be subject only to regular expansion.

On page 2483 line 80790 section 2.9.1.1, delete:
When determining whether a command name is a declaration utility, an implementation may use only lexical analysis. It is unspecified whether assignment context will be used if the command name would only become recognized as a declaration utility after word expansions.
(0006601)
kre (reporter)
2023-12-11 23:19

Re Note: 0006597

The change proposed for line 80769 changes nothing, just adds more words.
Since the earlier sentences (lines 80766-80769) have already required
that the words, up to the command word, be expanded first, it really makes
no sense to allow matching the command word that expanded to become the
command word to be compared lexically when the expanded form is already
available. That benefits no-one, and isn't what I believe that any shell
which does "lexical" detection of these commands does or wants.

That is, shells that do lexical matching are going to fail to recognise

    E= ; $E export foo=whatever

as a declaration utility, as "$E" looks to be the command word position
at lexical analysis time (that is, when parsing) as the assignement to E
might not even have been seen yet, and furthermore, might sometimes be
E=echo, as in a case like

     fn()
     {
          $E export foo=whatever
     }

     E=echo fn
     E= fn

The text you're proposing requires the parser to parse the contents of the
function in two different ways, once for each of the two invocations of fn.

I doubt that you're going to find any shells which implement it that way.

Shells which don't need different parsing to handle the declaration utilities
can make this work, as all that changes is the method by which they expand
the remaining args. Shells which do require different parsing cannot do the
"find the command word by expanding the words until something non-empty is
found, and then ..."

The other two proposed changes are OK I believe, though the text being deleted
in the third change (line 80790) better describes the requirements of shells
which do only lexical analysis.


I would still very much like to change the word "shall" in line 80722 into "may"
though, that is

   shall be expanded as a variable assignment
into
    may be expanded as a variable assignment

That allows shells that believe this is required (for some reason - that is,
that users cannot simply write the assignment, followed by export/readonly/
or in the case of "local" for shells that have it (all of them) local first,
and the assignment after, if variable assignment syntax is required) to
still do what they are doing now, and those of us who believe this is all a
waste of time, to continue ignoring it.

It would require portable scripts to use the two command approach, rather
that one, but in a script, that's a fairly painless requirement. Interactive
users can use whatever their shell permits, as always.
(0006612)
shware_systems (reporter)
2023-12-18 16:54

Note also that since it is the shell recognizing and processing these assignments during lexical analysis this disagrees with the grammar that requires them to be considered operands to the utility, as they don't parse as io redirects. Either the grammar needs updating to allow ASSIGNMENT_WORD in cmd_suffix productions or the onus for updating the current environment is up to each declaration utility that sees an assignment as operands, it looks to me.

I also think it needs to be explicit the check for if it's a cmd_name has to occur after alias processing, because the aliases may expand to a declaration utility name. This may be implicit already by describing it within XCU 2.9.1.1, but some implementations might try to do otherwise.

Desirable but not entirely necessary, I think the standard should require an implementation-defined means of being able to add names to any list of names of declaration utilities built into a shell. Otherwise only declaration utilities added to the standard can qualify as portable.
(0006614)
geoffclare (manager)
2024-01-04 16:54

On page 2483 line 80766 section 2.9.1.1, change:
The first word (if any) that is not a variable assignment or redirection shall be expanded. If any fields remain following its expansion, the first field shall be considered the command name. If no fields remain, the next word (if any) shall be expanded, and so on, until a command name is found or no words remain. If there is a command name and it is recognized as a declaration utility, then any remaining words after the word that expanded to produce the command name, that would be recognized as a variable assignment in isolation, shall be expanded as a variable assignment (tilde expansion after the first <equals-sign> and after any unquoted <colon>, parameter expansion, command substitution, arithmetic expansion, and quote removal, but no field splitting or pathname expansion); while remaining words that would not be a variable assignment in isolation shall be subject to regular expansion (tilde expansion for only a leading <tilde>, parameter expansion, command substitution, arithmetic expansion, field splitting, pathname expansion, and quote removal). For all other command names, words after the word that produced the command name shall be subject only to regular expansion. All fields resulting from the expansion of the word that produced the command name and the subsequent words, except for the field containing the command name, shall be the arguments for the command.
to:
The first word (if any) that is not a variable assignment or redirection, and any subsequent words, shall be processed as follows:

  1. The first word may be matched lexically against the names of declaration utilities.

  2. The first word shall be expanded.

  3. If any fields remain following expansion of the first word, the first field shall be considered the command name. If no fields remain, the next word (if any) shall be expanded, and so on, until a command name is found or no words remain.

  4. If the above optional matching against the names of declaration utilities was not performed and there is a command name, the command name shall be matched lexically against the names of declaration utilities.

  5. If whichever of the matching operations that was performed produced a successful match, any remaining words after the word that expanded to produce the command name, that would be recognized as a variable assignment in isolation, shall be expanded as a variable assignment (tilde expansion after the first <equals-sign> and after any unquoted <colon>, parameter expansion, command substitution, arithmetic expansion, and quote removal, but no field splitting or pathname expansion); while remaining words that would not be a variable assignment in isolation shall be subject to regular expansion (tilde expansion for only a leading <tilde>, parameter expansion, command substitution, arithmetic expansion, field splitting, pathname expansion, and quote removal). If the matching operation did not produce a successful match, words after the word that produced the command name shall be subject only to regular expansion.

  6. All fields resulting from the expansion of the word that produced the command name and the subsequent words, except for the field containing the command name, shall be the arguments for the command.

[Note to the editor: use letters a, b, c, etc. for the above list.]


On page 2483 line 80790 section 2.9.1.1, delete:
When determining whether a command name is a declaration utility, an implementation may use only lexical analysis. It is unspecified whether assignment context will be used if the command name would only become recognized as a declaration utility after word expansions.
(0006616)
kre (reporter)
2024-01-05 16:31

That wording is better, but still misses the point (and the real reason
declaration utilities exist at all - when and how tilde expansion happens,
and field splitting doesn't, are minor side issues).

That is, for the shells which really need this concept (the ones with
extensions to POSIX - particularly arrays) the detection of what is a
declaration utility must be done before parse time (as part of tokenisation)
and cannot be deferred until the command is to be executed.

That's what the fn() example was intended to show (kind of) in Note: 0006601

That's what the old wording "may use only lexical analysis" was getting at,
it all happens (or is allowed to) before parsing of the command even starts,
so that the parser can use the correct rules to parse the command in question.
That is, the names of the declaration utilities are treated just like reserved
words, recognised at the same time as those are, so that "export" has a very
similar effect on the parsing as "if" or "case" would have.

While it is certainly true that a shell which implements only what POSIX says
must be implemented (in this area anyway) can implement it as described in
Note: 0006614 so that is technically sufficient - as shells that need more are
already outside the standard, it seems as if (given ksh is one of those shells)
that it might be reasonable to be a little more liberal in how the standard
says this has to be done - and perhaps explicitly allow for recognition of
the names of declaration utilities as if they were reserved words.

None of this makes any difference to me - the shell I'm currently looking
after doesn't have arrays (never will I hope) so doesn't need to fiddle the
syntax, and as that is the only good reason for declaration utilities to
exist as a thing, I won't be implementing them at all, without that syntax
requirement, they're just a waste of space.

- Issue History
Date Modified Username Field Change
2023-10-28 04:09 kre New Issue
2023-10-28 04:09 kre Name => Robert Elz
2023-10-28 04:09 kre Section => XCU 2.9.1.1
2023-10-28 04:09 kre Page Number => 2483
2023-10-28 04:09 kre Line Number => 80766-80778, 80790-80792
2023-10-28 05:48 kre Note Added: 0006557
2023-10-28 06:27 Don Cragun Relationship added related to 0001535
2023-10-28 06:28 Don Cragun Relationship added related to 0001393
2023-10-28 06:30 Don Cragun Relationship added related to 0000351
2023-10-30 14:07 chet_ramey Note Added: 0006559
2023-12-11 15:37 geoffclare Note Added: 0006597
2023-12-11 23:19 kre Note Added: 0006601
2023-12-18 16:54 shware_systems Note Added: 0006612
2024-01-04 16:54 geoffclare Note Added: 0006614
2024-01-04 16:56 geoffclare Final Accepted Text => Note: 0006614
2024-01-04 16:56 geoffclare Status New => Resolved
2024-01-04 16:56 geoffclare Resolution Open => Accepted As Marked
2024-01-04 16:57 geoffclare Tag Attached: tc1-2024
2024-01-05 16:31 kre Note Added: 0006616


Mantis 1.1.6[^]
Copyright © 2000 - 2008 Mantis Group
Powered by Mantis Bugtracker