Austin Group Defect Tracker

Aardvark Mark IV

Viewing Issue Simple Details Jump to Notes ] Issue History ] Print ]
ID Category Severity Type Date Submitted Last Update
0001813 [Issue 8 drafts] Shell and Utilities Editorial Error 2024-02-16 14:42 2024-04-18 16:25
Reporter kre View Status public  
Assigned To
Priority normal Resolution Accepted As Marked  
Status Resolution Proposed   Product Version
Name Robert Elz
User Reference
Section XCU 3 / xargs
Page Number 3600-3603
Line Number 123177-8 123183 123184-6 123228-32 123233-7 123252-3 123263 123304-6
Final Accepted Text Note: 0006767
Summary 0001813: generic xargs description cleanups
Description Since "xargs" has "been in the news" recently... And while this issue is
specified to apply to Issue 8 Draft 4 (which is where the page/line numbers
are from) I would assume it would eventually be moved to Issue 8, after it
is published, and probably be considered for Issue 8 TC 1.

Most of this is just text that should be worded better, but there are a
few omissions that ought to be included.

Lines 123177-8 - two different issues here. First, arguments are said
to be delimited by an alternation of 3 terms - unquoted <blank>, unescaped
<blank> or newline. The problem is that nothing quoted can be escaped,
and nothing escaped can be quoted, hence any <blank> that appears is either
an unquoted <blank> or an unescaped <blank> and hence is a delimiter character
according to that definition.

Second (same lines, but continuing perhaps) we have quoting, and escaping,
but nothing here actually says that the quoting " or ' chharacters, or the
excaping \ character are removed from the arg text - which is what I assume
should happen, ie: in "abc def'"'ghi\jkl'\ m I would assume the argument
string is to be abc def'ghi\jkl m with the quoting and escaping
characters removed. But nothing says that, one could also read the text
as including those chars in the arg string, with the quoting or escaping
simply avoiding enclosed <blank> characters from being delimiters (and
quoted or escaped quote or escape characters from being quote or escape

Line 123183 "Any unquoted character" can be escaped, must include <newline>
as that's certainly a character, and isn't allowed to be quoted (incidentally
nothing says what is to be done if a <newline> appears after an initial
opening quote before its companion closing quote - that might be intended to
be treated as an error, or the <newline> might just terminate the quoting,
or the <newline> might just be a single unquoted character (which would be
a delimiter if -0 is not used) in the middle of otherwise quoted text, so
would be two args, "abc" (and any invisible here following <blank> chars),
and those leading <blank> chars followed by "def". Unlikely, but possible.

That was a side issue ... for line 123183, the issue is what does an escaped
<newline> mean - at line 123178 it doesn't say that only non-escaped <newline>
chars are delimiters, all <newline> characters are,. About all I can see
about escaped <newline> is that the results are unspecified if the eof string
follows one of those. I might guess that an escaped <newline> is intended to
be removed from the input (both the escape and the <newline> but at the minute
I don't see where anything says that.

Lines 123228-32: In the first bullet point, if -s is not specified, then
if number generates a command line longer than LINE_MAX but shorter than
{ARG_MAX}-2048 then it seems to imply that less than number args must be
used, to keep the command line length shorter than LINE_MAX. Why? That
seems like a thinko, lines 123202-3 just say that the default command line
length is at least LINE_MAX - in most cases it will be considerably larger.

Then, same lines, the second bullet point says that fewer args (that number)
shall be used if the last iteration has fewer than number operands remaining
(that makes sense) - but not if there are zero. A strict reading of that
would result in the interpretation that the last invocation in that case must
be padded to have number args ... (no idea how to accomplish that) - but
that's clearly not what it intended to say. More importantly, if there are
zero args remaining, the previous iteration wouild have been the last, this
one would not exist. Normally - there's still the case where the last
iteration is the first iteration, and -r was not given. In that case, one
would assume that the utility should be run with no args (as it would if there
was no -n option given) but that doesn't reconcile with the description of -n.

Lines 123233-7: There should be an XREF to XBD 3.7 attached to "an
affirmative response".

[Aside: I expect this is just because this is how existing implementations
behave, but if xargs is going to be opening /dev/tty to read the response
to the prompt, why is it not writing the trace output and prompt string
to /dev/tty instead of to stderr (which might have been redirected, in order
to redirect stderr for the utility invocations). Very odd. Note this would
be just for -p tracing/prompt, if just -t is used, stderr is fine.]

Lines 123252-3: (-t) "Each generated command line shall be written to standard
error just prior to invocation" - Really? The "command lines" generated are
actually arg lists for exec, with the args terminated by nul bytes, and with
no delimiting text of any kind. That is to be written to stderr? Amazing.
Surely there should be spaces inserted between the args, and as it is a
"command line" one would assume that a newline should also be appended.
But what of <blank> characters that are not separators, should those be
escaped or something, or is the user just supposed to guess? And what
about embedded <newline> characters, which are possible if -0 was used.

Line 123263: "utility" is "The name of the utility to be invoked, found
by search path using the PATH"... Really? That's all that is permitted?
No fully specified paths to the utility, no relative paths to "." (like
"bin/command" - everything must be found by a search of PATH) ??

Lines 123304-6: "If -p is specified, a prompt of the following format
shall be written (in the POSIX locale)" ... What's that supposed to
mean. A stupidly literal reading might assume that the prompt is
somehow intended to be written somewhere into the POSIX locale (whatever
that might mean) but that's clearly not right, there's enough text elsewhere
to make it clear the output goes to stderr - but what does that parenthesised
phrase mean? Does it mean this specification only applies if the current
locale is POSIX, and what happens for other locales is unspecified? Or
perhaps it means that the current locale must be switched to POSIX to write
that string (and then presumably switched back again). In any case it
is not clear.
Desired Action For the first, perhaps change the alternation to just two terms, the first
being something like "an unquoted and unescaped <blank>" (and then "or
<newline>" just like now.

For the quoting stuff, explicitly say that the quoting and escaping characters
do not form part of the argument string (if they're quoted or escaped themselves
they're not quoting or escaping chars, so nothing really needs to be said
about that).

No idea what escaped <newline> is intended to happen, but I'd guess something
like "An escape <newline> pair of characters shall be removed from the input
and not delimit an argument string", assuming that is what is to happen.

Also say something, but here I have no idea what, about what happens if
quoting is ongoing when a <newline> is encountered.

For the paragraph at lines 123184-6, I'd write the paragraph starting at
line 123176 something more like (here I'll abbreviate, not include every
word, but I don't intend to change anything - just type less here!)

   If the -0 option is not specified, the application ... are delimited by
   a sequence of one or more unquoted and unescaped <blank> characters, or
   <newline> characters, adjacent delimiter characters shall be treated as
   a single delimiter (not produce empty arguments for the utility). Note
   that if the input is not empty and does not end in a <newline> the
   behaviour is undefined (because...). Quoting and escaping shall be
   interpreted as follows, with any quote or escape characters removed
   after they have been processed. (Then the three bullet points, with
   added text to explain what is to happen if a <newline> is (seems to be)

For the -n option, change the LINE_MAX in the first bullet point to be
"the default line length as described above", and for the second bullet
point, just say "for the last iteration, if there are fewer than number
operands remaining" - no need to mention the zero case, if there are none
left, then if there has already been an iteration, the previous one was
the last (the one that used the final operand), there is no next one with
zero operands - if there was no previous iteration, then we just do what
we'd do with no -n (which depends upon -r).

For lines 123233-7 add an xref to XBD 3.7. (Sure would be nice to change
the output to go to /dev/tty - maybe that could at least be made an option?)

For lines 123252 specify the format in which the lines are to be written,
which cannot just be xargs internal form, but I have never used -t, so I
have no idea what is actually done.

For line 123263 - I suspect that the intent is to specify the same rules
for finding the utility as execvp() (or execlp()) uses - perhaps simply say
that, and xref it. (Not the shell command search rules, they're way too
complex). That section xref's XBD 8 (page 167). The xref to add would
be to page 867 here (in I8 D4).

I can't suggest what to do with that "(in the POSIX locale)" as I have
absolutely no idea what it is intended to mean.
Tags tc1-2024
Attached Files

- Relationships

-  Notes
geoffclare (manager)
2024-04-18 16:24

Interpretation response
The standard is unclear on this issue, and no conformance distinction can be made between alternative implementations based on this. This is being
referred to the sponsor.


Notes to the Editor (not part of this interpretation):

On page 3600 line 123176, change:
the application shall ensure that arguments in the standard input are delimited by unquoted <blank> characters, unescaped <blank> characters, or <newline> characters

the application shall ensure that arguments in the standard input are delimited by <blank> characters that are neither quoted nor escaped, or by unescaped <newline> characters

After page 3600 line 123183, add two bullet items:

  • Quoting and escaping characters shall not be included in the arguments passed to utility. Escaped <newline> characters shall be included in the arguments.

  • It shall be an error if an attempt is made to quote a <newline> character.

On page 3601 line 123231, change:
or {LINE_MAX} if there is no −s option

or the default command line length if there is no −s option

On page 3601 line 123232, change:
The last iteration has fewer than number, but not zero, operands remaining.

The last iteration has fewer than number operands remaining, or zero arguments were read from standard input (and the -r option is not specified).

On page 3602 line 123263, change:
The name of the utility to be invoked, found by search path using ...

The name of the utility to be invoked. If the name does not contain a <slash> character, the utility shall be found by search path using ...

On page 3603 line 123302, change:
If the −t option is specified, the utility and its constructed argument list shall be written to standard error, as it will be invoked, prior to invocation.

If the −t option is specified, the utility and its constructed argument list, with a <space> character preceding each argument and a terminating <newline>, shall be written to standard error, as it will be invoked, prior to invocation. Implementations may insert quoting and escaping characters in the output produced by -t such that the output (minus the utility name) can be unambiguously used as input to a subsequent xargs command and result in the same constructed argument list.

On page 3603 line 123304, change:
a prompt of the following format shall be written (in the POSIX locale):

at the end of the line of the output from −t.

a prompt shall be written at the end of the line of the output from −t. In the POSIX locale, the format of the prompt shall be:

- Issue History
Date Modified Username Field Change
2024-02-16 14:42 kre New Issue
2024-02-16 14:42 kre Name => Robert Elz
2024-02-16 14:42 kre Section => XCU 3 / xargs
2024-02-16 14:42 kre Page Number => 3600-3603
2024-02-16 14:42 kre Line Number => 123177-8 123183 123184-6 123228-32 123233-7 123252-3 123263 123304-6
2024-04-18 16:24 geoffclare Note Added: 0006767
2024-04-18 16:25 geoffclare Final Accepted Text => Note: 0006767
2024-04-18 16:25 geoffclare Status New => Resolution Proposed
2024-04-18 16:25 geoffclare Resolution Open => Accepted As Marked
2024-04-18 16:26 geoffclare Tag Attached: tc1-2024

Mantis 1.1.6[^]
Copyright © 2000 - 2008 Mantis Group
Powered by Mantis Bugtracker