Austin Group Defect Tracker

Aardvark Mark IV


Viewing Issue Simple Details Jump to Notes ] Issue History ] Print ]
ID Category Severity Type Date Submitted Last Update
0001563 [1003.1(2016/18)/Issue7+TC2] Shell and Utilities Editorial Clarification Requested 2022-02-18 15:07 2022-05-23 11:26
Reporter andras_farkas View Status public  
Assigned To
Priority normal Resolution Accepted As Marked  
Status Interpretation Required  
Name Andras Farkas
Organization
User Reference
Section what
Page Number
Line Number
Interp Status Approved
Final Accepted Text Note: 0005800
Summary 0001563: Wording for what seem to imply odd behavior. "all occurrences of @(#)"
Description On;
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/what.html [^]
There's the text:
> The what utility shall search the given files for all occurrences of the pattern that get (see get) substitutes for the %Z% keyword ( "@(#)" ) and shall write to standard output what follows until the first occurrence of one of the following:

Does this output a line for every occurrence of @(#) even if there are multiple before the occurrence of a character that ends an identification string?

For example: the text "@(#)ABC@(#)DEF@(#)GHI\n" (without the quotes) in a binary file. Will it treat it only as one line to output? "ABC@(#)DEF@(#)GHI" or three lines?
"ABC@(#)DEF@(#)GHI
DEF@(#)GHI
GHI"

I believe the standard itself implies it will only output one line.
My case for why the standard implies it will only output one line:
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/get.html [^]
The %A% keyword expands to the same text that %Z%%Y%%M%%I%%Z% expands to. Since %Z% is @(#) notice that it'd be expanded to "@(#)TextHere@(#)". (without the quotes)
If 'what' were to produce multiple lines for such a line, one of those lines would be a useless line with only a tab character on it.

This bug first noticed in note 0005680 on bug 0001538, by kre. Some of this bug is paraphrased from his note.
https://austingroupbugs.net/view.php?id=1538 [^]

I've tested behavior using FreeBSD what, and using the Schily SCCS what available at:
http://sccs.sourceforge.net/ [^]
Schily SCCS what is descended from Solaris's, while the BSD whats are not descended from the original SCCS.
Both have the same behavior: outputting only one line, for the example above.
Both do the same thing: output only one line, for the example above.
Desired Action If it is found that all (or most) implementations of what only output only one line rather than multiple for strings where multiple @(#) precede a character terminating an identification string, I'd like the following change in text:

Original:
The what utility shall search the given files for all occurrences of the pattern that get (see get) substitutes for the %Z% keyword ( "@(#)" ) and shall write to standard output what follows until the first occurrence of one of the following:
" > newline \ NUL

Desired text:
The what utility shall search the given files for all occurrences of the pattern that get (see get) substitutes for the %Z% keyword ( "@(#)" ). The what utility shall write to standard output what follows until the first occurrence of one of the following:
" > newline \ NUL
The what utility shall then look for the next occurrence of "@(#)" after one of those characters.

My wording can probably be improved.

Please research 'what' behavior across more systems, for me, please. I don't have access to System V systems or certified standards-compliant systems, since they're generally closed-source and cost money. (OpenSolaris being the only exception to this I know of, and only an exception to part of what I stated)
Tags tc3-2008
Attached Files

- Relationships

-  Notes
(0005684)
andras_farkas (reporter)
2022-02-18 15:23

Already spotted one part of my wording which could be improved:
> The what utility shall then look for
s/look/search

(also, oops! I didn't intend to have a near-duplicate last line in my Description)
(0005686)
shware_systems (reporter)
2022-02-18 16:02
edited on: 2022-02-18 16:17

AFAICT, the subsequent @(#) entries are all part of the identification string data
as non-special text, terminated by the \n. So it should go on one line. What appears to be missing is making it explicit searches continue after the identification string terminator, when -s not specified. That was added, I see, with different phrasing. If an identification string doesn't have one of the terminators, I'd think it's on the storing application to add one. For the example, using '\n' as default for the purpose, this would change it to:
"@(#)ABC\n@(#)DEF\n@(#)GHI\n"
to get all instances output on separate lines.

An alternative is adding '@' to the list of terminating chars, and explicitly start additional search with the terminating character, not after. That would pick up the 3 entries also.

(0005689)
kre (reporter)
2022-02-18 19:40

Re Note: 0005686

    What appears to be missing is making it explicit searches continue after the
    identification string terminator,

Exactly, if that is the intent (which I would expect it is).

    If an identification string doesn't have one of the terminators,

The only way that can happen is for there to be no terminating character
between the @(#) and EOF - which is probably a case that merits explicit
mention - EOF should also be one of the terminating conditions for the
identification string.

The issue if one wanted to get 3 separate outputs is simple, and doesn't
need discussing (when I tested it, I used " instead of \n as the separator,
but any of the listed set would do)/ But note, that would be
different than what might, improbably but possibly, be intended now, as
with none of @ ( # or ) being terminating characters, the alternate
implementation is that the first line would be long, containing (in the
example) all 3 "perhaps" id strings, and two @(#) sequences included,
the 2nd line would be a trailing substring of that, and the third a
trailing substring of that one. This is unlikely to be useful...

Making @ an additional terminator would be an invention, and not something
that we should be doing here (unless there is evidence of implementations
already doing that - and I very much doubt there are any).
(0005697)
geoffclare (manager)
2022-02-21 12:09

> Please research 'what' behavior across more systems, for me, please.

Solaris, macOS, and HP-UX all do this:
$ cat what3
@(#)ABC@(#)DEF@(#)GHI
@(#)ABC"@(#)DEF@(#)GHI
$ what what3
what3:  
        ABC@(#)DEF@(#)GHI
        ABC
        DEF@(#)GHI
(0005698)
andras_farkas (reporter)
2022-02-21 18:18

Looks like they all perform as expected, and the standards text for the 'what' utility could be changed to make expected behavior more clear.
(0005800)
geoffclare (manager)
2022-04-14 16:34
edited on: 2022-04-14 16:35

Interpretation response
------------------------
The standard states the identification strings written by what, and conforming implementations must conform to this. However, concerns have been raised about this which are being referred to the sponsor.

Rationale:
-------------
The description in the standard does not match existing practice when the number of identification strings in a file is not one. Additionally, an end-of-file condition was not listed as a delimiter.

Notes to the Editor (not part of this interpretation):
-------------------------------------------------------

On page 3437 line 116031 section what, change:
The what utility shall search the given files for all occurrences of the pattern that get (see [xref to get]) substitutes for the %Z% keyword ( "@(#)" ) and shall write to standard output what follows until the first occurrence of one of the following:
" > newline \ NUL
to:
The what utility shall search the given files for all occurrences of the pattern that get (see [xref to get]) substitutes for the %Z% keyword ( "@(#)" ). The what utility shall write to standard output what follows until the first occurrence of one of the following: <double-quote> ('"') , <greater-than-sign> (' >') , <newline> , <backslash> (' \\') , <NUL> ('\0'), or an end-of-file condition on the input file. If not at end-of-file, the what utility shall then look for the next occurrence of "@(#)" after one of those characters.

On page 3437 line 116063 section what, change:
The standard output shall consist of the following for each file operand:
"%s:\n\t%s\n", <pathname>, <identification string>
to:
For each file operand, the standard output shall consist of:
"%s:\n", <pathname>
followed by zero or more of:
"\t%s\n", <identification string>
one for each identification string located.


(0005801)
andras_farkas (reporter)
2022-04-14 19:39

Looks good to me.
(0005811)
agadmin (administrator)
2022-04-21 15:06

Interpretation proposed: 21 April 2022
(0005842)
agadmin (administrator)
2022-05-23 11:26

Interpretation approved: 23 May 2022

- Issue History
Date Modified Username Field Change
2022-02-18 15:07 andras_farkas New Issue
2022-02-18 15:07 andras_farkas Name => Andras Farkas
2022-02-18 15:07 andras_farkas Section => what
2022-02-18 15:23 andras_farkas Note Added: 0005684
2022-02-18 16:02 shware_systems Note Added: 0005686
2022-02-18 16:17 shware_systems Note Edited: 0005686
2022-02-18 19:40 kre Note Added: 0005689
2022-02-21 12:09 geoffclare Note Added: 0005697
2022-02-21 18:18 andras_farkas Note Added: 0005698
2022-04-14 16:34 geoffclare Note Added: 0005800
2022-04-14 16:35 geoffclare Note Edited: 0005800
2022-04-14 16:36 geoffclare Interp Status => Pending
2022-04-14 16:36 geoffclare Final Accepted Text => Note: 0005800
2022-04-14 16:36 geoffclare Status New => Interpretation Required
2022-04-14 16:36 geoffclare Resolution Open => Accepted As Marked
2022-04-14 16:36 geoffclare Tag Attached: tc3-2008
2022-04-14 19:39 andras_farkas Note Added: 0005801
2022-04-21 15:06 agadmin Interp Status Pending => Proposed
2022-04-21 15:06 agadmin Note Added: 0005811
2022-05-23 11:26 agadmin Interp Status Proposed => Approved
2022-05-23 11:26 agadmin Note Added: 0005842


Mantis 1.1.6[^]
Copyright © 2000 - 2008 Mantis Group
Powered by Mantis Bugtracker