Austin Group Defect Tracker

Aardvark Mark IV


Viewing Issue Simple Details Jump to Notes ] Issue History ] Print ]
ID Category Severity Type Date Submitted Last Update
0001161 [1003.1(2016/18)/Issue7+TC2] Shell and Utilities Editorial Clarification Requested 2017-09-04 13:09 2019-11-07 09:39
Reporter steffen View Status public  
Assigned To
Priority normal Resolution Accepted As Marked  
Status Applied  
Name steffen
Organization
User Reference
Section command
Page Number 2596
Line Number 84274 ff.
Interp Status ---
Final Accepted Text See Note: 0004220
Summary 0001161: command -v must find something executable
Description It just has been spelled out to me that some popular shells (bash, dash) evaluate non-executable text files when asked for a program via "command -v".
For example i was shown, after revealing myself as a non-believer

  $ >/tmp/foo
  $ PATH=/tmp:$PATH bash -c 'command -v foo'
  /tmp/foo
Desired Action It seems the text is misleading and it should be reiterated that the "simple command" should really be executable.

In addition the program "which" should possibly become standardized, to workaround the issue shown above rather complicated path transformations have to be applied and be iterated over in order to find an executable utility (and that does not find shell functions, for example, but this would be perfect for me, on the other hand).

P.S.: bash does not find plain text files when enforced via POSIXLY_CORRECT, but at least dash still does in that case.
Tags tc3-2008
Attached Files

- Relationships
related to 0001226Applied shell can not test if a file is text 

-  Notes
(0003821)
kre (reporter)
2017-09-04 15:49

While I have some sympathy with the intent, the problem with that is
that there would need to be a definition of what "executable" really
means for this purpose.

dash (I assume) inherits its behaviour here from ash - the NetBSD (and from
what I can see) FreeBSD shells are the same. It looks to have been like
this in FreeBSD as far back as their publicly available repo goes (1993,
with the 4.4-lite BSD version) - NetBSD's has been the same since then, but
the history goes back a little earlier, and it is possible to see several
different attempts to work out how to do the "executable" test so that it
works correctly.

It would be nice if one could use access() for this purpose, but that checks
the "wrong" ID for the purpose when effective != real. Without that, the
shell itself needs to deal with the different rules for root (id == 0) and
checking whether the group of the file (if the 'x' group bit is set, and the
owner 'x' permission does not apply) is one of the groups to which the process
belongs. It all gets very messy.

That is (I am guessing) why the decision was made to simply punt that code,
and simply find the name in PATH somewhere.

I don't think which belongs, it was a csh related command when invented, and
is supposed to read .cshrc which is hardly useful for POSIX.
(0003823)
stephane (reporter)
2017-09-05 06:58

In a separate discussion (about whether "exec" or "env" may execute builtins or functions), we mentioned adding a "command -e" to only execute *external* commands (already implemented in "yash").

That would also be useful here as a "command -ve" to return the path of external executables only.

About bash and non-executable files, note that the behaviour changes when bash is in POSIX mode. The behaviour is mentioned at https://unix.stackexchange.com/questions/85249/why-not-use-which-what-to-use-then/85250#85250 [^] (messy article but with some valid points).
(0003824)
stephane (reporter)
2017-09-05 08:30

Re: Note: 0003821

The euid != ruid is not the common case though (and it's commonly admitted that one should never do that, and bash for instance will not allow it (does a seteuid(getuid()), same for gid unless you use -p). So "command -v" and "type" could at least give a better answer in that common case.

The behaviour of dash can also be considered bogus in that if there's a non-executable "cmd" ahead of an executable one in $PATH, "command -v" will return the non-executable one (bash would return the executable one, it only falls back to the non-executable if it's the best it could find (and only when not in POSIX mode), and will return a "permission denied" if you run cmd), but "cmd" will execute the correct (executable) one and after that hash the wrong one:

$ dash -c 'command -v ls; ls -l /proc/self/exe; hash'
/home/stephane/bin/ls
lrwxrwxrwx 1 stephane stephane 0 Sep  5 09:28 /proc/self/exe -> /bin/ls
/home/stephane/bin/ls
(0003825)
steffen (reporter)
2017-09-05 12:34

re kre:
To me the problem is that number one, and the most widely used alternative shell on the most widely used Unix (alike) system, and the one which claims to be POSIX exactly and is understood like that by i think many normal users, has introduced this behaviour. You ask for a command and can possibly get a plain text file, for a reason i do not understand. So what i thought could be achieved with command -v effectively needs

thecmd() {
   fail=${1} pname=${2}
   oifs=${IFS} IFS=:
   set -- ${PATH}
   IFS=${oifs}
   for path
   do
      if [ -x "${path}/${pname}" ]; then
         echo "${path}/${pname}"
         return 0
      fi
   done
   [ ${fail} -eq 0 ] && return 1
   echo >&2 'ERROR: no trace of utility '"${pname}"
   exit 1
}

Unless you want to be very backward compatible and replace [(1) with test(1) in addition (thanks for Ralph and his favourite autotools).
I never understood why i cannot say "IFS=: set -- $PATH", by the way, has anyone an answer for this?

csh is not standardized by POSIX and thus no reason to not go for which, which seems to be what users are forced to head for? I do not know, being more explicit in the standard and thus turning behaviour found in the wild to be buggy is an option.
(0003826)
steffen (reporter)
2017-09-05 12:39

Re Stéphane Chazelas:
i do not know what this access(2) discussion of yours and kre is about. This issue is about getting back something non-executable via /bin/sh on number one. I would be fine if we could get some command which gives us back an external utility only, which(1) could be an option, like you show very thoroughly again in the linked page. That was exactly what i was saying, no?
(0003827)
kre (reporter)
2017-09-05 12:47

I understand that setuid for a shell is not common, and is usually a poor idea
(the NetBSD sh also requires -p to allow it) but we have to give sane and
consistent results in all cases, not just the common case.

I am not trying to justify the behaviour of dash (and other *BSD derived shells,
Chet gave me access to an original ash source, and I see it attempts to
implement an executable check - but does it incorrectly - the changes to delete
it must have been made as part of the integration into BSD when the incorrect
executable tests were discovered. The NetBSD CVS repo shows several different
methods attempted to make it work properly. eventually all simply removed.)

My point is that it is very hard to do an executable test in any way other than
by doing an exec() sys call - the kernel does all the correct tests (whatever
is correct for it) attempting to duplicate that is difficult. If all you need
is the 'x' bit in the flags, that is easy, but that does not make a file
executable. Eg: some systems have extra ACLs that modify permission lookups,
is the shell supposed to duplicate those tests as well? Of if you copy an
executable file from an a.out sparc system to an ELF alpha, is that file
executable on the alpha (it has the 'x' flag set, but exec() would fail,
both because it is the wrong format, and the wrong architecture.)

So, once again, if you are going to demand that "command -v" return only
executable files, you need to define what it means to be "executable" for
that purpose - and justify that definition.

Ideally the result would be the name of the file that actually ends up being
run when the shell executes the same command name, but without actually
attempting the exec, or copying the local kernel's tests (all of them) into
the shell, I see no way to achieve that.

Are you suggesting that the shell do a fork/exec sequence for each potential
name found from the PATH search, and then print the one that succeeds?
If it does that, how does it reliably prevent the command run from doing
anything, or does it also have to arrange to run it in ptrace() mode so it
halts before executing any real code?

This all just sounds too complicated to me - probably better to just leave
it unspecified what happens to "command -v name" if "name" exists in a
directory in $PATH but is not executable.
(0003828)
steffen (reporter)
2017-09-05 13:37

P.S.: i have to correct, thanks Ralph, it should have been

 [ -f FILE ] && [ -x FILE ]

instead.

re kre:
Now you are really going too far, in my opinion. Because the result can of course be wrong already once command(1) returns it.
So you think the description is fine as is.
Likely you are right, even bash(1) works as desired if driven via /bin/sh.
Whatever the reasoning is.
Having something like -p that goes for all the path and is guaranteed to find an external utility would nonetheless be valuable. Thanks.
(0003829)
chet_ramey (reporter)
2017-09-05 15:37

re: comment 3825

> I never understood why i cannot say "IFS=: set -- $PATH", by the way, has
> anyone an answer for this?

Because the standard says that variable assignments preceding a simple command
are processed after the command words are expanded.

http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_09_01 [^]
(0003830)
stephane (reporter)
2017-09-05 15:41

Re: Note: 0003827

No, I'm not suggesting the shell forks and try and exec (or fork()+setreuid()+access()), but that it calls access() when euid == ruid and resort to heuristics when euid != ruid like bash does (or give wrong results based on access() then which wouldn't be more wrong than what dash currently does). So opting for "saner" (in 99.99% of the cases) over "consistent" (for the odd case of euid!=ruid where typically you won't be using "command -v" or "type").

At the moment, IMO, dash's behaviour is broken as "type"/"command -v/V" doesn't give you the right executable in easy to address cases. For cases where the file is not executable for other reasons than EACCESS permissions (like EPERM, ENOEXE...), or for euid!=ruid where access() can't be used and the heuristic may be fooled by ACLs, there's the bash option where you get an error and bash doesn't try the next one in $PATH; or the option to get the next one in $PATH and acknowledge that "type"/"command -v/V" can't always get you the right results.

Example:

$ echo '#!/' > ~/bin/ls; chmod +x ~/bin/ls
$ PATH=~/bin:$PATH bash -o posix -c 'type ls; ls -l /proc/self/exe'
ls is /home/chazelas/bin/ls
bash: /home/chazelas/bin/ls: /: bad interpreter: Permission denied

(consistent behaviour, but the valid /bin/ls doesn't get executed. Just as well as that lets us know there's an issue with our ~/bin/ls)

$ PATH=~/bin:$PATH zsh -c 'type ls; ls -l /proc/self/exe'
ls is /home/chazelas/bin/ls
lrwxrwxrwx 1 chazelas chazelas 0 Sep 5 16:39 /proc/self/exe -> /bin/ls

We manage to find an executable "ls", but "type" gives the wrong result.

Re: Note: 0003825

bash will behave as you expect when called as "sh" (with "posix" option). bash only aims for POSIX compliant in that mode (though you need more to get full compliance like xpg_echo).
(0003831)
stephane (reporter)
2017-09-05 16:01

Re: Note: 0003829
>> I never understood why i cannot say "IFS=: set -- $PATH", by the way, has
>> anyone an answer for this?
>
> Because the standard says that variable assignments preceding a simple command
> are processed after the command words are expanded.

Also, one needs to disable globbing with "set -o noglob" or "set -f" which is the other effect of leaving that $PATH unquoted, and splitting $PATH that way is invalid as it splits "/bin:/usr/bin:" into "/bin", "/usr/bin" instead of "/bin", "/usr/bin" and "".

One would also need to take care of the "" and "/" cases specially. And avoid echo

Using:

thecmd() (
  IFS=:
  set -o noglob
  for p in $PATH''; do
    case $p in
      ("") file=$1;;
      (*/) file=$p$1;;
      (*) file=$p/$1;;
    esac
    if [ -f "$file" ] && [ -x "$file" ]; then
      printf '%s\n' "$file"
      exit 0
    fi
  done
  printf >&2 '"%s" not found\n' "$1"
  exit 1
)

Should work in most POSIX-like shells (not in the Bourne shell obviously)

Doesn't necessarily give you an executable file, but in the case of bash, would (I believe, it's been a long time since I looked into that) give you the one it would execute as long as euid==ruid (and report an error if it fails).
(0003832)
stephane (reporter)
2017-09-05 16:36

Re: Note: 0003831
> Doesn't necessarily give you an executable file, but in the
> case of bash, would (I believe, it's been a long time since I
> looked into that) give you the one it would execute as long as
> euid==ruid (and report an error if it fails).

Well, if we ignore the issues with commands being hashed, or
$PATH being unset and the other issues mentioned at
https://unix.stackexchange.com/questions/85249/why-not-use-which-what-to-use-then [^]
...

In that "thecmd" function, since we're already spawning a subshell, we may also as well
change uid and gid before calling "[ -x" (like
with UID=$EUID GID=$EGID in zsh or do it the whole thing in perl
with some

perl -Mfiletest=access -e '$(=$);$<=$); ... -f $f && -x $f...'

Note that glibc has some eaccess()/euidaccess() that implements
the heuristics described earlier in user space (only checks the
permissions AFAICT though (no ACLs, noexec mount flags...).
(0003833)
stephane (reporter)
2017-09-05 17:30

Re: Note: 0003827
> So, once again, if you are going to demand that "command -v"
> return only executable files, you need to define what it means
> to be "executable" for that purpose - and justify that
> definition.

IMO, it should return whatever command the shell is going to
execute. That's why I have no issue with bash returning
a non-executable file as long as it's the non-executable file
it's trying (and failing with an error message) to execute.

I agree it can't be done consistently with shells that don't
take bash approach, but it can certainly be improved in
Almquist-based shells.

(and I think "command -e" should be added).
(0003834)
steffen (reporter)
2017-09-05 20:17

re chet_ramey:
So excuse my thoughtless uneducated comment, please.
Thanks for pointing to the corresponding standard passage.

re Stéphane:
Hm, i see that even shipout scripts of mine do not do this right, regarding set -o noglob.
And not working with names that contain a LF, even then.

How is eaccess(3) implemented? I do not see an eaccess(2) system call?
Interesting can of worms. Will improve my script workaround.
(0003835)
steffen (reporter)
2017-09-05 20:19

Of course it works with LF contained, need to quote the assignment. Sigh.
(0003836)
kre (reporter)
2017-09-05 20:27

Re note 3827:

    IMO, it should return whatever command the shell is going to execute.

Does that mean "successfully execute" or "attempt to execute". In a shell
where those are the same, it obviously makes no difference, but for ash
based shells, there is no one "attempt to execute" path name (for commands
not containing a '/' in their names). If you just want the first of them
then I think that is what dash is giving now -- the exec attempt of that fails,
and then it goes on to find (and successfully exec) some later instance of the
same command name later in PATH.

To me that is more useful execution behaviour that simply failing when a
non-executable file (for whatever reason thet exec cannot be performed)
is in a directory in PATH earlier than an executable version of the same
command name.

Aside: in a few tests I have run I cannot get bash (not in posix mode) to
behave the way it has been claimed to behave - I can make it list a non-executable file from command -v, but attempting to execute the same name
actually executes an executable version later in PATH (whether it does it the
same way as ash shells, or whether it is simply checking 'x' permission, I
have no idea.)

And while I agree that command -e might be useful to have, this is not the place
to get it added - it needs to be implemented first, which means convincing
some shell implementers to actually add it ... while I can see the benefits,
no-one is clamoring for it to be added to the NetBSD sh, so I have not added
it - as long as that kind of thing remains the status quo, it cannot be added
to the standard - we are not here to legislate for the addition of new features.
(0003837)
chet_ramey (reporter)
2017-09-05 20:29

re: 3834

Take a look at

http://git.savannah.gnu.org/cgit/bash.git/tree/lib/sh/eaccess.c?h=devel [^]

(and note that FreeBSD, at least, has eaccess(2)
(0003838)
eblake (manager)
2017-09-05 21:27

re: 3834
Also, the standard has:
faccessat(..., AT_EACCESS)
and presumably that is what eaccess() would do.
(0003840)
eblake (manager)
2017-09-05 22:21

faccessat(,0) is different from faccessat(,AT_EACCESS) - the use of the flag option controls which ids (effective or real) are used in performing the permission checks.
(0003843)
stephane (reporter)
2017-09-06 08:30
edited on: 2017-09-06 08:32

Re: Note: 0003837

Looks like it's a recent addition to the standard (function added in issue 7). AFAICT, AT_EACCESS is not implemented on Linux; the GNU libc implements it in user space with a heuristic based on permissions only like eaccess()/euidaccess(), same one as used by bash.

(0003844)
stephane (reporter)
2017-09-06 09:30

Re: Note: 0003836

Historically (and still the case in most implementations), $PATH lookup was done by looping through the elements of $PATH and exec()ing until exec() doesn't return. And if all the exec() return, an error is reported describing the errno for the last failing exec(). So that executes the first command that the process can currently execute ("successfully" or not, for instance, the dynamic linker could still fail to load some libraries, the script interpreter could not have permission to read the script, the command may fail with a syntax error...), that the execve() will accept.

That has several issues. One is that "type"/"command -v" can't know which command will eventually exec() successfully.

Another one is that it gives non-deterministic behaviour. For instance, here on Linux with zsh that does it that way:

$ cp /bin/ls ~/bin
$ path=(~/bin $path)
$ limit stacksize 200k
$ a=${(l:126600:)} ls -l /proc/self/exe
lrwxrwxrwx 1 stephane stephane 0 Sep 6 09:40 /proc/self/exe -> /home/stephane/bin/ls
$ a=${(l:126700:)} ls -l /proc/self/exe
lrwxrwxrwx 1 stephane stephane 0 Sep 6 09:40 /proc/self/exe -> /bin/ls

The execve("/home/stephane/bin/ls") returns with E2BIG, but not the one with "/bin/ls" (probably because of the $_ variable that makes the difference between the two).

So, with a large arglist or environ, you could end-up running the wrong command. That applies to other "transient" errors as well. And for some errors like EIO, that also hides problems. If there's a bad sector in /usr/local/bin/python, you'll end up running /usr/bin/python without ever being told of that bad sector.

Some shells (like zsh) improve matters wrt "type"/"command -v" by trying to guess which of the files execve() will likely fail on. They only consider the non-regular file case and the access permission case. dash does the former and not the latter.

IMO, the bash behaviour that determines the path of the command in advance the same way for execution and for "type"/"command -v" and commits to it is better as it avoids all those problems above.

One could imagine going further in that direction and consider the first file found, whether it's regular or not, with execute access permission or not and commit to that and return with an error if exec() fails. That would help spot files in $PATH with incorrect permissions. But that would break schemes that have $PATH directories with executables meant for different users and I don't think any shell implements that.
(0003845)
stephane (reporter)
2017-09-06 09:37
edited on: 2017-09-06 09:54

Re: Note: 0003836
> And while I agree that command -e might be useful to have,
> this is not the place to get it added - it needs to be
> implemented first, which means convincing some shell
> implementers to actually add it ... while I can see the
> benefits, no-one is clamoring for it to be added to the NetBSD
> sh, so I have not added it - as long as that kind of thing
> remains the status quo, it cannot be added to the standard -
> we are not here to legislate for the addition of new features.

AFAIK, "command" was a POSIX invention in the first place, so
was a feature whose addition was legislated by POSIX.

"command -v" was meant to palliate the need for an equivalent to
the (broken / csh-only) "which" command. Since then, "which" has
evolved (on many systems) to a command that does the equivalent of
yash's "command -ve" or ksh's "whence -p", so it would make sense
to add that.

Not to mention that some people have objected in that recent
discussion about the "exec" special builtin to "env" being
guaranteed to execute external commands any longer, so we need an
alternative like yash's "command -e".

(0003846)
joerg (reporter)
2017-09-06 09:50
edited on: 2017-09-06 09:50

Re: Note: 0003824

truss -o o bosh -c 'command -v ls'
/usr/bin/ls

grep access o
access("/usr/bin/ls", X_OK|E_OK) = 0

This is already in effect.

(0003847)
steffen (reporter)
2017-09-06 12:55

Thank you, all.
That is fantastic, the standard already supports working around kre's issue, though not via E_OK for access(2) but only with AT_EACCESS and faccessat(2).

What stands again clarifying command -v accordingly?
And test -x on page 3289, line 110679 ff. also states "permission to execute the file (or search it, if it is a directory) will be granted", which would imply usage of the new interface if available. (Different to, e.g., the manual of bash, which states "True if file exists and is executable".)
And being able to simply get the path of an external utility that can be passed to and then used from within non-shell environments, this seems to be an improvement.
(0003848)
stephane (reporter)
2017-09-06 13:41

Re: Note: 0003846

Interesting.

Is that Solaris? That E_OK doesn't seem to be documented there
https://docs.oracle.com/cd/E26502_01/html/E29032/access-2.html#scrolltoc [^]

Your bosh code points to EFF_ONLY_OK on IRIX (documented: http://www.polarhome.com/service/man/?qf=access&af=0&sf=2&of=IRIX&tf=2) [^]

And EUID_OK on UNICOS (also documented: https://users.757.org/~ethan/comp_cray/MANPAGES/2012_10.0/03_pgs_1-210.pdf) [^]

Looks like many systems support checking access with effective uid/gid one way or another. A shame that it's not available on Linux except via some kludge in userspace. That shouldn't be hard to add as it's less work than a normal access(). I wonder why the glibc folks added the code there rather than suggesting the Linux kernel fold add it in kernel space.
(0003849)
kre (reporter)
2017-09-07 17:58

Before we attempt to settle upon what the wording for "command -v" should be,
can we go back to discovering its purpose? Why it exists at all, and what it
is intended to be used for (and for this, whether we call it command -v, or
which, or whence, or ... doesn't really matter.)

What is the use case for this functionality, and what must I be able to
conclude when the answer appears about that answer for that use case to be met?

I remain unconvinced that there is a use case for which the requirements can
be met (reliably, which implies that "usually" in the "I know from the
answer that ..." is not acceptable) which does not involve actually attempting
an exec() sys call on the intended answer and verifying that the exec works
(at least as far as the sys call not failing, whatever happens after that)
and if that is to be required, we would need a way to make that safe, which
it currently is not. If more is needed than just that the exec() sys call will
not fail, then we have even bigger problems.

If the requirement is simply that "command -v name" finds the first $PATH[n]/name (where $PATH[n] has the obvious interpretation - not suggesting
that that is, or should be, sh syntax) which has the 'x' bit set, then first
I am unconvinced that is actually useful for anything much, but for what uses
it might have, it has has already been shown in earlier notes, is not difficult
to achieve using existing functionality - we do not need "command -v" for that.

Perhaps the proper solution here is simply to remove "command -v" from the
standard (implementations could continue to handle it, for backwards compat,
but users would not, as they cannot today, expect any particular portable
answer.)
(0003850)
chet_ramey (reporter)
2017-09-07 18:29

"The command -v and -V options were added to satisfy requirements from users that are currently accomplished by three different historical utilities: type in the System V shell, whence in the KornShell, and which in the C shell. Since there is no historical agreement on how and what to accomplish here, the POSIX command utility was enhanced and the historical utilities were left unmodified. The C shell which merely conducts a path search. The KornShell whence is more elaborate-in addition to the categories required by POSIX, it also reports on tracked aliases, exported aliases, and undefined functions."
(0003851)
kre (reporter)
2017-09-07 21:17
edited on: 2017-09-07 21:18

The rationale for why it was originally added is not all that
interesting - what matters is if it is actually useful. Meaningless
demands from users for what they'd like to have are only relevant if
it is actually possible to provide something to achieves what they
want in a rational way.

Here, I am not sure it is - but as I don't know what any of the ancient
commands were actually useful for (aside from just satisfying curiosity,
for which we do not need a standard solution - each shell can provide
that for its users in any way that seems productive).

The question is more whether there is any way to sanely and reliably use
the output of command -v, or type, or which, or whence, or anything else
similar which has existed in the past, in a way that is dependable, and
which a script writer could depend upon always working (or always failing,
when an error is appropriate.)

If there is actually a demand here, someone should write the code (actually
make an implementation) which works properly, in all the hard cases, not just
"usually", and demonstrate it. It won't be me, as I have no idea how.
But if someone shows how it can be done, and the results are useful, I'd be
happy to copy (the idea, not necessarily the actual code.)

Then we might have something worth documenting.

(0003852)
chet_ramey (reporter)
2017-09-09 19:10

re: 3851

If the question is "what is the purpose?" or "why does it exist at all?", as it was in comment 3849, then looking at the original rationale should inform the answer.

Its utility is a separate question. I personally would rather have seen the -v/-V functionality implemented as options to `type', but type as the standard has it is minimalist.
(0003997)
McDutchie (reporter)
2018-04-25 17:57

Of course, kre is right in that 'command -v' and friends ('command -V' and 'type') cannot be made entirely reliable.

One thing that has not been mentioned is that, even if the result of 'command -v' can be made 100% accurate, it would still be impossible to eliminate the race condition between that test and the actual attempt to exec the returned file.

However, I strongly disagree that this means 'command -v' and friends are not useful at all.

One simple use case is a sanity check when initialising a shell script. A script that depends on certain utilities may use 'command -v' to check for their presence and refuse to initialise if they are not found. This would be much preferable to the program continuing and failing halfway (or, much worse, *not* failing properly and continuing to run in an inconsistent state instead, as is all too common for real-world shell scripts).

True, such a sanity check cannot be made "reliable". But it will work fine in 99.99...% of cases, which is good enough for it to be quite useful indeed. It should go without saying that this does not remove the need for proper exception handling.

So the real purpose of 'command -v' is not "can I definitely execute this utility?" but "can I reasonably expect to succeed at executing this utility, barring exceptions such as race conditions, I/O errors and ACL shenanigans?" This is useful data to decide whether or not to attempt to perform an operation.

Checking for the x bit is a minimum requirement for that use case. Being aware of things like 'noexec' mounts and ACLs, if present on the system, would be even better.

Unfortunately, ash and derivatives don't even attempt to check the x bit, so their 'command -v' commonly returns files that you cannot reasonably expect to execute.
(0004220)
nick (manager)
2019-01-24 17:11

At Page 2596 line 84277 Change "Utilities" to "Executable utilities" at the beginning of the sentence.
(0004223)
kre (reporter)
2019-01-24 23:33

This explains which kinds of files that command -v is supposed to locate,
as XBD 3.154 defines "executable file" - but now I'd like someone to
please explain how a shell is supposed to implement that?

As best I can tell, the only way (using defined interfaces) to find out if
a file is:
      A regular file acceptable as a new process image file by the
      equivalent of the exec family of functions,
is to apply one of the exec family of functions to it, and see what
happens. But that is not safe, as if it succeeds (an exec attempt) we
cannot know what will happen next (consider "command -v halt" (run as root))

Of course, there are so many interfaces that I might have missed one.
In that case, someone please tell me which one I missed.
(0004224)
kre (reporter)
2019-01-25 00:05

To clarify the problem: the wording in the description
of "command -v" that is the real issue does not relate
to the bullet points (that which is proposed to be changed)
which really are just defining what the output should be
in various cases. but to this sentence:

        Write a string to standard output that indicates the
        pathname or command that will be used by the shell,
        in the current shell execution environment (see Section
        2.12, on page 2381), to invoke command_name, but do
        not invoke command_name.

The problem is that the way that the shell works out what
pathname that "will be used" (when a pathname is to be used,
the other cases, when some internal shell object is used are
not a problem) is by actually invoking the command name -
which this text (quite correctly) says not to do.

If that text instead said:
        Write a string to standard output that indicates the
        pathname or command that will be first attempted to be
        used by the shell, [.....]
then the issue would go away.

Whether this is adequate or not, demands an answer to the
question I asked in note 3849. The only (possible) answer
to that was given in note 3997. If that is agreed, then the
wording I suggest above (rephrased into standards speak however
is appropriate) would be just fine. If the intended purpose
of command -v is something different, then we need to discover
what that is, and then how to meet that demand. If we have
no idea what the purpose is (we cannot agree on any useful
purpose for it to have) then we should simply delete it completely
-- I know that cannot be done for tc3, but the wording could
be changed to be so unspecific, that anything is OK, so that
command -v is effectively useless to scripts - and a future
direction could be added to say that it might go away in
some future edition of the standard.
(0004225)
steffen (reporter)
2019-01-25 00:53

I say good night by posting the solution that the wonderful kre has worked out for me, i am using it in some of my scripts.

May it help other POSIX shell programmers. The Unix/POSIX shell is such a flexible beast, i am afraid i will never land there with the syntax of my mailx.

P.S.: i have to add that i never have seen something similar in other code.

# which(1) not standardized, command(1) -v may return non-executable: unroll!
acmd_test() { __acmd "${1}" 1 0 0; }
acmd_test_fail() { __acmd "${1}" 1 1 0; }
acmd_set() { __acmd "${2}" 0 0 0 "${1}"; }
acmd_set_fail() { __acmd "${2}" 0 1 0 "${1}"; }
acmd_testandset() { __acmd "${2}" 1 0 0 "${1}"; }
acmd_testandset_fail() { __acmd "${2}" 1 1 0 "${1}"; }
thecmd_set() { __acmd "${2}" 0 0 1 "${1}"; }
thecmd_set_fail() { __acmd "${2}" 0 1 1 "${1}"; }
thecmd_testandset() { __acmd "${2}" 1 0 1 "${1}"; }
thecmd_testandset_fail() { __acmd "${2}" 1 1 1 "${1}"; }
__acmd() {
   pname=${1} dotest=${2} dofail=${3} verbok=${4} varname=${5}

   if [ "${dotest}" -ne 0 ]; then
      eval dotest=\$${varname}
      if [ -n "${dotest}" ]; then
         [ -n "${VERBOSE}" ] && [ ${verbok} -ne 0 ] &&
            msg ' . ${%s} ... %s' "${pname}" "${dotest}"
         return 0
      fi
   fi

   oifs=${IFS} IFS=:
   [ -n "${noglob_shell}" ] && set -o noglob
   set -- ${PATH}
   [ -n "${noglob_shell}" ] && set +o noglob
   IFS=${oifs}
   for path
   do
      if [ -z "${path}" ] || [ "${path}" = . ]; then
         if [ -d "${PWD}" ]; then
            path=${PWD}
         else
            path=.
         fi
      fi
      if [ -f "${path}/${pname}" ] && [ -x "${path}/${pname}" ]; then
         [ -n "${VERBOSE}" ] && [ ${verbok} -ne 0 ] &&
            msg ' . ${%s} ... %s' "${pname}" "${path}/${pname}"
         [ -n "${varname}" ] && eval ${varname}="${path}/${pname}"
         return 0
      fi
   done
   # We may have no builtin string functions, we yet have no programs we can
   # use, try to access once from the root, assuming it is an absolute path if
   # that finds the executable
   if ( cd && [ -f "${pname}" ] && [ -x "${pname}" ] ); then
     [ -n "${VERBOSE}" ] && [ ${verbok} -ne 0 ] &&
            msg ' . ${%s} ... %s' "${pname}" "${pname}"
      [ -n "${varname}" ] && eval ${varname}="${pname}"
      return 0
   fi
   [ -n "${varname}" ] && eval ${varname}=

   [ ${dofail} -eq 0 ] && return 1
   msg 'ERROR: no trace of utility '"${pname}"
   exit 1
}
msg() {
   fmt=${1}
   shift
   printf >&2 -- "${fmt}\n" "${@}"
}
(0004229)
kre (reporter)
2019-01-25 02:58

One additional note:

In note 3852 Chet attempted to answer the "why does it
exist at all" issue with a pointer to the Rationale.

That tells us why it was added by posix "Users liked
which/whence/type and demanded something like that"
but not what it is supposed to be useful for, its
purpose, without knowing which we cannot possibly
write text which says how it is expected to work.

If the purpose is to solve the problem:
User: "I said foo, but something happened that
I did not expect, the foo command I thought
would run did not, which I verified by ...."
then the proposed wording in note 4224 would not
be adequate, as if the shell's first exec
attempt fails, it might go to make another, which
succeeds - telling the user the path name of
the first attempt (which might be the command
the user is expecting to be run) is useless.

A better solution to that problem would be something
that indicates what pathname was last successfully
used for an exec, so the user could say "foo"
have it fail, then use the "what-happened-there"
command to find out what was just executed. Of
course, this would be pure invention, and so not
suitable to be defined here.

This is also an issue where only a human needs the
output, so command -V is a better choice than -v
anyway.


On the other hand, I have had need to determine
whether a command is built into the shell.
For this, command -v is useless, as it produces
the same output for functions and builtins,
and in the case in question, while there might
have been a function existing, that would not
have been relevant (the command in question was
too complex to be implemented entirely as a
function ... so such a function, if it wasn't
doing something else entirely different, would
need to either invoke a builtin, or filesystem
version to do the real work ... and which of
those would happen is what would need discovering).
I had to resort to type (or command -V) which
are essentially the same, and just hope that
shells would report builtin commands using something
I could guess at and match.

The are likely many other potential uses, but until
we know what they are, we cannot know if the
specification is adequate to meet them.

Really nothing should have been added to meet
the apparent user demand without answering this
question first, which the rationale acknowledges
did not happen:
    Since there is no historical agreement on how
    and what to accomplish here, the POSIX command
    utility was enhanced ...
which is just saying "we don't know what we're doing,
or why, but we're going to do it anyway" but it is
too late to worry about that now.

But something needs to be done, as it is now it is all
worse than useless. The proposed resolution does not
help.

- Issue History
Date Modified Username Field Change
2017-09-04 13:09 steffen New Issue
2017-09-04 13:09 steffen Name => steffen
2017-09-04 13:09 steffen Section => command
2017-09-04 13:09 steffen Page Number => 2596
2017-09-04 13:09 steffen Line Number => 84274 ff.
2017-09-04 15:49 kre Note Added: 0003821
2017-09-05 06:58 stephane Note Added: 0003823
2017-09-05 08:30 stephane Note Added: 0003824
2017-09-05 12:34 steffen Note Added: 0003825
2017-09-05 12:39 steffen Note Added: 0003826
2017-09-05 12:47 kre Note Added: 0003827
2017-09-05 13:37 steffen Note Added: 0003828
2017-09-05 15:37 chet_ramey Note Added: 0003829
2017-09-05 15:41 stephane Note Added: 0003830
2017-09-05 16:01 stephane Note Added: 0003831
2017-09-05 16:36 stephane Note Added: 0003832
2017-09-05 17:30 stephane Note Added: 0003833
2017-09-05 20:17 steffen Note Added: 0003834
2017-09-05 20:19 steffen Note Added: 0003835
2017-09-05 20:27 kre Note Added: 0003836
2017-09-05 20:29 chet_ramey Note Added: 0003837
2017-09-05 21:27 eblake Note Added: 0003838
2017-09-05 21:50 kre Note Added: 0003839
2017-09-05 21:52 kre Note Edited: 0003839
2017-09-05 22:21 eblake Note Added: 0003840
2017-09-05 22:23 Don Cragun Note Added: 0003841
2017-09-05 22:24 Don Cragun Note Deleted: 0003841
2017-09-06 00:00 kre Note Added: 0003842
2017-09-06 00:00 kre Note Deleted: 0003839
2017-09-06 08:30 stephane Note Added: 0003843
2017-09-06 08:32 stephane Note Edited: 0003843
2017-09-06 09:30 stephane Note Added: 0003844
2017-09-06 09:37 stephane Note Added: 0003845
2017-09-06 09:50 joerg Note Added: 0003846
2017-09-06 09:50 joerg Note Edited: 0003846
2017-09-06 09:50 joerg Note Edited: 0003846
2017-09-06 09:54 stephane Note Edited: 0003845
2017-09-06 12:55 steffen Note Added: 0003847
2017-09-06 13:41 stephane Note Added: 0003848
2017-09-07 17:31 kre Note Deleted: 0003842
2017-09-07 17:58 kre Note Added: 0003849
2017-09-07 18:29 chet_ramey Note Added: 0003850
2017-09-07 21:17 kre Note Added: 0003851
2017-09-07 21:18 kre Note Edited: 0003851
2017-09-09 19:10 chet_ramey Note Added: 0003852
2018-04-25 17:57 McDutchie Note Added: 0003997
2019-01-24 17:11 nick Note Added: 0004220
2019-01-24 17:11 nick Interp Status => ---
2019-01-24 17:11 nick Final Accepted Text => See Note: 0004220
2019-01-24 17:11 nick Status New => Resolved
2019-01-24 17:11 nick Resolution Open => Accepted As Marked
2019-01-24 17:12 nick Tag Attached: tc3-2008
2019-01-24 23:33 kre Note Added: 0004223
2019-01-25 00:05 kre Note Added: 0004224
2019-01-25 00:53 steffen Note Added: 0004225
2019-01-25 02:46 eblake Relationship added related to 0001226
2019-01-25 02:58 kre Note Added: 0004229
2019-11-07 09:39 geoffclare Status Resolved => Applied


Mantis 1.1.6[^]
Copyright © 2000 - 2008 Mantis Group
Powered by Mantis Bugtracker