0001150: exit status of command substitution not properly specified

Notes
(0003766) kre (reporter) 2017-06-16 01:46	Oops, I should have tested my example, when I wrote ... set -e; printf %s $( exit 1; printf foo ) I (of course) really meant ... set -e; printf %s $( (exit 1); printf foo ) If the exit in there was explicit, rather than just an exit status, then the -e option setting would be irrelevant. Alternatively ... set -e; printf %s $( false; printf foo ) would have worked as the example.

(0003767) stephane (reporter) 2017-06-16 06:10	See also exit $(exit 1) exit < "file$(exit 1)" in zsh and bash (I suppose the two "popular" shell implementations you're refering to). Command-less redirections are also cases to consider: < file$(exit 4) < file$(exit 5) var=$(exit 6) Which give different results depending on the shell. There's also: echo < file$(exit 5) "$? $(exit 3)$? $(exit 4)$?" And: var=$(exit 5) eval 'echo "$?"' And: eval 'echo "$?"' < file$(exit 5) And: { echo "$?"; } < file$(exit 5) And: exec 3< file$(exit 5)

(0003768) joerg (reporter) 2017-06-16 09:38	The standard is obvious here: If return or exit are called without parameter, the return/exit status is the status of the last command executed. Given that with "return $(exit 1)", the last command executed was "exit 1", it seems to be obvious that the expected return value is 1. This is important as we like to have an orthogonal behavior and: sh -c 'FOO=$(exit 99); echo $?' is expected to print 99 while sh -c 'FOO=$(exit 99) :; echo $?' is expected to print 0

(0003769) joerg (reporter) 2017-06-16 09:54	Re: Note: 0003767 It would be nice if you did post some expected exit status values.... Following the behavior of existing shells may be a bad idea, as these shells may suffer from bugs that we don't like to standardize.

(0003770) kre (reporter) 2017-06-16 10:06	Re note 3768 === It is not nearly as obvious as you think. In "return $(exit 1)" the "exit 1" is executed in a subshell environment (that is a requirement), and: Changes made to the subshell environment shall not affect the shell environment. That's from XCU 2.12. The exit status change in the subshell environment cannot directly affect the shell environment. The only way to make that happen would be to export the exit status of the command substitution, and whether or when that can happen was the subject of the bug report - it is just not stated, except for the one case in 2.9.1 Most shells (incl ksh93) do the "obvious" thing and return the status of the last command that occurred, before the return $(exit 1), leaving the exit status of the command substitution unused, as it is in just about every other situation. Making an exception for this case is exactly the opposite to achieving orthogonal behavior. Your exit 99 examples I agree with, but also remember that sh -c ': $(exit 99); echo $?' is expected to print 0. And re note 3769, I believe that in all of STephane's examples, the $(exit N) should simply be ignored (unless I missed one somewhere). They produce no output, and were not used in a context where the exit status of the command substitution should be extracted, so they just turn into meaningless syntactic noise.

(0003771) joerg (reporter) 2017-06-16 10:16 edited on: 2017-06-16 10:21	Re: Note: 0003770 The main shell waits for the sub-shell and the exit status remembered by the main shell is the exit status returned via the wait() or waitid() call from the sub-shell. This behavior is responsible for the documented and expected behavior that resuts in sh -c 'FOO=$(exit 99); echo $?' is expected to print 99 while sh -c 'FOO=$(exit 99) :; echo $?' is expected to print 0 If you like to see a different behavior than the one I explained in Note: 0003768, there would be a need to introduce extra code to the shell in order to make it behave non-orthogonal. BTW I wrote Note: 0003768 because I know that: sh -c ': $(exit 99); echo $?' is expected to print 0. As I prefer orthogonal behavior, I believe that requiring: $SHELL -c 'f() { return $( exit 1); } ; f; echo $?' to print 0 would be a mistake

(0003772) kre (reporter) 2017-06-16 11:07	Re note 3767, First, thanks for the additional cases to consider. In this example < file$(exit 4) (assuming "file" exists and is readable) I think the wording of 2.9.1 is clear enough, and the exit status from this is 4. And I think just about all shells (that pretend to be posix, or close) implement that. For this one < file$(exit 5) var=$(exit 6) we have a similar situation (no command word) so the exit status of the last command substitution applies. But here we are explicitly allowed to perform steps 3 and 4 (redirections and var assignments, resp) in whichever order we prefer, so the shell can set the exit status to 5 or 6 for this one. In all the other cases, where there is a command word, I would expect the command substitutions, which produce no output, to have no effect whatever. Their exit status should simply be ignored.

(0003773) stephane (reporter) 2017-06-16 15:38	Re: Note: 0003772 I wouldn't say it's that clear cut. Not many people would object that a=$(exit 4) b=$? may (if not should) assign 4 to $b (most do, dash and pdksh don't). With that in mind, it's hard to justify that a=$(exit 4)$? may not. Even if I agree it's not desirable. It's very unlikely anyone may want to do something like the above and expect $? to be the exit status of the previous cmdsubst. Things like this would be more likely: msg="$(translate 'last exit status'): $?" where you don't want $? to be the exit status of that $(translate). The question is if we mandate one way or the other, is it going to break existing scripts (which were probably not very portable anyway) which make the other assumption? Most of those are corner cases. We could leave it unspecified but raise awareness and give ways to avoid the problem. Like the above can be written portably: a=$(exit 4); b=$? # or b=$?; a=$(exit 4) # if you want $? to be the exit status of the last command ret=$? msg="$(translate 'last exit status'): $ret"

(0003774) joerg (reporter) 2017-06-16 15:52 edited on: 2017-06-16 15:59	Re: Note: 0003773 it seems that many shells are broken here... Does anybody disagree that: $SHELL -c 'f() { (exit 3); return $( exit 1); } ; f; echo $?' should print "3"? The reason for the diverging behavior is the places where the shells implement checkpoints to save the current value of "exitcode" for the next expansion of $?. The Bourne Shell and it's childs have such a checkpoint at the end of the interpreter function and this function is called from the command substitution..... ksh88 removed one such checkpoint after the wait() for the $(cmd), but left the checkpoint in the interpreter function "execute()" I could make it print "3" in bosh after I removed the checkpoint past the wait() from $(cmd) and after I restored the old saved value of the exit code after calling "execute()" for $(cmd).

(0003775) stephane (reporter) 2017-06-16 16:05	Re: Note: 0003774 Here, that's one case where it doesn't really matter. I agree 3 is the most sensible outcome. Like in $SHELL -c 'f() { (exit 3); a=$(exit 1) return; } ; f; echo $?' (ksh93 returns 0 which clearly is a bug) But that doesn't really matter because nobody is going to pass a cmdsubst that expands to nothing to exit or return or call return with a preceding assignment.

(0003776) kre (reporter) 2017-06-16 16:19	Re note 3773: \| Not many people would object that \| a=$(exit 4) b=$? \| may (if not should) assign 4 to $b I would not object to may, but would to should. In a=a; a=x b=$a it is explicitly unspecified whether b=a or b=x at the conclusion. It would be astounding if there was to be a stricter requirement for $? ash based shells (not just dash) only set $? after this whole simple command has completed, then set it to 4 (the result of the last cmdsub in a command with no command word, as required.) Exactly when $? is required to be updated is perhaps something else that needs better specification, this relates to the "checkpoints" that Joerg mentions in note 3774. I doubt much, if anything, will be broken, whatever ends up being specified here, until the issue appeared during an academic discussion on a related topic a day or two ago, I had no idea that shells behaved differently. If any scripts were being affected by that, I think we would have heard about it before now.

(0003777) shware_systems (reporter) 2017-06-16 16:33	Re: "There is what is said in 2.9.1 about a=$(command) which uses the status of command as the status of the assignment, when there is no command word, but that is the only case that is explicit." This can be construed that assignments and redirections are not included, and command as used there is only the words that are what should be the command name and any arguments. Example: a="" $(does-not-write-a-command-to-stdout-but-exits-5) argish1 would have the 5 as new value for $?. If the command name does not name a function, built-in, or file on the path, the result from the command substitution would override the 127 return of "command not found". I'm not saying this is the intent, just that it can be read that way and lead to some of the differences discussed.

(0003778) kre (reporter) 2017-06-16 16:35	And referring to 3773 again, I would say it is that clear cut (what was in 3772) - you changed the goalposts. That is, in a=$(exit 4) b=$? what gets assigned to b may be debatable, what is not however is what gets printed from a=$(exit 4) b=$?; echo $? which must be 4 (at least if you believe section 2.9.1's requirement on the exit status of a simple command with no command word.) That was the point of note 3772. This is the one clear place where the exit status of a command substitution is used. It might be the only one.

(0003779) kre (reporter) 2017-06-16 16:40 edited on: 2017-06-16 16:43	Re 3777, no it can't be read that way - if there is a command word at all then 2.9.1 does not apply, and the exit status is that that comes from executing the command word - even if that was originally planned on being an arg to a command which failed to be produced, and so ends up producing a 127 exit code. If there was no "argish1" in your example, then the exit code would be 5. And when I say 2.9.1 here I am of course referring to just the one sentence in that long section that relates to using the exit status of the last command substitution performed when there is no command word. All the rest of it still applies, obviously (I hope.)

(0003780) joerg (reporter) 2017-06-16 16:45 edited on: 2017-06-16 16:45	What gets assigned with a=$(exit 4) b=$? may depend on whether the macro expansion is done for all assignments first or step by step and whether/where is a checkpoint to save the exit code for $? expansion. You are right for a=$(exit 4) b=$?; echo $?

(0003781) shware_systems (reporter) 2017-06-17 07:51	Re: 3779 I agree the intent is along those lines, but isn't explicit enough to rule out the case presented. It's little different from 'a=echo; $a "Output TexT";', which I'd expect to write to stdout. The substitution results get evaluated as the command name that affects $?, whether arguments present or not. An implementation might limit 127 to case where './echox "Text";' not found on the path, for the simpler usage not involving command substitution, to resolve the conflicting requirements.

(0003782) kre (reporter) 2017-06-17 08:31	Re 3781... It is actually quite explicit, re-read 2.9.1 (steps 1-4 in the prelude, and what immediately follows) and you will see it (should see it.) The shell takes the (unexpanded) command line, removes any words that are var-assigns (start with unquoted name= and precede any non-redirect word which does not start that way) and any redirect words (contain an unquoted redirect operator), then applies the expansion rules to all the words that are left. The result of this can have some words vanish (expand to nothing) and new words appear (file name expansion, field splitting). Once that is complete we look at what resulted, if they are any words it is some kind of command (function, built in, whatever), and is processed that way (and its exit status comes from the command - which can include the 126 and 127 exit codes from the shell attempting to execute the command.) If there were are no words remaining (either never were any, or they all vanished) then the rules for processing commandless redirections and variable expansions apply. In this case (and as stated at least, only in this case) does the exit status of the last executed command substitution (when the redirects and var-assigns are expanded, or possibly from the earlier word expansions) get used to set $? (the exit status of the empty command.)

(0003936) stephane (reporter) 2018-03-09 22:55	Another case where we see diverging behaviours: for i in $(exit 4); do not run done echo "$?" Or case $(exit 4) in (x) ;; esac echo "$?" Most shells output 0, ksh93 outputs 4 (doesn't exit with errexit). In case x in ($(exit 4));; esac; echo "$?" zsh and ksh93 output 4. With errexit, zsh exits, but not ksh93

(0004176) kre (reporter) 2018-12-11 10:09	There is one more issue that ought to be resolved with all of this, as it is all related. In XCU section 2.9.1, page 2366, lines 75541-3) we see: If there is no command name, but the command contained a command substitution, the command shall complete with the exit status of the last command substitution performed. This is the one case where the use of the exit status of a cmd sub is stated explicitly - but even this is not really properly defined. Which is "the last" command substitution performed ? This turns on just what "performed" means for this, I think. And I do not know the answer in general. It is easy in the simple cases: a=$(cmd1) b=$(cmd2) it is the exit status of cmd2, as those are eval'd left to right. Same in: >$(echo file1; exit 1) 2>$(echo file2; exit 2) it is 2, as those are also eval'd L to R. Even in a=$(cmd) >$(echo file; exit 17) where it can be either the exit status of cmd, or 17, as it depends upon which order the shell decides to eval var-assigns and cmd-subs when there is no command word (which can vary according to lines 75498-500), so even though a user cannot rely upon which it will be, the status that should be returned by any particular shell is clear. But what about a=$(cmd $(exit 3)) ? Which command substitution is performed last? Is it the one started last, or the one that finished last? In that cmd is run with no args, its stdout is assigned to a, and the exit status is??? It seems obvious to say the exit status of cmd, as that one finishes last, and one might say that the command substitution has not been performed until it has completed, but what if cmd is $(exit 1) so what we have is ... a=$( $(exit 1) $(exit 3) ) which one is now the command substitution that is performed last? Is it still "cmd" ? Note here that we are talking about the exit status of the command which is the empty command word with the variable assignment to a, not the exit status of the command substitution performed in order to discover the value to be assigned - except if that one is the one picked as the "last performed". I have not tested what the various shells do with this (not even the NetBSD shell) - I really do not care. Nor does anyone else. However that the standard should be complete - and not leave things as "anyone's guess". What the shells currently do with any of these now is, I would submit, irrelevant - they can all be changed if needed (nothing is going to break because of that as no-one in any real script is insane enough to actually use any of these constructs, and if there is one, they deserve whatever misfortune lands upon them). If they were all the same that might be useful, but we know that is not the case. So, what should be done is to define this (all of this in this bug report) in the way that makes most logical sense, which also should be the way that is easiest to fully specify.

(0004177) joerg (reporter) 2018-12-11 12:49 edited on: 2018-12-11 12:50	While newer shells seem to agree that a=1 b=2 is evaluated from left to right, note that the original Bourne Shell does it from right to left and I am not aware of a text in POSIX that requires a specific order. a=$( $(exit 1) $(exit 3) ) behaves different from a=$(ls $(exit 1) $(exit 3) ) and in this area, most shells (except zsh) seem to be aligned. There is a nasty list of things that I believe should be defined in POSIX and that is e.g. The order of evaluation for a=1 b=2 whether the command echo foo \| read var should put foo in $var and similar things. The problem is that changing the behavior for exit or shell -ce should take less than a day if the maintainer knows his shell, but changing the behavior for echo foo \| read var may take more than a month.

(0004178) chet_ramey (reporter) 2018-12-11 14:52	Re: 4177 POSIX already uses "beginning to end" where that makes a difference: the order in which the shell expands words, and the order of processing words in a simple command, to name two. Whether or not the last element of a pipeline should run in the parent shell context is already covered (and made optional) in 2.12. Changing the behavior of `echo foo \| read var' to run the last element in the parent's context in a shell where job control is enabled and you have to support suspending the processes is a nasty job, and I, for one, am not interested at this time.

(0004179) chet_ramey (reporter) 2018-12-11 15:38	Re: 4176 Given that command substitutions are performed "beginning to end," as are all the word expansions, and they are intended to be synchronous, isn't the last one started the last one finished? Maybe the thing to do is to make the synchronous nature of command substitution explicit. I couldn't find anything that says the shell waits for the subshell environment to finish before moving onto the next expansion, but everyone seems to do it that way.

(0004180) kre (reporter) 2018-12-11 16:25	Re 4179 In a=$(cmd $(exit 1) ) the outer command substitution has to start first (or the inner one is never expanded) yet the outer one cannot finish until the inner one is finished. It may seem obvious that that the inner one cannot provide the exit status of the outer var-assign, but there's nothing that actually says that. All we have is that the last one performed is the one that provides the status. What the answer should be I am taking no position on at all, just that all of this needs to be cleaned up, one way or another.

(0004181) chet_ramey (reporter) 2018-12-11 16:33	Re: 4180 But they're not performed by the same execution environment. From the perspective of the assignment statement, there is only one command substitution. Maybe that's the assumption that needs to be cleaned up.

(0004184) geoffclare (manager) 2018-12-13 16:24	Interpretation response ------------------------ The standard is unclear on this issue, and no conformance distinction can be made between alternative implementations based on this. This is being referred to the sponsor. Rationale: ------------- None. Notes to the Editor (not part of this interpretation): ------------------------------------------------------- On page 2350 line 74877 section 2.5.2, change: Expands to the decimal exit status of the most recent pipeline (see [xref to 2.9.2]). to: Expands to the decimal exit status of the most recent pipeline (see [xref to 2.9.2]) that was not within a command substitution (see [xref to 2.6.3]). Note: In <tt>var=$(some_command); echo $?</tt> the output is the exit status of <tt>some_command</tt> but this is because its exit status becomes the exit status of the assignment command <tt>var=$(some_command)</tt> (see [xref to 2.9.1]) and this assignment command is the most recent pipeline. On page 2366 line 75543 section 2.9.1, change: with the exit status of the last command substitution performed to: with the exit status of the command substitution whose exit status was the last to be obtained On page 2410 line 77132 section 2.14 set, change: The failure of any individual command in a multi-command pipeline shall not cause the shell to exit. Only the failure of the pipeline itself shall be considered. to: The failure of any individual command in a multi-command pipeline, or of any subshell environments in which command substitution was performed during word expansion, shall not cause the shell to exit. Only the failure of the pipeline itself shall be considered. On page 2410 line 77145 section 2.14 set, add: In set -e; echo $(false; echo one) two the false command causes the subshell in which the command substitution is performed to exit without executing <tt>echo one</tt>; the exit status of the subshell is ignored and the shell then executes the word-expanded command <tt>echo two</tt>.

(0004199) agadmin (administrator) 2019-01-08 14:40	Interpretation proposed: 8 Jan 2019

(0004248) agadmin (administrator) 2019-02-11 17:28	Interpretation approved: 11 Feb 2019

Issue History
Date Modified	Username	Field	Change
2017-06-16 01:33	kre	New Issue
2017-06-16 01:33	kre	Name	=> Robert Elz
2017-06-16 01:33	kre	Section	=> 2.6.3
2017-06-16 01:33	kre	Page Number	=> 2357 - 2358
2017-06-16 01:33	kre	Line Number	=> 75182 - 75224
2017-06-16 01:46	kre	Note Added: 0003766
2017-06-16 06:10	stephane	Note Added: 0003767
2017-06-16 09:38	joerg	Note Added: 0003768
2017-06-16 09:54	joerg	Note Added: 0003769
2017-06-16 10:06	kre	Note Added: 0003770
2017-06-16 10:16	joerg	Note Added: 0003771
2017-06-16 10:21	joerg	Note Edited: 0003771
2017-06-16 11:07	kre	Note Added: 0003772
2017-06-16 15:38	stephane	Note Added: 0003773
2017-06-16 15:52	joerg	Note Added: 0003774
2017-06-16 15:59	joerg	Note Edited: 0003774
2017-06-16 16:05	stephane	Note Added: 0003775
2017-06-16 16:19	kre	Note Added: 0003776
2017-06-16 16:33	shware_systems	Note Added: 0003777
2017-06-16 16:35	kre	Note Added: 0003778
2017-06-16 16:40	kre	Note Added: 0003779
2017-06-16 16:43	kre	Note Edited: 0003779
2017-06-16 16:45	joerg	Note Added: 0003780
2017-06-16 16:45	joerg	Note Edited: 0003780
2017-06-17 07:51	shware_systems	Note Added: 0003781
2017-06-17 08:31	kre	Note Added: 0003782
2018-03-09 22:55	stephane	Note Added: 0003936
2018-12-11 10:09	kre	Note Added: 0004176
2018-12-11 12:49	joerg	Note Added: 0004177
2018-12-11 12:50	joerg	Note Edited: 0004177
2018-12-11 14:52	chet_ramey	Note Added: 0004178
2018-12-11 15:38	chet_ramey	Note Added: 0004179
2018-12-11 16:25	kre	Note Added: 0004180
2018-12-11 16:33	chet_ramey	Note Added: 0004181
2018-12-13 16:24	geoffclare	Note Added: 0004184
2018-12-13 16:25	geoffclare	Interp Status	=> Pending
2018-12-13 16:25	geoffclare	Final Accepted Text	=> Note: 0004184
2018-12-13 16:25	geoffclare	Status	New => Interpretation Required
2018-12-13 16:25	geoffclare	Resolution	Open => Accepted As Marked
2018-12-13 16:26	geoffclare	Tag Attached: tc3-2008
2019-01-08 14:40	agadmin	Interp Status	Pending => Proposed
2019-01-08 14:40	agadmin	Note Added: 0004199
2019-02-11 17:28	agadmin	Interp Status	Proposed => Approved
2019-02-11 17:28	agadmin	Note Added: 0004248
2019-11-07 09:21	geoffclare	Status	Interpretation Required => Applied
2020-01-20 14:58	geoffclare	Relationship added	related to 0001309
2020-01-20 15:03	geoffclare	Relationship added	related to 0000051

Aardvark Mark IV