View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0000842 | 1003.1(2013)/Issue7+TC1 | Shell and Utilities | public | 2014-06-02 05:55 | 2019-06-10 08:54 |
Reporter | rhansen | Assigned To | |||
Priority | normal | Severity | Objection | Type | Omission |
Status | Closed | Resolution | Accepted As Marked | ||
Name | Richard Hansen | ||||
Organization | BBN | ||||
User Reference | |||||
Section | XCU 2.14 break, continue | ||||
Page Number | 2358, 2362 | ||||
Line Number | 75117-75121, 75244-75250 | ||||
Interp Status | --- | ||||
Final Accepted Text | See 0000842:0002257. | ||||
Summary | 0000842: meaning of "enclosing loop" with break/continue unclear | ||||
Description | The results of running 'break' and 'continue' are specified in terms of the enclosing for, while, or until loop(s). However, it is unclear what "enclosing" means. For example, how should the following script behave: foo() { for i in 1 2; do echo " running bar ($i)..." bar echo " bar returned $?" done } bar() { for j in 1 2; do do_break() { echo " breaking..." false break 2 echo " break returned $?" } echo " running do_break ($j)..." do_break echo " do_break returned $?" done } echo "running foo..." foo echo "foo returned $?" Should the 'break' command do nothing because it is not a command in a compound list associated with a loop? Should it only break out of the inner loop because it is syntactically "inside" the inner loop but not the outer loop? Should it break out of the outer loop because both loops were executing at the time break was run? One could interpret "enclosing" as implying lexical/static scope: a break/continue command must be one of the commands in the compound list associated with the loop in order for the loop to qualify as enclosing the command. Alternatively, one could interpret "enclosing" as implying dynamic scope: A break/continue command is enclosed by a loop if the loop is executing when the command is executed. The following is a list of how the 'break' in the above script behaves in some existing implementations: * bash (POSIX mode), zsh (sh emulation mode), dash, and NetBSD's /bin/sh: breaks out of the outer loop * ksh93: does nothing (no error message, exit status is 0) * mksh: prints an error message but otherwise does nothing (exit status is 0) Scripts sourced with the dot command lead to similar questions. Here's an example script: cat <<\EOF >/tmp/foo for i in 1 2; do echo " running bar ($i)..." . /tmp/bar echo " bar returned $?" done EOF cat <<\EOF >/tmp/bar for j in 1 2; do echo " running do_break ($j)..." . /tmp/do_break echo " do_break returned $?" done EOF cat <<\EOF >/tmp/do_break echo " breaking..." false break 2 echo " break returned $?" EOF echo "running foo..." . /tmp/foo echo "foo returned $?" With the above script, the 'break' behaves as follows: * ksh93, bash (POSIX mode) and now NetBSD's /bin/sh [1]: breaks out of the outer loop * zsh (sh emulation mode): errors out * dash: acts like 'return 0' * mksh: prints an error message but otherwise does nothing (exit status is 0) [1] http://thread.gmane.org/gmane.os.netbsd.bugs/70663 Also, the behavior when break or continue is run outside of a loop is unclear. The specification for break without arguments says "shall exit from the smallest enclosing [...] loop, if any", but then goes on to talk about breaking out of the outermost enclosing loop (because the default is equivalent to n=1, and 1 is greater than the number of enclosing loops). The specification for continue doesn't have the "if any" qualifier. | ||||
Desired Action | (I will post proposed wording changes as a bug note later. Given the implementation diversity, I'm guessing we'll want to go with "unspecified" for Issue 7 TC2. For Issue 8, I don't know if lexical or dynamic scope is preferred.) | ||||
Tags | tc2-2008 |
parent of | 0001058 | Closed | is it an error to call "break [n]" (or continue) when not in a loop? |
|
Dynamic scope for break/continue can be thought of as an extension to lexical scope: Any shell script written assuming lexical scoping will behave as intended if the shell implementation actually uses dynamic scoping for break/continue. Thus, I think it will be sufficient to simply specify lexical scoping for Issue 7 TC2 and leave dynamic scoping as an extension to the standard. EDIT: The above statement is incorrect if a script uses 'break 1000' with the intention of breaking out of the outermost lexically enclosing loop and there is a non-lexically enclosing loop in progress. Thus, dynamic scoping isn't a pure extension to lexical scoping when n is greater than the number of lexically enclosing loops. |
|
At page 2358 lines 75117-75121 (XCU 2.14 break description), change:The break utility shall exit from the smallest enclosing for, while, or until loop, if any; or from the nth enclosing loop if n is specified. The value of n is an unsigned decimal integer greater than or equal to 1. The default shall be equivalent to n=1. If n is greater than the number of enclosing loops, the outermost enclosing loop shall be exited. Execution shall continue with the command immediately following the loop. to: If n is specified, the break utility shall exit from the nth enclosing for, while, or until loop. If n is unspecified, break shall behave as if n was specified as 1. Execution shall continue with the command immediately following the exited loop. The value of n is a positive decimal integer. If n is greater than the number of enclosing loops, the outermost enclosing loop shall be exited. If there is no enclosing loop, the behavior is unspecified. After page 2359 line 75155 (XCU 2.14 break examples), insert: The results of running the following example are unspecified: There are two loops in progress when the break command is executed, and they are in the same execution environment, but neither loop is lexically enclosing the break command. (There are no loops lexically enclosing the continue commands, either.)foo() { for j in 1 2; do echo 'break 2' >/tmp/do_break echo " sourcing /tmp/do_break ($j)..." # the behavior of the break from running the following command # results in unspecified behavior: . /tmp/do_break do_continue() { continue 2; } echo " running do_continue ($j)..." # the behavior of the continue in the following function call # results in unspecified behavior (if execution reaches this # point): do_continue trap 'continue 2' USR1 echo " sending SIGUSR1 to self ($j)..." # the behavior of the continue in the trap invoked from the # following signal results in unspecified behavior (if # execution reaches this point): kill -USR1 $$ sleep 1 done } for i in 1 2; do echo "running foo ($i)..." foo done At page 2362 lines 75244-75250 (XCU 2.14 continue description), change: The continue utility shall return to the top of the smallest enclosing for, while, or until loop, or to the top of the nth enclosing loop, if n is specified. This involves repeating the condition list of a while or until loop or performing the next assignment of a for loop, and re-executing the loop if appropriate. to: If n is specified, the continue utility shall return to the top of the nth enclosing for, while, or until loop. If n is unspecified, continue shall behave as if n was specified as 1. Returning to the top of the loop involves repeating the condition list of a while or until loop or performing the next assignment of a for loop, and re-executing the loop if appropriate. The value of n is a positive decimal integer. If n is greater than the number of enclosing loops, the outermost enclosing loop shall be used. If there is no enclosing loop, the behavior is unspecified. |
|
Here are two more strange cases: A break or continue could be in a trap handler: sh -c 'trap break USR1; for i in a b; do kill -USR1 $$; echo $i; done' Most shells (except mksh) exit from the loop in this example. This can be used to good effect, although something similar can be done by setting a variable in the trap handler and checking it in the mainline code. A break could be in a subshell nested in a loop: sh -c 'for i in a b; do (break; echo $i) done' Most shells (except mksh) exit from the subshell in this example. In FreeBSD sh, this is because the shell attempts to break from the loop, but the attempt stops at the subshell. I think the dynamic scope is most compatible with existing shells and scripts. |
|
Thank you for the additional strange cases! > A break or continue could be in a trap handler: Because no loop lexically encloses the break in the trap, the proposed wording would cause this to fall under unspecified behavior. However, the trap is executed in the same environment as the loop so if the shell did dynamic scope and the trap was invoked while executing the loop then it should work as expected. > A break could be in a subshell nested in a loop: The proposed wording makes this unspecified behavior because the subshell puts the break in a different shell execution environment. > I think the dynamic scope is most compatible with existing shells and scripts. Agreed, although there are some implementations don't do dynamic scoping. Also, I agree with what Geoff said on the mailing list: The current situation has persisted for at least 20 years, so I don't have a problem with it continuing. I imagine that almost all shell script authors just naturally use break and continue in a way that works the same with either scope, without being aware of the issue at all. |
|
Two other cases to consider: for i in 1 2; do echo "$i" eval break done mksh and other pdksh-derived shells choke on that. yash, ksh88, ksh93, zsh, bash, ash based shells don't. for i in 1 2; do echo "$i" command break done As a non-special built-in, "command" is required to run in a separate environment. That would be a bug in the spec as "command" as specified clearly needs to run in the current environment. That should be addressed in a different bug but once that's fixed, a note about "command break" being valid (or not as implementations are allowed to implement "command" as a function) should probably be added here as well. Also, it would help to clarify whether "break n" or "continue n" where n is greater than the number of enclosing loops should be considered an error or not (and the consequence on stderr and exit status). Some shells (yash, mksh) do report an error. bash and zsh only if the number of enclosing loops is 0. yash and zsh return a non-zero exit status when giving an error, others don't. |
|
Regarding implementations that choke on 'eval break', I believe that is clearly a bug in those implementations (they do not implement 'eval' properly). Regarding 'command break', I believe the standard is clear that it should behave just like 'break' except that an error in 'break' shall not cause the shell to abort, and variable assignments before 'command break' do not remain in affect after 'command break' completes. Regarding 'command' needing to run in the current environment: This is something that I have always found to be a bit confusing. Note that the standard does support non-special built-ins affecting the current execution environment; section 2.12 says: The environment of the shell process shall not be changed by the utility unless explicitly specified by the utility description (for example, cd and umask).In the past when we've discussed utilities (that are not special built-ins) that affect or depend on the current execution environment I believe the argument was that they could run in a separate execution environment but somehow (in an implementation-defined manner) communicate with the invoking shell to achieve their effects or acquire the relevant information. Usually this communication is trivial because those utilities are implemented as (regular) built-ins. If 'command' is implemented as a function, the standard requires it to behave as if it was not implemented as a function. It may be worthwhile to add a clarifying (non-normative) note saying that 'command break n' works like 'break n' except it doesn't abort the shell on errors or preserve in-line variable assignments. I agree that we should clarify whether it is an error if 'n' is greater than the number of enclosing loops. Would you mind filing a new bug report about that? |
|
> Regarding implementations that choke on 'eval break', I > believe that is clearly a bug in those implementations (they > do not implement 'eval' properly). "eval" is the command that causes the shell to interpret the code made of the concatenation of the arguments. "." is the command that causes the shell to interpret the code in the file given as arguments. That's two cases of a new instance of the parser being started to invoke some code. I don't object to requiring "eval break" to break out of a loop, and am even in favour of it, but if we're to make the behaviour unspecified for "." (even though all shells seem to concur on that one), I think we should make it explicit in the spec that "eval" is OK. [...] > Regarding 'command' needing to run in the current environment: > This is something that I have always found to be a bit > confusing. Note that the standard does support non-special > built-ins affecting the current execution environment; section > 2.12 says: > > The environment of the shell process shall not be changed > by the utility unless explicitly specified by the utility > description (for example, cd and umask). > > In the past when we've discussed utilities (that are not > special built-ins) that affect or depend on the current > execution environment I believe the argument was that they > could run in a separate execution environment but somehow (in > an implementation-defined manner) communicate with the > invoking shell to achieve their effects or acquire the > relevant information. Usually this communication is trivial > because those utilities are implemented as (regular) > built-ins. [...] So, for instance in: command eval ' a=test sleep 1 kill "$$"' "command" would be spawned in a new process. It would ask the shell via some implementation-defined IPC mechanism to run that "eval" command. So the shell would run eval while at the same time waiting for "command". And when finished running eval it would communicate the exit status of eval back to "command" which would then exit with in it. What should happen if the process running "command" is killed? Or like in this example if the process running the shell is killed? In an interactive shell, would command and eval (and sleep) run in the same process group? Wouldn't it be simpler to say that "command" runs in the current shell environment *and in the same process* as a built-in as there's not really any other sane way to implement it? > If 'command' is implemented as a function, the standard > requires it to behave as if it was not implemented as a > function. Even if the application redefines any of the utilities called by that function as functions? > It may be worthwhile to add a clarifying (non-normative) note > saying that 'command break n' works like 'break n' except it > doesn't abort the shell on errors or preserve in-line variable > assignments. It wouldn't harm indeed. > I agree that we should clarify whether it is an error if 'n' > is greater than the number of enclosing loops. Would you mind > filing a new bug report about that? I'll do. |
Date Modified | Username | Field | Change |
---|---|---|---|
2014-06-02 05:55 | rhansen | New Issue | |
2014-06-02 05:55 | rhansen | Name | => Richard Hansen |
2014-06-02 05:55 | rhansen | Organization | => BBN |
2014-06-02 05:55 | rhansen | Section | => XCU 2.14 break, continue |
2014-06-02 05:55 | rhansen | Page Number | => 2358, 2362 |
2014-06-02 05:55 | rhansen | Line Number | => 75117-75121, 75244-75250 |
2014-06-02 20:42 | rhansen | Note Added: 0002256 | |
2014-06-02 20:43 | rhansen | Note Added: 0002257 | |
2014-06-03 22:23 | jilles | Note Added: 0002258 | |
2014-06-05 03:49 | rhansen | Note Edited: 0002256 | |
2014-06-05 03:58 | rhansen | Note Edited: 0002256 | |
2014-06-05 05:43 | rhansen | Note Edited: 0002257 | |
2014-06-05 05:59 | rhansen | Note Edited: 0002257 | |
2014-06-05 06:01 | rhansen | Note Edited: 0002257 | |
2014-06-05 06:07 | rhansen | Note Edited: 0002257 | |
2014-06-05 06:24 | rhansen | Note Added: 0002259 | |
2014-06-05 06:30 | rhansen | Note Edited: 0002259 | |
2014-06-05 16:12 | rhansen | Note Edited: 0002257 | |
2014-06-05 16:20 | rhansen | Note Edited: 0002257 | |
2014-06-05 16:33 | Don Cragun | Interp Status | => --- |
2014-06-05 16:33 | Don Cragun | Final Accepted Text | => See 0000842:0002257. |
2014-06-05 16:33 | Don Cragun | Status | New => Resolved |
2014-06-05 16:33 | Don Cragun | Resolution | Open => Accepted As Marked |
2014-06-05 16:34 | Don Cragun | Tag Attached: tc2-2008 | |
2016-05-17 22:21 | stephane | Note Added: 0003230 | |
2016-05-26 16:21 | rhansen | Note Added: 0003234 | |
2016-06-11 21:17 | stephane | Note Added: 0003264 | |
2017-08-10 16:00 | Don Cragun | Relationship added | parent of 0001058 |
2019-06-10 08:54 | agadmin | Status | Resolved => Closed |