0000842: meaning of "enclosing loop" with break/continue unclear

ID	Project	Category	View Status	Date Submitted	Last Update

0000842	1003.1(2013)/Issue7+TC1	Shell and Utilities	public	2014-06-02 05:55	2019-06-10 08:54

Reporter	rhansen	Assigned To
Priority	normal	Severity	Objection	Type	Omission
Status	Closed	Resolution	Accepted As Marked

Name	Richard Hansen
Organization	BBN
User Reference
Section	XCU 2.14 break, continue
Page Number	2358, 2362
Line Number	75117-75121, 75244-75250
Interp Status	---
Final Accepted Text	See 0000842:0002257.


Summary	0000842: meaning of "enclosing loop" with break/continue unclear
Description	The results of running 'break' and 'continue' are specified in terms of the enclosing for, while, or until loop(s). However, it is unclear what "enclosing" means. For example, how should the following script behave: foo() { for i in 1 2; do echo " running bar ($i)..." bar echo " bar returned $?" done } bar() { for j in 1 2; do do_break() { echo " breaking..." false break 2 echo " break returned $?" } echo " running do_break ($j)..." do_break echo " do_break returned $?" done } echo "running foo..." foo echo "foo returned $?" Should the 'break' command do nothing because it is not a command in a compound list associated with a loop? Should it only break out of the inner loop because it is syntactically "inside" the inner loop but not the outer loop? Should it break out of the outer loop because both loops were executing at the time break was run? One could interpret "enclosing" as implying lexical/static scope: a break/continue command must be one of the commands in the compound list associated with the loop in order for the loop to qualify as enclosing the command. Alternatively, one could interpret "enclosing" as implying dynamic scope: A break/continue command is enclosed by a loop if the loop is executing when the command is executed. The following is a list of how the 'break' in the above script behaves in some existing implementations: * bash (POSIX mode), zsh (sh emulation mode), dash, and NetBSD's /bin/sh: breaks out of the outer loop * ksh93: does nothing (no error message, exit status is 0) * mksh: prints an error message but otherwise does nothing (exit status is 0) Scripts sourced with the dot command lead to similar questions. Here's an example script: cat <<\EOF >/tmp/foo for i in 1 2; do echo " running bar ($i)..." . /tmp/bar echo " bar returned $?" done EOF cat <<\EOF >/tmp/bar for j in 1 2; do echo " running do_break ($j)..." . /tmp/do_break echo " do_break returned $?" done EOF cat <<\EOF >/tmp/do_break echo " breaking..." false break 2 echo " break returned $?" EOF echo "running foo..." . /tmp/foo echo "foo returned $?" With the above script, the 'break' behaves as follows: * ksh93, bash (POSIX mode) and now NetBSD's /bin/sh [1]: breaks out of the outer loop * zsh (sh emulation mode): errors out * dash: acts like 'return 0' * mksh: prints an error message but otherwise does nothing (exit status is 0) [1] http://thread.gmane.org/gmane.os.netbsd.bugs/70663 Also, the behavior when break or continue is run outside of a loop is unclear. The specification for break without arguments says "shall exit from the smallest enclosing [...] loop, if any", but then goes on to talk about breaking out of the outermost enclosing loop (because the default is equivalent to n=1, and 1 is greater than the number of enclosing loops). The specification for continue doesn't have the "if any" qualifier.
Desired Action	(I will post proposed wording changes as a bug note later. Given the implementation diversity, I'm guessing we'll want to go with "unspecified" for Issue 7 TC2. For Issue 8, I don't know if lexical or dynamic scope is preferred.)
Tags	tc2-2008

rhansen 2014-06-02 20:42 manager bugnote:0002256 Last edited: 2014-06-05 03:58	Dynamic scope for break/continue can be thought of as an extension to lexical scope: Any shell script written assuming lexical scoping will behave as intended if the shell implementation actually uses dynamic scoping for break/continue. Thus, I think it will be sufficient to simply specify lexical scoping for Issue 7 TC2 and leave dynamic scoping as an extension to the standard. EDIT: The above statement is incorrect if a script uses 'break 1000' with the intention of breaking out of the outermost lexically enclosing loop and there is a non-lexically enclosing loop in progress. Thus, dynamic scoping isn't a pure extension to lexical scoping when n is greater than the number of lexically enclosing loops.

rhansen 2014-06-02 20:43 manager bugnote:0002257 Last edited: 2014-06-05 16:20	At page 2358 lines 75117-75121 (XCU 2.14 break description), change: The break utility shall exit from the smallest enclosing for, while, or until loop, if any; or from the nth enclosing loop if n is specified. The value of n is an unsigned decimal integer greater than or equal to 1. The default shall be equivalent to n=1. If n is greater than the number of enclosing loops, the outermost enclosing loop shall be exited. Execution shall continue with the command immediately following the loop. to: If n is specified, the break utility shall exit from the nth enclosing for, while, or until loop. If n is unspecified, break shall behave as if n was specified as 1. Execution shall continue with the command immediately following the exited loop. The value of n is a positive decimal integer. If n is greater than the number of enclosing loops, the outermost enclosing loop shall be exited. If there is no enclosing loop, the behavior is unspecified. A loop shall enclose a break or continue command if the loop lexically encloses the command. A loop lexically encloses a break or continue command if the command is: executing in the same execution environment (see section 2.12) as the compound-list of the loop's do-group (see section 2.10.2), and contained in a compound-list associated with the loop (either in the compound-list of the loop's do-group or, if the loop is a while or until loop, in the compound-list following the while or until reserved word), and not in the body of a function whose function definition command (see section 2.9.5) is contained in a compound-list associated with the loop. If n is greater than the number of lexically enclosing loops and there is a non-lexically enclosing loop in progress in the same execution environment as the break or continue command, it is unspecified whether that loop encloses the command. After page 2359 line 75155 (XCU 2.14 break examples), insert: The results of running the following example are unspecified: There are two loops in progress when the break command is executed, and they are in the same execution environment, but neither loop is lexically enclosing the break command. (There are no loops lexically enclosing the continue commands, either.) foo() { for j in 1 2; do echo 'break 2' >/tmp/do_break echo " sourcing /tmp/do_break ($j)..." # the behavior of the break from running the following command # results in unspecified behavior: . /tmp/do_break do_continue() { continue 2; } echo " running do_continue ($j)..." # the behavior of the continue in the following function call # results in unspecified behavior (if execution reaches this # point): do_continue trap 'continue 2' USR1 echo " sending SIGUSR1 to self ($j)..." # the behavior of the continue in the trap invoked from the # following signal results in unspecified behavior (if # execution reaches this point): kill -USR1 $$ sleep 1 done } for i in 1 2; do echo "running foo ($i)..." foo done At page 2362 lines 75244-75250 (XCU 2.14 continue description), change: The continue utility shall return to the top of the smallest enclosing for, while, or until loop, or to the top of the nth enclosing loop, if n is specified. This involves repeating the condition list of a while or until loop or performing the next assignment of a for loop, and re-executing the loop if appropriate. The value of n is a decimal integer greater than or equal to 1. The default shall be equivalent to n=1. If n is greater than the number of enclosing loops, the outermost enclosing loop shall be used. to: If n is specified, the continue utility shall return to the top of the nth enclosing for, while, or until loop. If n is unspecified, continue shall behave as if n was specified as 1. Returning to the top of the loop involves repeating the condition list of a while or until loop or performing the next assignment of a for loop, and re-executing the loop if appropriate. The value of n is a positive decimal integer. If n is greater than the number of enclosing loops, the outermost enclosing loop shall be used. If there is no enclosing loop, the behavior is unspecified. The meaning of "enclosing" shall be as specified in the description of the break utility.

jilles 2014-06-03 22:23 reporter bugnote:0002258	Here are two more strange cases: A break or continue could be in a trap handler: sh -c 'trap break USR1; for i in a b; do kill -USR1 $$; echo $i; done' Most shells (except mksh) exit from the loop in this example. This can be used to good effect, although something similar can be done by setting a variable in the trap handler and checking it in the mainline code. A break could be in a subshell nested in a loop: sh -c 'for i in a b; do (break; echo $i) done' Most shells (except mksh) exit from the subshell in this example. In FreeBSD sh, this is because the shell attempts to break from the loop, but the attempt stops at the subshell. I think the dynamic scope is most compatible with existing shells and scripts.

rhansen 2014-06-05 06:24 manager bugnote:0002259 Last edited: 2014-06-05 06:30	Thank you for the additional strange cases! > A break or continue could be in a trap handler: Because no loop lexically encloses the break in the trap, the proposed wording would cause this to fall under unspecified behavior. However, the trap is executed in the same environment as the loop so if the shell did dynamic scope and the trap was invoked while executing the loop then it should work as expected. > A break could be in a subshell nested in a loop: The proposed wording makes this unspecified behavior because the subshell puts the break in a different shell execution environment. > I think the dynamic scope is most compatible with existing shells and scripts. Agreed, although there are some implementations don't do dynamic scoping. Also, I agree with what Geoff said on the mailing list: The current situation has persisted for at least 20 years, so I don't have a problem with it continuing. I imagine that almost all shell script authors just naturally use break and continue in a way that works the same with either scope, without being aware of the issue at all.

stephane 2016-05-17 22:21 reporter bugnote:0003230	Two other cases to consider: for i in 1 2; do echo "$i" eval break done mksh and other pdksh-derived shells choke on that. yash, ksh88, ksh93, zsh, bash, ash based shells don't. for i in 1 2; do echo "$i" command break done As a non-special built-in, "command" is required to run in a separate environment. That would be a bug in the spec as "command" as specified clearly needs to run in the current environment. That should be addressed in a different bug but once that's fixed, a note about "command break" being valid (or not as implementations are allowed to implement "command" as a function) should probably be added here as well. Also, it would help to clarify whether "break n" or "continue n" where n is greater than the number of enclosing loops should be considered an error or not (and the consequence on stderr and exit status). Some shells (yash, mksh) do report an error. bash and zsh only if the number of enclosing loops is 0. yash and zsh return a non-zero exit status when giving an error, others don't.

rhansen 2016-05-26 16:21 manager bugnote:0003234	Regarding implementations that choke on 'eval break', I believe that is clearly a bug in those implementations (they do not implement 'eval' properly). Regarding 'command break', I believe the standard is clear that it should behave just like 'break' except that an error in 'break' shall not cause the shell to abort, and variable assignments before 'command break' do not remain in affect after 'command break' completes. Regarding 'command' needing to run in the current environment: This is something that I have always found to be a bit confusing. Note that the standard does support non-special built-ins affecting the current execution environment; section 2.12 says: The environment of the shell process shall not be changed by the utility unless explicitly specified by the utility description (for example, cd and umask). In the past when we've discussed utilities (that are not special built-ins) that affect or depend on the current execution environment I believe the argument was that they could run in a separate execution environment but somehow (in an implementation-defined manner) communicate with the invoking shell to achieve their effects or acquire the relevant information. Usually this communication is trivial because those utilities are implemented as (regular) built-ins. If 'command' is implemented as a function, the standard requires it to behave as if it was not implemented as a function. It may be worthwhile to add a clarifying (non-normative) note saying that 'command break n' works like 'break n' except it doesn't abort the shell on errors or preserve in-line variable assignments. I agree that we should clarify whether it is an error if 'n' is greater than the number of enclosing loops. Would you mind filing a new bug report about that?

stephane 2016-06-11 21:17 reporter bugnote:0003264	> Regarding implementations that choke on 'eval break', I > believe that is clearly a bug in those implementations (they > do not implement 'eval' properly). "eval" is the command that causes the shell to interpret the code made of the concatenation of the arguments. "." is the command that causes the shell to interpret the code in the file given as arguments. That's two cases of a new instance of the parser being started to invoke some code. I don't object to requiring "eval break" to break out of a loop, and am even in favour of it, but if we're to make the behaviour unspecified for "." (even though all shells seem to concur on that one), I think we should make it explicit in the spec that "eval" is OK. [...] > Regarding 'command' needing to run in the current environment: > This is something that I have always found to be a bit > confusing. Note that the standard does support non-special > built-ins affecting the current execution environment; section > 2.12 says: > > The environment of the shell process shall not be changed > by the utility unless explicitly specified by the utility > description (for example, cd and umask). > > In the past when we've discussed utilities (that are not > special built-ins) that affect or depend on the current > execution environment I believe the argument was that they > could run in a separate execution environment but somehow (in > an implementation-defined manner) communicate with the > invoking shell to achieve their effects or acquire the > relevant information. Usually this communication is trivial > because those utilities are implemented as (regular) > built-ins. [...] So, for instance in: command eval ' a=test sleep 1 kill "$$"' "command" would be spawned in a new process. It would ask the shell via some implementation-defined IPC mechanism to run that "eval" command. So the shell would run eval while at the same time waiting for "command". And when finished running eval it would communicate the exit status of eval back to "command" which would then exit with in it. What should happen if the process running "command" is killed? Or like in this example if the process running the shell is killed? In an interactive shell, would command and eval (and sleep) run in the same process group? Wouldn't it be simpler to say that "command" runs in the current shell environment and in the same process as a built-in as there's not really any other sane way to implement it? > If 'command' is implemented as a function, the standard > requires it to behave as if it was not implemented as a > function. Even if the application redefines any of the utilities called by that function as functions? > It may be worthwhile to add a clarifying (non-normative) note > saying that 'command break n' works like 'break n' except it > doesn't abort the shell on errors or preserve in-line variable > assignments. It wouldn't harm indeed. > I agree that we should clarify whether it is an error if 'n' > is greater than the number of enclosing loops. Would you mind > filing a new bug report about that? I'll do.

Date Modified	Username	Field	Change
2014-06-02 05:55	rhansen	New Issue
2014-06-02 05:55	rhansen	Name	=> Richard Hansen
2014-06-02 05:55	rhansen	Organization	=> BBN
2014-06-02 05:55	rhansen	Section	=> XCU 2.14 break, continue
2014-06-02 05:55	rhansen	Page Number	=> 2358, 2362
2014-06-02 05:55	rhansen	Line Number	=> 75117-75121, 75244-75250
2014-06-02 20:42	rhansen	Note Added: 0002256
2014-06-02 20:43	rhansen	Note Added: 0002257
2014-06-03 22:23	jilles	Note Added: 0002258
2014-06-05 03:49	rhansen	Note Edited: 0002256
2014-06-05 03:58	rhansen	Note Edited: 0002256
2014-06-05 05:43	rhansen	Note Edited: 0002257
2014-06-05 05:59	rhansen	Note Edited: 0002257
2014-06-05 06:01	rhansen	Note Edited: 0002257
2014-06-05 06:07	rhansen	Note Edited: 0002257
2014-06-05 06:24	rhansen	Note Added: 0002259
2014-06-05 06:30	rhansen	Note Edited: 0002259
2014-06-05 16:12	rhansen	Note Edited: 0002257
2014-06-05 16:20	rhansen	Note Edited: 0002257
2014-06-05 16:33	~~Don Cragun~~	Interp Status	=> ---
2014-06-05 16:33	~~Don Cragun~~	Final Accepted Text	=> See 0000842:0002257.
2014-06-05 16:33	~~Don Cragun~~	Status	New => Resolved
2014-06-05 16:33	~~Don Cragun~~	Resolution	Open => Accepted As Marked
2014-06-05 16:34	~~Don Cragun~~	Tag Attached: tc2-2008
2016-05-17 22:21	stephane	Note Added: 0003230
2016-05-26 16:21	rhansen	Note Added: 0003234
2016-06-11 21:17	stephane	Note Added: 0003264
2017-08-10 16:00	~~Don Cragun~~	Relationship added	parent of 0001058
2019-06-10 08:54	agadmin	Status	Resolved => Closed

View Issue Details

Relationships

Activities

Issue History