Austin Group Defect Tracker

Aardvark Mark III


Viewing Issue Simple Details Jump to Notes ] Issue History ] Print ]
ID Category Severity Type Date Submitted Last Update
0000842 [1003.1(2013)/Issue7+TC1] Shell and Utilities Objection Omission 2014-06-02 05:55 2016-06-11 21:17
Reporter rhansen View Status public  
Assigned To
Priority normal Resolution Accepted As Marked  
Status Resolved  
Name Richard Hansen
Organization BBN
User Reference
Section XCU 2.14 break, continue
Page Number 2358, 2362
Line Number 75117-75121, 75244-75250
Interp Status ---
Final Accepted Text See Note: 0002257.
Summary 0000842: meaning of "enclosing loop" with break/continue unclear
Description The results of running 'break' and 'continue' are specified in terms of the enclosing for, while, or until loop(s). However, it is unclear what "enclosing" means. For example, how should the following script behave:

    foo() {
        for i in 1 2; do
            echo " running bar ($i)..."
            bar
            echo " bar returned $?"
        done
    }
    bar() {
        for j in 1 2; do
            do_break() {
                echo " breaking..."
                false
                break 2
                echo " break returned $?"
            }
            echo " running do_break ($j)..."
            do_break
            echo " do_break returned $?"
        done
    }
    echo "running foo..."
    foo
    echo "foo returned $?"

Should the 'break' command do nothing because it is not a command in a compound list associated with a loop? Should it only break out of the inner loop because it is syntactically "inside" the inner loop but not the outer loop? Should it break out of the outer loop because both loops were executing at the time break was run?

One could interpret "enclosing" as implying lexical/static scope: a break/continue command must be one of the commands in the compound list associated with the loop in order for the loop to qualify as enclosing the command.

Alternatively, one could interpret "enclosing" as implying dynamic scope: A break/continue command is enclosed by a loop if the loop is executing when the command is executed.

The following is a list of how the 'break' in the above script behaves in some existing implementations:

  * bash (POSIX mode), zsh (sh emulation mode), dash, and NetBSD's
    /bin/sh: breaks out of the outer loop
  * ksh93: does nothing (no error message, exit status is 0)
  * mksh: prints an error message but otherwise does nothing (exit
    status is 0)

Scripts sourced with the dot command lead to similar questions. Here's an example script:

    cat <<\EOF >/tmp/foo
        for i in 1 2; do
            echo " running bar ($i)..."
            . /tmp/bar
            echo " bar returned $?"
        done
    EOF
    cat <<\EOF >/tmp/bar
        for j in 1 2; do
            echo " running do_break ($j)..."
            . /tmp/do_break
            echo " do_break returned $?"
        done
    EOF
    cat <<\EOF >/tmp/do_break
        echo " breaking..."
        false
        break 2
        echo " break returned $?"
    EOF
    echo "running foo..."
    . /tmp/foo
    echo "foo returned $?"

With the above script, the 'break' behaves as follows:

  * ksh93, bash (POSIX mode) and now NetBSD's /bin/sh [1]: breaks out
    of the outer loop
  * zsh (sh emulation mode): errors out
  * dash: acts like 'return 0'
  * mksh: prints an error message but otherwise does nothing (exit
    status is 0)

[1] http://thread.gmane.org/gmane.os.netbsd.bugs/70663 [^]

Also, the behavior when break or continue is run outside of a loop is unclear. The specification for break without arguments says "shall exit from the smallest enclosing [...] loop, if any", but then goes on to talk about breaking out of the outermost enclosing loop (because the default is equivalent to n=1, and 1 is greater than the number of enclosing loops). The specification for continue doesn't have the "if any" qualifier.
Desired Action (I will post proposed wording changes as a bug note later. Given the implementation diversity, I'm guessing we'll want to go with "unspecified" for Issue 7 TC2. For Issue 8, I don't know if lexical or dynamic scope is preferred.)
Tags tc2-2008
Attached Files

- Relationships
parent of 0001058Resolved is it an error to call "break [n]" (or continue) when not in a loop? 

-  Notes
(0002256)
rhansen (manager)
2014-06-02 20:42
edited on: 2014-06-05 03:58

Dynamic scope for break/continue can be thought of as an extension to lexical scope: Any shell script written assuming lexical scoping will behave as intended if the shell implementation actually uses dynamic scoping for break/continue. Thus, I think it will be sufficient to simply specify lexical scoping for Issue 7 TC2 and leave dynamic scoping as an extension to the standard.

EDIT: The above statement is incorrect if a script uses 'break 1000' with the intention of breaking out of the outermost lexically enclosing loop and there is a non-lexically enclosing loop in progress. Thus, dynamic scoping isn't a pure extension to lexical scoping when n is greater than the number of lexically enclosing loops.

(0002257)
rhansen (manager)
2014-06-02 20:43
edited on: 2014-06-05 16:20

At page 2358 lines 75117-75121 (XCU 2.14 break description), change:
The break utility shall exit from the smallest enclosing for, while, or until loop, if any; or from the nth enclosing loop if n is specified. The value of n is an unsigned decimal integer greater than or equal to 1. The default shall be equivalent to n=1. If n is greater than the number of enclosing loops, the outermost enclosing loop shall be exited. Execution shall continue with the command immediately following the loop.

to:
If n is specified, the break utility shall exit from the nth enclosing for, while, or until loop. If n is unspecified, break shall behave as if n was specified as 1. Execution shall continue with the command immediately following the exited loop. The value of n is a positive decimal integer. If n is greater than the number of enclosing loops, the outermost enclosing loop shall be exited. If there is no enclosing loop, the behavior is unspecified.

A loop shall enclose a break or continue command if the loop lexically encloses the command. A loop lexically encloses a break or continue command if the command is:
  • executing in the same execution environment (see section 2.12) as the compound-list of the loop's do-group (see section 2.10.2), and
  • contained in a compound-list associated with the loop (either in the compound-list of the loop's do-group or, if the loop is a while or until loop, in the compound-list following the while or until reserved word), and
  • not in the body of a function whose function definition command (see section 2.9.5) is contained in a compound-list associated with the loop.

If n is greater than the number of lexically enclosing loops and there is a non-lexically enclosing loop in progress in the same execution environment as the break or continue command, it is unspecified whether that loop encloses the command.

After page 2359 line 75155 (XCU 2.14 break examples), insert:
The results of running the following example are unspecified: There are two loops in progress when the break command is executed, and they are in the same execution environment, but neither loop is lexically enclosing the break command. (There are no loops lexically enclosing the continue commands, either.)
foo() {
    for j in 1 2; do
        echo 'break 2' >/tmp/do_break
        echo "  sourcing /tmp/do_break ($j)..."
        # the behavior of the break from running the following command
        # results in unspecified behavior:
        . /tmp/do_break

        do_continue() { continue 2; }
        echo "  running do_continue ($j)..."
        # the behavior of the continue in the following function call
        # results in unspecified behavior (if execution reaches this
        # point):
        do_continue

        trap 'continue 2' USR1
        echo "  sending SIGUSR1 to self ($j)..."
        # the behavior of the continue in the trap invoked from the
        # following signal results in unspecified behavior (if
        # execution reaches this point):
        kill -USR1 $$
        sleep 1
    done
}
for i in 1 2; do
    echo "running foo ($i)..."
    foo
done

At page 2362 lines 75244-75250 (XCU 2.14 continue description), change:
The continue utility shall return to the top of the smallest enclosing for, while, or until loop, or to the top of the nth enclosing loop, if n is specified. This involves repeating the condition list of a while or until loop or performing the next assignment of a for loop, and re-executing the loop if appropriate.

The value of n is a decimal integer greater than or equal to 1. The default shall be equivalent to n=1. If n is greater than the number of enclosing loops, the outermost enclosing loop shall be used.

to:
If n is specified, the continue utility shall return to the top of the nth enclosing for, while, or until loop. If n is unspecified, continue shall behave as if n was specified as 1. Returning to the top of the loop involves repeating the condition list of a while or until loop or performing the next assignment of a for loop, and re-executing the loop if appropriate. The value of n is a positive decimal integer. If n is greater than the number of enclosing loops, the outermost enclosing loop shall be used. If there is no enclosing loop, the behavior is unspecified.

The meaning of "enclosing" shall be as specified in the description of the break utility.


(0002258)
jilles (reporter)
2014-06-03 22:23

Here are two more strange cases:

A break or continue could be in a trap handler:
    sh -c 'trap break USR1; for i in a b; do kill -USR1 $$; echo $i; done'
Most shells (except mksh) exit from the loop in this example. This can be used to good effect, although something similar can be done by setting a variable in the trap handler and checking it in the mainline code.

A break could be in a subshell nested in a loop:
    sh -c 'for i in a b; do (break; echo $i) done'
Most shells (except mksh) exit from the subshell in this example. In FreeBSD sh, this is because the shell attempts to break from the loop, but the attempt stops at the subshell.

I think the dynamic scope is most compatible with existing shells and scripts.
(0002259)
rhansen (manager)
2014-06-05 06:24
edited on: 2014-06-05 06:30

Thank you for the additional strange cases!

> A break or continue could be in a trap handler:

Because no loop lexically encloses the break in the trap, the proposed wording would cause this to fall under unspecified behavior. However, the trap is executed in the same environment as the loop so if the shell did dynamic scope and the trap was invoked while executing the loop then it should work as expected.

> A break could be in a subshell nested in a loop:

The proposed wording makes this unspecified behavior because the subshell puts the break in a different shell execution environment.

> I think the dynamic scope is most compatible with existing shells and scripts.

Agreed, although there are some implementations don't do dynamic scoping. Also, I agree with what Geoff said on the mailing list:
The current situation has persisted for at least 20 years, so I don't have a problem with it continuing. I imagine that almost all shell script authors just naturally use break and continue in a way that works the same with either scope, without being aware of the issue at all.


(0003230)
stephane (reporter)
2016-05-17 22:21

Two other cases to consider:

for i in 1 2; do
  echo "$i"
  eval break
done

mksh and other pdksh-derived shells choke on that. yash, ksh88, ksh93, zsh, bash, ash based shells don't.

for i in 1 2; do
  echo "$i"
  command break
done

As a non-special built-in, "command" is required to run in a separate environment. That would be a bug in the spec as "command" as specified clearly needs to run in the current environment. That should be addressed in a different bug but once that's fixed, a note about "command break" being valid (or not as implementations are allowed to implement "command" as a function) should probably be added here as well.

Also, it would help to clarify whether "break n" or "continue n" where n is greater than the number of enclosing loops should be considered an error or not (and the consequence on stderr and exit status). Some shells (yash, mksh) do report an error. bash and zsh only if the number of enclosing loops is 0. yash and zsh return a non-zero exit status when giving an error, others don't.
(0003234)
rhansen (manager)
2016-05-26 16:21

Regarding implementations that choke on 'eval break', I believe that is clearly a bug in those implementations (they do not implement 'eval' properly).

Regarding 'command break', I believe the standard is clear that it should behave just like 'break' except that an error in 'break' shall not cause the shell to abort, and variable assignments before 'command break' do not remain in affect after 'command break' completes.

Regarding 'command' needing to run in the current environment: This is something that I have always found to be a bit confusing. Note that the standard does support non-special built-ins affecting the current execution environment; section 2.12 says:
The environment of the shell process shall not be changed by the utility unless explicitly specified by the utility description (for example, cd and umask).
In the past when we've discussed utilities (that are not special built-ins) that affect or depend on the current execution environment I believe the argument was that they could run in a separate execution environment but somehow (in an implementation-defined manner) communicate with the invoking shell to achieve their effects or acquire the relevant information. Usually this communication is trivial because those utilities are implemented as (regular) built-ins.

If 'command' is implemented as a function, the standard requires it to behave as if it was not implemented as a function.

It may be worthwhile to add a clarifying (non-normative) note saying that 'command break n' works like 'break n' except it doesn't abort the shell on errors or preserve in-line variable assignments.

I agree that we should clarify whether it is an error if 'n' is greater than the number of enclosing loops. Would you mind filing a new bug report about that?
(0003264)
stephane (reporter)
2016-06-11 21:17

> Regarding implementations that choke on 'eval break', I
> believe that is clearly a bug in those implementations (they
> do not implement 'eval' properly).

"eval" is the command that causes the shell to interpret the
code made of the concatenation of the arguments. "." is the
command that causes the shell to interpret the code in the file
given as arguments.

That's two cases of a new instance of the parser being started
to invoke some code.

I don't object to requiring "eval break" to break out of a loop,
and am even in favour of it, but if we're to make the behaviour
unspecified for "." (even though all shells seem to concur on
that one), I think we should make it explicit in the spec that
"eval" is OK.

[...]
> Regarding 'command' needing to run in the current environment:
> This is something that I have always found to be a bit
> confusing. Note that the standard does support non-special
> built-ins affecting the current execution environment; section
> 2.12 says:
>
> The environment of the shell process shall not be changed
> by the utility unless explicitly specified by the utility
> description (for example, cd and umask).
>
> In the past when we've discussed utilities (that are not
> special built-ins) that affect or depend on the current
> execution environment I believe the argument was that they
> could run in a separate execution environment but somehow (in
> an implementation-defined manner) communicate with the
> invoking shell to achieve their effects or acquire the
> relevant information. Usually this communication is trivial
> because those utilities are implemented as (regular)
> built-ins.
[...]

So, for instance in:

command eval '
  a=test
  sleep 1
  kill "$$"'

"command" would be spawned in a new process. It would
ask the shell via some implementation-defined IPC mechanism to
run that "eval" command. So the shell would run eval while at
the same time waiting for "command". And when finished running
eval it would communicate the exit status of eval back to "command"
which would then exit with in it. What should happen if the
process running "command" is killed? Or like in this example if
the process running the shell is killed?

In an interactive shell, would command and eval (and sleep) run
in the same process group?

Wouldn't it be simpler to say that "command" runs in the current
shell environment *and in the same process* as a built-in as
there's not really any other sane way to implement it?

> If 'command' is implemented as a function, the standard
> requires it to behave as if it was not implemented as a
> function.

Even if the application redefines any of the utilities called by
that function as functions?

> It may be worthwhile to add a clarifying (non-normative) note
> saying that 'command break n' works like 'break n' except it
> doesn't abort the shell on errors or preserve in-line variable
> assignments.

It wouldn't harm indeed.

> I agree that we should clarify whether it is an error if 'n'
> is greater than the number of enclosing loops. Would you mind
> filing a new bug report about that?

I'll do.

- Issue History
Date Modified Username Field Change
2014-06-02 05:55 rhansen New Issue
2014-06-02 05:55 rhansen Name => Richard Hansen
2014-06-02 05:55 rhansen Organization => BBN
2014-06-02 05:55 rhansen Section => XCU 2.14 break, continue
2014-06-02 05:55 rhansen Page Number => 2358, 2362
2014-06-02 05:55 rhansen Line Number => 75117-75121, 75244-75250
2014-06-02 20:42 rhansen Note Added: 0002256
2014-06-02 20:43 rhansen Note Added: 0002257
2014-06-03 22:23 jilles Note Added: 0002258
2014-06-05 03:49 rhansen Note Edited: 0002256
2014-06-05 03:58 rhansen Note Edited: 0002256
2014-06-05 05:43 rhansen Note Edited: 0002257
2014-06-05 05:59 rhansen Note Edited: 0002257
2014-06-05 06:01 rhansen Note Edited: 0002257
2014-06-05 06:07 rhansen Note Edited: 0002257
2014-06-05 06:24 rhansen Note Added: 0002259
2014-06-05 06:30 rhansen Note Edited: 0002259
2014-06-05 16:12 rhansen Note Edited: 0002257
2014-06-05 16:20 rhansen Note Edited: 0002257
2014-06-05 16:33 Don Cragun Interp Status => ---
2014-06-05 16:33 Don Cragun Final Accepted Text => See Note: 0002257.
2014-06-05 16:33 Don Cragun Status New => Resolved
2014-06-05 16:33 Don Cragun Resolution Open => Accepted As Marked
2014-06-05 16:34 Don Cragun Tag Attached: tc2-2008
2016-05-17 22:21 stephane Note Added: 0003230
2016-05-26 16:21 rhansen Note Added: 0003234
2016-06-11 21:17 stephane Note Added: 0003264
2017-08-10 16:00 Don Cragun Relationship added parent of 0001058


Mantis 1.1.6[^]
Copyright © 2000 - 2008 Mantis Group
Powered by Mantis Bugtracker