Notes |
(0002256)
rhansen (manager)
2014-06-02 20:42
edited on: 2014-06-05 03:58
|
Dynamic scope for break/continue can be thought of as an extension to lexical scope: Any shell script written assuming lexical scoping will behave as intended if the shell implementation actually uses dynamic scoping for break/continue. Thus, I think it will be sufficient to simply specify lexical scoping for Issue 7 TC2 and leave dynamic scoping as an extension to the standard.
EDIT: The above statement is incorrect if a script uses 'break 1000' with the intention of breaking out of the outermost lexically enclosing loop and there is a non-lexically enclosing loop in progress. Thus, dynamic scoping isn't a pure extension to lexical scoping when n is greater than the number of lexically enclosing loops.
|
|
(0002257)
rhansen (manager)
2014-06-02 20:43
edited on: 2014-06-05 16:20
|
At page 2358 lines 75117-75121 (XCU 2.14 break description), change:
The break utility shall exit from the smallest enclosing for, while, or until loop, if any; or from the nth enclosing loop if n is specified. The value of n is an unsigned decimal integer greater than or equal to 1. The default shall be equivalent to n=1. If n is greater than the number of enclosing loops, the outermost enclosing loop shall be exited. Execution shall continue with the command immediately following the loop.
to:
If n is specified, the break utility shall exit from the nth enclosing for, while, or until loop. If n is unspecified, break shall behave as if n was specified as 1. Execution shall continue with the command immediately following the exited loop. The value of n is a positive decimal integer. If n is greater than the number of enclosing loops, the outermost enclosing loop shall be exited. If there is no enclosing loop, the behavior is unspecified.
A loop shall enclose a break or continue command if the loop lexically encloses the command. A loop lexically encloses a break or continue command if the command is:
- executing in the same execution environment (see section 2.12) as the compound-list of the loop's do-group (see section 2.10.2), and
- contained in a compound-list associated with the loop (either in the compound-list of the loop's do-group or, if the loop is a while or until loop, in the compound-list following the while or until reserved word), and
- not in the body of a function whose function definition command (see section 2.9.5) is contained in a compound-list associated with the loop.
If n is greater than the number of lexically enclosing loops and there is a non-lexically enclosing loop in progress in the same execution environment as the break or continue command, it is unspecified whether that loop encloses the command.
After page 2359 line 75155 (XCU 2.14 break examples), insert:
The results of running the following example are unspecified: There are two loops in progress when the break command is executed, and they are in the same execution environment, but neither loop is lexically enclosing the break command. (There are no loops lexically enclosing the continue commands, either.)foo() {
for j in 1 2; do
echo 'break 2' >/tmp/do_break
echo " sourcing /tmp/do_break ($j)..."
# the behavior of the break from running the following command
# results in unspecified behavior:
. /tmp/do_break
do_continue() { continue 2; }
echo " running do_continue ($j)..."
# the behavior of the continue in the following function call
# results in unspecified behavior (if execution reaches this
# point):
do_continue
trap 'continue 2' USR1
echo " sending SIGUSR1 to self ($j)..."
# the behavior of the continue in the trap invoked from the
# following signal results in unspecified behavior (if
# execution reaches this point):
kill -USR1 $$
sleep 1
done
}
for i in 1 2; do
echo "running foo ($i)..."
foo
done
At page 2362 lines 75244-75250 (XCU 2.14 continue description), change:
The continue utility shall return to the top of the smallest enclosing for, while, or until loop, or to the top of the nth enclosing loop, if n is specified. This involves repeating the condition list of a while or until loop or performing the next assignment of a for loop, and re-executing the loop if appropriate.
The value of n is a decimal integer greater than or equal to 1. The default shall be equivalent to n=1. If n is greater than the number of enclosing loops, the outermost enclosing loop shall be used.
to:
If n is specified, the continue utility shall return to the top of the nth enclosing for, while, or until loop. If n is unspecified, continue shall behave as if n was specified as 1. Returning to the top of the loop involves repeating the condition list of a while or until loop or performing the next assignment of a for loop, and re-executing the loop if appropriate. The value of n is a positive decimal integer. If n is greater than the number of enclosing loops, the outermost enclosing loop shall be used. If there is no enclosing loop, the behavior is unspecified.
The meaning of "enclosing" shall be as specified in the description of the break utility.
|
|
(0002258)
jilles (reporter)
2014-06-03 22:23
|
Here are two more strange cases:
A break or continue could be in a trap handler:
sh -c 'trap break USR1; for i in a b; do kill -USR1 $$; echo $i; done'
Most shells (except mksh) exit from the loop in this example. This can be used to good effect, although something similar can be done by setting a variable in the trap handler and checking it in the mainline code.
A break could be in a subshell nested in a loop:
sh -c 'for i in a b; do (break; echo $i) done'
Most shells (except mksh) exit from the subshell in this example. In FreeBSD sh, this is because the shell attempts to break from the loop, but the attempt stops at the subshell.
I think the dynamic scope is most compatible with existing shells and scripts. |
|
(0002259)
rhansen (manager)
2014-06-05 06:24
edited on: 2014-06-05 06:30
|
Thank you for the additional strange cases!
> A break or continue could be in a trap handler:
Because no loop lexically encloses the break in the trap, the proposed wording would cause this to fall under unspecified behavior. However, the trap is executed in the same environment as the loop so if the shell did dynamic scope and the trap was invoked while executing the loop then it should work as expected.
> A break could be in a subshell nested in a loop:
The proposed wording makes this unspecified behavior because the subshell puts the break in a different shell execution environment.
> I think the dynamic scope is most compatible with existing shells and scripts.
Agreed, although there are some implementations don't do dynamic scoping. Also, I agree with what Geoff said on the mailing list:The current situation has persisted for at least 20 years, so I don't have a problem with it continuing. I imagine that almost all shell script authors just naturally use break and continue in a way that works the same with either scope, without being aware of the issue at all.
|
|
(0003230)
stephane (reporter)
2016-05-17 22:21
|
Two other cases to consider:
for i in 1 2; do
echo "$i"
eval break
done
mksh and other pdksh-derived shells choke on that. yash, ksh88, ksh93, zsh, bash, ash based shells don't.
for i in 1 2; do
echo "$i"
command break
done
As a non-special built-in, "command" is required to run in a separate environment. That would be a bug in the spec as "command" as specified clearly needs to run in the current environment. That should be addressed in a different bug but once that's fixed, a note about "command break" being valid (or not as implementations are allowed to implement "command" as a function) should probably be added here as well.
Also, it would help to clarify whether "break n" or "continue n" where n is greater than the number of enclosing loops should be considered an error or not (and the consequence on stderr and exit status). Some shells (yash, mksh) do report an error. bash and zsh only if the number of enclosing loops is 0. yash and zsh return a non-zero exit status when giving an error, others don't. |
|
(0003234)
rhansen (manager)
2016-05-26 16:21
|
Regarding implementations that choke on 'eval break', I believe that is clearly a bug in those implementations (they do not implement 'eval' properly).
Regarding 'command break', I believe the standard is clear that it should behave just like 'break' except that an error in 'break' shall not cause the shell to abort, and variable assignments before 'command break' do not remain in affect after 'command break' completes.
Regarding 'command' needing to run in the current environment: This is something that I have always found to be a bit confusing. Note that the standard does support non-special built-ins affecting the current execution environment; section 2.12 says:The environment of the shell process shall not be changed by the utility unless explicitly specified by the utility description (for example, cd and umask). In the past when we've discussed utilities (that are not special built-ins) that affect or depend on the current execution environment I believe the argument was that they could run in a separate execution environment but somehow (in an implementation-defined manner) communicate with the invoking shell to achieve their effects or acquire the relevant information. Usually this communication is trivial because those utilities are implemented as (regular) built-ins.
If 'command' is implemented as a function, the standard requires it to behave as if it was not implemented as a function.
It may be worthwhile to add a clarifying (non-normative) note saying that 'command break n' works like 'break n' except it doesn't abort the shell on errors or preserve in-line variable assignments.
I agree that we should clarify whether it is an error if 'n' is greater than the number of enclosing loops. Would you mind filing a new bug report about that? |
|
(0003264)
stephane (reporter)
2016-06-11 21:17
|
> Regarding implementations that choke on 'eval break', I
> believe that is clearly a bug in those implementations (they
> do not implement 'eval' properly).
"eval" is the command that causes the shell to interpret the
code made of the concatenation of the arguments. "." is the
command that causes the shell to interpret the code in the file
given as arguments.
That's two cases of a new instance of the parser being started
to invoke some code.
I don't object to requiring "eval break" to break out of a loop,
and am even in favour of it, but if we're to make the behaviour
unspecified for "." (even though all shells seem to concur on
that one), I think we should make it explicit in the spec that
"eval" is OK.
[...]
> Regarding 'command' needing to run in the current environment:
> This is something that I have always found to be a bit
> confusing. Note that the standard does support non-special
> built-ins affecting the current execution environment; section
> 2.12 says:
>
> The environment of the shell process shall not be changed
> by the utility unless explicitly specified by the utility
> description (for example, cd and umask).
>
> In the past when we've discussed utilities (that are not
> special built-ins) that affect or depend on the current
> execution environment I believe the argument was that they
> could run in a separate execution environment but somehow (in
> an implementation-defined manner) communicate with the
> invoking shell to achieve their effects or acquire the
> relevant information. Usually this communication is trivial
> because those utilities are implemented as (regular)
> built-ins.
[...]
So, for instance in:
command eval '
a=test
sleep 1
kill "$$"'
"command" would be spawned in a new process. It would
ask the shell via some implementation-defined IPC mechanism to
run that "eval" command. So the shell would run eval while at
the same time waiting for "command". And when finished running
eval it would communicate the exit status of eval back to "command"
which would then exit with in it. What should happen if the
process running "command" is killed? Or like in this example if
the process running the shell is killed?
In an interactive shell, would command and eval (and sleep) run
in the same process group?
Wouldn't it be simpler to say that "command" runs in the current
shell environment *and in the same process* as a built-in as
there's not really any other sane way to implement it?
> If 'command' is implemented as a function, the standard
> requires it to behave as if it was not implemented as a
> function.
Even if the application redefines any of the utilities called by
that function as functions?
> It may be worthwhile to add a clarifying (non-normative) note
> saying that 'command break n' works like 'break n' except it
> doesn't abort the shell on errors or preserve in-line variable
> assignments.
It wouldn't harm indeed.
> I agree that we should clarify whether it is an error if 'n'
> is greater than the number of enclosing loops. Would you mind
> filing a new bug report about that?
I'll do. |
|