Austin Group Defect Tracker

Aardvark Mark IV


Viewing Issue Simple Details Jump to Notes ] Issue History ] Print ]
ID Category Severity Type Date Submitted Last Update
0001607 [1003.1(2016/18)/Issue7+TC2] Shell and Utilities Editorial Clarification Requested 2022-09-26 12:22 2024-06-11 09:07
Reporter nmeum View Status public  
Assigned To
Priority normal Resolution Accepted As Marked  
Status Closed  
Name Sören Tempel
Organization
User Reference
Section ed
Page Number 2691
Line Number 87825
Interp Status ---
Final Accepted Text Note: 0005984
Summary 0001607: Operator associativity for address chain operator is not specified
Description The ed specification does not specify the operator associativity for the address chain operators (i.e. , and ;) in a normative section. The rationale section (which is in my understanding only informative) mandates the following address handling rules in the provided table in line number 87825 and 87834:

Address Addr1 Addr2
7,5,    5     5
7;5;    5     5

As such, it seems that the address chain operators are right-associative. That is, the address 7,5, must be evaluated as 7,(5,) and not as (7,5), since the latter would yield the address 1,$. However, to the best of my knowledge, this is never explicitly stated in the specification.

I already mentioned this as a side note in https://austingroupbugs.net/view.php?id=1582 [^] but it seems to me that this issue was overlooked and thus not addressed as part of the proposed interpretation, hence I am creating a separate bug report for it.
Desired Action After the sentence starting in line 87370 add an additional sentence which clarifies that the <comma> and <semicolon> operators are right-associative and not left-associative.
Tags tc3-2008
Attached Files

- Relationships

-  Notes
(0005975)
kre (reporter)
2022-09-26 14:01
edited on: 2022-09-26 19:46

I would agree that perhaps a better specification of address chains might
be required, but I don't believe that defining some kind of associativity
is the way to do it, the , and ; aren't operators as such, they're separators.

ed addresses are evaluated left to right, in order, and then the last N (where
the value of N depends upon the command being executed) are used.

It really is as simple as that.

When the separator is a ';' "." is set to the value of the preceding address
before the following one is evaluated.

I have no idea where the:
      That is, the address 7,5, must be evaluated as 7,(5,) and not as (7,5),
      since the latter would yield the address 1,$.
1,$ conclusion can possibly originate - where does the 1 come from? A first
address (7,5) would be 5 and 5, (as the table at lines 87376-82 shows)
 ==> 5,5 not 5,$ On the other hand 7,(5,) would be 7,(5,5) which would
be (if it meant anything at all) 7,5 which is eventually going to produce an
error if the command needs 2 addresses.

(0005976)
nmeum (reporter)
2022-09-26 16:54

> I have no idea where the: 1,$ conclusion can possibly originate - where does the 1 come from?

Apologies if I am misunderstanding the specification here, but isn't this a question of applying the omission rules? According to the omission rules "," is expanded to "1,$" (line number 87377). Therefore, I assumed 7,5, to be evaluated (left to right) as follows when grouped from the left:

  7,5, -> (7,5), -> (7,5)(1,$) -> 1,$

Since 7,5 is already a valid address there are no omission rules to apply. As such, you would then expand "," to "1,$" (line number 87377) and then discard address until the "maximum number of valid addresses remain" (line number 87368). Assuming the command takes two addresses my understanding would be that one thus ends up with "1,$" with left to right evaluation order. This is, of cause, in violation with line number 87825, thus I assumed the specification requires grouping from the right:

  7,5, -> 7,(5,) -> 7,5,5 -> 5,5

> A first address (7,5) would be 5 and 5, [...]

Why would "(7,5)" be "5" and "5,"? Is this a typo and you mean "7" and "5,"? If so: Are you not grouping from the right then? Or do you mean that 7,5 should evaluate to 5? If so: where is this stated in the spec?
(0005977)
kre (reporter)
2022-09-26 19:30

Re Note: 0005976

OK, I see now how you got the "1,$" but that's not the way things work.

    but isn't this a question of applying the omission rules?

Eventually, yes.

   Since 7,5 is already a valid address there are no omission rules to apply.

It is actually 2 addresses, but yes, that's correct otherwise.

   As such, you would then expand "," to "1,$"

No, that rule only applies when there is no address before (or after)
the ',', but that's not the case here, here there is an address before
the ',', "5" (and another before that, but that one is irrelevant now).
What we have is "5," not a bare "," and in that case line 87379 applies.

When you do: 7,5, -> (7,5), -> (7,5)(1,$)

you have managed to make 4 addresses (7 5 1 and $) with only 2
separators, and that makes no sense at all. There are 3 addresses,
you cannot invent a new one.

You wouldn't do that kind of transformation with arithmetic, you
don't do it here either -- consider 2*3*4 - treated as (2*3)*4 you
cannot make that into (2*3)(0*4) or even (2*3)(1*4) it just makes
no sense. We would end up with (6)(0) (or (6)(4)) - so should the
answer be either 60 or 64? I hope not.

    Why would "(7,5)" be "5" and "5,"?

No, not a typo, 7,5 is two addresses, when we proceed to the next addr,
only the 2nd of those means anything, hence "5" the separator is ","
so we are now evaluating "5," and line 87379 applies.

   where is this stated in the spec?

That goes back to the first sentence of my Note: 0005975 .. it probably
is not stated, and should be. But it is very dangerous to ever claim
something is not in the standard, one would need to know every word of
all of it, to be sure of that, and I don't think anyone claims that
ability. This doesn't mean that sections can't be improved, even if
that is not strictly necessary.

I suspect this might be one of those "everyone just knows" kinds of
things, no-one who really knows ed can even imagine address chains
being evaluated any other way. This is not the only time this
(apparently) has led to sloppy wording.

Perhaps the best demonstration of what is intended is this sentence,
also from the Rationale (and so, as you surmised, not normative, not
really a part of the standard - though it is possible to use to
resolve ambiguities).

From lines 87815-8:

    For example, the command "3;/foo/;+2p" will display the first
    line after line 3 that contains the pattern foo, plus the next
    two lines. Note that the address "3;" must still be evaluated
    before being discarded, because the search origin for the
    "/foo/" command depends on this.

["/foo/" is not a command, but an address, and that ought be fixed,
but that isn't the point here.]

If you attempt to apply your method to this address chain you will
absolutely not get the desired result.

But if you do simple left to right address evaluation, where the
',', or in this case ';' (do note that an address chain can use both
',' and ';' as separators) marks the division between the addresses
(only when not in a pattern of course) the result is obvious.

First address, "3" that's a simple case, line 3. Separator is ';'
so set '.' to 3. Second address "/foo/" search forward from '.'
looking for the r.e. (in this case, just a string) "foo"). Let's
assume that is found in line 12. Next separator is ';' so set '.'
to 12. Third address "+2" which means ".+2" or 14. So the
evaluated addr chain is 3 12 14

The 'p' command takes only (max of) 2 addresses, so the 3 is
ignored at this stage, and we print lines 12 13 and 14 (ie:
the first line after line 3 containing "foo" and the two following
lines, just as the text says will happen).

ed really is a very simply beast, if there is an easy way to explain
how something might work, and a complex one, the easy one will be
correct every time.
(0005978)
kre (reporter)
2022-09-26 19:55

While we are here, playing with wording in ed - and at just about
the place a change ought to be made, there's another piece of
bizarre (though not incorrect) wording:

Lines 87365-6

    If more than the required number of addresses
    are provided to a command that requires zero addresses,
    it shall be an error.

"more than the required number of addresses ... to a command
that requires zero addresses"

Really!

Could this be changed into

    If addresses are provided to a command that takes zero addresses...
(0005979)
nmeum (reporter)
2022-09-27 08:59

Thank you for your detailed comments.

> That goes back to the first sentence of my Note: 0005975 .. it probably is not stated, and should be.

But in that case we both agree that the specification currently does not describe how an address chain like "7,5," should be evaluated, that is the point of this clarification request. I think the discussion here clearly demonstrates that "reverse-engineering" the evaluation algorithm from examples such as "3;/foo/;+2p" or "7,5," can obviously lead to wrong results. All I am asking for is a clarification of this algorithm in a normative section. Grouping separators from the right was just me trying to make sense of the example chains in the rationale section.

> I suspect this might be one of those "everyone just knows" kinds of things, no-one who really knows ed can even imagine address chains being evaluated any other way. This is not the only time this (apparently) has led to sloppy wording.

I would just propose adding an additional sentence to the paragraph in line number 87370 to describe the address chain evaluation algorithm that "everyone just knows" but isn't stated in the specification currently.

> But it is very dangerous to ever claim something is not in the standard [...]

The "Addresses in ed" section (line number 87317), or more specifically, the paragraph in 87370 - 87373 (where the "," and ";" separators are introduced) is where I would have expected the algorithm to be described. As far as I can tell, it is not described in this section. For the purpose of clarification, it makes sense to describe the algorithm in this section.
(0005980)
geoffclare (manager)
2022-09-27 09:24

The relevant part of "Addresses in ed" is 87366-87369:
if more than the required number of addresses are provided to a command, the addresses specified first shall be evaluated and then discarded until the maximum number of valid addresses remain, for the specified command.

Once this rule has been followed there are no extra addresses and so the question of associativity simply never arises.
(0005981)
nmeum (reporter)
2022-09-27 12:39

Feel free to close this issue if you feel that this is sufficiently specified. Maybe it is just me.
(0005982)
kre (reporter)
2022-09-27 16:06
edited on: 2022-09-27 16:12

Re Note: 0005980

I agree the question of associativity never arises, but for a different
reason - it doesn't apply because the ',' and ';' are not operators
(the text is quite clear already that they are separators), the idea
that they could associate one way or the other would mean treating an
address chain as some kind of expression, which it isn't. Each address
is an entity in itself.

Re Note: 0005981

I am not sure that there is nothing to change however. In particular,
considering the words that Geoff quoted, which have an obvious meaning
to anyone who already knows what that obvious meaning is, are by no means
clear.

   the addresses specified first shall be evaluated and then discarded

might be

  (the addresses specified first) shall be evaluated and then discarded
or
   the addresses specified (first shall be evaluated and then discarded)

and in neither case does that say which order the addresses should be
evaluated, though lines 87371-2 say

   In the case of a <semicolon> separator, the current line ('.') shall
   be set to the first address,
that part isn't important for the current issue, but..
   and only then will the second address be calculated.
this is, as it requires the the first addr be calculated before the
second. But this only applies to ';' separators.

Consider
          /foo/;+2,/bar/

In ed, we all (or most of us) know that we search from . to find foo,
set . to that (and addr#1), addr#2 is addr#1+2, and then we search
forward from . (the updated .) to find bar, which gives us addr#3.
Then we use addr#3 (for one addr commands) or addr#2 and addr#3 for
two addr commands.

But nothing I see in the normative text requires that. If we first
evaluate the addresses, in some unspecified order except where the
separator is ';', then why not evaluate /bar/ first? Then /foo/
and finally +2 ? Then we discard the unnecessary ones (but which?)
Where does the normative text prohibit that?

This should be easy to fix, just change those lines 87366-87369
from

    if more than the required number of addresses are provided to
    a command, the addresses specified first shall be evaluated and
    then discarded until the maximum number of valid addresses remain,
    for the specified command.

into

    if more than the required number of addresses are provided to
    a command, the addresses shall be evaluated from first to last, and
    then discarded, until the maximum number of valid addresses remain,
    for the specified command.

And as well as doing that, in lines 87365-6 make the change suggested in
Note: 0005978 or something similar.

And third, in line 87818 change the word "command" to "address".

(0005984)
geoffclare (manager)
2022-09-29 11:05

Suggested changes ...

On page 2680 line 87365 section ed, change:
Commands accept zero, one, or two addresses. If more than the required number of addresses are provided to a command that requires zero addresses, it shall be an error. Otherwise, if more than the required number of addresses are provided to a command, the addresses specified first shall be evaluated and then discarded until the maximum number of valid addresses remain, for the specified command.
to:
Commands accept zero, one, or two addresses. If one or more addresses are provided to a command that accepts zero addresses, it shall be an error. Otherwise, if more than the maximum number of accepted addresses are provided to a command, the addresses shall be evaluated from first to last and then discarded, until the maximum number of accepted addresses for that command remain.

On page 2691 line 87812 section ed, change:
Any number of addresses can be provided to commands taking addresses; for example, "1,2,3,4,5p" prints lines 4 and 5, because two is the greatest valid number of addresses accepted by the print command.
to:
More than the maximum number of accepted addresses can be provided to commands taking addresses; for example, "1,2,3,4,5p" prints lines 4 and 5, because two is the maximum number of addresses accepted by the print command.

On page 2691 line 87818 section ed, change:
the search origin for the "/foo/" command depends on this.
to:
the search origin for the "/foo/" address depends on this.

- Issue History
Date Modified Username Field Change
2022-09-26 12:22 nmeum New Issue
2022-09-26 12:22 nmeum Name => Sören Tempel
2022-09-26 12:22 nmeum Section => ed
2022-09-26 12:22 nmeum Page Number => 2691
2022-09-26 12:22 nmeum Line Number => 87825
2022-09-26 14:01 kre Note Added: 0005975
2022-09-26 16:54 nmeum Note Added: 0005976
2022-09-26 19:30 kre Note Added: 0005977
2022-09-26 19:46 kre Note Edited: 0005975
2022-09-26 19:55 kre Note Added: 0005978
2022-09-27 08:59 nmeum Note Added: 0005979
2022-09-27 09:24 geoffclare Note Added: 0005980
2022-09-27 12:39 nmeum Note Added: 0005981
2022-09-27 16:06 kre Note Added: 0005982
2022-09-27 16:08 kre Note Edited: 0005982
2022-09-27 16:12 kre Note Edited: 0005982
2022-09-29 11:05 geoffclare Note Added: 0005984
2022-10-20 16:02 geoffclare Interp Status => ---
2022-10-20 16:02 geoffclare Final Accepted Text => Note: 0005984
2022-10-20 16:02 geoffclare Status New => Resolved
2022-10-20 16:02 geoffclare Resolution Open => Accepted As Marked
2022-10-20 16:02 geoffclare Tag Attached: tc3-2008
2022-11-01 15:22 geoffclare Status Resolved => Applied
2024-06-11 09:07 agadmin Status Applied => Closed


Mantis 1.1.6[^]
Copyright © 2000 - 2008 Mantis Group
Powered by Mantis Bugtracker