0001607: Operator associativity for address chain operator is not specified

ID	Project	Category	View Status	Date Submitted	Last Update

0001607	1003.1(2016/18)/Issue7+TC2	Shell and Utilities	public	2022-09-26 12:22	2024-06-11 09:07

Reporter	nmeum	Assigned To
Priority	normal	Severity	Editorial	Type	Clarification Requested
Status	Closed	Resolution	Accepted As Marked

Name	Sören Tempel
Organization
User Reference
Section	ed
Page Number	2691
Line Number	87825
Interp Status	---
Final Accepted Text	0001607:0005984


Summary	0001607: Operator associativity for address chain operator is not specified
Description	The ed specification does not specify the operator associativity for the address chain operators (i.e. , and ;) in a normative section. The rationale section (which is in my understanding only informative) mandates the following address handling rules in the provided table in line number 87825 and 87834: Address Addr1 Addr2 7,5, 5 5 7;5; 5 5 As such, it seems that the address chain operators are right-associative. That is, the address 7,5, must be evaluated as 7,(5,) and not as (7,5), since the latter would yield the address 1,$. However, to the best of my knowledge, this is never explicitly stated in the specification. I already mentioned this as a side note in https://austingroupbugs.net/view.php?id=1582 but it seems to me that this issue was overlooked and thus not addressed as part of the proposed interpretation, hence I am creating a separate bug report for it.
Desired Action	After the sentence starting in line 87370 add an additional sentence which clarifies that the <comma> and <semicolon> operators are right-associative and not left-associative.
Tags	tc3-2008

kre 2022-09-26 14:01 reporter bugnote:0005975 Last edited: 2022-09-26 19:46	I would agree that perhaps a better specification of address chains might be required, but I don't believe that defining some kind of associativity is the way to do it, the , and ; aren't operators as such, they're separators. ed addresses are evaluated left to right, in order, and then the last N (where the value of N depends upon the command being executed) are used. It really is as simple as that. When the separator is a ';' "." is set to the value of the preceding address before the following one is evaluated. I have no idea where the: That is, the address 7,5, must be evaluated as 7,(5,) and not as (7,5), since the latter would yield the address 1,$. 1,$ conclusion can possibly originate - where does the 1 come from? A first address (7,5) would be 5 and 5, (as the table at lines 87376-82 shows) ==> 5,5 not 5,$ On the other hand 7,(5,) would be 7,(5,5) which would be (if it meant anything at all) 7,5 which is eventually going to produce an error if the command needs 2 addresses.

nmeum 2022-09-26 16:54 reporter bugnote:0005976	> I have no idea where the: 1,$ conclusion can possibly originate - where does the 1 come from? Apologies if I am misunderstanding the specification here, but isn't this a question of applying the omission rules? According to the omission rules "," is expanded to "1,$" (line number 87377). Therefore, I assumed 7,5, to be evaluated (left to right) as follows when grouped from the left: 7,5, -> (7,5), -> (7,5)(1,$) -> 1,$ Since 7,5 is already a valid address there are no omission rules to apply. As such, you would then expand "," to "1,$" (line number 87377) and then discard address until the "maximum number of valid addresses remain" (line number 87368). Assuming the command takes two addresses my understanding would be that one thus ends up with "1,$" with left to right evaluation order. This is, of cause, in violation with line number 87825, thus I assumed the specification requires grouping from the right: 7,5, -> 7,(5,) -> 7,5,5 -> 5,5 > A first address (7,5) would be 5 and 5, [...] Why would "(7,5)" be "5" and "5,"? Is this a typo and you mean "7" and "5,"? If so: Are you not grouping from the right then? Or do you mean that 7,5 should evaluate to 5? If so: where is this stated in the spec?

kre 2022-09-26 19:30 reporter bugnote:0005977	Re 0001607:0005976 OK, I see now how you got the "1,$" but that's not the way things work. but isn't this a question of applying the omission rules? Eventually, yes. Since 7,5 is already a valid address there are no omission rules to apply. It is actually 2 addresses, but yes, that's correct otherwise. As such, you would then expand "," to "1,$" No, that rule only applies when there is no address before (or after) the ',', but that's not the case here, here there is an address before the ',', "5" (and another before that, but that one is irrelevant now). What we have is "5," not a bare "," and in that case line 87379 applies. When you do: 7,5, -> (7,5), -> (7,5)(1,$) you have managed to make 4 addresses (7 5 1 and $) with only 2 separators, and that makes no sense at all. There are 3 addresses, you cannot invent a new one. You wouldn't do that kind of transformation with arithmetic, you don't do it here either -- consider 234 - treated as (23)4 you cannot make that into (23)(04) or even (23)(14) it just makes no sense. We would end up with (6)(0) (or (6)(4)) - so should the answer be either 60 or 64? I hope not. Why would "(7,5)" be "5" and "5,"? No, not a typo, 7,5 is two addresses, when we proceed to the next addr, only the 2nd of those means anything, hence "5" the separator is "," so we are now evaluating "5," and line 87379 applies. where is this stated in the spec? That goes back to the first sentence of my 0001607:0005975 .. it probably is not stated, and should be. But it is very dangerous to ever claim something is not in the standard, one would need to know every word of all of it, to be sure of that, and I don't think anyone claims that ability. This doesn't mean that sections can't be improved, even if that is not strictly necessary. I suspect this might be one of those "everyone just knows" kinds of things, no-one who really knows ed can even imagine address chains being evaluated any other way. This is not the only time this (apparently) has led to sloppy wording. Perhaps the best demonstration of what is intended is this sentence, also from the Rationale (and so, as you surmised, not normative, not really a part of the standard - though it is possible to use to resolve ambiguities). From lines 87815-8: For example, the command "3;/foo/;+2p" will display the first line after line 3 that contains the pattern foo, plus the next two lines. Note that the address "3;" must still be evaluated before being discarded, because the search origin for the "/foo/" command depends on this. ["/foo/" is not a command, but an address, and that ought be fixed, but that isn't the point here.] If you attempt to apply your method to this address chain you will absolutely not get the desired result. But if you do simple left to right address evaluation, where the ',', or in this case ';' (do note that an address chain can use both ',' and ';' as separators) marks the division between the addresses (only when not in a pattern of course) the result is obvious. First address, "3" that's a simple case, line 3. Separator is ';' so set '.' to 3. Second address "/foo/" search forward from '.' looking for the r.e. (in this case, just a string) "foo"). Let's assume that is found in line 12. Next separator is ';' so set '.' to 12. Third address "+2" which means ".+2" or 14. So the evaluated addr chain is 3 12 14 The 'p' command takes only (max of) 2 addresses, so the 3 is ignored at this stage, and we print lines 12 13 and 14 (ie: the first line after line 3 containing "foo" and the two following lines, just as the text says will happen). ed really is a very simply beast, if there is an easy way to explain how something might work, and a complex one, the easy one will be correct every time.

kre 2022-09-26 19:55 reporter bugnote:0005978	While we are here, playing with wording in ed - and at just about the place a change ought to be made, there's another piece of bizarre (though not incorrect) wording: Lines 87365-6 If more than the required number of addresses are provided to a command that requires zero addresses, it shall be an error. "more than the required number of addresses ... to a command that requires zero addresses" Really! Could this be changed into If addresses are provided to a command that takes zero addresses...

nmeum 2022-09-27 08:59 reporter bugnote:0005979	Thank you for your detailed comments. > That goes back to the first sentence of my Note: 0005975 .. it probably is not stated, and should be. But in that case we both agree that the specification currently does not describe how an address chain like "7,5," should be evaluated, that is the point of this clarification request. I think the discussion here clearly demonstrates that "reverse-engineering" the evaluation algorithm from examples such as "3;/foo/;+2p" or "7,5," can obviously lead to wrong results. All I am asking for is a clarification of this algorithm in a normative section. Grouping separators from the right was just me trying to make sense of the example chains in the rationale section. > I suspect this might be one of those "everyone just knows" kinds of things, no-one who really knows ed can even imagine address chains being evaluated any other way. This is not the only time this (apparently) has led to sloppy wording. I would just propose adding an additional sentence to the paragraph in line number 87370 to describe the address chain evaluation algorithm that "everyone just knows" but isn't stated in the specification currently. > But it is very dangerous to ever claim something is not in the standard [...] The "Addresses in ed" section (line number 87317), or more specifically, the paragraph in 87370 - 87373 (where the "," and ";" separators are introduced) is where I would have expected the algorithm to be described. As far as I can tell, it is not described in this section. For the purpose of clarification, it makes sense to describe the algorithm in this section.

geoffclare 2022-09-27 09:24 manager bugnote:0005980	The relevant part of "Addresses in ed" is 87366-87369: if more than the required number of addresses are provided to a command, the addresses specified first shall be evaluated and then discarded until the maximum number of valid addresses remain, for the specified command. Once this rule has been followed there are no extra addresses and so the question of associativity simply never arises.

nmeum 2022-09-27 12:39 reporter bugnote:0005981	Feel free to close this issue if you feel that this is sufficiently specified. Maybe it is just me.

kre 2022-09-27 16:06 reporter bugnote:0005982 Last edited: 2022-09-27 16:12	Re 0001607:0005980 I agree the question of associativity never arises, but for a different reason - it doesn't apply because the ',' and ';' are not operators (the text is quite clear already that they are separators), the idea that they could associate one way or the other would mean treating an address chain as some kind of expression, which it isn't. Each address is an entity in itself. Re 0001607:0005981 I am not sure that there is nothing to change however. In particular, considering the words that Geoff quoted, which have an obvious meaning to anyone who already knows what that obvious meaning is, are by no means clear. the addresses specified first shall be evaluated and then discarded might be (the addresses specified first) shall be evaluated and then discarded or the addresses specified (first shall be evaluated and then discarded) and in neither case does that say which order the addresses should be evaluated, though lines 87371-2 say In the case of a <semicolon> separator, the current line ('.') shall be set to the first address, that part isn't important for the current issue, but.. and only then will the second address be calculated. this is, as it requires the the first addr be calculated before the second. But this only applies to ';' separators. Consider /foo/;+2,/bar/ In ed, we all (or most of us) know that we search from . to find foo, set . to that (and addr#1), addr#2 is addr#1+2, and then we search forward from . (the updated .) to find bar, which gives us addr#3. Then we use addr#3 (for one addr commands) or addr#2 and addr#3 for two addr commands. But nothing I see in the normative text requires that. If we first evaluate the addresses, in some unspecified order except where the separator is ';', then why not evaluate /bar/ first? Then /foo/ and finally +2 ? Then we discard the unnecessary ones (but which?) Where does the normative text prohibit that? This should be easy to fix, just change those lines 87366-87369 from if more than the required number of addresses are provided to a command, the addresses specified first shall be evaluated and then discarded until the maximum number of valid addresses remain, for the specified command. into if more than the required number of addresses are provided to a command, the addresses shall be evaluated from first to last, and then discarded, until the maximum number of valid addresses remain, for the specified command. And as well as doing that, in lines 87365-6 make the change suggested in 0001607:0005978 or something similar. And third, in line 87818 change the word "command" to "address".

geoffclare 2022-09-29 11:05 manager bugnote:0005984	Suggested changes ... On page 2680 line 87365 section ed, change: Commands accept zero, one, or two addresses. If more than the required number of addresses are provided to a command that requires zero addresses, it shall be an error. Otherwise, if more than the required number of addresses are provided to a command, the addresses specified first shall be evaluated and then discarded until the maximum number of valid addresses remain, for the specified command. to: Commands accept zero, one, or two addresses. If one or more addresses are provided to a command that accepts zero addresses, it shall be an error. Otherwise, if more than the maximum number of accepted addresses are provided to a command, the addresses shall be evaluated from first to last and then discarded, until the maximum number of accepted addresses for that command remain. On page 2691 line 87812 section ed, change: Any number of addresses can be provided to commands taking addresses; for example, "1,2,3,4,5p" prints lines 4 and 5, because two is the greatest valid number of addresses accepted by the print command. to: More than the maximum number of accepted addresses can be provided to commands taking addresses; for example, "1,2,3,4,5p" prints lines 4 and 5, because two is the maximum number of addresses accepted by the print command. On page 2691 line 87818 section ed, change: the search origin for the "/foo/" command depends on this. to: the search origin for the "/foo/" address depends on this.

Date Modified	Username	Field	Change
2022-09-26 12:22	nmeum	New Issue
2022-09-26 12:22	nmeum	Name	=> Sören Tempel
2022-09-26 12:22	nmeum	Section	=> ed
2022-09-26 12:22	nmeum	Page Number	=> 2691
2022-09-26 12:22	nmeum	Line Number	=> 87825
2022-09-26 14:01	kre	Note Added: 0005975
2022-09-26 16:54	nmeum	Note Added: 0005976
2022-09-26 19:30	kre	Note Added: 0005977
2022-09-26 19:46	kre	Note Edited: 0005975
2022-09-26 19:55	kre	Note Added: 0005978
2022-09-27 08:59	nmeum	Note Added: 0005979
2022-09-27 09:24	geoffclare	Note Added: 0005980
2022-09-27 12:39	nmeum	Note Added: 0005981
2022-09-27 16:06	kre	Note Added: 0005982
2022-09-27 16:08	kre	Note Edited: 0005982
2022-09-27 16:12	kre	Note Edited: 0005982
2022-09-29 11:05	geoffclare	Note Added: 0005984
2022-10-20 16:02	geoffclare	Interp Status	=> ---
2022-10-20 16:02	geoffclare	Final Accepted Text	=> 0001607:0005984
2022-10-20 16:02	geoffclare	Status	New => Resolved
2022-10-20 16:02	geoffclare	Resolution	Open => Accepted As Marked
2022-10-20 16:02	geoffclare	Tag Attached: tc3-2008
2022-11-01 15:22	geoffclare	Status	Resolved => Applied
2024-06-11 09:07	agadmin	Status	Applied => Closed

View Issue Details

Activities

Issue History