Austin Group Defect Tracker

Aardvark Mark IV


Viewing Issue Simple Details Jump to Notes ] Issue History ] Print ]
ID Category Severity Type Date Submitted Last Update
0001015 [1003.1(2013)/Issue7+TC1] Shell and Utilities Editorial Clarification Requested 2015-12-28 12:36 2019-10-21 13:53
Reporter mirabilos View Status public  
Assigned To
Priority normal Resolution Accepted As Marked  
Status Applied  
Name mirabilos
Organization The MirOS Project
User Reference http://article.gmane.org/gmane.comp.standards.posix.austin.general/11817 [^]
Section 2.6.3
Page Number http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_06_03 [^]
Line Number third paragraph
Interp Status Approved
Final Accepted Text Note: 0003445
Summary 0001015: Handling of <backslash><double-quote> inside a (double-quoted) backquote command substitution
Description Clarification of section 2.6.3 (along with 2.2.3’s discussion of the accent gravis) is requested.

According to my, and the yash developers’, reading of the standard, the <backslash> only acts as escape character within the backquoted style of command substitution if followed by exactly <dollar>, <accent gravis> or <backslash>, but not <double-quote> (2.6.3 says so, supported by 2.2.3), making it an ordinary (and to-be-preserved) character, which means that the shell script…

ACRO_INSTALL_DIR=/usr/Acroread/Reader
directory="`basename \"$ACRO_INSTALL_DIR\"`"
echo "<$directory>"

… has exactly one valid output: <Reader">

This is because, under this reading,
| directory="`basename \"$ACRO_INSTALL_DIR\"`"
is exactly equivalent to (with the outer double quotes being redundant):
| directory="$(basename \"$ACRO_INSTALL_DIR\")"

Historic and current practice however differs; shells output <Reader> for that. Under that reading, the aforementioned string is exactly equivalent to (with the outer double quotes again being redundant):
| directory="$(basename "$ACRO_INSTALL_DIR")"
Desired Action Please clarify:

– if my (and the yash developers’) reading of the standard is correct, and
– if <Reader"> is indeed the only valid output, or
– if <Reader> is the only, or an additional permitted, output.

If <Reader> is permitted, it is likely that 2.6.3 has to be changed from…

Within the backquoted style of command substitution, <backslash> shall retain its literal meaning, except when followed by: '$', '`', or <backslash>.

… to:

Within the backquoted style of command substitution, <backslash> shall retain its literal meaning, except when followed by: '"', '$', '`', or <backslash>.

The desired outcome is for the standard to be changed to match current practice in virtually all shells (from those I tested, only mksh in POSIX mode, pdksh in POSIX mode, and yash differ) unless there is a very good reason not to and this is deliberate, and to clarify the wording in the standard.
Tags tc3-2008
Attached Files

- Relationships

-  Notes
(0003024)
mirabilos (reporter)
2016-01-15 14:40

In fact, I would like to formally request the standard to be changed to at least allow, if not require, the version that unescapes double quotes, as I’ve found at least two more recent pieces of software (mktexlsr from TeXlive, and dbconfig-common from Debian) that require it, while I’ve yet to see software that requires the version that does not unescape double quotes (as there are very few systems out in the field that use yash, or mksh R52 or later in POSIX mode, as /bin/sh).
(0003086)
kre (reporter)
2016-02-25 11:14
edited on: 2016-02-25 11:34

I think I disagree with your reading of the std. In particular, since
you are doing

    directory="`basename \"$ACRO_INSTALL_DIR\"`"

2.2.3 applies first. In that ...

   Enclosing characters in double-quotes ( "" ) shall preserve the literal
   value of all characters within the double-quotes, with the exception of the
   characters backquote, <dollar-sign>, and <backslash>, as follows:

\
    The <backslash> shall retain its special meaning as an escape character
    (see Escape Character (Backslash)) only when followed by one of the
    following characters when considered special:

           $ ` " \ <newline>

Since we are inside " the \ should be considered as escaping the " rather
than as a literal \. You might believe that the quoting rules start all over
again inside the ``, but I believe that is not correct, as 2.2.3 says:

    The input characters within the quoted string that are also enclosed
    between "$(" and the matching ')' shall not be affected by the
    double-quotes, but rather...

Making it quite clear that is true for a $( ) type command expansion.
But it says nothing of the kind about `...` expansions.

I suspect that's because that's the way shells all parse `...` and (for
most of them) always have. That's why almost all the shells you tested
act the way they do.

This is further reinforced by the 2nd of the two undefined cases where
2.2.3 talks about ` inside "".

     A "`...`" sequence that begins, but does not end, within the same
     double-quoted string

That would make no sense at all if the interpretation is as you seem to
be making,

If we have "text ` " (no \ chars anywhere) and if the ` started
a whole new parsing context, then the " would be the start of a new
quoted string. Until that ends, the `` expression could not possibly
end, as a new ` in there would have to be beginning of a new command
substitution.

But that is not how it works, rather that second " terminated the
double quoted string, and we have a case of a `...` expression that
begins, but does not end, within a double quoted string, which is the
undefined syntax that 2.2.3 is talking about.

So, I think the std is correct as it is, and bash and the other shells
you mentioned are doing the right thing. NetBSD's sh (if anyone cares)
does the same, and if the (seemingly redundant) outer quotes are
removed. so we execute

    directory=`basename \"$ACRO_INSTALL_DIR\"`

Then NetBSD's sh (and I suspect most others) does indeed produce
     <Reader">
as the output.

kre

(0003088)
mirabilos (reporter)
2016-03-01 18:59

Hi kre,

interesting reading, but your post opens up another question.

Consider this:

#-----BEGIN-----
    $ cat x
ACRO_INSTALL_DIR=/usr/Acroread/Reader

directory="`echo ba\s\te\
name \"$ACRO_INSTALL_DIR\"`"
printf '%s\n' "<$directory>"

directory=`echo ba\s\te\
name \"$ACRO_INSTALL_DIR\"`
printf '%s\n' "<$directory>"

directory="$(echo ba\s\te\
name \"$ACRO_INSTALL_DIR\")"
printf '%s\n' "<$directory>"

directory=$(echo ba\s\te\
name \"$ACRO_INSTALL_DIR\")
printf '%s\n' "<$directory>"

directory="`echo "ba\s\te\
name" \"$ACRO_INSTALL_DIR\"`"
printf '%s\n' "<$directory>"

directory=`echo "ba\s\te\
name" \"$ACRO_INSTALL_DIR\"`
printf '%s\n' "<$directory>"

directory="$(echo "ba\s\te\
name" \"$ACRO_INSTALL_DIR\")"
printf '%s\n' "<$directory>"

directory=$(echo "ba\s\te\
name" \"$ACRO_INSTALL_DIR\")
printf '%s\n' "<$directory>"

directory="`echo \"ba\s\te\
name\" \"$ACRO_INSTALL_DIR\"`"
printf '%s\n' "<$directory>"

directory=`echo \"ba\s\te\
name\" \"$ACRO_INSTALL_DIR\"`
printf '%s\n' "<$directory>"

directory="$(echo \"ba\s\te\
name\" \"$ACRO_INSTALL_DIR\")"
printf '%s\n' "<$directory>"

directory=$(echo \"ba\s\te\
name\" \"$ACRO_INSTALL_DIR\")
printf '%s\n' "<$directory>"
    $ mksh x
<bastename /usr/Acroread/Reader>
<bastename "/usr/Acroread/Reader">
<bastename "/usr/Acroread/Reader">
<bastename "/usr/Acroread/Reader">
<ba\s ename /usr/Acroread/Reader>
<ba\s ename "/usr/Acroread/Reader">
<ba\s ename "/usr/Acroread/Reader">
<ba\s ename "/usr/Acroread/Reader">
<ba\s ename /usr/Acroread/Reader>
<"bastename" "/usr/Acroread/Reader">
<"bastename" "/usr/Acroread/Reader">
<"bastename" "/usr/Acroread/Reader">
    $ mksh -o posix x # this is with disabled yash interpretation
<bastename /usr/Acroread/Reader>
<bastename "/usr/Acroread/Reader">
<bastename "/usr/Acroread/Reader">
<bastename "/usr/Acroread/Reader">
<ba\s\tename /usr/Acroread/Reader>
<ba\s\tename "/usr/Acroread/Reader">
<ba\s\tename "/usr/Acroread/Reader">
<ba\s\tename "/usr/Acroread/Reader">
<ba\s\tename /usr/Acroread/Reader>
<"bastename" "/usr/Acroread/Reader">
<"bastename" "/usr/Acroread/Reader">
<"bastename" "/usr/Acroread/Reader">
    $ bash --posix x
<bastename /usr/Acroread/Reader>
<bastename "/usr/Acroread/Reader">
<bastename "/usr/Acroread/Reader">
<bastename "/usr/Acroread/Reader">
<ba\s\tename /usr/Acroread/Reader>
<ba\s\tename "/usr/Acroread/Reader">
<ba\s\tename "/usr/Acroread/Reader">
<ba\s\tename "/usr/Acroread/Reader">
<ba\s\tename /usr/Acroread/Reader>
<"bastename" "/usr/Acroread/Reader">
<"bastename" "/usr/Acroread/Reader">
<"bastename" "/usr/Acroread/Reader">
    $ yash x
<bastename "/usr/Acroread/Reader">
<bastename "/usr/Acroread/Reader">
<bastename "/usr/Acroread/Reader">
<bastename "/usr/Acroread/Reader">
<ba\s ename "/usr/Acroread/Reader">
<ba\s ename "/usr/Acroread/Reader">
<ba\s ename "/usr/Acroread/Reader">
<ba\s ename "/usr/Acroread/Reader">
<"bastename" "/usr/Acroread/Reader">
<"bastename" "/usr/Acroread/Reader">
<"bastename" "/usr/Acroread/Reader">
<"bastename" "/usr/Acroread/Reader">
    $ dash x
<bastename /usr/Acroread/Reader>
<bastename "/usr/Acroread/Reader">
<bastename "/usr/Acroread/Reader">
<bastename "/usr/Acroread/Reader">
<ba\s ename /usr/Acroread/Reader>
<ba\s ename "/usr/Acroread/Reader">
<ba\s ename "/usr/Acroread/Reader">
<ba\s ename "/usr/Acroread/Reader">
<ba\s ename /usr/Acroread/Reader>
<"bastename" "/usr/Acroread/Reader">
<"bastename" "/usr/Acroread/Reader">
<"bastename" "/usr/Acroread/Reader">
    $ ksh93 x
<bastename /usr/Acroread/Reader>
<bastename "/usr/Acroread/Reader">
<bastename "/usr/Acroread/Reader">
<bastename "/usr/Acroread/Reader">
<ba\s\tename /usr/Acroread/Reader>
<ba\s\tename "/usr/Acroread/Reader">
<ba\s\tename "/usr/Acroread/Reader">
<ba\s\tename "/usr/Acroread/Reader">
<ba\s\tename /usr/Acroread/Reader>
<"bastename" "/usr/Acroread/Reader">
<"bastename" "/usr/Acroread/Reader">
<"bastename" "/usr/Acroread/Reader">
    $ zsh x
<bastename /usr/Acroread/Reader>
<bastename "/usr/Acroread/Reader">
<bastename "/usr/Acroread/Reader">
<bastename "/usr/Acroread/Reader">
<ba\s ename /usr/Acroread/Reader>
<ba\s ename "/usr/Acroread/Reader">
<ba\s ename "/usr/Acroread/Reader">
<ba\s ename "/usr/Acroread/Reader">
<ba\s ename /usr/Acroread/Reader>
<"bastename" "/usr/Acroread/Reader">
<"bastename" "/usr/Acroread/Reader">
<"bastename" "/usr/Acroread/Reader">
    $ /usr/mpkg/5bin/sh x
<bastename /usr/Acroread/Reader>
<bastename "/usr/Acroread/Reader">
<$(echo ba\s\tename "/usr/Acroread/Reader")>
x: syntax error at line 15: `directory=$' unexpected
    $ /usr/mpkg/5bin/sh y # same as x w/o the $(…) cases
<bastename /usr/Acroread/Reader>
<bastename "/usr/Acroread/Reader">
<ba\s ename /usr/Acroread/Reader>
<ba\s ename "/usr/Acroread/Reader">
<ba\s ename /usr/Acroread/Reader>
<"bastename" "/usr/Acroread/Reader">
#-----END-----

Differences in echo’s backslash interpretation aside,
I’m not entirely sure why the backslash+newline is
always removed in these scenarios… but I’ll accept it
because all shells I tested behave, modulo known things,
the same. (pdksh -o posix behaved like yash and like
mksh did before I removed that again.)

There’s still the note in the GNU autoconf texinfo manual’s
section about portable shell programming stating that both
have been seen in the wild, but I’ll consider this solved
for me, for now.

Thank you.
(0003096)
kre (reporter)
2016-03-22 04:38

Sorry, did not notice your note... (or forgot it, or something.)

    I’m not entirely sure why the backslash+newline is
    always removed in these scenarios…

That one is simple, aside from when single quoted (and when obtained via read -r
which is essentially the same thing) the pair of a backslash followed by a
newline is always simply removed (as if neither character existed) - that is
one of the most basic and lowest level lexical operations.

For reference, both the NetBSD and FreeBSD shells (both derived from ash)
produce the same output as bash (and modulo \t expansion from echo) from what
I could see from an eyeball comparison, everything else except
/usr/mpkg/5bin/sh (which looks to be simply broken.)
(0003415)
joerg (reporter)
2016-10-14 12:00
edited on: 2016-10-14 12:03

From the teleconference on October 13, we discovered the following with the
script from Note: 0003088

- Shells that do not expand \t to TAB are not XSI compliant

- The ksh93 variant that has been used for the report was compiled in a non-standard way and thus disabled XSI support. The standardcompilation for ksh93 on Solaris creates a ksh93 binary with XSI support enabled and thus expands \t.

- bash is not XSI compliant

- mksh -o posix disables XSI compliance which seems to be a bad idea.

(0003418)
chet_ramey (reporter)
2016-10-15 19:39

Bash claims XSI conformance when used with the `posix' and `xpg_echo' options enabled.
(0003445)
geoffclare (manager)
2016-10-20 15:47
edited on: 2016-10-20 16:30

Interpretation response
------------------------
The standard is unclear on this issue, and no conformance distinction can be made between alternative implementations based on this. This is being referred to the sponsor.

Rationale:
-------------
The handling of <backslash> inside a double-quoted `...` command substitution is described in XCU 2.2.3. However, it is not clear in 2.6.3 that this section applies instead of the description in 2.6.3 itself.

Notes to the Editor (not part of this interpretation):
-------------------------------------------------------
On (2016 edition) page 2357 line 75195 section 2.6.3 change:
Within the backquoted style of command substitution, <backslash> shall retain its literal meaning, except when followed by: '$', '`', or <backslash>.
to:
Within the backquoted style of command substitution, if the command substitution is not within double-quotes, <backslash> shall retain its literal meaning, except when followed by: '$', '`', or <backslash>. See [xref to 2.2.3] for the handling of <backslash> when the command substitution is within double-quotes.


(0003519)
ajosey (manager)
2016-12-15 18:10

Interpretation proposed: 15 Dec 2016
(0003548)
ajosey (manager)
2017-01-18 15:23

Interpretation Approved: 18 Jan 2017

- Issue History
Date Modified Username Field Change
2015-12-28 12:36 mirabilos New Issue
2015-12-28 12:36 mirabilos Name => mirabilos
2015-12-28 12:36 mirabilos Organization => The MirOS Project
2015-12-28 12:36 mirabilos User Reference => http://article.gmane.org/gmane.comp.standards.posix.austin.general/11817 [^]
2015-12-28 12:36 mirabilos Section => 2.6.3
2015-12-28 12:36 mirabilos Page Number => http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_06_03 [^]
2015-12-28 12:36 mirabilos Line Number => third paragraph
2016-01-15 14:40 mirabilos Note Added: 0003024
2016-02-25 11:14 kre Note Added: 0003086
2016-02-25 11:34 kre Note Edited: 0003086
2016-03-01 18:59 mirabilos Note Added: 0003088
2016-03-22 04:38 kre Note Added: 0003096
2016-10-14 12:00 joerg Note Added: 0003415
2016-10-14 12:00 joerg Note Edited: 0003415
2016-10-14 12:03 joerg Note Edited: 0003415
2016-10-15 19:39 chet_ramey Note Added: 0003418
2016-10-20 15:47 geoffclare Note Added: 0003445
2016-10-20 15:48 geoffclare Interp Status => Pending
2016-10-20 15:48 geoffclare Final Accepted Text => Note: 0003445
2016-10-20 15:48 geoffclare Status New => Interpretation Required
2016-10-20 15:48 geoffclare Resolution Open => Accepted As Marked
2016-10-20 15:48 geoffclare Tag Attached: tc3-2008
2016-10-20 16:30 geoffclare Note Edited: 0003445
2016-12-15 18:10 ajosey Interp Status Pending => Proposed
2016-12-15 18:10 ajosey Note Added: 0003519
2017-01-18 15:23 ajosey Interp Status Proposed => Approved
2017-01-18 15:23 ajosey Note Added: 0003548
2019-10-21 13:53 geoffclare Status Interpretation Required => Applied


Mantis 1.1.6[^]
Copyright © 2000 - 2008 Mantis Group
Powered by Mantis Bugtracker