Austin Group Defect Tracker

Aardvark Mark IV


Viewing Issue Simple Details Jump to Notes ] Issue History ] Print ]
ID Category Severity Type Date Submitted Last Update
0001073 [1003.1(2008)/Issue 7] Shell and Utilities Editorial Clarification Requested 2016-08-29 20:17 2020-04-21 14:06
Reporter EdSchouten View Status public  
Assigned To ajosey
Priority normal Resolution Accepted As Marked  
Status Applied  
Name Ed Schouten
Organization The FreeBSD Project
User Reference
Section dirname utility and dirname() function
Page Number -
Line Number -
Interp Status Approved
Final Accepted Text See Note: 0003928
Summary 0001073: dirname utility: algorithm for computing pathname string is stricter than the corresponding dirname() function
Description The dirname() function is described in a pretty relaxed way:

"The dirname() function shall take a pointer to a character string that contains a pathname, and return a pointer to a string that is a pathname of the parent directory of that file. Trailing '/' characters in the path are not counted as part of the path."

In FreeBSD HEAD, we're making use of this fact by implementing this function as follows:

https://svnweb.freebsd.org/base/head/lib/libc/gen/dirname.c?view=markup [^]

A simple linear scan through the input string, copying all but the last pathname component over to the output. Pathname components consisting of a single dot are omitted, so that the output corresponds to the shortest sequence leading up to the path. As far as I know, this implementation complies to the spec.

Now the interesting part. It looks like the description of the dirname utility has a definition that's a lot stricter than that of dirname(). It explicitly describes all of the steps that need to be performed to get the output string. This means that there are some inconsistencies between the potential output of the utility and the function:

- Input: //a//b//
- Output utility: //a
- Output function: //a or /a

- Input: //.//b//
- Output utility: //.
- Output function: //., /., // or /

In other words, the dirname utility cannot be implemented on top of the dirname() function, which does seem to be done pretty often. My question is, is this really what's intended?
Desired Action The description of the dirname utility could be simplified a lot to just say:

"The output generated by this utility is identical to that of the dirname() function."

Done.
Tags issue8
Attached Files

- Relationships
related to 0000612Closedajosey 1003.1(2008)/Issue 7 dirname of "usr/" or "/" are not clear 
related to 0001064Appliedajosey 1003.1(2008)/Issue 7 basename() and dirname(): Specification is not complete enough to allow existing thread-unsafe implementations 
related to 0000830Closed 1003.1(2013)/Issue7+TC1 not clear that dirname() is purely a string operation 

-  Notes
(0003369)
eblake (manager)
2016-08-29 20:45

Careful - the use of leading //a is already in implementation-defined behavior. Your example would be more compelling as:

Input: ///a///b///
Utility output: ///a
Function output: ///a or /a

where you are avoiding the implementation-defined escape clause rules of leading //.

That said, I think it is INTENTIONAL that the dirname utility is defined as a strictly textual operation, and one that does NOT normalize redundant / or eliminate '.' elements. If anything, that argues that the dirname() function should be made stricter, not the dirname utility made looser, if we do indeed want to require the two to behave identically.

If there is existing practice for yet another function that properly normalizes redundant slashes and directory name components (some platforms have a 'realpath' utility with particular command line flags to achieve this, for example, but I'm not sure of any particular libc function that has equivalent flexibility), then maybe that would be worth standardizing, but it seems out of scope for this bug.

Meanwhile, I personally find the basename() and dirname() function specifications to be useless: the results need not be thread-safe, and need not return the input string, making it impossible to portably use in a multi-threaded application. Standardizing a function that is safe to use (perhaps by always malloc'ing its result) would be a smarter move than trying to band-aid the already-broken dirname() function.
(0003370)
geoffclare (manager)
2016-08-30 08:46

It is certainly the intention that the dirname() function should behave the same as is required for the dirname utility. This is evident from the TC2 update to the basename() examples section (see Note: 0001394) which added dirname() and the two utilities to the table; it shows the dirname() output for "/home//dwc//test" as "/home//dwc".
(0003926)
nick (manager)
2018-02-22 19:02

Note that the proposed FreeBSD implementation of the dirname() function in revision 304860, which removes "." components, will lead to very different results from those required by the standard:
dirname("/a/b/c/.");

produces "/a/b/c" for most implementations, but just "/a/b" for the quoted FreeBSD version (which has now been replaced by a more conforming one).

Do we need to tighten the wording for the dirname() function itself to be clearer about trailing "." components?
(0003927)
joerg (reporter)
2018-02-23 10:23
edited on: 2018-02-23 10:23

Re: Note: 0003926 I just verified that the FreeBSD implementation mentioned in the Description behaves correctly and returns "/a/b/c" for dirname("/a/b/c/.").

It seems that the current version from Sun Sep 18 20:47:55 2016 UTC has been corrected already.

(0003928)
nick (manager)
2018-03-01 17:19
edited on: 2018-03-01 17:30

Interpretation response
------------------------
The standard states the steps required for the dirname utility, and conforming implementations must conform to this. However, concerns have been raised about this which are being referred to the sponsor.

Rationale:
-------------
There is an inconsistency between the specifications of the dirname function and utility. Additionally, it seems reasonable that the pathname could be rationalized, removing additional <slash> characters and "." components, and some implementations have experimented with this.

Notes to the Editor (not part of this interpretation):
-------------------------------------------------------
On page 76 lines 2199-2200 (XBD 3.271 Pathname), change:

except for the case of exactly two leading <slash> characters.


to:

except it is implementation-defined whether exactly two leading <slash> characters is treated specially.


On page 625 lines 21586-21596 (XSH basename()), change:


<table>
<tr>
  <td><tt>"usr"</tt></td>
  <td><tt>"usr"</tt></td>
  <td><tt>"."</tt></td>
  <td><tt>usr</tt></td>
  <td><tt>usr</tt></td>
  <td><tt>.</tt></td>
</tr>
<tr>
  <td><tt>"usr/"</tt></td>
  <td><tt>"usr"</tt></td>
  <td><tt>"."</tt></td>
  <td><tt>usr/</tt></td>
  <td><tt>usr</tt></td>
  <td><tt>.</tt></td>
</tr>
<tr>
  <td><tt>""</tt></td>
  <td><tt>"."</tt></td>
  <td><tt>"."</tt></td>
  <td><tt>""</tt></td>
  <td><tt>.</tt> or empty string</td>
  <td><tt>.</tt></td>
</tr>
<tr>
  <td><tt>"/"</tt></td>
  <td><tt>"/"</tt></td>
  <td><tt>"/"</tt></td>
  <td><tt>/</tt></td>
  <td><tt>/</tt></td>
  <td><tt>/</tt></td>
</tr>
<tr>
  <td><tt>"//"</tt></td>
  <td><tt>"/"</tt> or <tt>"//"</tt></td>
  <td><tt>"/"</tt> or <tt>"//"</tt></td>
  <td><tt>//</tt></td>
  <td><tt>/</tt> or <tt>//</tt></td>
  <td><tt>/</tt> or <tt>//</tt></td>
</tr>
<tr>
  <td><tt>"///"</tt></td>
  <td><tt>"/"</tt></td>
  <td><tt>"/"</tt></td>
  <td><tt>///</tt></td>
  <td><tt>/</tt></td>
  <td><tt>/</tt></td>
</tr>
<tr>
  <td><tt>"/usr/"</tt></td>
  <td><tt>"usr"</tt></td>
  <td><tt>"/"</tt></td>
  <td><tt>/usr/</tt></td>
  <td><tt>usr</tt></td>
  <td><tt>/</tt></td>
</tr>
<tr>
  <td><tt>"/usr/lib"</tt></td>
  <td><tt>"lib"</tt></td>
  <td><tt>"/usr"</tt></td>
  <td><tt>/usr/lib</tt></td>
  <td><tt>lib</tt></td>
  <td><tt>/usr</tt></td>
</tr>
<tr>
  <td><tt>"//usr//lib//"</tt></td>
  <td><tt>"lib"</tt></td>
  <td><tt>"//usr"</tt></td>
  <td><tt>//usr//lib//</tt></td>
  <td><tt>lib</tt></td>
  <td><tt>//usr</tt></td>
</tr>
<tr>
  <td><tt>"/home//dwc//test"</tt></td>
  <td><tt>"test"</tt></td>
  <td><tt>"/home//dwc"</tt></td>
  <td><tt>/home//dwc//test</tt></td>
  <td><tt>test</tt></td>
  <td><tt>/home//dwc</tt></td>
</tr>
</table>


to:


<table>
<tr>
  <td><tt>"usr"</tt></td>
  <td><tt>"usr"</tt></td>
  <td><tt>"."</tt></td>
  <td><tt>usr</tt></td>
  <td><tt>usr</tt></td>
  <td><tt>.</tt></td>
</tr>
<tr>
  <td><tt>"usr/"</tt></td>
  <td><tt>"usr"</tt></td>
  <td><tt>"."</tt></td>
  <td><tt>usr/</tt></td>
  <td><tt>usr</tt></td>
  <td><tt>.</tt></td>
</tr>
<tr>
  <td><tt>""</tt></td>
  <td><tt>"."</tt></td>
  <td><tt>"."</tt></td>
  <td><tt>""</tt></td>
  <td><tt>.</tt> or empty string</td>
  <td><tt>.</tt></td>
</tr>
<tr>
  <td><tt>"/"</tt></td>
  <td><tt>"/"</tt></td>
  <td><tt>"/"</tt></td>
  <td><tt>/</tt></td>
  <td><tt>/</tt></td>
  <td><tt>/</tt></td>
</tr>
<tr>
  <td><tt>"//"</tt></td>
  <td><tt>"/"</tt> or <tt>"//"</tt> (see note 1)</td>
  <td><tt>"/"</tt> or <tt>"//"</tt> (see note 1)</td>
  <td><tt>//</tt></td>
  <td><tt>/</tt> or <tt>//</tt> (see note 1)</td>
  <td><tt>/</tt> or <tt>//</tt> (see note 1)</td>
</tr>
<tr>
  <td><tt>"///"</tt></td>
  <td><tt>"/"</tt></td>
  <td><tt>"/"</tt> or <tt>"///"</tt></td>
  <td><tt>///</tt></td>
  <td><tt>/</tt></td>
  <td><tt>/</tt> or <tt>///</tt></td>
</tr>
<tr>
  <td><tt>"/usr/"</tt></td>
  <td><tt>"usr"</tt></td>
  <td><tt>"/"</tt></td>
  <td><tt>/usr/</tt></td>
  <td><tt>usr</tt></td>
  <td><tt>/</tt></td>
</tr>
<tr>
  <td><tt>"/usr/lib"</tt></td>
  <td><tt>"lib"</tt></td>
  <td><tt>"/usr"</tt></td>
  <td><tt>/usr/lib</tt></td>
  <td><tt>lib</tt></td>
  <td><tt>/usr</tt></td>
</tr>
<tr>
  <td><tt>"//usr//lib//"</tt></td>
  <td><tt>"lib"</tt></td>
  <td><tt>"//usr"</tt> or <tt>"/usr"</tt> (see note 1)</td>
  <td><tt>//usr//lib//</tt></td>
  <td><tt>lib</tt></td>
  <td><tt>//usr</tt> or <tt>/usr</tt> (see note 1)</td>
</tr>
<tr>
  <td><tt>"/home//dwc//test"</tt></td>
  <td><tt>"test"</tt></td>
  <td><tt>"/home//dwc"</tt> or <tt>"/home/dwc"</tt></td>
  <td><tt>/home//dwc//test</tt></td>
  <td><tt>test</tt></td>
  <td><tt>/home//dwc</tt> or <tt>/home/dwc</tt></td>
</tr>
<tr>
  <td><tt>"/home/.././test"</tt></td>
  <td><tt>"test"</tt></td>
  <td><tt>"/home/../."</tt> or <tt>"/home/.."</tt></td>
  <td><tt>/home/.././test</tt></td>
  <td><tt>test</tt></td>
  <td><tt>/home/../.</tt> or <tt>/home/..</tt></td>
</tr>
<tr>
  <td><tt>"/home/dwc/."</tt></td>
  <td><tt>"."</tt></td>
  <td><tt>"/home/dwc"</tt></td>
  <td><tt>/home/dwc/.</tt></td>
  <td><tt>.</tt></td>
  <td><tt>/home/dwc</tt></td>
</tr>
</table>
note 1: Whether leading // can be converted to / depends on the implementation-defined behavior of // (see [xref to XBD 4.13 Pathname Resolution]; although the basename() and dirname() functions, and basename and dirname utilities, do not themselves perform pathname resolution, their results can be passed to a function or utility which does).


(Rows 5, 6, 9, and 10 are changed, and two new rows plus a note are added at the end.)

On page 736 line 25067 section dirname() change:
    
return a pointer to a string that is a pathname of the parent directory of that file.


to:

return a pointer to a string that is a pathname of the directory containing the entry of the final pathname component.


On page 736 line 25072 section dirname() add a new paragraph:

It is unspecified whether redundant '/' characters and '.' pathname components in path are removed after determining the pathname to output. However, ".." pathname components occurring prior to the final component shall not be removed.


On page 737 line 25113, change the RATIONALE section from:
None.

to:
An implementation should prefer the shortest output possible; however, this is not required, in part because earlier versions of the standard did not mention whether elision of redundant <slash> characters or dot (".") components was permitted. Removal of the dot-dot ("..") pathname component is not permitted, because eliding it correctly would require performing pathname resolution to ensure the resulting string would still point to the correct pathname if the original string resolved as a pathname. On implementations where pathname "//" has an implementation-defined meaning distinct from the pathname "/", the dirname of "//" will be "//".


On page 2667, change the entire DESCRIPTION section (lines 86879-86894) from:

The string operand shall be treated as a pathname, as defined in [xref to XBD Section 3.271]. The string string shall be converted to the name of the directory containing the filename corresponding to the last pathname component in string, performing actions equivalent to the following steps in order:

[numbered list ...]

The resulting string shall be written to standard output.


to:

The string operand shall be treated as a pathname, as defined in [xref to XBD Section 3.271 Pathname], and shall be converted to a pathname of the directory containing the entry of the final pathname component. The resulting string shall be written to standard output. The dirname utility shall not perform pathname resolution; the result shall not be affected by whether or not a file with the pathname string exists or by its file type. Trailing '/' characters in string that are not also leading '/' characters shall not be counted as part of the pathname. If the pathname does not contain a '/', the resulting string shall be ".". If string is an empty string, the resulting string shall be ".".

It is unspecified whether redundant '/' characters and '.' pathname components in string are removed after determining the pathname to output. However, ".." pathname components occurring prior to the final component shall not be removed.


In RATIONALE, page 2668 after line 86955, insert a new paragraph:
The dirname utility is not specified in terms of the dirname() function, because the two may produce slightly different output where both output forms are still compliant. An implementation should prefer the shortest output possible; however, this is not required, in part because earlier versions of the standard did not permit elision of redundant <slash> characters or dot (".") components. Removal of the dot-dot ("..") pathname component is not permitted, because eliding it correctly would require performing pathname resolution to ensure the resulting string would still point to the correct pathname if the original string resolved as a pathname. On implementations where pathname "//" has an implementation-defined meaning distinct from the pathname "/", the dirname of "//" will be "//".


(0004139)
ajosey (manager)
2018-09-30 18:29

Interpretation Proposed: 30 Sept 2018
(0004161)
ajosey (manager)
2018-11-12 15:07

Interpretation approved: 12 November 2018

- Issue History
Date Modified Username Field Change
2016-08-29 20:17 EdSchouten New Issue
2016-08-29 20:17 EdSchouten Status New => Under Review
2016-08-29 20:17 EdSchouten Assigned To => ajosey
2016-08-29 20:17 EdSchouten Name => Ed Schouten
2016-08-29 20:17 EdSchouten Organization => The FreeBSD Project
2016-08-29 20:17 EdSchouten Section => dirname utility and dirname() function
2016-08-29 20:17 EdSchouten Page Number => -
2016-08-29 20:17 EdSchouten Line Number => -
2016-08-29 20:41 emaste Issue Monitored: emaste
2016-08-29 20:45 eblake Note Added: 0003369
2016-08-30 08:46 geoffclare Note Added: 0003370
2016-08-30 08:47 geoffclare Relationship added related to 0000612
2016-08-30 13:33 eblake Relationship added related to 0000830
2018-02-15 16:17 eblake Relationship added related to 0001064
2018-02-22 19:02 nick Note Added: 0003926
2018-02-23 10:23 joerg Note Added: 0003927
2018-02-23 10:23 joerg Note Edited: 0003927
2018-03-01 17:19 nick Note Added: 0003928
2018-03-01 17:21 nick Interp Status => ---
2018-03-01 17:21 nick Final Accepted Text => See Note: 0003928
2018-03-01 17:21 nick Status Under Review => Resolved
2018-03-01 17:21 nick Resolution Open => Accepted As Marked
2018-03-01 17:21 nick Tag Attached: issue8
2018-03-01 17:28 nick Note Edited: 0003928
2018-03-01 17:28 nick Note Edited: 0003928
2018-03-01 17:29 nick Status Resolved => Interpretation Required
2018-03-01 17:29 nick Interp Status --- => Pending
2018-03-01 17:30 eblake Note Edited: 0003928
2018-09-30 18:29 ajosey Interp Status Pending => Proposed
2018-09-30 18:29 ajosey Note Added: 0004139
2018-11-12 15:07 ajosey Interp Status Proposed => Approved
2018-11-12 15:07 ajosey Note Added: 0004161
2020-04-21 14:06 geoffclare Status Interpretation Required => Applied


Mantis 1.1.6[^]
Copyright © 2000 - 2008 Mantis Group
Powered by Mantis Bugtracker