View Issue Details

IDProjectCategoryView StatusLast Update
00010731003.1(2008)/Issue 7Shell and Utilitiespublic2024-06-11 08:52
ReporterEdSchouten Assigned Toajosey  
PrioritynormalSeverityEditorialTypeClarification Requested
Status ClosedResolutionAccepted As Marked 
NameEd Schouten
OrganizationThe FreeBSD Project
User Reference
Sectiondirname utility and dirname() function
Page Number-
Line Number-
Interp StatusApproved
Final Accepted TextSee 0001073:0003928
Summary0001073: dirname utility: algorithm for computing pathname string is stricter than the corresponding dirname() function
DescriptionThe dirname() function is described in a pretty relaxed way:

"The dirname() function shall take a pointer to a character string that contains a pathname, and return a pointer to a string that is a pathname of the parent directory of that file. Trailing '/' characters in the path are not counted as part of the path."

In FreeBSD HEAD, we're making use of this fact by implementing this function as follows:

https://svnweb.freebsd.org/base/head/lib/libc/gen/dirname.c?view=markup

A simple linear scan through the input string, copying all but the last pathname component over to the output. Pathname components consisting of a single dot are omitted, so that the output corresponds to the shortest sequence leading up to the path. As far as I know, this implementation complies to the spec.

Now the interesting part. It looks like the description of the dirname utility has a definition that's a lot stricter than that of dirname(). It explicitly describes all of the steps that need to be performed to get the output string. This means that there are some inconsistencies between the potential output of the utility and the function:

- Input: //a//b//
- Output utility: //a
- Output function: //a or /a

- Input: //.//b//
- Output utility: //.
- Output function: //., /., // or /

In other words, the dirname utility cannot be implemented on top of the dirname() function, which does seem to be done pretty often. My question is, is this really what's intended?
Desired ActionThe description of the dirname utility could be simplified a lot to just say:

"The output generated by this utility is identical to that of the dirname() function."

Done.
Tagsissue8

Relationships

related to 0000612 Closedajosey 1003.1(2008)/Issue 7 dirname of "usr/" or "/" are not clear 
related to 0001064 Closedajosey 1003.1(2008)/Issue 7 basename() and dirname(): Specification is not complete enough to allow existing thread-unsafe implementations 
related to 0000830 Closed 1003.1(2013)/Issue7+TC1 not clear that dirname() is purely a string operation 

Activities

eblake

2016-08-29 20:45

manager   bugnote:0003369

Careful - the use of leading //a is already in implementation-defined behavior. Your example would be more compelling as:

Input: ///a///b///
Utility output: ///a
Function output: ///a or /a

where you are avoiding the implementation-defined escape clause rules of leading //.

That said, I think it is INTENTIONAL that the dirname utility is defined as a strictly textual operation, and one that does NOT normalize redundant / or eliminate '.' elements. If anything, that argues that the dirname() function should be made stricter, not the dirname utility made looser, if we do indeed want to require the two to behave identically.

If there is existing practice for yet another function that properly normalizes redundant slashes and directory name components (some platforms have a 'realpath' utility with particular command line flags to achieve this, for example, but I'm not sure of any particular libc function that has equivalent flexibility), then maybe that would be worth standardizing, but it seems out of scope for this bug.

Meanwhile, I personally find the basename() and dirname() function specifications to be useless: the results need not be thread-safe, and need not return the input string, making it impossible to portably use in a multi-threaded application. Standardizing a function that is safe to use (perhaps by always malloc'ing its result) would be a smarter move than trying to band-aid the already-broken dirname() function.

geoffclare

2016-08-30 08:46

manager   bugnote:0003370

It is certainly the intention that the dirname() function should behave the same as is required for the dirname utility. This is evident from the TC2 update to the basename() examples section (see 0000612:0001394) which added dirname() and the two utilities to the table; it shows the dirname() output for "/home//dwc//test" as "/home//dwc".

nick

2018-02-22 19:02

manager   bugnote:0003926

Note that the proposed FreeBSD implementation of the dirname() function in revision 304860, which removes "." components, will lead to very different results from those required by the standard:
dirname("/a/b/c/.");

produces "/a/b/c" for most implementations, but just "/a/b" for the quoted FreeBSD version (which has now been replaced by a more conforming one).

Do we need to tighten the wording for the dirname() function itself to be clearer about trailing "." components?

joerg

2018-02-23 10:23

reporter   bugnote:0003927

Last edited: 2018-02-23 10:23

Re: 0001073:0003926 I just verified that the FreeBSD implementation mentioned in the Description behaves correctly and returns "/a/b/c" for dirname("/a/b/c/.").

It seems that the current version from Sun Sep 18 20:47:55 2016 UTC has been corrected already.

nick

2018-03-01 17:19

manager   bugnote:0003928

Last edited: 2018-03-01 17:30

Interpretation response
------------------------
The standard states the steps required for the dirname utility, and conforming implementations must conform to this. However, concerns have been raised about this which are being referred to the sponsor.

Rationale:
-------------
There is an inconsistency between the specifications of the dirname function and utility. Additionally, it seems reasonable that the pathname could be rationalized, removing additional <slash> characters and "." components, and some implementations have experimented with this.

Notes to the Editor (not part of this interpretation):
-------------------------------------------------------
On page 76 lines 2199-2200 (XBD 3.271 Pathname), change:

except for the case of exactly two leading <slash> characters.


to:

except it is implementation-defined whether exactly two leading <slash> characters is treated specially.


On page 625 lines 21586-21596 (XSH basename()), change:


<table>
<tr>
  <td><tt>"usr"</tt></td>
  <td><tt>"usr"</tt></td>
  <td><tt>"."</tt></td>
  <td><tt>usr</tt></td>
  <td><tt>usr</tt></td>
  <td><tt>.</tt></td>
</tr>
<tr>
  <td><tt>"usr/"</tt></td>
  <td><tt>"usr"</tt></td>
  <td><tt>"."</tt></td>
  <td><tt>usr/</tt></td>
  <td><tt>usr</tt></td>
  <td><tt>.</tt></td>
</tr>
<tr>
  <td><tt>""</tt></td>
  <td><tt>"."</tt></td>
  <td><tt>"."</tt></td>
  <td><tt>""</tt></td>
  <td><tt>.</tt> or empty string</td>
  <td><tt>.</tt></td>
</tr>
<tr>
  <td><tt>"/"</tt></td>
  <td><tt>"/"</tt></td>
  <td><tt>"/"</tt></td>
  <td><tt>/</tt></td>
  <td><tt>/</tt></td>
  <td><tt>/</tt></td>
</tr>
<tr>
  <td><tt>"//"</tt></td>
  <td><tt>"/"</tt> or <tt>"//"</tt></td>
  <td><tt>"/"</tt> or <tt>"//"</tt></td>
  <td><tt>//</tt></td>
  <td><tt>/</tt> or <tt>//</tt></td>
  <td><tt>/</tt> or <tt>//</tt></td>
</tr>
<tr>
  <td><tt>"///"</tt></td>
  <td><tt>"/"</tt></td>
  <td><tt>"/"</tt></td>
  <td><tt>///</tt></td>
  <td><tt>/</tt></td>
  <td><tt>/</tt></td>
</tr>
<tr>
  <td><tt>"/usr/"</tt></td>
  <td><tt>"usr"</tt></td>
  <td><tt>"/"</tt></td>
  <td><tt>/usr/</tt></td>
  <td><tt>usr</tt></td>
  <td><tt>/</tt></td>
</tr>
<tr>
  <td><tt>"/usr/lib"</tt></td>
  <td><tt>"lib"</tt></td>
  <td><tt>"/usr"</tt></td>
  <td><tt>/usr/lib</tt></td>
  <td><tt>lib</tt></td>
  <td><tt>/usr</tt></td>
</tr>
<tr>
  <td><tt>"//usr//lib//"</tt></td>
  <td><tt>"lib"</tt></td>
  <td><tt>"//usr"</tt></td>
  <td><tt>//usr//lib//</tt></td>
  <td><tt>lib</tt></td>
  <td><tt>//usr</tt></td>
</tr>
<tr>
  <td><tt>"/home//dwc//test"</tt></td>
  <td><tt>"test"</tt></td>
  <td><tt>"/home//dwc"</tt></td>
  <td><tt>/home//dwc//test</tt></td>
  <td><tt>test</tt></td>
  <td><tt>/home//dwc</tt></td>
</tr>
</table>


to:


<table>
<tr>
  <td><tt>"usr"</tt></td>
  <td><tt>"usr"</tt></td>
  <td><tt>"."</tt></td>
  <td><tt>usr</tt></td>
  <td><tt>usr</tt></td>
  <td><tt>.</tt></td>
</tr>
<tr>
  <td><tt>"usr/"</tt></td>
  <td><tt>"usr"</tt></td>
  <td><tt>"."</tt></td>
  <td><tt>usr/</tt></td>
  <td><tt>usr</tt></td>
  <td><tt>.</tt></td>
</tr>
<tr>
  <td><tt>""</tt></td>
  <td><tt>"."</tt></td>
  <td><tt>"."</tt></td>
  <td><tt>""</tt></td>
  <td><tt>.</tt> or empty string</td>
  <td><tt>.</tt></td>
</tr>
<tr>
  <td><tt>"/"</tt></td>
  <td><tt>"/"</tt></td>
  <td><tt>"/"</tt></td>
  <td><tt>/</tt></td>
  <td><tt>/</tt></td>
  <td><tt>/</tt></td>
</tr>
<tr>
  <td><tt>"//"</tt></td>
  <td><tt>"/"</tt> or <tt>"//"</tt> (see note 1)</td>
  <td><tt>"/"</tt> or <tt>"//"</tt> (see note 1)</td>
  <td><tt>//</tt></td>
  <td><tt>/</tt> or <tt>//</tt> (see note 1)</td>
  <td><tt>/</tt> or <tt>//</tt> (see note 1)</td>
</tr>
<tr>
  <td><tt>"///"</tt></td>
  <td><tt>"/"</tt></td>
  <td><tt>"/"</tt> or <tt>"///"</tt></td>
  <td><tt>///</tt></td>
  <td><tt>/</tt></td>
  <td><tt>/</tt> or <tt>///</tt></td>
</tr>
<tr>
  <td><tt>"/usr/"</tt></td>
  <td><tt>"usr"</tt></td>
  <td><tt>"/"</tt></td>
  <td><tt>/usr/</tt></td>
  <td><tt>usr</tt></td>
  <td><tt>/</tt></td>
</tr>
<tr>
  <td><tt>"/usr/lib"</tt></td>
  <td><tt>"lib"</tt></td>
  <td><tt>"/usr"</tt></td>
  <td><tt>/usr/lib</tt></td>
  <td><tt>lib</tt></td>
  <td><tt>/usr</tt></td>
</tr>
<tr>
  <td><tt>"//usr//lib//"</tt></td>
  <td><tt>"lib"</tt></td>
  <td><tt>"//usr"</tt> or <tt>"/usr"</tt> (see note 1)</td>
  <td><tt>//usr//lib//</tt></td>
  <td><tt>lib</tt></td>
  <td><tt>//usr</tt> or <tt>/usr</tt> (see note 1)</td>
</tr>
<tr>
  <td><tt>"/home//dwc//test"</tt></td>
  <td><tt>"test"</tt></td>
  <td><tt>"/home//dwc"</tt> or <tt>"/home/dwc"</tt></td>
  <td><tt>/home//dwc//test</tt></td>
  <td><tt>test</tt></td>
  <td><tt>/home//dwc</tt> or <tt>/home/dwc</tt></td>
</tr>
<tr>
  <td><tt>"/home/.././test"</tt></td>
  <td><tt>"test"</tt></td>
  <td><tt>"/home/../."</tt> or <tt>"/home/.."</tt></td>
  <td><tt>/home/.././test</tt></td>
  <td><tt>test</tt></td>
  <td><tt>/home/../.</tt> or <tt>/home/..</tt></td>
</tr>
<tr>
  <td><tt>"/home/dwc/."</tt></td>
  <td><tt>"."</tt></td>
  <td><tt>"/home/dwc"</tt></td>
  <td><tt>/home/dwc/.</tt></td>
  <td><tt>.</tt></td>
  <td><tt>/home/dwc</tt></td>
</tr>
</table>
note 1: Whether leading // can be converted to / depends on the implementation-defined behavior of // (see [xref to XBD 4.13 Pathname Resolution]; although the basename() and dirname() functions, and basename and dirname utilities, do not themselves perform pathname resolution, their results can be passed to a function or utility which does).


(Rows 5, 6, 9, and 10 are changed, and two new rows plus a note are added at the end.)

On page 736 line 25067 section dirname() change:
    
return a pointer to a string that is a pathname of the parent directory of that file.


to:

return a pointer to a string that is a pathname of the directory containing the entry of the final pathname component.


On page 736 line 25072 section dirname() add a new paragraph:

It is unspecified whether redundant '/' characters and '.' pathname components in path are removed after determining the pathname to output. However, ".." pathname components occurring prior to the final component shall not be removed.


On page 737 line 25113, change the RATIONALE section from:
None.

to:
An implementation should prefer the shortest output possible; however, this is not required, in part because earlier versions of the standard did not mention whether elision of redundant <slash> characters or dot (".") components was permitted. Removal of the dot-dot ("..") pathname component is not permitted, because eliding it correctly would require performing pathname resolution to ensure the resulting string would still point to the correct pathname if the original string resolved as a pathname. On implementations where pathname "//" has an implementation-defined meaning distinct from the pathname "/", the dirname of "//" will be "//".


On page 2667, change the entire DESCRIPTION section (lines 86879-86894) from:

The string operand shall be treated as a pathname, as defined in [xref to XBD Section 3.271]. The string string shall be converted to the name of the directory containing the filename corresponding to the last pathname component in string, performing actions equivalent to the following steps in order:

[numbered list ...]

The resulting string shall be written to standard output.


to:

The string operand shall be treated as a pathname, as defined in [xref to XBD Section 3.271 Pathname], and shall be converted to a pathname of the directory containing the entry of the final pathname component. The resulting string shall be written to standard output. The dirname utility shall not perform pathname resolution; the result shall not be affected by whether or not a file with the pathname string exists or by its file type. Trailing '/' characters in string that are not also leading '/' characters shall not be counted as part of the pathname. If the pathname does not contain a '/', the resulting string shall be ".". If string is an empty string, the resulting string shall be ".".

It is unspecified whether redundant '/' characters and '.' pathname components in string are removed after determining the pathname to output. However, ".." pathname components occurring prior to the final component shall not be removed.


In RATIONALE, page 2668 after line 86955, insert a new paragraph:
The dirname utility is not specified in terms of the dirname() function, because the two may produce slightly different output where both output forms are still compliant. An implementation should prefer the shortest output possible; however, this is not required, in part because earlier versions of the standard did not permit elision of redundant <slash> characters or dot (".") components. Removal of the dot-dot ("..") pathname component is not permitted, because eliding it correctly would require performing pathname resolution to ensure the resulting string would still point to the correct pathname if the original string resolved as a pathname. On implementations where pathname "//" has an implementation-defined meaning distinct from the pathname "/", the dirname of "//" will be "//".


ajosey

2018-09-30 18:29

manager   bugnote:0004139

Interpretation Proposed: 30 Sept 2018

ajosey

2018-11-12 15:07

manager   bugnote:0004161

Interpretation approved: 12 November 2018

Issue History

Date Modified Username Field Change
2016-08-29 20:17 EdSchouten New Issue
2016-08-29 20:17 EdSchouten Status New => Under Review
2016-08-29 20:17 EdSchouten Assigned To => ajosey
2016-08-29 20:17 EdSchouten Name => Ed Schouten
2016-08-29 20:17 EdSchouten Organization => The FreeBSD Project
2016-08-29 20:17 EdSchouten Section => dirname utility and dirname() function
2016-08-29 20:17 EdSchouten Page Number => -
2016-08-29 20:17 EdSchouten Line Number => -
2016-08-29 20:45 eblake Note Added: 0003369
2016-08-30 08:46 geoffclare Note Added: 0003370
2016-08-30 08:47 geoffclare Relationship added related to 0000612
2016-08-30 13:33 eblake Relationship added related to 0000830
2018-02-15 16:17 eblake Relationship added related to 0001064
2018-02-22 19:02 nick Note Added: 0003926
2018-02-23 10:23 joerg Note Added: 0003927
2018-02-23 10:23 joerg Note Edited: 0003927
2018-03-01 17:19 nick Note Added: 0003928
2018-03-01 17:21 nick Interp Status => ---
2018-03-01 17:21 nick Final Accepted Text => See 0001073:0003928
2018-03-01 17:21 nick Status Under Review => Resolved
2018-03-01 17:21 nick Resolution Open => Accepted As Marked
2018-03-01 17:21 nick Tag Attached: issue8
2018-03-01 17:28 nick Note Edited: 0003928
2018-03-01 17:28 nick Note Edited: 0003928
2018-03-01 17:29 nick Status Resolved => Interpretation Required
2018-03-01 17:29 nick Interp Status --- => Pending
2018-03-01 17:30 eblake Note Edited: 0003928
2018-09-30 18:29 ajosey Interp Status Pending => Proposed
2018-09-30 18:29 ajosey Note Added: 0004139
2018-11-12 15:07 ajosey Interp Status Proposed => Approved
2018-11-12 15:07 ajosey Note Added: 0004161
2020-04-21 14:06 geoffclare Status Interpretation Required => Applied
2024-06-11 08:52 agadmin Status Applied => Closed