|Anonymous | Login||2023-06-10 18:47 UTC|
|Main | My View | View Issues | Change Log | Docs|
|Viewing Issue Simple Details [ Jump to Notes ]||[ Issue History ] [ Print ]|
|ID||Category||Severity||Type||Date Submitted||Last Update|
|0001073||[1003.1(2008)/Issue 7] Shell and Utilities||Editorial||Clarification Requested||2016-08-29 20:17||2020-04-21 14:06|
|Priority||normal||Resolution||Accepted As Marked|
|Organization||The FreeBSD Project|
|Section||dirname utility and dirname() function|
|Final Accepted Text||See Note: 0003928|
|Summary||0001073: dirname utility: algorithm for computing pathname string is stricter than the corresponding dirname() function|
The dirname() function is described in a pretty relaxed way:
"The dirname() function shall take a pointer to a character string that contains a pathname, and return a pointer to a string that is a pathname of the parent directory of that file. Trailing '/' characters in the path are not counted as part of the path."
In FreeBSD HEAD, we're making use of this fact by implementing this function as follows:
A simple linear scan through the input string, copying all but the last pathname component over to the output. Pathname components consisting of a single dot are omitted, so that the output corresponds to the shortest sequence leading up to the path. As far as I know, this implementation complies to the spec.
Now the interesting part. It looks like the description of the dirname utility has a definition that's a lot stricter than that of dirname(). It explicitly describes all of the steps that need to be performed to get the output string. This means that there are some inconsistencies between the potential output of the utility and the function:
- Input: //a//b//
- Output utility: //a
- Output function: //a or /a
- Input: //.//b//
- Output utility: //.
- Output function: //., /., // or /
In other words, the dirname utility cannot be implemented on top of the dirname() function, which does seem to be done pretty often. My question is, is this really what's intended?
The description of the dirname utility could be simplified a lot to just say:
"The output generated by this utility is identical to that of the dirname() function."
Careful - the use of leading //a is already in implementation-defined behavior. Your example would be more compelling as:
Utility output: ///a
Function output: ///a or /a
where you are avoiding the implementation-defined escape clause rules of leading //.
That said, I think it is INTENTIONAL that the dirname utility is defined as a strictly textual operation, and one that does NOT normalize redundant / or eliminate '.' elements. If anything, that argues that the dirname() function should be made stricter, not the dirname utility made looser, if we do indeed want to require the two to behave identically.
If there is existing practice for yet another function that properly normalizes redundant slashes and directory name components (some platforms have a 'realpath' utility with particular command line flags to achieve this, for example, but I'm not sure of any particular libc function that has equivalent flexibility), then maybe that would be worth standardizing, but it seems out of scope for this bug.
Meanwhile, I personally find the basename() and dirname() function specifications to be useless: the results need not be thread-safe, and need not return the input string, making it impossible to portably use in a multi-threaded application. Standardizing a function that is safe to use (perhaps by always malloc'ing its result) would be a smarter move than trying to band-aid the already-broken dirname() function.
|It is certainly the intention that the dirname() function should behave the same as is required for the dirname utility. This is evident from the TC2 update to the basename() examples section (see Note: 0001394) which added dirname() and the two utilities to the table; it shows the dirname() output for "/home//dwc//test" as "/home//dwc".|
Note that the proposed FreeBSD implementation of the dirname() function in revision 304860, which removes "." components, will lead to very different results from those required by the standard:
produces "/a/b/c" for most implementations, but just "/a/b" for the quoted FreeBSD version (which has now been replaced by a more conforming one).
Do we need to tighten the wording for the dirname() function itself to be clearer about trailing "." components?
edited on: 2018-02-23 10:23
Re: Note: 0003926 I just verified that the FreeBSD implementation mentioned in the Description behaves correctly and returns "/a/b/c" for dirname("/a/b/c/.").
It seems that the current version from Sun Sep 18 20:47:55 2016 UTC has been corrected already.
edited on: 2018-03-01 17:30
The standard states the steps required for the dirname utility, and conforming implementations must conform to this. However, concerns have been raised about this which are being referred to the sponsor.
There is an inconsistency between the specifications of the dirname function and utility. Additionally, it seems reasonable that the pathname could be rationalized, removing additional <slash> characters and "." components, and some implementations have experimented with this.
Notes to the Editor (not part of this interpretation):
On page 76 lines 2199-2200 (XBD 3.271 Pathname), change:
except for the case of exactly two leading <slash> characters.
except it is implementation-defined whether exactly two leading <slash> characters is treated specially.
On page 625 lines 21586-21596 (XSH basename()), change:
(Rows 5, 6, 9, and 10 are changed, and two new rows plus a note are added at the end.)
On page 736 line 25067 section dirname() change:
return a pointer to a string that is a pathname of the parent directory of that file.
return a pointer to a string that is a pathname of the directory containing the entry of the final pathname component.
On page 736 line 25072 section dirname() add a new paragraph:
It is unspecified whether redundant '/' characters and '.' pathname components in path are removed after determining the pathname to output. However, ".." pathname components occurring prior to the final component shall not be removed.
On page 737 line 25113, change the RATIONALE section from:
An implementation should prefer the shortest output possible; however, this is not required, in part because earlier versions of the standard did not mention whether elision of redundant <slash> characters or dot (".") components was permitted. Removal of the dot-dot ("..") pathname component is not permitted, because eliding it correctly would require performing pathname resolution to ensure the resulting string would still point to the correct pathname if the original string resolved as a pathname. On implementations where pathname "//" has an implementation-defined meaning distinct from the pathname "/", the dirname of "//" will be "//".
On page 2667, change the entire DESCRIPTION section (lines 86879-86894) from:
The string operand shall be treated as a pathname, as defined in [xref to XBD Section 3.271]. The string string shall be converted to the name of the directory containing the filename corresponding to the last pathname component in string, performing actions equivalent to the following steps in order:
The string operand shall be treated as a pathname, as defined in [xref to XBD Section 3.271 Pathname], and shall be converted to a pathname of the directory containing the entry of the final pathname component. The resulting string shall be written to standard output. The dirname utility shall not perform pathname resolution; the result shall not be affected by whether or not a file with the pathname string exists or by its file type. Trailing '/' characters in string that are not also leading '/' characters shall not be counted as part of the pathname. If the pathname does not contain a '/', the resulting string shall be ".". If string is an empty string, the resulting string shall be ".".
In RATIONALE, page 2668 after line 86955, insert a new paragraph:
The dirname utility is not specified in terms of the dirname() function, because the two may produce slightly different output where both output forms are still compliant. An implementation should prefer the shortest output possible; however, this is not required, in part because earlier versions of the standard did not permit elision of redundant <slash> characters or dot (".") components. Removal of the dot-dot ("..") pathname component is not permitted, because eliding it correctly would require performing pathname resolution to ensure the resulting string would still point to the correct pathname if the original string resolved as a pathname. On implementations where pathname "//" has an implementation-defined meaning distinct from the pathname "/", the dirname of "//" will be "//".
|Interpretation Proposed: 30 Sept 2018|
|Interpretation approved: 12 November 2018|
|2016-08-29 20:17||EdSchouten||New Issue|
|2016-08-29 20:17||EdSchouten||Status||New => Under Review|
|2016-08-29 20:17||EdSchouten||Assigned To||=> ajosey|
|2016-08-29 20:17||EdSchouten||Name||=> Ed Schouten|
|2016-08-29 20:17||EdSchouten||Organization||=> The FreeBSD Project|
|2016-08-29 20:17||EdSchouten||Section||=> dirname utility and dirname() function|
|2016-08-29 20:17||EdSchouten||Page Number||=> -|
|2016-08-29 20:17||EdSchouten||Line Number||=> -|
|2016-08-29 20:41||emaste||Issue Monitored: emaste|
|2016-08-29 20:45||eblake||Note Added: 0003369|
|2016-08-30 08:46||geoffclare||Note Added: 0003370|
|2016-08-30 08:47||geoffclare||Relationship added||related to 0000612|
|2016-08-30 13:33||eblake||Relationship added||related to 0000830|
|2018-02-15 16:17||eblake||Relationship added||related to 0001064|
|2018-02-22 19:02||nick||Note Added: 0003926|
|2018-02-23 10:23||joerg||Note Added: 0003927|
|2018-02-23 10:23||joerg||Note Edited: 0003927|
|2018-03-01 17:19||nick||Note Added: 0003928|
|2018-03-01 17:21||nick||Interp Status||=> ---|
|2018-03-01 17:21||nick||Final Accepted Text||=> See Note: 0003928|
|2018-03-01 17:21||nick||Status||Under Review => Resolved|
|2018-03-01 17:21||nick||Resolution||Open => Accepted As Marked|
|2018-03-01 17:21||nick||Tag Attached: issue8|
|2018-03-01 17:28||nick||Note Edited: 0003928|
|2018-03-01 17:28||nick||Note Edited: 0003928|
|2018-03-01 17:29||nick||Status||Resolved => Interpretation Required|
|2018-03-01 17:29||nick||Interp Status||--- => Pending|
|2018-03-01 17:30||eblake||Note Edited: 0003928|
|2018-09-30 18:29||ajosey||Interp Status||Pending => Proposed|
|2018-09-30 18:29||ajosey||Note Added: 0004139|
|2018-11-12 15:07||ajosey||Interp Status||Proposed => Approved|
|2018-11-12 15:07||ajosey||Note Added: 0004161|
|2020-04-21 14:06||geoffclare||Status||Interpretation Required => Applied|
|Mantis 1.1.6[^] Copyright © 2000 - 2008 Mantis Group|