Anonymous | Login | 2024-11-03 20:24 UTC |
Main | My View | View Issues | Change Log | Docs |
Viewing Issue Simple Details [ Jump to Notes ] | [ Issue History ] [ Print ] | |||||||||||
ID | Category | Severity | Type | Date Submitted | Last Update | |||||||
0001798 | [1003.1(2024)/Issue8] System Interfaces | Objection | Clarification Requested | 2024-01-22 15:13 | 2024-07-24 14:32 | |||||||
Reporter | eblake | View Status | public | |||||||||
Assigned To | ||||||||||||
Priority | normal | Resolution | Accepted As Marked | |||||||||
Status | Interpretation Required | |||||||||||
Name | Eric Blake | |||||||||||
Organization | Red Hat | |||||||||||
User Reference | ebb.posix_getdents | |||||||||||
Section | XSH posix_getdents | |||||||||||
Page Number | 1567 | |||||||||||
Line Number | 52601 | |||||||||||
Interp Status | Approved | |||||||||||
Final Accepted Text | Note: 0006819 | |||||||||||
Summary | 0001798: Must posix_getdents remember file offsets across exec? | |||||||||||
Description |
The RATIONALE for fdopendir( ) (page 922) states that POSIX imposes no constraints on what may happen for "the use or referencing of a dirp value or a dirent structure value ... after a fork( ) or one of the exec function calls." Issue 8 added the posix_getdents( ) interface, and one of our goals was to allow it to be implemented on top of a hidden DIR* object for implementations where readdir( ) and friends already track file types as an extension. In trying to implement posix_getdents( ) for Cygwin, the choice was made to use a hidden DIR* object, opened on the first call to posix_getdents() for any fd , and where subsequent lseek() of the fd map to telldir()/seekdir() of the underlying DIR*. This works even for the case of dup() within a single process; but for fork() without exec, it is prohibitive to keep synchronization of the offset between the two copies, and after exec the underlying DIR* state is no longer available on the newly-exec'd process. It seems like most portable uses of getdent() were limited to a single process; it might help if the standard explicitly calls out the non-portability of exepcting directory offsets to be preserved across fork() and exec(), so that an implementation that uses an underlying DIR* is not hitting hard walls about the synchronized use of that DIR* across fork. |
|||||||||||
Desired Action |
(Draft 4 locations) On page 1567, line 52616 (posix_getdents DESCRIPTION), change: The behavior is unspecified if lseek( ) is used to set the file offset to a value other than zero or a value returned by a previous call to lseek( ) on the same open file description.to: The behavior is unspecified if lseek( ) is used to set the file offset to a value other than zero or a value returned by a previous call to lseek( ) on the same open file description; likewise, the behavior is unspecified if attempting to use posix_getdents( ) on a file descriptor after an exec call or in the child process of a fork( ) or _Fork( ) call if the file descriptor was at a non-zero offset before the call, without first using lseek( ) to set the file offset back to zero. |
|||||||||||
Tags | tc1-2024 | |||||||||||
Attached Files | ||||||||||||
|
Notes | |
(0006632) eblake (manager) 2024-01-22 15:30 edited on: 2024-01-22 15:39 |
Correction - I'm told that the attempted Cygwin implementation also has problems after dup(); it is unclear whether the states should be linked (reading an entry on one fd, grabbing its offset, then using the other fd to read entries, it is unclear whether the second fd starts reading from the point where the fd was at the time of dup() or at the subsequent point reached by the first fd, and whether the second fd can safely lseek() to any subsequent offset read using the first fd). Easiest would be to state that dup() has the same limitations as fork()/exec - namely, that resuming any mid-stream directory traversal in either side of the split is unspecified, and the only portable thing is to start a new traversal by lseek'ing back to 0 (at which point, the implementation no longer has to worry about sharing a half-read DIR* across fd copies or processes). |
(0006658) corinna_vinschen (reporter) 2024-02-16 10:18 |
From Cygwin's side, the problem is this: The underlying non-POSIXy kernel does not allow lseek(2) operations on directory descriptors, not even requesting a position within the directory. The only available seek-like operation is equivalent to lseek(dirfd, SEEK_SET, 0). Therefore we have to use a DIR* and the entire operation of position bookkeeping is performed in user space. If the standard strives to allow implementing posix_getdents() using DIR* under the hood, the standard should be clear on the subject that DIR is not dup(2)'able the same way as the dir descriptor given as argument to posix_getdents(). DIR is, by and large, a user-space object while the descriptor is a kernel object. DIR has never been meant as a dup'able object and there's no precedent for such a functionality. As such, there's no way to keep the dup'ed DIR* in sync after such a duplication. The same problem occurs with fork(2), which is just a more thorough dup(2) in terms of descriptors. Bottom line is, with user-space DIR* with enforced user space bookkeeping, there's no way after dup(2)/fork(2) to keep the directory position info in sync. Consequentially, there should be no assumption made how posix_getdents() behaves after dup(2) or fork(2). I.e. using the descriptors with posix_getdents() or readdir() in parallel should be undefined behaviour. If you're interested in code, I invite you to take a look into the current, preliminary implementation of posix_getdents() in Cygwin: https://cygwin.com/cgit/newlib-cygwin/commit/?id=62ca95721a14 [^] As the commit outlines, the code does not try to keep track of the hidden DIR at all. Thanks, Corinna |
(0006695) geoffclare (manager) 2024-02-29 17:27 |
Proposed interpretation (review timer to start after approval of issue 8) ... Interpretation response ------------------------ The standard states that the posix_getdents() function starts reading at the current file offset in the open file description associated with fildes, and conforming implementations must conform to this. However, concerns have been raised about this which are being referred to the sponsor. Rationale: ------------- Elsewhere the standard makes allowances for implementations where directory streams are not implemented using a file descriptor, but this was not extended to the new posix_getdents() function when it was added. Notes to the Editor (not part of this interpretation): ------------------------------------------------------- After page 920 line 31407 section fdopendir(), add a new paragraph: If the file descriptor specified by fd is associated with an open file description on which posix_getdents() has previously been used, the behavior is unspecified. After page 1567 line 52616 section posix_getdents(): The behavior is unspecified if lseek() is used to set the file offset to a value other than zero or a value returned by a previous call to lseek() on the same open file description. add these sentences: The behavior is unspecified if calls to posix_getdents() are made on different file descriptors that refer to the same open file description (for example, before and after a file descriptor is inherited across fork() or the exec family of functions, or is duplicated using dup() or fcntl()), unless lseek() is used to set the file offset to zero in between the calls to posix_getdents(). A single exception to this condition is that after a call to fork(), either the parent or child (but not both) can continue processing the directory using posix_getdents(); if both the parent and child processes use the function, the result is unspecified. Likewise, the behavior is unspecified if in between two calls to posix_getdents() on one file descriptor, the file offset is altered by a call made on a different file descriptor that refers to the same open file description and the new offset is not zero. After page 1571 line 52771 section posix_getdents(), add a new paragraph to RATIONALE: The restrictions on the use of different file descriptors that refer to the same open file description are needed in order to enable implementations where directory streams are not implemented using a file descriptor to maintain some internal state related to a particular file descriptor. At page 1858 line 61319 section readdir(), change: the result is undefined. to: the result is unspecified. |
(0006703) corinna_vinschen (reporter) 2024-03-04 09:38 |
> Likewise, the behavior is unspecified if in between two calls to > posix_getdents() on one file descriptor, the file offset is altered > by a call made on a different file descriptor that refers to the same open > file description and the new offset is not zero. While the new clarifications look mostly good to me, this snippet still looks like a problem, in particular the restriction "and the new offset is not zero". The reason is that after dup, the process has to create a new DIR for the dup'ed descriptor. The two DIR's are logically distinct. The above paragraph sounds like the behaviour is not supposed to be undefined in the follwoing situation: posix_getdents (fd1, ...); // file pos != 0 after this call fd2 = dup (fd1); lseek (fd2, SEEK_SET, 0); // seeks file to pos 0 via fd2 posix_getdents (fd1, ...); // is now supposed to start at pos 0? If so, I'm not sure how to do this via underlying DIR pointer. The DIR pointer attached to fd1 is either dropped from or duplicated to fd2. It's certainly not the same DIR pointer. The lseek in fd2 can only affect the DIR attached to fd2, not the one from fd1. So, given we don't have a seekable underlying OS file descriptor, how is the second posix_getdents on fd1 supposed to know that it has to restart at pos 0? Thanks, Corinna |
(0006708) geoffclare (manager) 2024-03-07 12:28 |
Re Note: 0006703 I'm not seeing why that case would be any more difficult to handle than this one: posix_getdents (fd1, ...); // file pos != 0 after this call lseek (fd1, SEEK_SET, 0); // seeks file to pos 0 posix_getdents (fd1, ...); Since posix_getdents() has to query the file offset on every call, if the offset is zero it should just start reading the directory at the beginning, regardless of how the offset became zero. Did I miss something? |
(0006709) corinna_vinschen (reporter) 2024-03-07 15:00 |
You're missing the fact that the underlying OS does *not* maintain a file position on directory descriptors. The function returning the file position always returns 0 on a directory, independent of the actually read directory entries. Also, there's no way to lseek on a directory. The only operation available is a "restart" flag to the directory read operation, which allows to specify to start at position 0. So, to be able to implement telldir/seekdir, the DIR struct has to maintain a read counter. telldir() simply returns the number of directory entries read so far. Seekdir() is implemented as a "restart" and then reading directory entries in a loop until the counter matches the one given as argument. Having said that, as soon as you fork() a directory descriptor with posix_getdent operation, you not only generate a copy of the underlying OS descriptor, you also duplicate the DIR struct into the new process. Now the DIR structs are independent from each other. If you call posix_getdents on one of them, the DIR strucxt in the other process is obviously *not* updated accordingly. Thus, any lseek() on the directory descriptor in one process is lost on the one in the other directory. I used fork() as an example, but the same goes for dup(), unless you share the same DIR structure for all the directory descriptors in shared memory. Does that clear things up? Thanks, Corinna |
(0006710) geoffclare (manager) 2024-03-07 18:14 |
> You're missing the fact that the underlying OS does *not* maintain a file position on directory descriptors. Actually, I think I knew that, but had forgotten it. So the Cygwin lseek() must have to fake an offset for fds associated with a directory stream - presumably returning the read count - and accept those faked offsets as input. To make it work for an lseek() on an fd obtained from dup(), as in Note: 0006703, couldn't you have dup() notice that the fd passed in is associated with a directory stream and create an association between the new fd and the same directory stream? Admittedly the code would be more complicated if a directory stream can be associated with more than one fd, but it seems to me that this could be a promising approach that would provide better compatibility with other systems. |
(0006711) corinna_vinschen (reporter) 2024-03-07 20:24 |
> So the Cygwin lseek() must have to fake an offset for fds associated with a > directory stream - presumably returning the read count - and accept those > faked offsets as input Yes, as I described in my previous note, readdir() keeps count, telldir() returns the count, seekdir () rewinds and calls readdir until the count equals the seekdir() argument. lseek() on dirs was not implemented at all due to the OS not supporting it. lseek() is now supported for dirs used in posix_getdents() by calling telldir()/seekdir() under the hood, see https://cygwin.com/cgit/newlib-cygwin/commit/?id=62ca95721a14 [^] > To make it work for an lseek() on an fd obtained from dup(), as in > Note: 0006703, couldn't you have dup() notice that the fd passed in is > associated with a directory stream and create an association between > the new fd and the same directory stream? Admittedly the code would be > more complicated Actually a *lot* more complicated. You're basically now expecting shared bookkeeping of DIRs. But this was never before required by opendir()/readdir()/... A DIR is an allocated user space structure on the heap. The only way of sharing a DIR was by duplicating it via fork(). The two resulting DIRs in the parent and child processes are disconnected and they function independently of each other. If you now require DIRs to be shared across dup() and fork(), you're basically requiring a rewrite of otherwise conforming implementations of opendir() and friends. It would require to store a DIR in shared memory, add interprocess locking to readdir() , and whatnot. If the idea was to allow implementing posix_getdents() with existing DIR under the hood, this new requirement breaks this assumption, just for a border case. Corinna |
(0006712) corinna_vinschen (reporter) 2024-03-07 20:30 |
Btw., in terms of lseek() I pushed an improved patch a week ago: https://cygwin.com/cgit/newlib-cygwin/commit/?id=6d936915477c [^] |
(0006715) geoffclare (manager) 2024-03-08 09:00 |
> If you now require DIRs to be shared across dup() and fork(), ... I was only proposing sharing across dup(), not fork(). The extra complexity you describe (shared memory, interprocess locking) would not be needed. |
(0006716) corinna_vinschen (reporter) 2024-03-08 11:08 |
Even if only with dup() it's still synchronization overhead which was never before required for DIR :( |
(0006721) eblake (manager) 2024-03-21 16:21 edited on: 2024-03-21 16:30 |
On the 21 Mar 2024 call, we reopened this bug, in order to consider replacing the original proposal of Note: 0006695 with the following; line numbers from draft 4.0: Proposed interpretation (review timer to start after approval of issue 8) ... Interpretation response ------------------------ The standard states that the posix_getdents() function starts reading at the current file offset in the open file description associated with fildes, and conforming implementations must conform to this. However, concerns have been raised about this which are being referred to the sponsor. Rationale: ------------- Elsewhere the standard makes allowances for implementations where directory streams are not implemented using a file descriptor, but this was not extended to the new posix_getdents() function when it was added. Notes to the Editor (not part of this interpretation): ------------------------------------------------------- After page 920 line 31407 section fdopendir(), add a new paragraph: If the file descriptor specified by fd is associated with an open file description on which posix_getdents() has previously been used, or for which any associated file descriptor is already associated with a directory stream, the behavior is unspecified. After page 1567 line 52616 section posix_getdents(): The behavior is unspecified if lseek() is used to set the file offset to a value other than zero or a value returned by a previous call to lseek() on the same open file description. add these sentences: The behavior is unspecified if calls to posix_getdents() are made on different file descriptors that refer to the same open file description (for example, before and after a file descriptor is inherited across fork() or the exec family of functions, or is duplicated using dup() or fcntl()), unless lseek() is used to set the file offset to zero in between the calls to posix_getdents(). A single exception to this condition is that after a call to fork(), either the parent or child (but not both) can continue processing the directory using posix_getdents(). Likewise, the behavior is unspecified if in between two calls to posix_getdents() on one file descriptor, the file offset is altered by a call made on a different file descriptor that refers to the same open file description. After page 1571 line 52771 section posix_getdents(), add a new paragraph to RATIONALE: The restrictions on the use of different file descriptors that refer to the same open file description are needed in order to enable implementations where directory streams are not implemented using a file descriptor to maintain some internal state related to a particular file descriptor. At page 1858 line 61312, section readdir(), change: If a file is removed from or added to the directory after the most recent call to opendir( ) or rewinddir( ), whether a subsequent call to readdir( ) returns an entry for that file is unspecified. to: If a file is removed from or added to the directory after the most recent call to opendir( ) or rewinddir( ), whether a subsequent call to readdir( ) on that directory stream returns an entry for that file is unspecified. For all other files in the directory that existed at the time the directory stream was opened and which have not been removed, successive calls to readdir( ) on that directory stream shall return an entry for each such file exactly once before reporting that the end of the directory has been reached, provided that there are no intervening calls to seekdir( ) and no unspecified behavior caused by opening a second directory stream on the same file description associated with the directory. For any such file that is renamed within the directory after the directory stream was opened, readdir( ) shall return either an entry for the original name or for the new name, but not both. At page 1858 line 61319 section readdir(), change: the result is undefined. to: the result is unspecified. |
(0006722) corinna_vinschen (reporter) 2024-03-21 18:58 |
This sounds pretty good. Thanks, Corinna |
(0006723) kre (reporter) 2024-03-22 03:50 edited on: 2024-03-22 04:48 |
Re Note: 0006721 There it says: For any such file that is renamed within the directory after the directory stream was opened, readdir( ) shall return either an entry for the original name or for the new name, but not both. I have no idea how that is supposed to be implementable, that is, the "but not both". First, I am assuming that "such file" means this applies only to a file that is: removed from or added to the directory after the most recent call to from the opening line of the paragraph, though it doesn't really matter, since a typical implementation of readdir() isn't going to be able to tell which files were added to the directory after any particular event, other than when the event is readdir() reaching EOF. That is, if you consider reading a VERY large directory with readdir(), while reading the first few entries (however many readdir() manages to buffer in memory) a new file is added, way down at the end of the directory where it hasn't reached yet. How could there be a special rule for that file which doesn't also apply to the previously added file, which just happened to be added before the event occurred? To readdir() those two files look to be almost exactly the same, it can't even use the modification time of the directory, or its size, as some kind of hint, neither of those works given the right circumstances. Note that here I am concerned only with the rule about what happens with files that are renamed, not with the "unspecified whether it is shown or not" part, that's simple. And for this first nit, it is just the "such file" (as distinct from "any file") which seems to be wrong. One more assumption, then I'll get to the real problem with the language quoted. If I set up a directory using mkdir dir cd dir > file1 ln file1 file2 and after that is all done, I run an application (maybe ls) which uses readdir() to read the directory (with nothing changing in any way), I assume (hope) that is intended, and required, that readdir() returns entries for both file1 and file2. If that's not required, then you can stop reading this note now, and we have bigger problems. So, I assume it is true. Now setup a directory as follows mkdir dir cd dir > f makemany a b c d e > do_it_now makemany g h i j k l m n o p q r s t u v w x y z where "makemany" is a function defined like makemany() { for c do for n1 in 0 1 2 3 4 5 6 7 8 9 do for n2 in 0 1 2 3 4 5 6 7 8 9 do for n3 in 0 1 2 3 4 5 6 7 8 9 do >"${c}${n1}${n2}${n3}" done done done done } which is fairly ugly ... the silly nested n1 n2 n3 loops are just because POSIX appears not to have a standard utility like either jot or seq, which is what I'd use in reality. [Aside, I know awk would work, but that is kind of heavyweight.] The point is simply to create thousands of files with relatively short names in the current directory. If your readdir() buffers LOTS in memory, you might need to add n4 n5 ... to make sufficient files for the problem to manifest itself. The method by which that is accomplished is unimportant. Then we start a process that uses readdir() to read this directory. When (or about when) it reaches the file "do_it_now" while reading the directory, either this process (the thread running readdir() or some other thread) or some other process unknown to this one, does: mv f the_same_file_as_f_was_but_with_a_much_longer_name Now because of the way directories are created in typical systems (and certainly a way they're permitted to be created) the entry for "f" right at the start of the directory, which our readdir() call will have already returned an entry for, will be removed, and a new entry for the new name (which I won't type again) will be made. Because of the way the directory was created, with many short file names, the only place that new name can be put is right at the end beyond what the readdir() implementation has already read from the filesystem (if it has already read to the end in a particular implementation, simply add more directory entries). The new name cannot simply replace "f", it is far too long to fit there, and if other entries were moved around, trying to get readdir() to do anything reasonable at all (given that none of the files that might be repositioned in the directory have been added, or removed, while readdir() is reading the directory) would be almost impossible. The rule that says "but not both" means that, as an entry for "f" has already been returned, readdir() is not permitted to return an entry for its replacement name. But I have no idea how any reasonable readdir() implementation is supposed to implement that rule. Before you start telling me how it could, or should, be done, consider the similar case, where instead of doing the "mv" command above, the following was performed (at the same point as the mv, instead of it) ln f the_same_file_as_f_was_but_with_a_much_longer_name Now I know in that case it is unspecified whether or not the new name (being one created after readdir() started reading the directory) is returned or not, so not returning it would be legitimate, but in practice, there is no way for readdir() to know that this file was added while it was reading the directory. If necessary, we can postulate that before doing that "ln" command, the same process which would execute that one (or the equivalent link() system call) removed just the right number of the final files in the directory, and the alternative (long) name for "f" is made precisely the correct length so that the size of the directory is not altered by this sequence. The assumption above is that when links to files exist, both must be returned. Since there is no way to know for sure that the new file was added after readdir() started, it cannot rely on the "unspecified whether it is returned or not" - the second name simply must be returned. (The "unspecified" is because an implementation is permitted to insert the new name into a section of the directory which had already been read, and we cannot require the implementation to continuously re-read the directory, just in case a new file was added into a segment already processed.) Now how is the readdir() implementation supposed to know that a rename ("mv") happened, rather than a "ln" - or in fact that any of those things happened at all while the directory was being read? Hence I cannot see a way that makes it possible to implement that "but not both" rule. It simply has to go. There cannot be any rule about what must be done in these cases, as there is no way to determine whether one of the special cases happened or not. All the implementation can do is return the entries it sees as it reads the directory. The standard simply needs to make it clear that there are cases where what is returned might be neither what was in the directory when the readdir() started, nor what is there when it completes. What needs to be said about rename() is more like: If a file is renamed within the directory after the directory stream was opened, readdir() may return an entry for the original name, one for the new name, neither, or both. as in reality (as far as the directory is concerned) a rename is simply making a new file, and deleting an old. That the contents, type, and other attributes, of the file are all the same is not relevant at all to readdir(), so the effects need to be the same: unspecified whether the name that was removed is returned, and unspecified whether the replacement name (which was added) is returned. (The suggested paragraph above could be rewritten in terms of "unspecified" rather than "may" if desired.) I also notice that posix_getdents() says nothing about the effects of a rename() - and perhaps should. However, were the language changed to refer to file names being added to or removed from the directory, rather than files being added or removed, then what is there now would cover it I think. That does assume that the "effects of the concurrent operation" mean only the effects as applied to any specific entry being returned, and do not extend to other entries that may be modified as a side effect of that concurrent operation. |
(0006724) geoffclare (manager) 2024-03-22 09:48 edited on: 2024-03-22 10:30 |
> If I set up a directory using > > mkdir dir > cd dir > > file1 > ln file1 file2 > > and after that is all done, I run an application (maybe ls) which > uses readdir() to read the directory (with nothing changing in any > way), I assume (hope) that is intended, and required, that readdir() > returns entries for both file1 and file2. If that's not required, > then you can stop reading this note now, and we have bigger problems. Obviously, returning entries for both is _intended_ to be required, but you have uncovered a major problem with the proposed wording, and as it stands implementations would be required to return either file1 or file2 but not both. This is because the text uses "file" when it means "directory entry". In your example, file1 and file2 are separate directory entries which both refer to the same file. If we reword in terms of directory entries, I think no explicit statement about renaming will be needed. If the rename removes one directory entry and adds another, it will be covered by the add/remove text; if it updates the name within the directory entry, the existing requirement for directory operations to be atomic will be sufficient. > I also notice that posix_getdents() says nothing about the effects of > a rename() - and perhaps should. However, were the language changed > to refer to file names being added to or removed from the directory, > rather than files being added or removed, then what is there now would > cover it I think. In the relevant paragraph (lines 52624-52629 in draft 4) the first sentence uses "directory entry" and the second uses "file". The second should change to use "directory entry". Update: I have added suggested new wording to the etherpad at https://posix.rhansen.org/p/2024-03-21 [^] (currently at line 330) |
(0006726) eblake (manager) 2024-03-25 15:20 edited on: 2024-03-25 15:22 |
Proposed interpretation (review timer to start after approval of issue 8), using draft 4.0 line numbers ... Interpretation response ------------------------ The standard states that the posix_getdents() function starts reading at the current file offset in the open file description associated with fildes, and conforming implementations must conform to this. However, concerns have been raised about this which are being referred to the sponsor. Rationale: ------------- Elsewhere the standard makes allowances for implementations where directory streams are not implemented using a file descriptor, but this was not extended to the new posix_getdents() function when it was added. Notes to the Editor (not part of this interpretation): ------------------------------------------------------- After page 920 line 31407 section fdopendir(), add a new paragraph: If the file descriptor specified by fd is associated with an open file description on which posix_getdents() has previously been used, or for which any associated file descriptor is already associated with a directory stream, the behavior is unspecified. After page 1567 line 52616 section posix_getdents(): The behavior is unspecified if lseek() is used to set the file offset to a value other than zero or a value returned by a previous call to lseek() on the same open file description. add these sentences: The behavior is unspecified if calls to posix_getdents() are made on different file descriptors that refer to the same open file description (for example, before and after a file descriptor is inherited across fork() or the exec family of functions, or is duplicated using dup() or fcntl()), unless lseek() is used to set the file offset to zero in between the calls to posix_getdents(). A single exception to this condition is that after a call to fork(), either the parent or child (but not both) can continue processing the directory using posix_getdents(). Likewise, the behavior is unspecified if in between two calls to posix_getdents() on one file descriptor, the file offset is altered by a call made on a different file descriptor that refers to the same open file description. At page 1568 line 52626 section posix_getdents(), change: If a sequence of calls to posix_getdents() is made that reads from offset zero to end-of-file and a file is removed from or added to the directory between the first and last of those calls, whether the sequence of calls returns an entry for that file is unspecified. to: If a sequence of calls to posix_getdents() is made that reads from offset zero to end-of-file and a directory entry is removed from or added to the directory between the first and last of those calls, whether the sequence of calls returns that directory entry is unspecified. After page 1571 line 52771 section posix_getdents(), add a new paragraph to RATIONALE: The restrictions on the use of different file descriptors that refer to the same open file description are needed in order to enable implementations where directory streams are not implemented using a file descriptor to maintain some internal state related to a particular file descriptor. At page 1858 line 61304, section readdir(), change: If a file is removed from or added to the directory after the most recent call to opendir() or rewinddir(), whether a subsequent call to readdir() returns an entry for that file is unspecified. to: If a directory entry is removed from or added to the directory after the most recent call to opendir() or rewinddir(), whether a subsequent call to readdir() on that directory stream returns that directory entry is unspecified. For all other directory entries in the directory that existed at the time the directory stream was opened or rewound and which have not been removed, successive calls to readdir() on that directory stream shall return each such directory entry exactly once before reporting that the end of the directory has been reached, provided that there are no intervening calls to seekdir() and no unspecified behavior caused by performing an operation on an open file description associated with the directory. At page 1858 line 61319 section readdir(), change: the result is undefined. to: the result is unspecified. At page 1859 line 61332 section readdir(), change: The readdir_r() function shall not return directory entries containing empty names. to: The readdir_r() function shall not return directory entries containing empty names. If entries for dot or dot-dot exist, one entry shall be returned for dot and one entry shall be returned for dot-dot; otherwise, they shall not be returned. |
(0006819) geoffclare (manager) 2024-06-17 08:51 |
Interpretation response ------------------------ The standard states that the posix_getdents() function starts reading at the current file offset in the open file description associated with fildes, and conforming implementations must conform to this. However, concerns have been raised about this which are being referred to the sponsor. Rationale: ------------- Elsewhere the standard makes allowances for implementations where directory streams are not implemented using a file descriptor, but this was not extended to the new posix_getdents() function when it was added. Notes to the Editor (not part of this interpretation): ------------------------------------------------------- After page 920 line 31395 section fdopendir(), add a new paragraph: If the file descriptor specified by fd is associated with an open file description on which posix_getdents() has previously been used, or for which any associated file descriptor is already associated with a directory stream, the behavior is unspecified. After page 1567 line 52603 section posix_getdents(): The behavior is unspecified if lseek() is used to set the file offset to a value other than zero or a value returned by a previous call to lseek() on the same open file description. add these sentences: The behavior is unspecified if calls to posix_getdents() are made on different file descriptors that refer to the same open file description (for example, before and after a file descriptor is inherited across fork() or the exec family of functions, or is duplicated using dup() or fcntl()), unless lseek() is used to set the file offset to zero in between the calls to posix_getdents(). A single exception to this condition is that after a call to fork(), either the parent or child (but not both) can continue processing the directory using posix_getdents(). Likewise, the behavior is unspecified if in between two calls to posix_getdents() on one file descriptor, the file offset is altered by a call made on a different file descriptor that refers to the same open file description. At page 1568 line 52613 section posix_getdents(), change: If a sequence of calls to posix_getdents() is made that reads from offset zero to end-of-file and a file is removed from or added to the directory between the first and last of those calls, whether the sequence of calls returns an entry for that file is unspecified. to: If a sequence of calls to posix_getdents() is made that reads from offset zero to end-of-file and a directory entry is removed from or added to the directory between the first and last of those calls, whether the sequence of calls returns that directory entry is unspecified. After page 1571 line 52758 section posix_getdents(), add a new paragraph to RATIONALE: The restrictions on the use of different file descriptors that refer to the same open file description are needed in order to enable implementations where directory streams are not implemented using a file descriptor to maintain some internal state related to a particular file descriptor. At page 1858 line 61299, section readdir(), change: If a file is removed from or added to the directory after the most recent call to opendir() or rewinddir(), whether a subsequent call to readdir() returns an entry for that file is unspecified. to: If a directory entry is removed from or added to the directory after the most recent call to opendir() or rewinddir(), whether a subsequent call to readdir() on that directory stream returns that directory entry is unspecified. For all other directory entries in the directory that existed at the time the directory stream was opened or rewound and which have not been removed, successive calls to readdir() on that directory stream shall return each such directory entry exactly once before reporting that the end of the directory has been reached, provided that there are no intervening calls to seekdir() and no unspecified behavior caused by performing an operation on an open file description associated with the directory. At page 1858 line 61306 section readdir(), change: the result is undefined. to: the result is unspecified. At page 1859 line 61319 section readdir(), change: The readdir_r() function shall not return directory entries containing empty names. to: The readdir_r() function shall not return directory entries containing empty names. If entries for dot or dot-dot exist, one entry shall be returned for dot and one entry shall be returned for dot-dot; otherwise, they shall not be returned. |
(0006825) agadmin (administrator) 2024-06-21 11:47 |
Interpretation proposed: 21 June 2024 |
(0006837) agadmin (administrator) 2024-07-24 14:32 edited on: 2024-07-25 03:57 |
Interpretation approved: 24 July 2024 |
Issue History | |||
Date Modified | Username | Field | Change |
2024-01-22 15:13 | eblake | New Issue | |
2024-01-22 15:13 | eblake | Name | => Eric Blake |
2024-01-22 15:13 | eblake | Organization | => Red Hat |
2024-01-22 15:13 | eblake | User Reference | => ebb.posix_getdents |
2024-01-22 15:13 | eblake | Section | => XSH posix_getdents |
2024-01-22 15:13 | eblake | Page Number | => 1567 |
2024-01-22 15:13 | eblake | Line Number | => 52609 |
2024-01-22 15:30 | eblake | Note Added: 0006632 | |
2024-01-22 15:39 | eblake | Note Edited: 0006632 | |
2024-02-16 10:18 | corinna_vinschen | Note Added: 0006658 | |
2024-02-29 17:27 | geoffclare | Note Added: 0006695 | |
2024-02-29 17:29 | geoffclare | Final Accepted Text | => Note: 0006695 |
2024-02-29 17:29 | geoffclare | Status | New => Resolution Proposed |
2024-02-29 17:29 | geoffclare | Resolution | Open => Accepted As Marked |
2024-02-29 17:30 | geoffclare | Tag Attached: tc1-2024 | |
2024-03-04 09:38 | corinna_vinschen | Note Added: 0006703 | |
2024-03-07 12:28 | geoffclare | Note Added: 0006708 | |
2024-03-07 15:00 | corinna_vinschen | Note Added: 0006709 | |
2024-03-07 18:14 | geoffclare | Note Added: 0006710 | |
2024-03-07 20:24 | corinna_vinschen | Note Added: 0006711 | |
2024-03-07 20:30 | corinna_vinschen | Note Added: 0006712 | |
2024-03-08 09:00 | geoffclare | Note Added: 0006715 | |
2024-03-08 11:08 | corinna_vinschen | Note Added: 0006716 | |
2024-03-21 16:21 | eblake | Note Added: 0006721 | |
2024-03-21 16:21 | eblake | Resolution | Accepted As Marked => Reopened |
2024-03-21 16:24 | eblake | Note Edited: 0006721 | |
2024-03-21 16:30 | eblake | Note Edited: 0006721 | |
2024-03-21 16:31 | eblake | Final Accepted Text | Note: 0006695 => Note: 0006721 |
2024-03-21 18:58 | corinna_vinschen | Note Added: 0006722 | |
2024-03-22 03:50 | kre | Note Added: 0006723 | |
2024-03-22 04:40 | kre | Note Edited: 0006723 | |
2024-03-22 04:48 | kre | Note Edited: 0006723 | |
2024-03-22 09:48 | geoffclare | Note Added: 0006724 | |
2024-03-22 10:30 | geoffclare | Note Edited: 0006724 | |
2024-03-25 15:20 | eblake | Note Added: 0006726 | |
2024-03-25 15:21 | eblake | Final Accepted Text | Note: 0006721 => Note: 0006726 |
2024-03-25 15:21 | eblake | Resolution | Reopened => Accepted As Marked |
2024-03-25 15:22 | eblake | Note Edited: 0006726 | |
2024-06-17 08:19 | geoffclare | Project | Issue 8 drafts => 1003.1(2024)/Issue8 |
2024-06-17 08:37 | geoffclare | Line Number | 52609 => 52601 |
2024-06-17 08:37 | geoffclare | Interp Status | => Pending |
2024-06-17 08:37 | geoffclare | Status | Resolution Proposed => Interpretation Required |
2024-06-17 08:51 | geoffclare | Note Added: 0006819 | |
2024-06-17 08:53 | geoffclare | Final Accepted Text | Note: 0006726 => Note: 0006819 |
2024-06-21 11:47 | agadmin | Interp Status | Pending => Proposed |
2024-06-21 11:47 | agadmin | Note Added: 0006825 | |
2024-07-24 14:32 | agadmin | Interp Status | Proposed => Approved |
2024-07-24 14:32 | agadmin | Note Added: 0006837 | |
2024-07-25 03:57 | agadmin | Note Edited: 0006837 |
Mantis 1.1.6[^] Copyright © 2000 - 2008 Mantis Group |