Austin Group Defect Tracker

Aardvark Mark III


Viewing Issue Simple Details Jump to Notes ] Issue History ] Print ]
ID Category Severity Type Date Submitted Last Update
0000768 [1003.1(2013)/Issue7+TC1] System Interfaces Editorial Enhancement Request 2013-10-11 13:24 2014-12-18 20:57
Reporter jlayton View Status public  
Assigned To
Priority normal Resolution Open  
Status New  
Name Jeff Layton
Organization Red Hat
User Reference
Section fcntl
Page Number 814-815
Line Number
Interp Status ---
Final Accepted Text
Summary 0000768: add "fd-private" POSIX locks to spec
Description At this year's Linux Storage and Filesystem summit, there was a lively discussion about what implementors of userland fileservers need. One of the things brought up was the problematic behavior of POSIX locks when a file is closed. The existing spec says:

"All locks associated with a file for a given process shall be removed when a
 file descriptor for that file is closed by that process or the process holding
 that file descriptor terminates."

The problem here is that userland programs may need to open more than one file
descriptor on a file, but they have to keep track of them and refrain from
closing *any* of them until they know that it's ok to release any locks held on
them. This behavior is surprising for most people when they first see it and it greatly complicates writing certain types of userland programs,
Desired Action I've posted an initial rough draft of a patchset here for Linux to address this by adding two new struct flock.l_type values -- F_RDLCKP and F_WRLCKP. In addition to adding this to Linux, I'd like to have it considered for adoption in the formal POSIX spec as well since I'm fairly sure this is an issue on other POSIX-y OS' as well.

The initial patch posting is here:

    http://marc.info/?l=linux-fsdevel&m=138149440915513&w=2 [^]

...and the semantics for the new lock types are layed out there. The basic idea is to have them behave exactly like any other F_RDLCK or F_WRLCK lock, but that these should only be released on close when the file descriptor against which they were opened is closed. Closing other file descriptors shouldn't affect them.

This changes how lock merging works a bit. The 'P' lock types can't be merged with 'non-P' types since they have different semantics when a fd is closed. Similarly, 'P' locks acquired on different file descriptors can't be merged either, since they need to be released separately on close.

At this point, the draft patchset is quite rough and I expect it'll need changes in response to review and comments. In the meantime, I'd like to also open the floor for other POSIX players to add feedback on this while I'm trying to finalize the interface.
Tags No tags attached.
Attached Files

- Relationships
related to 0000824Interpretation Required children should not inherit fcntl file locks from parent calling posix_spawn 

-  Notes
(0001879)
jilles (reporter)
2013-10-11 15:27

Another file locking API available on Linux and BSDs is flock(), originally from 4.2BSD. Although it only allows locking whole files (no byte-ranges), it has much better semantics: locks are tied to open file descriptions instead of processes. As a result, not only can processes open and close a file another time without affecting the lock, but a child process can inherit a lock from its parent and mutual exclusion works between threads in the same process if they each open the file themselves.

If fcntl(F_GETLK) returns a lock set with flock() (this is possible on FreeBSD but Linux man pages say it is not possible on Linux because fcntl() and flock() locks are independent), then the l_pid member cannot be a process ID because such a lock is not held by a process as such. FreeBSD sets it to -1 in that case.

The proposed F_RDLCKP and F_WRLCKP types seem to address the close issue only.

So I suggest standardizing flock() and/or a fcntl()-based locking mechanism with the same semantics as flock().
(0001880)
jlayton (reporter)
2013-10-11 15:45

The close() issue is the main one I'm trying to address.

Standardizing flock() wouldn't help much since it doesn't allow you to lock byte-ranges. I've no objection to standardizing that, but it's sort of orthogonal to the problem at hand.

Linux could be altered to return a flock() lock on F_GETLK, but it historically hasn't since flock and POSIX locks operate entirely independently. I don't think that would really buy us much. Do flock and fcntl locks conflict with one another on BSD as well?

The question of inheritance of a lock is more interesting however. It's not 100% clear to me what sort of semantics the people asking for this actually want in that respect, so I'll look into that.

I'm inclined to steer away from introducing an entirely new locking mechanism however, since I think we want these new lock types to conflict vs. "legacy" POSIX locks. IOW, programs written to use the older F_RDLCK/F_WRLCK types should continue to work in parallel with ones written for the new ones and provide the same exclusion guarantees. We could still do that with a new interface, but I worry that it'll be more confusing for developers.
(0001975)
jlayton (reporter)
2013-11-08 12:18

Ok, after looking at this, I think you're correct that emulating flock()'s semantics wrt to close and inheritance is the right thing to do. I've got a new draft patchset that does this.

It does mean that I'll need a new F_UNLCKP variant flag if we stick with the F_SETLK interface, but I think that's a reasonable addition anyway.
(0001985)
Don Cragun (manager)
2013-11-14 16:17

This was discussed during the November 14, 2013 conference call.

Although we agree that a new lock type might be a good idea, before we could consider anything, we would need actual changes to be added to the text to define the behavior of this feature. Furthermore, there would have to be an existing implementation that is shipping as part of a product. This bug will be kept open for a while as a placeholder for such a specification.
(0002062)
jlayton (reporter)
2013-12-10 19:59

Thanks for the consideration so far. I'm still pushing forward with proposed patches on the Linux mailing lists. Once I have the semantics a bit more settled, I'll see about writing up the actual changes to the text.
(0002069)
jlayton (reporter)
2013-12-12 12:05

First rough draft of update to the text in the spec. This is just covers the F_SETLK piece. We may need some updates to the F_GETLK piece. As a side note, documenting this makes it clear the that F_SETLK documentation is pretty complex. Would we be better served by doing this with new cmd values instead? i.e. F_GETLKP and F_SETLKP ? Then we could just reuse the existing definitions for F_RDLCK and F_WRLCK.

Thoughts?

--------------------------------------

...existing text:

F_SETLK
    Set or clear a file segment lock according to the lock description
pointed to by the third argument, arg, taken as a pointer to type struct
flock, defined in <fcntl.h>. F_SETLK can establish shared (or read)
locks (F_RDLCK) or exclusive (or write) locks (F_WRLCK), as well as to
remove either type of lock (F_UNLCK). F_RDLCK, F_WRLCK, and F_UNLCK are
defined in <fcntl.h>. If a shared or exclusive lock cannot be set,
fcntl() shall return immediately with a return value of -1.

...new text:

F_SETLK

    Set or clear a file segment lock according to the lock description
pointed to by the third argument, arg, taken as a pointer to type struct
flock, defined in <fcntl.h>. F_SETLK can establish shared (or read)
locks (F_RDLCK and F_RDLCKP) or exclusive (or write) locks (F_WRLCK and
F_WRLCKP), as well as to remove either type of lock (F_UNLCK and
F_UNLCKP). F_RDLCK, F_WRLCK, F_UNLCK, F_RDLCKP, F_WRLCKP, and F_UNLCKP
are defined in <fcntl.h>. If a shared or exclusive lock cannot be set,
fcntl() shall return immediately with a return value of -1. Locks set
with F_RDLCK and F_WRLCK can only be unset with F_UNLCK, and locks set
with F_RDLCKP and F_WRLCKP can only be unset with F_UNLCKP.

---------------[part 2]---------------

...existing text:

All locks associated with a file for a given process shall be removed
when a file descriptor for that file is closed by that process or the
process holding that file descriptor terminates. Locks are not inherited
by a child process.

...new text:

All locks associated with a file for a given process that were set with
F_RDLCK or F_WRLCK shall be removed when a file descriptor for that file
is closed by that process or the process holding that file descriptor
terminates. Also those sorts of locks are not inherited by a child process.

Locks set with F_RDLCKP and F_WRLCKP are removed when the last reference
to the open file on which they were set is closed. These locks are
inherited by child processes.
(0002093)
jlayton (reporter)
2013-12-26 12:29

Now that I've looked at the complexity of adding the above text, I think it might be better to implement this with a new set of cmd values instead, a'la:

F_GETLKP
F_SETLKP
F_SETLKPW

I've made that change to the Linux implementation of this patchset and am testing it now. I'll plan to write up a documentation update once that's complete.
(0002163)
eblake (manager)
2014-02-27 16:20

During discussion of 0000824, the question was raised whether it might be useful to have read locks be inheritable across fork/posix_spawn, so that a child can be started with ownership of the same sections locked as the parent instead of having to re-grab the lock. This should be considered when adding new lock constructs (and the consideration may be that dropping all locks, including read locks, is still the best policy)
(0002168)
jlayton (reporter)
2014-02-27 18:44

With the current Linux implementation, I've taken the suggestion of jilles above and adopted BSD (flock()) lock semantics with respect to inheritance and on close(). Thus, file-private locks are associated with the open file so any entity that gets a reference to that open file won't need to reset locks.

The article here explains the semantics in more depth:

     https://lwn.net/Articles/586904/ [^]

...though this is still not yet merged into Linux kernel so the semantics are not yet set in stone.
(0002230)
jlayton (reporter)
2014-04-18 00:01

There is currently a discussion running on several Linux-related mailing lists about what we should call these new locks. The original name I game them was "file-private" locks but that's not as descriptive as it should be.

The current favorite is "file-description locks" since these follow the open file description, with a corresponding change of the macros to make them more visually distinct:

F_FD_GETLK
F_FD_SETLK
F_FD_SETLKW

...it would be nice to have some input on this front from the "powers that be" at the austingroup. A link to the discussion on LKML is here:

    https://lkml.org/lkml/2014/4/16/583 [^]

Please feel free to chime in on the discussion if you can. Unfortunately, the window to rename these is rather short. We only have around 6 weeks before v3.15 of the kernel ships with this feature, and I need to have this fixed well before then.
(0002508)
jlayton (reporter)
2014-12-18 20:57

Sorry for the long delay on this. I had a job change which ended up sidetracking my efforts here. A progress report:

The locks were renamed to "open file description" (OFD) locks, with the constants as:

    #define F_OFD_GETLK 36
    #define F_OFD_SETLK 37
    #define F_OFD_SETLKW 38

The code was merged into v3.15. Both Samba and NFS-Ganesha are looking to use this new facility to simplify their locking code, but both are currently works in progress.

When I did the (trivial) patch for glibc, I also added a section to its manual that describes the new locks and their semantics:

http://www.gnu.org/software/libc/manual/html_mono/libc.html#Open-File-Description-Locks [^]

I'm still interested in seeing this adopted into the POSIX standard.

- Issue History
Date Modified Username Field Change
2013-10-11 13:24 jlayton New Issue
2013-10-11 13:24 jlayton Name => Jeff Layton
2013-10-11 13:24 jlayton Organization => Red Hat
2013-10-11 13:24 jlayton Section => fcntl
2013-10-11 13:24 jlayton Page Number => 814-815
2013-10-11 13:39 jlayton Issue Monitored: jlayton
2013-10-11 15:27 jilles Note Added: 0001879
2013-10-11 15:45 jlayton Note Added: 0001880
2013-11-08 12:18 jlayton Note Added: 0001975
2013-11-14 16:17 Don Cragun Note Added: 0001985
2013-11-27 17:44 grawity Issue Monitored: grawity
2013-12-04 11:06 Florian Weimer Issue Monitored: Florian Weimer
2013-12-10 19:59 jlayton Note Added: 0002062
2013-12-12 12:05 jlayton Note Added: 0002069
2013-12-26 12:29 jlayton Note Added: 0002093
2014-02-27 16:20 eblake Note Added: 0002163
2014-02-27 16:22 eblake Relationship added related to 0000824
2014-02-27 18:44 jlayton Note Added: 0002168
2014-04-18 00:01 jlayton Note Added: 0002230
2014-12-18 20:57 jlayton Note Added: 0002508


Mantis 1.1.6[^]
Copyright © 2000 - 2008 Mantis Group
Powered by Mantis Bugtracker