Austin Group Defect Tracker

Aardvark Mark III


Viewing Issue Simple Details Jump to Notes ] Issue History ] Print ]
ID Category Severity Type Date Submitted Last Update
0000852 [1003.1(2013)/Issue7+TC1] System Interfaces Editorial Clarification Requested 2014-06-28 22:42 2014-11-27 10:31
Reporter mdempsky View Status public  
Assigned To
Priority normal Resolution Accepted As Marked  
Status Interpretation Required  
Name Matthew Dempsky
Organization OpenBSD
User Reference
Section mmap
Page Number 1324
Line Number 43807
Interp Status Approved
Final Accepted Text Note: 0002334
Summary 0000852: Clarify MAP_FIXED semantics when replacing existing locked mappings
Description munmap() explicitly states that for the specified address range, any existing mappings *and* memory locks shall be removed. This would suggest that formally, memory locks are independent of the underlying mapping.

mmap() describes MAP_FIXED as replacing existing mappings, but it doesn't describe what happens if there are memory locks for the associated memory range. Should they be inherited by the new mapping, or should they be reset (according to the prevailing mlockall() status)?

At least on OpenBSD, they're reset.
Desired Action If existing memory locks should be inherited by MAP_FIXED mappings, then in the ERRORS section change both instances of:

    if required by mlockall()

to:

    if required by mlock() or mlockall()


However, if existing memory locks should instead be reset according to mlockall(), then change the paragraph starting with "When MAP_FIXED is set in the flags argument" from:

    If a MAP_FIXED request is successful, the mapping established by mmap() replaces any previous mappings for the pages in the range [pa,pa+len) of the process.

to:

    If a MAP_FIXED request is successful, then any previous mappings [ML|MLR]or memory locks[/] associated with the address range [pa,pa + len) are removed, as if by an appropriate call to munmap().
Tags tc2-2008
Attached Files

- Relationships

-  Notes
(0002293)
mdempsky (reporter)
2014-06-28 22:56

I suppose another option is to just leave it explicitly unspecified as to what happens if MAP_FIXED replaces a locked mapping. Something like:

    If MAP_FIXED is specified and there are memory locks associated with the address range [pa,pa + len), then the effects are unspecified.
(0002294)
mdempsky (reporter)
2014-06-29 00:20

As best I can tell just by looking at kernel source code, it looks like illumos, FreeBSD, NetBSD, and Linux all behave the same as OpenBSD in this regard.

I've also experimentally verified that this seems to be the case on Linux using the test program below. It makes use of the non-standard RLIMIT_MEMLOCK extension (available on Linux, *BSD, and OS X); not sure if there's a suitable replacement for testing SVR derivatives.

#include <sys/mman.h>
#include <sys/resource.h>
#include <assert.h>
#include <unistd.h>

int
main()
{
    long pagesize = sysconf(_SC_PAGESIZE);
    assert(pagesize >= 1);

    // Lower process memory lock limit to exactly one page.
    struct rlimit rlim;
    assert(0 == getrlimit(RLIMIT_MEMLOCK, &rlim));
    if (rlim.rlim_cur != RLIM_INFINITY)
        assert(rlim.rlim_cur >= (unsigned long)pagesize);
    rlim.rlim_cur = pagesize;
    assert(0 == setrlimit(RLIMIT_MEMLOCK, &rlim));

    // Allocate two pages of memory.
    char *p = mmap(NULL, 2 * pagesize, PROT_READ|PROT_WRITE,
        MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
    assert(p != MAP_FAILED);

    // Assert we're able to lock exactly one of them.
    assert(0 == mlock(p, pagesize));
    assert(-1 == mlock(p + pagesize, pagesize));

    // Map over the first page, and assert it allows us to now
    // lock the second page.
    assert(p == mmap(p, pagesize, PROT_READ|PROT_WRITE,
        MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0));
    assert(0 == mlock(p + pagesize, pagesize));

    // Reset.
    assert(0 == munlock(p, 2 * pagesize));

    // Set MCL_FUTURE, remap over first page again, and assert
    // the new mapping is now locked.
    assert(0 == mlockall(MCL_FUTURE));
    assert(p == mmap(p, pagesize, PROT_READ|PROT_WRITE,
        MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0));
    assert(-1 == mlock(p + pagesize, pagesize));

    return (0);
}
(0002301)
mdempsky (reporter)
2014-07-02 18:02

Hm, strictly speaking "pa+len" is ambiguous, because the type for "pa" is never specified. If we infer it's the same as mmap()'s return type ("void *"), then the expression is technically ill-formed, as pointer arithmetic is only valid on pointers to object type (which "void *" is not). So perhaps "address range [pa,pa+len)" should be expanded to "address range starting at address pa and continuing for len bytes", to mimick the wording in mprotect() and munmap().
(0002302)
markh (reporter)
2014-07-04 17:04

Does that expansion not strictly speaking add a new ambiguity, in that the direction is no longer specified? Also note that the ill-formed [addr,addr+len) is also used in the wording for mmap() [ENOMEM], mprotect(), and munmap().
(0002303)
mdempsky (reporter)
2014-07-05 10:13

markh: I would argue the wording is sufficiently clear as is. C99 in several places refers to the "beginning", "start", or "end" of objects/arrays without belaboring the meaning. And if the address range were to proceed backwards, then pa would no longer be its start.
(0002311)
joerg (reporter)
2014-07-17 16:26
edited on: 2014-07-17 17:03

Could you explain why yoo believe that illumos does the same as OpenBSD?

The code from you coes not compile on Solaris as there is no RLIMIT_MEMLOCK

On Solaris the program then fails with:

Assertion failed: -1 == mlock(p + pagesize, pagesize), file mm.c, line 29

Could you help?

BTW: I have been to verify in the solaris source that mmap() with MAP_FIXED
indeed calls the full equivalent for munmap(). What I cannot say is what
happens with the locking state of the new memory.

(0002312)
mdempsky (reporter)
2014-07-17 18:16

joerg: Yes, unfortunately like I said, the test program uses the non-portable RLIMIT_MEMLOCK resource limit which doesn't seem to be available outside of Linux, OS X, and *BSD. I recall seeing Solaris has some memory locking limit that can be associated with jails, tasks, or something, but I'm not familiar enough with Solaris to know how to write a test program to utilize those (or if that's even possible).

As for why I think illumos has the same behavior, I'm not very familiar with its VM internals, but here's what I was able to reason about from browsing src.illumos.org:

In uts/common/os/grow.c, the munmap() system call appears to be implemented as a call to lwpchan_delete_mapping() and as_unmap(). POSIX explicitly requires munmap() to unlock wired memory, so I infer that calling both of these in sequence should ensure the memory is unlocked. (I suspect as_unmap() alone is actually responsible for the unlocking, but the rest of my reasoning below doesn't depend on this detail.)

Also in uts/common/os/grow.c, the smmap32() and smmap64() system call entry points both call into smmap_common(). Considering these two case separately:

  1. If fp == NULL (i.e., MAP_ANON was specified), then smmap_common() will call lwpchan_delete_mapping() and then call zmap(), which calls choose_addr(), which if MAP_FIXED was specified uses as_unmap() to remove any existing mappings.

  2. If fp != NULL, then smmap_common() will still (but much later) call lwpchan_delete_mapping() and then call VOP_MAP(). I haven't reviewed every vop_map implementation, but at least udf_map(), ufs_map(), and zfs_map() all unconditionally call choose_addr() (as zmap() did above in case #1).

Therefore when MAP_FIXED is specified, illumos's mmap() appears to always call the same two functions used to implement munmap() (lwpchan_delete_mapping() and as_unmap()). So it seems reasonable to conclude that on illumos the MAP_FIXED flag means the old mappings will be removed as if by munmap().

Please advise if I've misinterpreted the code.
(0002313)
martinr (reporter)
2014-07-18 07:48

Solaris mlock(3C) man page says:

     If the mapping through which an mlock() has been performed
     is removed, an munlock() is implicitly performed. An mun-
     lock() is also performed implicitly when a page is deleted
     through file removal or truncation.

Test program:
    char *p = mmap(NULL, 2 * pagesize, PROT_READ|PROT_WRITE,
        MAP_PRIVATE|MAP_ANONYMOUS, -1, 0); mlock(p, 2 * pagesize);
    munmap(p, 2 * pagesize));

Dtrace:
 CPU FUNCTION
   6 => munmap munmap syscall entry
   6 -> munmap
...
   6 -> as_unmap unmap the mapping from the address space
...
   6 -> segvn_unmap unmap operation of seg_vn driver
   6 -> segvn_lockop call for unlock the mapping

Kernel seg_vn segment driver is in use when the mapping was allocated using
mmap() with MAP_PRIVATE|MAP_ANONYMOUS flags.
(0002316)
mdempsky (reporter)
2014-07-18 16:03

martinr: Thanks. Can you repeat your test but replace "munmap(p, 2 * pagesize);" with "mmap(p, 2 * pagesize, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0);" and confirm that segvn_lockop() is still called to unlock the original mapping?
(0002323)
martinr (reporter)
2014-07-24 16:36

When the second mmap() is called segvn_lockop() function with UNLOCK operation is called to unlock the lock. During as_map() part no segvn_lockop() with LOCK operation is called to restore underlying lock. That means there is no lock on the overlapping region unless mlock*() is explicitly called.
(0002334)
geoffclare (manager)
2014-08-07 15:37
edited on: 2014-08-07 15:51

Interpretation response
------------------------

The standard does not speak to this issue, and as such no conformance
distinction can be made between alternative implementations based on
this. This is being referred to the sponsor.

Rationale:
-------------

We believe this is a clarification of existing practice; removing locks when remapping existing regions is important to applications but the standard does not clearly specify what is supposed to happen in this case.

Notes to the Editor (not part of this interpretation):
-------------------------------------------------------

On page 1324 lines 43807-43808 (mmap() description), change:
If a MAP_FIXED request is successful, the mapping established by mmap() replaces any previous mappings for the pages in the range [pa,pa+len) of the process.

to:
If a MAP_FIXED request is successful, then any previous mappings [ML|MLR]or memory locks[/] for those whole pages containing any part of the address range [pa,pa+len) shall be removed, as if by an appropriate call to munmap(), before the new mapping is established.


On page 1326 lines 43928-43937 (mmap() rationale) change:
If an application requests a mapping that would overlay existing mappings in the process, it might be desirable that an implementation detect this and inform the application. However, the default, portable (not MAP_FIXED) operation does not overlay existing mappings. On the other hand, if the program specifies a fixed address mapping (which requires some implementation knowledge to determine a suitable address, if the function is supported at all), then the program is presumed to be successfully managing its own address space and should be trusted when it asks to map over existing data structures. Furthermore, it is also desirable to make as few system calls as possible, and it might be considered onerous to require an munmap() before an mmap() to the same address range. This volume of POSIX.1-2008 specifies that the new mappings replace any existing mappings, following existing practice in this regard.

to:
If an application requests a mapping that overlaps existing mappings in the process, it might be desirable that an implementation detect this and inform the application. However, if the program specifies a fixed address mapping (which requires some implementation knowledge to determine a suitable address, if the function is supported at all), then the program is presumed to be successfully managing its own address space and should be trusted when it asks to map over existing data structures. Furthermore, it is also desirable to make as few system calls as possible, and it might be considered onerous to require an munmap() before an mmap() to the same address range. This volume of POSIX.1-2008 specifies that the new mapping replaces any existing mappings (implying an automatic munmap() on the address range), following existing practice in this regard. The standard developers also considered whether there should be a way for new mappings to overlay existing mappings, but found no existing practice for this.


On page 1328 line 43983 (mmap() rationale) change MEMLOCK_FUTURE to MCL_FUTURE.

(0002406)
ajosey (manager)
2014-10-06 07:44

Interpretation proposed 6 October 2014
(0002447)
ajosey (manager)
2014-11-27 10:31

Interpretation approved 27 November 2014

- Issue History
Date Modified Username Field Change
2014-06-28 22:42 mdempsky New Issue
2014-06-28 22:42 mdempsky Name => Matthew Dempsky
2014-06-28 22:42 mdempsky Organization => OpenBSD
2014-06-28 22:42 mdempsky Section => mmap
2014-06-28 22:42 mdempsky Page Number => http://pubs.opengroup.org/onlinepubs/9699919799/functions/mmap.html [^]
2014-06-28 22:56 mdempsky Note Added: 0002293
2014-06-29 00:20 mdempsky Note Added: 0002294
2014-07-02 18:02 mdempsky Note Added: 0002301
2014-07-04 17:04 markh Note Added: 0002302
2014-07-05 10:13 mdempsky Note Added: 0002303
2014-07-17 16:26 joerg Note Added: 0002311
2014-07-17 17:02 joerg Note Edited: 0002311
2014-07-17 17:03 joerg Note Edited: 0002311
2014-07-17 18:16 mdempsky Note Added: 0002312
2014-07-18 07:48 martinr Note Added: 0002313
2014-07-18 16:03 mdempsky Note Added: 0002316
2014-07-24 16:36 martinr Note Added: 0002323
2014-08-07 15:37 geoffclare Note Added: 0002334
2014-08-07 15:38 geoffclare Interp Status => Pending
2014-08-07 15:38 geoffclare Final Accepted Text => Note: 0002334
2014-08-07 15:38 geoffclare Status New => Interpretation Required
2014-08-07 15:38 geoffclare Resolution Open => Accepted As Marked
2014-08-07 15:39 geoffclare Tag Attached: tc2-2008
2014-08-07 15:51 geoffclare Note Edited: 0002334
2014-08-07 15:52 geoffclare Page Number http://pubs.opengroup.org/onlinepubs/9699919799/functions/mmap.html [^] => 1324
2014-08-07 15:52 geoffclare Line Number => 43807
2014-10-06 07:44 ajosey Interp Status Pending => Proposed
2014-10-06 07:44 ajosey Note Added: 0002406
2014-11-27 10:31 ajosey Interp Status Proposed => Approved
2014-11-27 10:31 ajosey Note Added: 0002447


Mantis 1.1.6[^]
Copyright © 2000 - 2008 Mantis Group
Powered by Mantis Bugtracker