Notes |
(0002293)
mdempsky (reporter)
2014-06-28 22:56
|
I suppose another option is to just leave it explicitly unspecified as to what happens if MAP_FIXED replaces a locked mapping. Something like:
If MAP_FIXED is specified and there are memory locks associated with the address range [pa,pa + len), then the effects are unspecified. |
|
(0002294)
mdempsky (reporter)
2014-06-29 00:20
|
As best I can tell just by looking at kernel source code, it looks like illumos, FreeBSD, NetBSD, and Linux all behave the same as OpenBSD in this regard.
I've also experimentally verified that this seems to be the case on Linux using the test program below. It makes use of the non-standard RLIMIT_MEMLOCK extension (available on Linux, *BSD, and OS X); not sure if there's a suitable replacement for testing SVR derivatives.
#include <sys/mman.h>
#include <sys/resource.h>
#include <assert.h>
#include <unistd.h>
int
main()
{
long pagesize = sysconf(_SC_PAGESIZE);
assert(pagesize >= 1);
// Lower process memory lock limit to exactly one page.
struct rlimit rlim;
assert(0 == getrlimit(RLIMIT_MEMLOCK, &rlim));
if (rlim.rlim_cur != RLIM_INFINITY)
assert(rlim.rlim_cur >= (unsigned long)pagesize);
rlim.rlim_cur = pagesize;
assert(0 == setrlimit(RLIMIT_MEMLOCK, &rlim));
// Allocate two pages of memory.
char *p = mmap(NULL, 2 * pagesize, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
assert(p != MAP_FAILED);
// Assert we're able to lock exactly one of them.
assert(0 == mlock(p, pagesize));
assert(-1 == mlock(p + pagesize, pagesize));
// Map over the first page, and assert it allows us to now
// lock the second page.
assert(p == mmap(p, pagesize, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0));
assert(0 == mlock(p + pagesize, pagesize));
// Reset.
assert(0 == munlock(p, 2 * pagesize));
// Set MCL_FUTURE, remap over first page again, and assert
// the new mapping is now locked.
assert(0 == mlockall(MCL_FUTURE));
assert(p == mmap(p, pagesize, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0));
assert(-1 == mlock(p + pagesize, pagesize));
return (0);
} |
|
(0002301)
mdempsky (reporter)
2014-07-02 18:02
|
Hm, strictly speaking "pa+len" is ambiguous, because the type for "pa" is never specified. If we infer it's the same as mmap()'s return type ("void *"), then the expression is technically ill-formed, as pointer arithmetic is only valid on pointers to object type (which "void *" is not). So perhaps "address range [pa,pa+len)" should be expanded to "address range starting at address pa and continuing for len bytes", to mimick the wording in mprotect() and munmap(). |
|
(0002302)
markh (reporter)
2014-07-04 17:04
|
Does that expansion not strictly speaking add a new ambiguity, in that the direction is no longer specified? Also note that the ill-formed [addr,addr+len) is also used in the wording for mmap() [ENOMEM], mprotect(), and munmap(). |
|
(0002303)
mdempsky (reporter)
2014-07-05 10:13
|
markh: I would argue the wording is sufficiently clear as is. C99 in several places refers to the "beginning", "start", or "end" of objects/arrays without belaboring the meaning. And if the address range were to proceed backwards, then pa would no longer be its start. |
|
(0002311)
joerg (reporter)
2014-07-17 16:26
edited on: 2014-07-17 17:03
|
Could you explain why yoo believe that illumos does the same as OpenBSD?
The code from you coes not compile on Solaris as there is no RLIMIT_MEMLOCK
On Solaris the program then fails with:
Assertion failed: -1 == mlock(p + pagesize, pagesize), file mm.c, line 29
Could you help?
BTW: I have been to verify in the solaris source that mmap() with MAP_FIXED
indeed calls the full equivalent for munmap(). What I cannot say is what
happens with the locking state of the new memory.
|
|
(0002312)
mdempsky (reporter)
2014-07-17 18:16
|
joerg: Yes, unfortunately like I said, the test program uses the non-portable RLIMIT_MEMLOCK resource limit which doesn't seem to be available outside of Linux, OS X, and *BSD. I recall seeing Solaris has some memory locking limit that can be associated with jails, tasks, or something, but I'm not familiar enough with Solaris to know how to write a test program to utilize those (or if that's even possible).
As for why I think illumos has the same behavior, I'm not very familiar with its VM internals, but here's what I was able to reason about from browsing src.illumos.org:
In uts/common/os/grow.c, the munmap() system call appears to be implemented as a call to lwpchan_delete_mapping() and as_unmap(). POSIX explicitly requires munmap() to unlock wired memory, so I infer that calling both of these in sequence should ensure the memory is unlocked. (I suspect as_unmap() alone is actually responsible for the unlocking, but the rest of my reasoning below doesn't depend on this detail.)
Also in uts/common/os/grow.c, the smmap32() and smmap64() system call entry points both call into smmap_common(). Considering these two case separately:
1. If fp == NULL (i.e., MAP_ANON was specified), then smmap_common() will call lwpchan_delete_mapping() and then call zmap(), which calls choose_addr(), which if MAP_FIXED was specified uses as_unmap() to remove any existing mappings.
2. If fp != NULL, then smmap_common() will still (but much later) call lwpchan_delete_mapping() and then call VOP_MAP(). I haven't reviewed every vop_map implementation, but at least udf_map(), ufs_map(), and zfs_map() all unconditionally call choose_addr() (as zmap() did above in case #1).
Therefore when MAP_FIXED is specified, illumos's mmap() appears to always call the same two functions used to implement munmap() (lwpchan_delete_mapping() and as_unmap()). So it seems reasonable to conclude that on illumos the MAP_FIXED flag means the old mappings will be removed as if by munmap().
Please advise if I've misinterpreted the code. |
|
(0002313)
martinr (reporter)
2014-07-18 07:48
|
Solaris mlock(3C) man page says:
If the mapping through which an mlock() has been performed
is removed, an munlock() is implicitly performed. An mun-
lock() is also performed implicitly when a page is deleted
through file removal or truncation.
Test program:
char *p = mmap(NULL, 2 * pagesize, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANONYMOUS, -1, 0); mlock(p, 2 * pagesize);
munmap(p, 2 * pagesize));
Dtrace:
CPU FUNCTION
6 => munmap munmap syscall entry
6 -> munmap
...
6 -> as_unmap unmap the mapping from the address space
...
6 -> segvn_unmap unmap operation of seg_vn driver
6 -> segvn_lockop call for unlock the mapping
Kernel seg_vn segment driver is in use when the mapping was allocated using
mmap() with MAP_PRIVATE|MAP_ANONYMOUS flags. |
|
(0002316)
mdempsky (reporter)
2014-07-18 16:03
|
martinr: Thanks. Can you repeat your test but replace "munmap(p, 2 * pagesize);" with "mmap(p, 2 * pagesize, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0);" and confirm that segvn_lockop() is still called to unlock the original mapping? |
|
(0002323)
martinr (reporter)
2014-07-24 16:36
|
When the second mmap() is called segvn_lockop() function with UNLOCK operation is called to unlock the lock. During as_map() part no segvn_lockop() with LOCK operation is called to restore underlying lock. That means there is no lock on the overlapping region unless mlock*() is explicitly called. |
|
(0002334)
geoffclare (manager)
2014-08-07 15:37
edited on: 2014-08-07 15:51
|
Interpretation response
------------------------
The standard does not speak to this issue, and as such no conformance
distinction can be made between alternative implementations based on
this. This is being referred to the sponsor.
Rationale:
-------------
We believe this is a clarification of existing practice; removing locks when remapping existing regions is important to applications but the standard does not clearly specify what is supposed to happen in this case.
Notes to the Editor (not part of this interpretation):
-------------------------------------------------------
On page 1324 lines 43807-43808 (mmap() description), change:If a MAP_FIXED request is successful, the mapping established by mmap() replaces any previous mappings for the pages in the range [pa,pa+len) of the process.
to:If a MAP_FIXED request is successful, then any previous mappings [ML|MLR]or memory locks[/] for those whole pages containing any part of the address range [pa,pa+len) shall be removed, as if by an appropriate call to munmap(), before the new mapping is established.
On page 1326 lines 43928-43937 (mmap() rationale) change:If an application requests a mapping that would overlay existing mappings in the process, it might be desirable that an implementation detect this and inform the application. However, the default, portable (not MAP_FIXED) operation does not overlay existing mappings. On the other hand, if the program specifies a fixed address mapping (which requires some implementation knowledge to determine a suitable address, if the function is supported at all), then the program is presumed to be successfully managing its own address space and should be trusted when it asks to map over existing data structures. Furthermore, it is also desirable to make as few system calls as possible, and it might be considered onerous to require an munmap() before an mmap() to the same address range. This volume of POSIX.1-2008 specifies that the new mappings replace any existing mappings, following existing practice in this regard.
to:If an application requests a mapping that overlaps existing mappings in the process, it might be desirable that an implementation detect this and inform the application. However, if the program specifies a fixed address mapping (which requires some implementation knowledge to determine a suitable address, if the function is supported at all), then the program is presumed to be successfully managing its own address space and should be trusted when it asks to map over existing data structures. Furthermore, it is also desirable to make as few system calls as possible, and it might be considered onerous to require an munmap() before an mmap() to the same address range. This volume of POSIX.1-2008 specifies that the new mapping replaces any existing mappings (implying an automatic munmap() on the address range), following existing practice in this regard. The standard developers also considered whether there should be a way for new mappings to overlay existing mappings, but found no existing practice for this.
On page 1328 line 43983 (mmap() rationale) change MEMLOCK_FUTURE to MCL_FUTURE.
|
|
(0002406)
ajosey (manager)
2014-10-06 07:44
|
Interpretation proposed 6 October 2014 |
|
(0002447)
ajosey (manager)
2014-11-27 10:31
|
Interpretation approved 27 November 2014 |
|