Austin Group Defect Tracker

Aardvark Mark III


Viewing Issue Simple Details Jump to Notes ] Issue History ] Print ]
ID Category Severity Type Date Submitted Last Update
0000811 [1003.1(2013)/Issue7+TC1] System Interfaces Objection Clarification Requested 2013-12-18 20:06 2014-06-19 16:13
Reporter torvald View Status public  
Assigned To
Priority normal Resolution Accepted As Marked  
Status Resolved  
Name Torvald Riegel
Organization Red Hat
User Reference
Section pthread_mutex_destroy
Page Number 1643, 1647
Line Number 53249-53251, 53423-53444
Interp Status ---
Final Accepted Text Note: 0002267
Summary 0000811: precondition for mutex destruction unclear; example contradicts normative text
Description The precondition for mutex destruction is unclear because the informative example seems to contradict the normative text. The latter states: "Attempting to destroy [...] a mutex that is referenced [...] by another thread results in undefined behavior." It can be argued that a mutex can be considered "referenced" as long as a call to pthread_mutex_unlock has not yet returned to the caller.

This would make the example using the reference counting incorrect because there is no synchronization between *after* unlock() in threads that don't yet destroy the mutex (i.e., no communication that unlock() calls have returned to the caller) to before the last lock/unlock whose caller will then destroy the mutex. IOW, the example program does not ensure that the mutex is not referenced during destruction. However, the conceptual unblocking of the mutex is communicated via the last thread being able to acquire the mutex; the standard should clarify what the precise precondition for destruction is.
Desired Action If the intended precondition for destruction is that the mutex (i.e., the pthread_mutex_t structure) is not referenced by other threads concurrently, then the example should be removed or changed.

If the intended precondition for destruction is that the mutex has been conceptually unblocked as observable to the program by being able to acquire it, then "referenced" should be clarified. I guess that it would be sufficient to state that the program is required to not have concurrent or pending lock/unlock calls anymore on this mutex; this would fit the condvar case I suspect because any mutexes 'referenced' by a condvar will also get locked eventually. Note that this choice has significant consequences for how locks can be implemented on some platforms; I can elaborate on this if this would be helpful.
Tags tc2-2008
Attached Files

- Relationships

-  Notes
(0002081)
torvald (reporter)
2013-12-18 20:07

This should be in "System Interfaces", sorry.
(0002082)
Don Cragun (manager)
2013-12-18 20:10

Category changed from Base Definitions to System Interfaces
(0002083)
dalias (reporter)
2013-12-18 20:45

I'd like to speak up in support of the interpretation that supports self-synchronized destruction, i.e. where the release of "reference" to a mutex happens atomically with unlocking it. As I stated on the glibc bug tracker ticket (13690) associated with this request for interpretation, I believe it is difficult for application programmers to safely use mutexes in dynamically allocated memory without this guarantee, and misuses are extremely hard to detect (reproducing the associated crashes may take months of years of cpu time).

Certainly there are some cases where getting by without self-synchronized destruction is easy. For example, if you're freeing an object that's part of a larger data structure, acquiring the lock on the larger data structure before locking the individual object assures that another thread cannot still be in the tail part of the pthread_mutex_unlock call for the individual object when you obtain the mutex. But in general it's not so easy. One particularly bad case is reference-counted objects shared between threads where the thread relinquishing the last reference destroys the internal mutex and frees the object. In this case there is no additional synchronization; you have to rely on self-synchronized destruction.
(0002167)
eblake (manager)
2014-02-27 17:35

Here's a mailing list message (along with other messages in the same thread) that demonstrate that the current standard's silence on whether self-synchronized destruction is possible is less than ideal, in that the code ended up having to rely on a pthread_join before destroying a mutex in order to ensure that there was no overlap.
https://lists.gnu.org/archive/html/qemu-devel/2014-02/msg04583.html [^]
(0002188)
geoffclare (manager)
2014-03-20 15:38

It would be good if we could make the normative text less vague. Where it says "a mutex that is referenced (for example, while being used in a pthread_cond_timedwait() or pthread_cond_wait()) by another thread" what other types of use besides those mentioned as examples are intended to count as "references"? Could we just change it to: "a mutex that is being used in a pthread_cond_timedwait() or pthread_cond_wait() by another thread"?
(0002191)
wlerch (reporter)
2014-03-20 16:45

What if the mutex is being used in a pthread_mutex_lock() by another thread -- should that not count too?
(0002192)
geoffclare (manager)
2014-03-20 16:58

Good point. How about:

Attempting to destroy a locked mutex, or a mutex that another thread is attempting to lock, or a mutex that is being used in a pthread_cond_timedwait() or pthread_cond_wait() call by another thread, results in undefined behavior.
(0002193)
wlerch (reporter)
2014-03-20 17:21

Is the concern here that pthread_mutex_unlock() may be non-atomic in the sense that another thread might be able to lock (and unlock, and destroy) the mutex before pthread_mutex_unlock() is done accessing the contents of the pthread_mutex_t?
(0002194)
dalias (reporter)
2014-03-20 18:31

Yes, that's exactly the concern, and real world implementations (glibc) have this issue. The most problematic aspect of it is that it precludes freeing reference-counted objects, as in:

pthread_mutex_lock(&obj->m);
int ref = --obj->ref;
pthread_mutex_unlock(&obj->m);
if (!ref) {
    pthread_mutex_destroy(&obj->m);
    free(obj);
}

Of course nobody notices the bug because it's a race condition that gets hit once in an interval somewhere on the order of cpu-years...
(0002195)
wlerch (reporter)
2014-03-20 18:37
edited on: 2014-03-21 02:31

Well my naive reading of the standard would be that once pthread_mutex_unlock() has allowed another thread to get the mutex, it counts as "unlocked" by this thread, and therefore it must be safe to destroy. In other words, any implementation that has this issue has a bug.

Oh, but after re-reading the original Description I (finally) see that the issue is that it *might* be claimed that a pthread_mutex_unlock() call that has unlocked the mutex but has not returned yet counts as another example of "reference", making destruction undefined; and the request is to clarify that it does not. Got it. Thanks. :)

(0002267)
geoffclare (manager)
2014-06-19 16:12

On page 1643 line 53249 section pthread_mutex_destroy() change from:
Attempting to destroy a locked mutex or a mutex that is referenced (for example, while being used in a pthread_cond_timedwait() or pthread_cond_wait()) by another thread results in undefined behavior.

to:
Attempting to destroy a locked mutex, or a mutex that another thread is attempting to lock, or a mutex that is being used in a pthread_cond_timedwait() or pthread_cond_wait() call by another thread, results in undefined behavior.

On page 1647 lines 53424-53425 change:
A mutex can be destroyed immediately after it is unlocked. For example, consider the following code:

to:
A mutex can be destroyed immediately after it is unlocked. However, since attempting to destroy a locked mutex, or a mutex that another thread is attempting to lock, or a mutex that is being used in a pthread_cond_timedwait() or pthread_cond_wait() call by another thread, results in undefined behavior, care must be taken to ensure that no other thread may be referencing the mutex.

On page 1647 delete lines 53426-53444.

- Issue History
Date Modified Username Field Change
2013-12-18 20:06 torvald New Issue
2013-12-18 20:06 torvald Name => Torvald Riegel
2013-12-18 20:06 torvald Organization => Red Hat
2013-12-18 20:06 torvald Section => pthread_mutex_destroy
2013-12-18 20:06 torvald Page Number => 1643, 1647
2013-12-18 20:06 torvald Line Number => 53249-53251, 53423-53444
2013-12-18 20:07 torvald Note Added: 0002081
2013-12-18 20:10 Don Cragun Interp Status => ---
2013-12-18 20:10 Don Cragun Note Added: 0002082
2013-12-18 20:10 Don Cragun Category Base Definitions and Headers => System Interfaces
2013-12-18 20:45 dalias Note Added: 0002083
2014-02-27 17:35 eblake Note Added: 0002167
2014-03-20 15:38 geoffclare Note Added: 0002188
2014-03-20 16:45 wlerch Note Added: 0002191
2014-03-20 16:58 geoffclare Note Added: 0002192
2014-03-20 17:21 wlerch Note Added: 0002193
2014-03-20 18:31 dalias Note Added: 0002194
2014-03-20 18:37 wlerch Note Added: 0002195
2014-03-21 02:31 wlerch Note Edited: 0002195
2014-06-19 16:12 geoffclare Note Added: 0002267
2014-06-19 16:13 geoffclare Final Accepted Text => Note: 0002267
2014-06-19 16:13 geoffclare Status New => Resolved
2014-06-19 16:13 geoffclare Resolution Open => Accepted As Marked
2014-06-19 16:13 geoffclare Tag Attached: tc2-2008
2015-02-09 21:11 Florian Weimer Issue Monitored: Florian Weimer


Mantis 1.1.6[^]
Copyright © 2000 - 2008 Mantis Group
Powered by Mantis Bugtracker