Austin Group Defect Tracker

Aardvark Mark III


Viewing Issue Simple Details Jump to Notes ] Issue History ] Print ]
ID Category Severity Type Date Submitted Last Update
0000609 [1003.1(2004)/Issue 6] System Interfaces Editorial Clarification Requested 2012-09-20 14:18 2012-10-28 18:33
Reporter mmihaylov View Status public  
Assigned To ajosey
Priority normal Resolution Open  
Status Under Review  
Name Mihail Mihaylov
Organization
User Reference
Section pthread_cond_broadcast, pthread_cond_signal
Page Number 1043
Line Number 33043 - 33046
Interp Status ---
Final Accepted Text
Summary 0000609: It is not clear what threads are considered blocked with respect to a call to pthread_cond_signal() or pthread_cond_broadcast()
Description The spec states that a call to pthread_cond_broadcast will unblock all threads currently blocked on the condition variable, and a call to pthread_cond_signal(0 will unblock at least one of them.

However, it is not clear what "currently blocked" means.

It seems clear that if the signaling thread doesn't own the mutex, it is hard to reason what threads are eligible for wakeup. So I'm only concerned with the case when the signaling thread owns the mutex.

My interpretation is that, in this case, only threads that have blocked on the conditional variable before the mutex was acquired by the thread calling pthread_cond_signal() or pthread_cond_broadcast() are eligible for wakeup. So my understanding is that a call to pthread_cond_signal() is required to unblock at least one of the threads that blocked before the calling thread acquired the mutex. If a thread that blocked after the signaling thread called pthread_cond_signal() and released the mutex, this should be considered a spurious wakeup and should not prevent unblocking at least one of the threads that blocked before the call.

But there has been an alternative interpretation supported by a maintainer of glibc. He claims that because the spec doesn't state that pthread_cond_signal() will wait for a thread to be unblocked by the signal, the signal can be "delivered" some time later. So it is possible that the calling thread will release the mutex and another thread will acquire it and block on the condition variable before the signal is delivered, thus making it possible for the signal to unblock a thread that has blocked after the call to pthread_cond_signal() even when the calling thread was holding the mutex.

The spec doesn't provide enough explicit information to make his interpretation invalid. Therefore it cannot be claimed that my interpretation is correct either, because the spec allows his interpretation.

Another question which cannot be answered based on the spec is what should happen when a thread calls pthread_cond_signal() and then immediately calls pthread_cond_wait(). Is this thread eligible to consume its own signal?

The argument between me and the glibc maintainer is available at http://sourceware.org/bugzilla/show_bug.cgi?id=13165#c16. [^] Please, consider taking a look at it, in case my description here doesn't accurately describe the different interpretations that the current text of the spec allows.
Desired Action Add an explicit explanation as to which threads are eligible to be unblocked by a call to pthread_cond_signal() and pthread_cond_broadcast()
Tags No tags attached.
Attached Files

- Relationships

-  Notes
(0001371)
torvald (reporter)
2012-09-21 15:37

Speaking here for the interpretation that glibc's condvar implementation relies on:

For waiters Wn and signaler S and "->" being happens-before, the spec states that if W1->S (i.e., not the return of the pthread_cond_wait but the part that atomically waits and unlocks the mutex), then W1 is considered a blocked thread. However, it does not state that W1 is _only_ considered to be a blocked thread if it happens before S. Therefore, if we would have S -> W2, and W2 blocks on the cond var that S signals, then W2 could also be considered as a blocked thread with respect to S' signal. As a result, if we have (W0, W1) -> S -> W2, then S could wake up either W0, W1, or W2. So, informally, signals are asynchronous in the sense that they will be eventually delivered at some time during or after the call to pthread_cond_signal, and will wake up at least one thread from the set of blocked threads at _this_ time.

If this behavior is not allowed (and Mihail's interpretation is the actual intent of the spec), then S would have to wake up either W0 or W1 (which is a stronger guarantee). For implementations, this means that they would need to restrict which waiters a certain signal can wake. For example, for W1 -> S1 -> W2 -> S2, S1 could only wake W1 whereas S2 could wake W1 or W2. If that is the intent, then please also specify the follow-up questions that arise: For example, would S2 have to always wake W2 (otherwise, S2 could wake W1, S1 would have no associated waiter, and W2 would remain blocked)? There might be further follow-ups around cancellation or PI cond vars.

Even with the former interpretation (i.e., what glibc's implementation assumes currently), applications can establish the stronger waiter--signal relationship (and thus the stronger guarantees in Mihail's interpretation) by using separate cond vars for each required set (even though there is no fallback or overflow to a previous set).
(0001372)
dalias (reporter)
2012-09-21 17:46

I have followed the associated glibc bug report (#13165) for this issue for quite some time, and believe that the current behavior is non-conformant for reasons described in comments #14, 27, 29, and 31 on the bug tracker. I am also the primary author of musl libc in which we have an alternate, much simpler implementation of condition variables built on Linux futexes which is believed to perform at least as well (in particular, we have not been able to measure any spurious wakeups that occur with musl but not glibc/NPTL, despite the trouble NPTL is going to for the sake of avoiding them) and which does not suffer from the issue, so I see no reason the requirements of the standard should be relaxed to accommodate the undesirable glibc/NPTL behavior. There's no reason glibc cannot just remove the incorrect logic that's causing the problem.
(0001373)
mmihaylov (reporter)
2012-09-21 18:29

Please, let us not move the dispute from the glibc bug here. Let's just limit ourselves to clarifications of the question.
(0001374)
dalias (reporter)
2012-09-21 22:31

Indeed. I just want it to be clear that giving the implementation this freedom to neglect one aspect of quality (ordering guarantees that seem to be required by the current language of the standard) is not essential to providing other quality guarantees (reasonable performance, etc.) when implementing POSIX threads on top of the primitives the Linux kernel provides, and that the whole issue stems from an attempt at optimizing against spurious wakes in NPTL.

Perhaps it would also be useful to know whether there are any other implementations which suffer from the same issue.
(0001378)
siddhesh (reporter)
2012-09-25 16:36

Torvald, when you talk of the sequence:

(W0, W1) -> S -> W2

resulting in W2 being woken up, have you defined 'entering a wait' as the Wn entering the futex_wait system call or is your definition of having entered wait something else?

Given the glibc implementation, a waiter could be considered as having 'entered a wait' as soon as it has acquired the internal condition since that is what the signaller uses to 'enter' its signal. With that, I don't think it would be possible for any waiter to enter into a wait if S has taken the mutex and entered pthread_cond_signal before it, i.e. W2 will never wake up if (W0, W1) -> S -> W2

This is how the implementations are in glibc, for reference:

pthread_cond_wait:
1: wait(internal_condition)
2: release(mutex)
3: update_condition_state
4: release(internal_condition)
5: futex_syscall(wait) /* blocks in the kernel */
6: wait(internal_condition)
7: goto 4 if we are not to be woken
8: update_condition_state
9: release(internal_condition)
10: wait(mutex)

pthread_cond_signal:
1: wait(internal_condition)
2: if(waiters_present)
3: update_condition_state
4: futex_syscall(wake)
5: release(internal_condition)
(0001379)
torvald (reporter)
2012-09-25 17:05

Siddhesh, it is possible. But as Mihail pointed out, we should not discuss glibc's implementation here. This issue here is a general request for clarification of the specification, even though the glibc discussion triggered it. Thanks.
(0001396)
msbrown (manager)
2012-10-10 16:17

Comments on the topic by David Butenhof <David.Butenhof@hp.com>:

The argument in the glibc "interpretation" appears to hinge on the idea that POSIX use of the terms "block" and "unblock" are not discrete atomic events or transitions, and that "unblock" in fact sends some unsynchronized signal that arrives at a later time to (presumably) cause a blocked thread to be removed from its list and scheduled to run.

** First, I'll take the claims I infer from the post at face value against the wording and intent of the standard, and then I'll comment on the more pragmatic aspects. (If you find yourself objecting to something in the first part, read the second part before replying.) Standards first:

This interpretation seems to clearly contradict the usage within the standard, and certainly the intent at the time the text was composed. (Note, while I glanced through the referenced glibc bug report, I haven't tried to read through the code that purportedly demonstrates the problem. I'd rather deal with this in terms of the abstract POSIX standard implications rather than get embroiled in the glibc issue; and, frankly, I tend to agree with a contention in the bug discussion that any user code demonstrating a flaw short of simply "no wakeup occurs" is probably trying to exploit some unjust assumptions about fairness.)

"Blocked" means that the thread ceases to be runnable; "unblock" means the target thread has been scheduled to run. Both terms are used as discrete atomic events; there's no concept of a "partial" or "pending" unblock -- a thread is either blocked or not blocked. But this is exactly the problem foreseen by those arguing in favor of a concise mathematical description of thread behavior; which was rejected in favor of the more casual prose descriptions. (Ultimately, I think the committee was correct: a dense "mathematical" description as proposed would likely have made the early standard much less approachable and perhaps less quickly implemented. And, considering the number of independent implementations and the length of time, the remarkably low frequency of this sort of problem suggests that greater complication wasn't needed.)

If we assume W1 -> S1 -> (W2, W3) -> B1 -> (W4, W5) -> S2 -> S3, then signal S1 has the set (W1) from which to select and therefore shall unblock W1 removing it from the wait set and making it runnable. Broadcast B1 finds a set of two blocked threads W2 and W3 and must unblock both, removing them from the wait set and making both runnable. Signal S2 finds a set of two waiters W4 and W5 and shall select exactly one of those two to unblock; based on highest scheduling priority if priority scheduling is in effect or else using some unspecified arbitrary criterion but, again, removing that thread from the wait set and making it runnable. The final S3 must then find exactly one thread in the candidate set, whichever was not selected and unblocked by S2, and shall remove it from the wait set and make it runnable.

It is impossible for S1 to unblock W2, W3, W4, or W5 as they cannot have become part of the wait set at the time S1's discrete and atomic "unblock" action takes place. That would imply a "pending wake", a concept that is intentionally absent from POSIX.

The original author's statement of the glibc position is that POSIX doesn't specify that "pthread_cond_signal() will wait for a thread to be unblocked by the signal". This almost sounds like confusion between the "UNIX signal" and "condition signal" concepts, as if the latter was a similar asynchronous and potentially deferred action. But I don't see any reasonable leeway in the descriptions for such a radical interpretation. The POSIX model of an unblock operation removes a thread from the wait set and makes it runnable. There is no way to introduce a concept of "waiting" for that to happen; it's not asynchronous. Granted, POSIX doesn't formally define the operations; but that was the crux of the "prose vs mathematics" argument, that the casual usage, in context, is sufficiently clear. Debatable, certainly; but I think true enough in practice.

** But, with that out of the way, here's the pragmatic side:

In reality we ask (and realistically can test) only that pthread_cond_signal is guaranteed to wake at least one thread in the condition variable's current wait set (if any). (At least, unless you're running the test code entirely SCHED_FIFO on a uniprocessor.) That's why we have spurious wakeups: the determination of which application thread proceeds is entirely the responsibility of the application logic. It becomes impossible for the application to cope only if there are waiting threads and pthread_cond_signal fails to wake at least one.

There seems to be some controversy within the glibc bug report whether we're talking about the possibility of pthread_cond_signal failing to unblock a waiter, or only whether it's possible to unblock an "undesired" waiter. If S1 may unblock W2 instead of W1, at the implementation level, I think there's an error in the implementation; but while a convoluted program can potentially detect this, it's difficult to see how this can matter to any real application without arguing about "scheduling fairness". I'm presuming that W2 would have to occur very close to S1 unless this presumptive "asynchronous unblock" is absurdly slow; and in any real program, especially on a multiprocessor, that means the order of S1 and W2 might just as easily have been reversed.

We can easily argue at the standards and implementation level about the proper and intended behavior of the "unblock" operation on a condition variable; but correct POSIX application code cannot observe that transition. It can only observe return from the call to pthread_cond_*wait which requires a lot more including contention for the associated mutex and processor scheduling; none of which is itself directly observable or easily controllable (again, without SCHED_FIFO on a uniprocessor).

It also seems possible that such an "asynchronous unblock" could mean that W1 may never be awakened, presuming there's always a "Wx" waiter immediately following any "Sy" signaler. This also seems clearly an error; but again I question whether it can matter to a correct application so long as some thread always returns from pthread_cond_*wait after any pthread_cond_signal. Multiple waiters for a condition variable either are awakened by broadcast or are all waiting for the identical predicate... in which case the program doesn't care which thread wakes. (If all consumer threads are never concurrently unblocked by the producer logic, it only means that the consumer was coded "too wide" for the actual application parallelism; it doesn't matter that the same thread always remains blocked. In fact, this is probably best for application performance.) We can therefore argue quality of implementation, or scheduling fairness: but neither is a part of the standard and therefore, for !
 our purposes (though not necessarily for the glibc bug discussion) irrelevant.

** And back to the actual POSIX defect report:

The action is "Add an explicit explanation as to which threads are eligible to be unblocked by a call to pthread_cond_signal() and pthread_cond_broadcast()." I would argue there is in fact precisely that explanation in place. And while there seems to be some contention that, at the "subatomic" level, glibc may violate the letter of that "explicit explanation", I see no clear evidence that anyone is claiming it fails in a way that be detected by a meaningful conforming POSIX application. Or, perhaps more precisely, that POSIX grants to the application any grounds to practically or portably exploit the guarantee claimed to be absent in glibc's implementation, so long as at least one thread returns from pthread_cond_*wait (if any) corresponding to each pthread_cond_signal call on the same condition variable. (If glibc really uses an "asynchronous unblock" as described, I have difficulty seeing how there wouldn't be far worse consequences than this; but that's a different matter!
 .)

In any case, I don't see that the definition of "blocked" is disputed, and (at a simple prose level) XBD 3.7.6 seems to cover it well enough: "waiting for some condition (other than availability of a processor) to be satisfied before it can continue execution". It's a thread that can't run until some external action makes it runnable. And I think that the pthread_cond_broadcast/pthread_cond_signal description neatly treads a valuable line in not actually saying that the unblocked thread is "made runnable" in some fashion; it says instead that the thread(s) "shall contend for the mutex [as if] each had called pthread_mutex_lock".

The pthread_mutex_unlock text is a bit muddier, saying that if threads are blocked "the scheduling policy shall determine which thread shall acquire the mutex". It doesn't say anything about unblocking, although it's pretty clearly implied.
(0001398)
mmihaylov (reporter)
2012-10-10 19:43
edited on: 2012-10-11 06:14

Let me see if I understand David's comment correctly.

The way I understand the first part is that the standard dictates that implementations should guarantee that a call to pthread_cond_signal() (provided the calling thread owns the mutex) will unblock at least one thread that is in the current wait set of the condition variable. The current wait set consists of all threads that called pthread_cond_wait() before the signal was sent, excluding those that have already been unblocked by an earlier call to pthread_cond_signal() or pthread_cond_broadcast() or because of a spurious wakeup. No other threads are included in the wait set.

Based on this, the comment states that implementations that don't provide this guarantee are incorrect - therefore NPTL's implementation is incorrect too.

However, the second part of the comment states that real life correctly written POSIX-based applications cannot rely on this guarantee because they cannot detect when it is violated. This statement is based on the presumption that the predicate is identical for all blocked threads - it is implied that this includes even those threads that blocked after the signal was sent.

In the concluding section of the comment David claims that the definition of "blocked" is not contested and that there is no evidence that a "meaningful" conforming application has failed because this guarantee was violated.

Now let me comment on the comment...

1) There is a contradiction between the first and the second part of the comment. If conforming applications are not allowed to rely on this guarantee, why is it specified by the standard?

2) There is a claim that the predicate should be the same for all waiting threads. Is there a text in the standard which requires this? If the libc implementation provides the ordering guarantee in question, this is not necessary. It is enough for the predicate to be true for the threads in the current wait set. And it should be acceptable for the predicate to be false for a wait set for which no signal has been sent yet.

By this I mean that it should be acceptable to have a predicate which involves some thread specific state, as long as this predicate is guaranteed to be true for all threads that blocked before the last signal/broadcast was sent. At the same time it should be acceptable for the predicate to be false for some or all of the threads that blocked after the last signal was sent, as long as it becomes true for them when the next signal is sent. Incidentally that's the kind of predicate that one would need to reliably detect spurious wakeups. And this is the kind of predicate that will detect when a libc implementation breaks the ordering guarantee.

3) I find that the claim that there is no evidence that the violation of this guarantee cannot be detected contradicts the fact that there exists a bug against glibc about it. Actually there is also one more bug about the same issue, but its reporter was not as persistent as me. I did not file the bug because I wrote a test to check if it is supported. I discovered the problem because my meaningful conforming application failed because of it. My use case is simple - I needed to write on top of condition variables a waitable object which prevents spurious wakeups while preserving the ordering guarantee specified in the POSIX standard - basically a condition variable with no spurious wakeups. I don't know what was the use case of the other reporter, but I think it would be unfair to assume his use case was not meaningful and conforming.

4) As for the claim that the definition of "blocked" is not contested in the glibc bug - that's true in some sense. What is contested was what threads should be included in the current wait set for a call to pthread_cond_signal() or pthread_cond_broadcast(). And that's the clarification that I requested, although maybe my wording wasn't precise enough.

So, now that the comment has confirmed that the guarantee is stated in the standard, the remaining question is - are implementations required to provide this guarantee, or are they allowed to break it because it won't affect 99.9% of the users?


EDIT 1: Added "real life" to my interpretation of the second part of David's comment.

EDIT 2: Fixed a typo in 1) - replaced "implementations" with "applications"

EDIT 3: Added a clarification to 2) describing the kind of predicate I believe should be supported by the condition variable implementation.

EDIT 4: Removed the mention of Win32 events from 3), because they are only similar to what I was implementing in that they have no spurious wakeups.

EDIT 5: Changed slightly the wording at the end of 3) to make it clearer

EDIT 6: Added 4)

(0001403)
geoffclare (manager)
2012-10-17 16:39
edited on: 2012-10-17 16:42

In the Oct 17th conference call we agreed to make the following changes. Further work is needed to produce some interpretation text to precede these changes.

At line 50746 change:

    The pthread_cond_broadcast() function shall unblock all threads
    currently blocked on the specified condition variable cond.

    The pthread_cond_signal() function shall unblock at least one of
    the threads that are blocked on the specified condition variable
    cond (if any threads are blocked on cond).

to:
    
    The pthread_cond_broadcast() and pthread_cond_signal() functions
    shall atomically determine which threads, if any, are blocked
    on the specified condition variable cond. This determination
    shall occur at an unspecified time during the
    pthread_cond_broadcast() or pthread_cond_signal() call.
    The pthread_cond_broadcast() function shall then unblock all of
    these threads. The pthread_cond_signal() function shall unblock at
    least one of these threads.

At line 50762 change:

    The pthread_cond_broadcast() and pthread_cond_signal() functions
    shall have no effect if there are no threads currently blocked on
    cond.

to:

    The pthread_cond_broadcast() and pthread_cond_signal() functions
    shall have no effect if they determine that there are no threads
    blocked on cond.

Add a new paragraph to the RATIONALE section after line 50792:

   There is no guarantee of overall atomicity for the
   pthread_cond_broadcast() and pthread_cond_signal() functions, and
   there is a conceptual point during the execution of these functions
   where the set of blocked threads is (atomically) evaluated. Threads
   could enter this set at any point before this, hence the standard
   speaks of "an unspecified time during the call".

(0001404)
Konrad_Schwarz (reporter)
2012-10-18 07:45

I have very serious reservations on the proposed changes
of Note 1403, which eliminate atomicity requirements from
condition variable operations.

For the condition variable/mutex paradigm to be useful
to applications, that is, to make reasoning about
concurrency tractable, the condition variable
signal/broadcast/wait and mutex enter/exit
operations must be atomic with regards to one another.

Anything else is a recipe for disaster.

See below for individual points.

> At line 50746 change:
>
> The pthread_cond_broadcast() function shall unblock all threads
> currently blocked on the specified condition variable cond.
>
> The pthread_cond_signal() function shall unblock at least one of
> the threads that are blocked on the specified condition variable
> cond (if any threads are blocked on cond).
>
> to:
>
> The pthread_cond_broadcast() and pthread_cond_signal() functions
> shall atomically determine which threads, if any, are blocked
> on the specified condition variable cond. This determination
> shall occur at an unspecified time during the
> pthread_cond_broadcast() or pthread_cond_signal() call.
> The pthread_cond_broadcast() shall then unblock all of these
> threads. The pthread_cond_signal() function shall unblock at
> least one of these threads.

I'm sorry, but this suggestion is a serious mistake.

It seems designed to accommodate the
current -- broken -- Glibc implementation: that is, Glibc
wakes up threads that will block on the condition variable
in the future in lieu of threads that are already blocked,
potentially leaving those threads blocked forever.

Why "determine"? signal() must wake up one thread,
broadcast() must wake up all threads blocked on the condition
variable, no doubt about it.

Why write "at an unspecified time within the [..] call"?
Every action specified for any POSIX function happens at
an unspecified time within a call of the function --
how does pointing this out help here?

On the contrary, condition variable signal, broadcast,
wait and mutex enter and exit must all be atomic with
regards to one another.

Any loosening of this contract inordinately complicates
reasoning about concurrent processes; as no guarantees
apply ("determine the threads blocked"?).

E.g., under the proposed change, n calls to signal() will no
longer necessarily unblock n waiting threads -- an invocation
of signal() could incorrectly determine that no threads
are blocked, when in fact there are some.
(A contract that is easily derived from the Larch specification,
by the way -- not that I advocate usage of formal
specifications in POSIX.)

So, if a change is indeed required, change from line 50746 to:

     The pthread_cond_broadcast() function shall
     _atomically_ unblock all threads
     currently blocked on the specified condition variable cond.
 
     The pthread_cond_signal() function shall _atomically_ unblock
     at least one of the threads that are blocked on the specified
     condition variable cond (if any threads are blocked on cond).

> At line 50762 change:
>
> The pthread_cond_broadcast() and pthread_cond_signal() functions
> shall have no effect if there are no threads currently blocked on
> cond.
>
> to:
>
> The pthread_cond_broadcast() and pthread_cond_signal() functions
> shall have no effect if they determine that there are no threads
> blocked on cond.

An implementation where signal() is a null and void operation would
seem to conform to the suggested changed text: this implementation
never "determines" that there are any threads blocked on any
condition variable.

Please discard this change; it is obviously broken.

> Add a new paragraph to the RATIONALE section after line 50792:
>
> There is no guarantee of overall atomicity for the
> pthread_cond_broadcast() and pthread_cond_signal() functions, and
> there is a conceptual point during the execution of these functions
> where the set of blocked threads is (atomically) evaluated. Threads
> could enter this set at any point before this, hence the standard
> speaks of "an unspecified time during the call".

This is exactly the wrong direction for POSIX to take.

"Overall" atomicity makes these operations (fairly) easy to understand.
This makes reasoning about concurrency and synchronization somewhat
more tractable -- parallel programming is notably difficult.

What advantages are gained from this change? It is very detrimental
to applications; advantages for implementations are yet to be named.

It is also a radical departure from historical practice:
from Hoare monitors, via Lampson's Mesa, through Modula-3
(evidenced by its formal Larch specification), these operations
have always been atomic, and with very good reasons.

Thanks,

Konrad Schwarz
(0001405)
dalias (reporter)
2012-10-18 12:32

I think you're misreading the proposed changes to the text. They do not allow the broken behavior but clarify that it is disallowed. The determination of which threads are blocked is required to be made atomically at some point during the call to pthread_cond_signal or pthread_cond_broadcast. Thus, if the mutex is held during the call, the set of threads blocked is completely determined, and if the mutex was obtained and released before the signal/broadcast operation, then the set contains at least those threads which blocked before the mutex was obtained, but may also include others that arrive at some point (prior to the atomic determination) during the signal/broadcast call.

This is exactly what the original specified behavior was; some folks just did not find the original text clear enough and believed it allowed the current, undesirable behavior glibc is exhibiting.
(0001408)
torvald (reporter)
2012-10-24 17:34

Thanks for working on those changes to the specification.

The changes make signal/broadcast essentially linearizable, which I agree is probably the best approach in the absence of a complete memory model. However, this effectively requires a total order over all those operations on all condition variables in the program, which can require stronger low-level synchronization in implementations than if each cond var would have a total order of these operations that just needs to be consistent with happens-before. In particular, considering C11/C++11, the former requires some sequentially-consistent operation, whereas the later just needs acquire/release memory orders. It would be nice to know if you see the latter as sufficient too.

I've raised an issue similar to the this clarification request with C++11 because C++11 didn't require consistency with happens-before: http://cplusplus.github.com/LWG/lwg-active.html#2190 [^]
C++11's proposed resolution is to add this consistency requirement. I haven't had time to review the minutes of the recent C++ meeting yet, so I can't say whether they went for a global total order or just individual total orders for all the operations on each condition variable.

Finally, I think that the planned addition to the RATIONALE probably do confuse more than they help:
> There is no guarantee of overall atomicity for the
> pthread_cond_broadcast() and pthread_cond_signal() functions, and
> there is a conceptual point during the execution of these functions
> where the set of blocked threads is (atomically) evaluated. Threads
> could enter this set at any point before this, hence the standard
> speaks of "an unspecified time during the call".

What is "overall atomicity" meant to refer to? The rest of the paragraph makes sense though. Perhaps it's better to explain it along the same lines as C++11 does?
(That is, split any waiting into three parts, then make those and signal/broadcast atomic with respect to each other, and require them to be totally ordered in a way consistent with happens-before. Paragraph 30.5, sentences 4-5 in the spec, plus the changes of issue 2190 referenced above.)

The core subject of this specification issue was not atomicity (i.e., that something executes as one indivisible step), but ordering (when this indivisible step executes with respect to (1) other things going on in the same thread or (2) communication with other threads or operations in those other threads). Thus, perhaps it's better to state that the ordering requirements got stronger, but the atomicity requirements have stayed all the same. That might help prevent misunderstandings.
(0001409)
nialldouglas (reporter)
2012-10-25 23:31

It seems to me that, portably speaking, pthread conditions can merely "best effort" on waking threads. When you signal "wake one thread" or "wake all threads", there is no way for any portable implementation to know how many threads are currently sleeping. Therefore "wake one thread" could mean wake none, wake one, or conceivably wake a small number. Similarly, "wake all threads" really means "wake as many as you can, more or less".

If you want stronger guarantees, then it's your responsibility to wrap the condition variable calls with appropriate atomic CAS logic and to loop the pthread calls appropriately. An example of this for C11 lives at https://github.com/ned14/ISO_POSIX_standards_stuff/tree/master/pthreads%20Notifer%20Object. [^] I would suggest it might be useful for C++11 too.

I would strongly suggest something like a pthreads notifier object be brought into being to avoid naive usage of pthread condition variables. There's too many corner cases in those for most programmers. If your glibc maintainer would like, I'm happy to contribute my notifier implementation for use in glibc.

Niall
(0001410)
dalias (reporter)
2012-10-26 00:02

It is very easy for an application, based on the actions it has performed, to know exactly which threads are included in the set of threads T such that either T is currently blocked on the condition variable, or T has spuriously woken and is waiting to acquire the mutex and return from pthread_cond_wait. Therefore, it is completely reasonable to assume that that calling pthread_cond_signal on the condition variable, and subsequently unlocking the mutex, will allow at least one thread in that set to make forward progress.
(0001411)
nialldouglas (reporter)
2012-10-28 00:37
edited on: 2012-10-28 00:38

"It is very easy for an application, based on the actions it has performed, to know exactly which threads are included in the set of threads T such that either T is currently blocked on the condition variable, or T has spuriously woken and is waiting to acquire the mutex and return from pthread_cond_wait. Therefore, it is completely reasonable to assume that that calling pthread_cond_signal on the condition variable, and subsequently unlocking the mutex, will allow at least one thread in that set to make forward progress."

In traditional SMP architectures with a system wide atomic CAS (preferably system wide atomic increment) you are correct that it is *possible* to ensure, relatively cost free, a 1-1 relationship between pthread_cond_signal and a thread wake via its associated mutex. I might add that implementation overheads are a bit heavier than I'd prefer here, but yes it's doable.

In NUMA, you are not correct, because you're relying on messages whose results you don't wait for. You can't know if a wake has any effect. You can't know the state or wait list of a mutex either.

That's why the current POSIX approach is the right one - leave it to the API user to take care of spurious and lost wakeups if that is necessary to the use case. When NUMA systems become much more common, we can standardise then whatever practice emerges by then as least worst.

If you'd like different behaviour, propose a new API set rather than retrofit POSIX condition variables.

Niall

(0001412)
dalias (reporter)
2012-10-28 01:11

In response to note #0001411, the quoted text from note #0001410 is not specific to any particular architecture. The claim I made, which I stand by, is that making this determination is easy in the abstract machine specified by POSIX, and therefore is easy on any implementation of POSIX, regardless of the hardware architecture. SMP vs NUMA is irrelevant.

Further, "the current POSIX approach" does not allow lost wakeups. It allows spurious wakeups, which do not matter, as they have no bearing on the set of threads described.

For the sake of clarity, here's how an application would go about tracking and making use of knowledge about which threads are waiting on the condition variable or just-woken (possibly spuriously) and waiting for the mutex:

Each waiter performs the following steps:
1. Acquire mutex.
2. Add self to list of waiters protected by the mutex.
3. Call pthread_cond_wait
4. Remove self from list of waiters.
5. Release mutex.

Now, once the signaling thread has acquired the mutex, it can inspect the list of waiters and know exactly which threads are waiting; call this set S. If it calls pthread_cond_signal and pthread_mutex_unlock and then blocks waiting for an event that will only be triggered by forward progress in one of the threads in set S, the standard, both as currently written and with the proposed clarifications, requires that the event occur and that forward progress be made. I don't see any way you can argue against this.
(0001413)
nialldouglas (reporter)
2012-10-28 18:33

Regarding note #0001412, simply repeating yourself without new information isn't going to convince me. Nor will I simply repeat myself. There is always a gulf between design and practice, and while the POSIX spec design might suggest no lost wakeups, in practice that's hard to achieve perfectly on distributed architectures in an efficient way. It also introduces an amount of overhead for SMP architectures which I am not at all convinced is necessary nor desirable.

How about this: get someone with deep domain experience in this field like Hans Boehm to yay or nay this? Whatever he supports I'll accept.

Niall

- Issue History
Date Modified Username Field Change
2012-09-20 14:18 mmihaylov New Issue
2012-09-20 14:18 mmihaylov Status New => Under Review
2012-09-20 14:18 mmihaylov Assigned To => ajosey
2012-09-20 14:18 mmihaylov Name => Mihail Mihaylov
2012-09-20 14:18 mmihaylov Section => pthread_cond_broadcast, pthread_cond_signal
2012-09-20 14:18 mmihaylov Page Number => 1043
2012-09-20 14:18 mmihaylov Line Number => 33043 - 33046
2012-09-21 14:24 torvald Issue Monitored: torvald
2012-09-21 15:37 torvald Note Added: 0001371
2012-09-21 17:46 dalias Note Added: 0001372
2012-09-21 18:29 mmihaylov Note Added: 0001373
2012-09-21 22:31 dalias Note Added: 0001374
2012-09-25 16:36 siddhesh Note Added: 0001378
2012-09-25 17:05 torvald Note Added: 0001379
2012-10-10 16:17 msbrown Note Added: 0001396
2012-10-10 19:43 mmihaylov Note Added: 0001398
2012-10-11 06:14 mmihaylov Note Edited: 0001398
2012-10-17 16:39 geoffclare Note Added: 0001403
2012-10-17 16:42 geoffclare Note Edited: 0001403
2012-10-18 07:45 Konrad_Schwarz Note Added: 0001404
2012-10-18 12:32 dalias Note Added: 0001405
2012-10-24 17:34 torvald Note Added: 0001408
2012-10-25 23:31 nialldouglas Note Added: 0001409
2012-10-26 00:02 dalias Note Added: 0001410
2012-10-28 00:37 nialldouglas Note Added: 0001411
2012-10-28 00:38 nialldouglas Note Edited: 0001411
2012-10-28 01:11 dalias Note Added: 0001412
2012-10-28 18:33 nialldouglas Note Added: 0001413


Mantis 1.1.6[^]
Copyright © 2000 - 2008 Mantis Group
Powered by Mantis Bugtracker