Austin Group Defect Tracker

Aardvark Mark IV


Viewing Issue Simple Details Jump to Notes ] Issue History ] Print ]
ID Category Severity Type Date Submitted Last Update
0000863 [1003.1(2013)/Issue7+TC1] System Interfaces Editorial Clarification Requested 2014-08-05 22:09 2019-06-10 08:54
Reporter dalias View Status public  
Assigned To
Priority normal Resolution Accepted As Marked  
Status Closed  
Name Rich Felker
Organization musl libc
User Reference
Section pthread_once
Page Number 1684
Line Number 54499
Interp Status ---
Final Accepted Text See Note: 0002619
Summary 0000863: Explicitly disallow longjmp from pthread_once init_routine
Description As written, the specification for pthread_once assumes that init_routine either returns or acts on thread cancellation. However, there is a third possibility which is unaddressed: that execution of init_routine ends with a call to longjmp.

Based on a strict reading of the specification, I think the requirement here is that all subsequent calls to pthread_once (with the same once_control) deadlock if the init_routine ended with longjmp. This is because:

A. "Subsequent calls of pthread_once() with the same once_control shall not call the init_routine." and this requirement is only relaxed when init_routine is cancelled: "if init_routine is a cancellation point and is canceled, the effect on once_control shall be as if pthread_once() was never called." Thus subsequent calls after the first ended by longjmp cannot call the init_routine again.

B. "On return from pthread_once(), init_routine shall have completed." Thus, subsequent calls after the first ended by longjmp cannot return, since init_routine has not completed.

However, I don't think this behavior is useful, and allowing longjmp at all it makes it difficult to implement pthread_once, since the natural way of handling cancellation uses pthread_cleanup_push, while longjmp out of a pthread_cleanup_push context is not permitted.

Note that allowing longjmp and having pthread_once behave the same way when init_routine ends with longjmp as it does when init_routine is cancelled does not seem like a viable option. In order for this to work, longjmp must be implemented via some sort of unwinding that can reset the state of the once_control object. I do not think forbidding all traditional implementations of setjmp/longjmp as mere register save/restore is the intent of this standard or within the scope of fixing this issue in pthread_once.
Desired Action Add to the description of pthread_once:

"If init_routine calls longjmp[OB XSI][Option Start], _longjmp,[Option End] or siglongjmp, the behavior is undefined."
Tags tc2-2008
Attached Files

- Relationships

-  Notes
(0002333)
dalias (reporter)
2014-08-06 01:24

I realize there are at least two related issues which should also be addressed: recursive calls to pthread_once (where the init_routine calls pthread_once with the same once_control object) and calling pthread_exit from the init_routine.

My opinion is that recursive calls should be forbidden ("If init_routine calls pthread_once with the same value of once_control that was passed to pthread_once, the behavior is undefined.") and pthread_exit should either behave identically to cancellation (this is the natural behavior which would arise anyway with most implementations) or be undefined.
(0002339)
jilles (reporter)
2014-08-08 11:01

I think the desired action restricts applications too much. There should only be undefined behavior when the call to longjmp(), _longjmp() or siglongjmp() would terminate the call to the init_routine. This is similar to the restrictions in the exit (atexit functions) and pthread_cleanup_pop pages.

Pages like ftw and nftw mention the perils of terminating calls to a callback function via longjmp() and similar in the Application Usage section (so not normative). Perhaps this can be or is specified with longjmp since terminating calls to any standard function's callback function via longjmp() is not a reasonable thing to support.
(0002341)
steffen (reporter)
2014-08-08 13:21

Yes, i too wanted to note that using a longjmp(3) from just about any call-out facility cannot be expected to leave state in sane behaviour, which possibly should be stated more explicitly for longjmp(3) itself.
(0002342)
dalias (reporter)
2014-08-08 16:05

Indeed, that is exactly what I meant to say; I merely misstated it. I would very much like the approach of specifying this as part of the description of longjmp as jilles suggested; however, I fear that doing so may not align well with ISO C. C11 added several functions which have callbacks, and at least for call_once, it's not clear that leaving the call with longjmp is forbidden. So this would need an interpretation from WG14 I think. As such, I'd be happy with just fixing the description of pthread_once locally for now, and looking for a more "global" solution to this type of problem later, but I don't oppose trying to fix it globally now if everybody is up for the task.
(0002344)
steffen (reporter)
2014-08-08 18:29

Cool.
But i think that if this really can't be closed by changing longjmp(3) itself, just as Jilles suggested, so that the ftw(3), nftw(3) and scandir(3) usage hints can be removed, then it should be considered to change the undefined "dynamic storage" term which is used there to "dynamic resources" or something similar, since file descriptors may also leak, of course. I think the term "storage" isn't related to file descriptors anywhere else.
(0002366)
Don Cragun (manager)
2014-08-28 15:37
edited on: 2014-08-28 16:23

Change "None" on P1684, L54499 in APPLICATION USAGE to:

If init_routine does not return (such as by calling longjmp()) then pthread_once() will not return either. A subsequent call to pthread_once() with the same once_control, however, will not call the specified init_routine.

If init_routine recursively calls pthread_once() with the same once_control, the recursive call will not call the specified init_routine, and thus the specified init_routine will not complete, and thus the recursive call to pthread_once() will not return.


(0002367)
dalias (reporter)
2014-08-28 16:41

I have a few issues with the proposed fix in note #2366:

1. The first paragraph is somewhat contradictory with the requirements for acting on cancellation (and possibly pthread_exit?), since that is a case where init_routine does not return.

2. The second paragraph seems somewhat confusing and does not address the longjmp case directly. Formally, longjmp can "terminate the call" (this wording is used several places in the C standard), and in that case, it would be hard to argue that the call to pthread_once is "recursive".

3. Since it's likely that pthread_once is implemented with cancellation cleanup handlers, and using longjmp to leave the scope of a pthread_cleanup_push results in undefined behavior, actually supporting this usage, which does not seem useful, may be costly and require special behavior in longjmp which would otherwise not be required.

As such, I would really prefer a resolution of:

"If execution of init_routine is terminated by a call to longjmp, the behavior is undefined."
(0002368)
mdempsky (reporter)
2014-08-28 17:13

Re #2: The paragraph about recursive calls was added in response to tangential discussion that came up about this issue on the call, not specifically in response to subsequent calls to pthread_once() after a longjmp().

Re #3: I can somewhat appreciate that pthread_once() commonly being implemented by pthread_cleanup_{push,pop}() makes it desirable to duplicate the explicit undefined behavior warning about longjmp() into pthread_once(), but that wouldn't necessarily be visible to user-applications would it? E.g., it's fine for pthread_once()'s implementation to assume some specific behavior for longjmp() calls that escape the pthread_cleanup context, it's just not okay for applications to make similar assumptions.
(0002369)
dalias (reporter)
2014-08-28 17:55

As stated in the original report of the issue, the problem in #3 there is not that it would give applications permission to bypass this rule about leaving the scope of a cleanup context by longjmp, but that it imposes either a requirement on the implementation of longjmp that precludes traditional implementations (simply restoring the register context with no unwinding), or a requirement that cancellation be done in a way that makes it safe to just jump past a cleanup context. I do not think there was any original intent for pthread_once to impose such requirements, only an oversight. It's obviously not desirable for applications to use longjmp to leave init_routine (doing so simply causes deadlocks if the once_control is used again), so imposing constraints on other aspects of an implementation for the sake of supporting this usage does not seem reasonable.
(0002370)
mdempsky (reporter)
2014-08-28 23:27

Rich and I discussed this a bit further off list and looked at several open source implementations of pthread_once(), and I now believe we were mistaken on the teleconference call that current POSIX implementations provide defined behavior for programs that use longjmp() to abnormally terminate a call to pthread_once().

On OS X, illumos, FreeBSD, and NetBSD, pthread_once()'s thread cancellation is implemented using pthread_cleanup_{push,pop}() (or in the case of FreeBSD, a substantially similar macro):

    OS X: https://www.opensource.apple.com/source/Libc/Libc-825.40.1/pthreads/pthread.c [^]
    illumos: http://src.illumos.org/source/xref/illumos-gate/usr/src/lib/libc/port/threads/pthread.c#158 [^]
    FreeBSD: http://svnweb.freebsd.org/base/head/lib/libthr/thread/thr_once.c?revision=220888&view=markup#l62 [^]
    NetBSD: http://cvsweb.netbsd.org/bsdweb.cgi/src/lib/libpthread/pthread_once.c?rev=1.3&content-type=text/x-cvsweb-markup [^]

Further, each of these cleanup push/pop implementations are handled by maintaining a per-thread linked-list and constructing new nodes on the stack in the "push" routine, and then removing them in the "pop" routine. Thus if an application uses longjmp() to abnormally terminate the pthread_once() call, the "pop" routine won't execute, and the thread's cleanup list will still contain a pointer to the local linked-list node whose lifetime has ended. If the thread subsequently tries to use this node (e.g., by being cancelled or calling pthread_cleanup_pop()), then undefined behavior will occur.

So I believe now that POSIX *should* specify that using longjmp() to abnormally terminate the call to init_routine results in undefined behavior.
(0002371)
Don Cragun (manager)
2014-09-04 15:36
edited on: 2014-09-04 15:49

Add new paragraph after P1684, L54487:
If init_routine() does not return to pthread_once() other than by being cancelled, the results are undefined.


Change "None" on P1684, L54499 in APPLICATION USAGE to:
If init_routine() recursively calls pthread_once() with the same once_control, the recursive call will not call the specified init_routine, and thus the specified init_routine() will not complete, and thus the recursive call to pthread_once() will not return. Use of longjmp(), _longjmp(), or siglongjmp() within an init_routine() to jump to a point outside of init_routine() prevents init_routine() from returning.


(0002409)
dalias (reporter)
2014-10-07 13:35

I found a related issue: in XBD 4.11 Memory Synchronization,

"The pthread_once() function shall synchronize memory for the first call in each thread for a given pthread_once_t object."

The limitation to "the first call" seems to be assuming that init_routine has completed after the first call. However, it's plausible that the first call to init_routine acts on cancellation, and that pthread_once is called again with the same pthread_once_t object via cancellation cleanup handlers. In this case, a second call to pthread_once also needs to synchronize, I think.

Should I open this as a separate issue?
(0002619)
geoffclare (manager)
2015-04-10 15:07

I'm reopening this one as there was an email discussion, after the latest resolution, which produced better wording. We may as well address the additional issue raised in Note: 0002409 at the same time.

New proposed resolution...

On Page: 1684 Line: 54487 Section: pthread_once()

In the DESCRIPTION section, add new paragraph:

If the call to init_routine is terminated by a call to longjmp(), _longjmp(), or siglongjmp(), the behavior is undefined.

On Page: 1684 Line: 54499 Section: pthread_once()

In the APPLICATION USAGE section, change from:

None.

to:

If init_routine recursively calls pthread_once() with the same once_control, the recursive call will not call the specified init_routine, and thus the specified init_routine will not complete, and thus the recursive call to pthread_once() will not return. Use of longjmp(), _longjmp(), or siglongjmp() within an init_routine to jump to a point outside of init_routine prevents init_routine from returning.

Cross-volume change to XBD...

On Page: 110 Line: 3004 Section: 4.11 Memory Synchronization

Change from:

The pthread_once() function shall synchronize memory for the first call in each thread for a given pthread_once_t object.

to:

The pthread_once() function shall synchronize memory for the first call in each thread for a given pthread_once_t object. If the init_routine called by pthread_once() is a cancellation point and is canceled, a call to pthread_once() for the same pthread_once_t object made from a cancellation cleanup handler shall also synchronize memory.

- Issue History
Date Modified Username Field Change
2014-08-05 22:09 dalias New Issue
2014-08-05 22:09 dalias Name => Rich Felker
2014-08-05 22:09 dalias Organization => musl libc
2014-08-05 22:09 dalias Section => pthread_once
2014-08-05 22:09 dalias Page Number => unknown
2014-08-05 22:09 dalias Line Number => unknown
2014-08-06 01:24 dalias Note Added: 0002333
2014-08-08 11:01 jilles Note Added: 0002339
2014-08-08 13:21 steffen Note Added: 0002341
2014-08-08 16:05 dalias Note Added: 0002342
2014-08-08 18:29 steffen Note Added: 0002344
2014-08-28 15:37 Don Cragun Note Added: 0002366
2014-08-28 16:14 Don Cragun Note Edited: 0002366
2014-08-28 16:16 Don Cragun Note Edited: 0002366
2014-08-28 16:17 Don Cragun Page Number unknown => 1684
2014-08-28 16:17 Don Cragun Line Number unknown => 54499
2014-08-28 16:17 Don Cragun Interp Status => ---
2014-08-28 16:17 Don Cragun Final Accepted Text => See Note: 0002366.
2014-08-28 16:17 Don Cragun Status New => Resolved
2014-08-28 16:17 Don Cragun Resolution Open => Accepted As Marked
2014-08-28 16:23 Don Cragun Note Edited: 0002366
2014-08-28 16:23 Don Cragun Tag Attached: tc2-2008
2014-08-28 16:41 dalias Note Added: 0002367
2014-08-28 17:13 mdempsky Note Added: 0002368
2014-08-28 17:55 dalias Note Added: 0002369
2014-08-28 23:27 mdempsky Note Added: 0002370
2014-09-04 15:08 nick Resolution Accepted As Marked => Reopened
2014-09-04 15:36 Don Cragun Note Added: 0002371
2014-09-04 15:36 Don Cragun Note Edited: 0002371
2014-09-04 15:49 Don Cragun Note Edited: 0002371
2014-09-04 15:50 Don Cragun Final Accepted Text See Note: 0002366. => See Note: 0002371.
2014-09-04 15:50 Don Cragun Resolution Reopened => Accepted As Marked
2014-10-07 13:35 dalias Note Added: 0002409
2015-04-10 15:07 geoffclare Note Added: 0002619
2015-04-10 15:07 geoffclare Status Resolved => Under Review
2015-04-10 15:07 geoffclare Resolution Accepted As Marked => Reopened
2015-04-16 15:19 geoffclare Final Accepted Text See Note: 0002371. => See Note: 0002619
2015-04-16 15:19 geoffclare Status Under Review => Resolved
2015-04-16 15:19 geoffclare Resolution Reopened => Accepted As Marked
2015-04-23 23:08 emaste Issue Monitored: emaste
2019-06-10 08:54 agadmin Status Resolved => Closed


Mantis 1.1.6[^]
Copyright © 2000 - 2008 Mantis Group
Powered by Mantis Bugtracker