Austin Group Defect Tracker

Aardvark Mark IV


Viewing Issue Simple Details Jump to Notes ] Issue History ] Print ]
ID Category Severity Type Date Submitted Last Update
0000765 [1003.1(2013)/Issue7+TC1] System Interfaces Objection Omission 2013-10-10 15:39 2019-06-10 08:55
Reporter eblake View Status public  
Assigned To
Priority normal Resolution Accepted As Marked  
Status Closed  
Name Eric Blake
Organization Red Hat
User Reference ebb.pthread_kill
Section pthread_kill
Page Number 1640
Line Number 53174
Interp Status ---
Final Accepted Text Note: 0001968
Summary 0000765: kill and pthread_kill behavior between termination and lifetime end
Description The standard states (although non-normative) that kill() should not fail with ESRCH for an inactive process (also known as a zombie), because the process lifetime still exists even if no signal will be delivered [see the RATIONALE at line 40379 page 1213]; while conceding that historical practice sometimes delivered ESRCH in that situation. The standard is silent on whether pthread_kill() fails for an inactive thread (one where the thread has terminated but has not yet been joined or detached); and several existing implementations still fail with ESRCH in that situation. For a test program that demonstrates this difference in behavior (tested on glibc and Solaris):

$ cat foo.c
#define _POSIX_C_SOURCE 200809L
#include <pthread.h>
#include <stdlib.h>
#include <unistd.h>
#include <signal.h>
#include <string.h>
#include <stdio.h>
#include <errno.h>
#include <sys/wait.h>

static void *thread_fn(void *arg)
{
    return NULL;
}

int main(int argc, char** argv)
{
    pthread_t thread;
    int ret;
    pid_t child;
    int num = argc > 1 ? SIGUSR1 : 0;

    puts("testing processes");

    switch (child = fork()) {
    case -1:
        perror("fork failed");
        exit(1);
    case 0:
        return 0;
    }

    sleep(1);
    errno = 0;
    ret = kill(child, num);
    if (ret != 0) {
        fprintf(stderr, "kill: %s\n", strerror(errno));
    }

    sleep(1);
    errno = 0;
    if (wait(NULL) != child) {
        perror("wait failed");
        exit(1);
    }

    puts("testing threads");

    ret = pthread_create(&thread, NULL, thread_fn, NULL);
    if (ret != 0) {
        perror("pthread_create failed");
        exit(1);
    }

    sleep(1);
    ret = pthread_kill(thread, num);
    if (ret != 0) {
        fprintf(stderr, "pthread_kill: %s\n", strerror(ret));
    }

    sleep(1);
    ret = pthread_join(thread, NULL);
    if (ret != 0) {
        perror("pthread_join failed");
        exit(1);
    }

    puts("complete");
    return 0;
}
$ c99 -o foo $(getconf POSIX_V7_THREADS_CFLAGS) foo.c \
    $(getconf POSIX_V7_THREADS_LDFLAGS) -l pthread
$ ./foo
testing processes
testing threads
pthread_kill: No such process
complete
$ ./foo 1
testing processes
testing threads
pthread_kill: No such process
complete


For Issue 7, the best we can do is document the issue. For Issue 8, it might be nice to make the behavior normative and consistent for the two interfaces. This report targets Issue 7.
Desired Action At line 40391 page 1213 [XSH kill FUTURE DIRECTIONS], change "None." to:

A future version of this standard may require that kill() not fail with ESRCH in the case of sending signals to an inactive process (a terminated process not yet waited for by its parent), even though no signal will be delivered because the process is no longer running.

At line 53174 page 1640 [XSH pthread_kill RATIONALE], add a paragraph:

Existing implementations vary on the result of a pthread_kill( ) with thread id indicating an inactive thread (a terminated thread that has not been detached or joined). Some indicate success on such a call, while others give an error of [ESRCH]. Since the definition of thread lifetime in this volume of POSIX.1-2008 covers inactive threads, the [ESRCH] error as described is inappropriate in this case. In particular, this means that an application cannot have one thread check for termination of another with pthread_kill( ).

At line 53176 page 1640 [XSH pthread_kill FUTURE DIRECTIONS], change "None." to:

A future version of this standard may require that pthread_kill() not fail with ESRCH in the case of sending signals to an inactive thread (a terminated thread not yet detached or joined), even though no signal will be delivered because the thread is no longer running.
Tags tc2-2008
Attached Files

- Relationships
related to 0000792Applied better definition of thread lifetime 

-  Notes
(0001878)
geoffclare (manager)
2013-10-11 11:14

The proposed change to kill() FUTURE DIRECTIONS is not needed, as the
standard already requires this behaviour.

The relevant paragraph of the rationale is misleading because it has
not been updated since the first edition of POSIX.1. When it says
"existing implementations" it is referring to the implementations that
existed at the time that POSIX.1-1988 was published. It is explaining
the authors' decision to require (in normative text) that kill() on a
zombie process must not give ESRCH.

The paragraph should be updated to make it historical. It should also
use the defined term zombie process. I suggest:

    Historical implementations varied on the result of a kill() with
    pid indicating a zombie process. Some indicated success on such a
    call (subject to permission checking), while others gave an error
    of [ESRCH]. Since the definition of process lifetime in this
    standard covers zombie processes, the [ESRCH] error as described
    is inappropriate in this case and implementations that give
    this error do not conform. This means that an application cannot
    have a parent process check for termination of a particular child
    by sending it the null signal with kill(), but must instead use
    waitpid() or waitid().
(0001968)
geoffclare (manager)
2013-11-07 16:55

At page 1213 line 40379 section kill() change:

    Existing implementations vary on the result of a kill( ) with pid
    indicating an inactive process (a terminated process that has not
    been waited for by its parent). Some indicate success on such
    a call (subject to permission checking), while others give an
    error of [ESRCH]. Since the definition of process lifetime in this
    volume of POSIX.1-2008 covers inactive processes, the [ESRCH] error
    as described is inappropriate in this case. In particular, this
    means that an application cannot have a parent process check for
    termination of a particular child with kill( ). (Usually this is
    done with the null signal; this can be done reliably with waitpid().)

to:

    Historical implementations varied on the result of a kill() with
    pid indicating a zombie process. Some indicated success on such
    a call (subject to permission checking), while others gave an
    error of [ESRCH]. Since the definition of process lifetime in this
    standard covers zombie processes, the [ESRCH] error as described is
    inappropriate in this case and implementations that give this error
    do not conform. This means that an application cannot have a parent
    process check for termination of a particular child by sending it
    the null signal with kill(), but must instead use waitpid() or
    waitid().

At line 53174 page 1640 [XSH pthread_kill RATIONALE], add a paragraph:

    Existing implementations vary on the result of a pthread_kill( )
    with thread id indicating an inactive thread (a terminated thread
    that has not been detached or joined). Some indicate success on such
    a call, while others give an error of [ESRCH]. Since the definition
    of thread lifetime in this volume of POSIX.1-2008 covers inactive
    threads, the [ESRCH] error as described is inappropriate in this
    case. In particular, this means that an application cannot have one
    thread check for termination of another with pthread_kill( ).

At line 53176 page 1640 [XSH pthread_kill FUTURE DIRECTIONS], change:

    None.

to:

    A future version of this standard may require that pthread_kill()
    not fail with ESRCH in the case of sending signals to an inactive
    thread (a terminated thread not yet detached or joined), even though
    no signal will be delivered because the thread is no longer running.

- Issue History
Date Modified Username Field Change
2013-10-10 15:39 eblake New Issue
2013-10-10 15:39 eblake Name => Eric Blake
2013-10-10 15:39 eblake Organization => Red Hat
2013-10-10 15:39 eblake User Reference => ebb.pthread_kill
2013-10-10 15:39 eblake Section => (section number or name, can be interface name)
2013-10-10 15:39 eblake Page Number => (page or range of pages)
2013-10-10 15:39 eblake Line Number => (Line or range of lines)
2013-10-10 15:39 eblake Interp Status => ---
2013-10-10 15:52 eblake Section (section number or name, can be interface name) => pthread_kill
2013-10-10 15:52 eblake Page Number (page or range of pages) => 53174
2013-10-10 15:52 eblake Line Number (Line or range of lines) => 1640
2013-10-11 11:14 geoffclare Note Added: 0001878
2013-11-07 16:55 geoffclare Note Added: 0001968
2013-11-07 16:57 geoffclare Page Number 53174 => 1640
2013-11-07 16:57 geoffclare Line Number 1640 => 53174
2013-11-07 16:57 geoffclare Final Accepted Text => Note: 0001968
2013-11-07 16:57 geoffclare Status New => Resolved
2013-11-07 16:57 geoffclare Resolution Open => Accepted As Marked
2013-11-07 16:57 geoffclare Tag Attached: tc2-2008
2013-11-15 05:23 eblake Relationship added related to 0000792
2019-06-10 08:55 agadmin Status Resolved => Closed


Mantis 1.1.6[^]
Copyright © 2000 - 2008 Mantis Group
Powered by Mantis Bugtracker