Austin Group Defect Tracker

Aardvark Mark IV


Viewing Issue Simple Details Jump to Notes ] Issue History ] Print ]
ID Category Severity Type Date Submitted Last Update
0001436 [1003.1(2016/18)/Issue7+TC2] Shell and Utilities Editorial Enhancement Request 2020-12-15 21:00 2024-06-11 09:08
Reporter steffen View Status public  
Assigned To
Priority normal Resolution Accepted As Marked  
Status Closed  
Name steffen
Organization
User Reference
Section Vol. 3: Shell and Utilities, Issue 7, make
Page Number 2969
Line Number 98473
Interp Status ---
Final Accepted Text Note: 0005496
Summary 0001436: make: add "-j max_jobs" option to support simultaneous rule processing
Description Parallel, even massively parallel processing has become the widely supported and
used default, yet the standard make(1) does not document it.

P.S.:
Even though on SunOS/Solaris the first approach taken (long ago) involved
distribution of make jobs to different computers via a special "dmake" program,
it seems even there a default make(1) could follow an adjusted standard by doing
the equivalence of "exec dmake ARGUMENTS".
Desired Action On page 2969, insert before line 2969

  -j max_jobs
    Specifies the maximum number of rule-processing jobs to run simultaneously.
Tags issue8
Attached Files

- Relationships
related to 0001437Closed 1003.1(2016/18)/Issue7+TC2 make: (document .NOTPARALLEL and .WAIT special targets) in RATIONALE 
related to 0001652Closed Issue 8 drafts make: missing option argument 
related to 0001659Closed Issue 8 drafts make's -j option missing from synopsis 
related to 0001660Closed Issue 8 drafts Out of date make rationale about -n and $(MAKE) 
related to 0001680Closed Issue 8 drafts Synopsis for make is missing '-j maxjobs' 

-  Notes
(0005299)
psmith (developer)
2021-03-22 21:27

Defining -j in a single instance of make is not too hard to understand and most likely all versions of make, that currently support -j, do it essentially the same way.

The question is, what happens when a parent make with -j invokes a command which is a sub-make. This is where it gets hard, and where there's probably not too much agreement on what the behavior should be between existing implementations.
(0005327)
rhansen (manager)
2021-04-22 15:34

At (TC2) page 2969 line 98438, make SYNOPSIS, add:
[-j maxjobs]
(SD shaded as the rest)


After page 2969 line 98472, make OPTIONS, add:
-j maxjobs Set the maximum number of targets that can be updated concurrently. If this option is specified multiple times, the last value of maxjobs specified shall take precedence. If this option is not specified, or if maxjobs is 1, only one target shall be updated at a time (no parallelization). If the value of maxjobs is non-positive, the behavior is unspecified. When this option is specified with maxjobs greater than 1 and a rule invokes make (perhaps via <tt>$(MAKE)</tt>), the parent make shall attempt to establish communication with the child make in an unspecified manner, possibly via an implementation-specific flag in MAKEFLAGS. If the -j option is passed to the child make via the MAKEFLAGS environment variable with the same maxjobs value as the parent and is not overridden by a maxjobs value from another source (even if it has the same value), the child make shall establish communication with the parent make before it attempts to update any targets. In all other cases it is unspecified whether the child make establishes communication with the parent make. The parent make and any children it is communicating with, recursively, shall between them update no more than maxjobs targets in parallel. Implementations are permitted to silently limit maxjobs to an unspecified positive value; if this limit is 1, make need not attempt to establish communication with a child make.


At page 2972 line 98587, make EXTENDED DESCRIPTION, change:
The make utility shall treat all prerequisites as targets themselves and recursively ensure that they are up-to-date, processing them in the order in which they appear in the rule. The make utility shall use the modification times of files to determine whether the corresponding targets are out-of-date.
to:
The make utility shall treat all prerequisites as targets themselves and recursively ensure that they are up-to-date, using the modification times of files to determine whether the corresponding targets are out-of-date. If the -j option is not specified, a rule's prerequisites shall be processed in the order in which they appear in the rule. If targets are being updated in parallel (see -j in OPTIONS), the order of processing of prerequisites is unspecified except the make utility shall ensure that all of a prerequisite's own prerequisites are up-to-date before the prerequisite itself is made up-to-date.


At page 2986 line 99174, make RATIONALE, delete the bullet item:
Syntax supporting parallel execution (such as from various multi-processor vendors, GNU, and others)


After page 2987 line 99244, make RATIONALE, add a new paragraph:
Some implementations of make allow omission of the option-argument for the -j option, although not in the manner described in item 2.b in [xref to XBD 12.1] (where the option-argument, if present, needs to be directly adjacent to the option in the same argument string). To allow the option-argument to follow -j as a separate argument, these implementations check whether the next argument begins with a digit. If it does, it is treated as an option-argument for -j; if it does not, the -j is processed as not having an option-argument and the next argument is processed as if it followed a -j option-argument. This behavior is not suitable for inclusion in this standard as it does not meet the syntax guidelines. However, it is an allowed extension since following -j with an argument that does not begin with a digit would otherwise be a syntax error. At least one implementation of make uses <tt>-j 0</tt> to mean "use a sensible value for the maximum number of targets that can be updated in parallel". If an implementation wants to add this feature, the standard developers suggest following this convention.


At page 2988 line 99280, make RATIONALE, change:
The make utilities in most historical implementations process the prerequisites of a target in left-to-right order, and the makefile format requires this. It supports the standard idiom used in many makefiles that produce yacc programs; for example:
foo: y.tab.o lex.o main.o
    $(CC) $(CFLAGS) −o $@ t.tab.o lex.o main.o
In this example, if make chose any arbitrary order, the lex.o might not be made with the correct y.tab.h. Although there may be better ways to express this relationship, it is widely used historically. Implementations that desire to updatem prerequisites in parallel should require an explicit extension to make or the makefile format to accomplish it, as described previously.
to:
When targets are not being updated in parallel (see -j in OPTIONS), make processes the prerequisites of a target in left-to-right order. This supports a common idiom used in many makefiles that produce yacc programs; for example:
foo: y.tab.o lex.o main.o
    $(CC) $(CFLAGS) −o $@ y.tab.o lex.o main.o
In this example, if make chose any arbitrary order, the lex.o might not be made with the correct y.tab.h. Although there may be better ways to express this relationship (that would be needed if -j is specified), it is widely used historically.
(0005328)
psmith (developer)
2021-04-22 18:36

Thanks for all the work on this! I will make an effort to review this from a GNU make perspective and provide comments, if any, before Monday.
(0005329)
psmith (developer)
2021-04-23 17:40

> When this option is specified with maxjobs greater than 1 and a rule invokes make (perhaps via <tt>$(MAKE)</tt>),

I think that this isn't sufficient. It's not clear what make is intended to do here: how can make determine if a "rule invokes make"? Make doesn't have a shell parser so it cannot determine whether or not a command will actually invoke make or not.

In GNU make, we require the recipe to contain a reference to the MAKE variable; the command string must contain either $(MAKE) or ${MAKE} somewhere. Or, it must be prefixed with "+". If either of those are true then make assumes that the child may or will invoke make, and it prepares to allow that process to participate in the parallel domain. If neither of those are true then make assumes that the child will not invoke make and it won't configure that child to be available in the parallel domain.
(0005330)
psmith (developer)
2021-04-23 17:47
edited on: 2021-04-23 18:09

> shall between them update no more than maxjobs targets in parallel

It's not clear from the text whether this set of targets includes make itself. In a recursive make scenario, you will be running sub-makes to create targets. In GNU make we do not count make itself as one of the targets to be updated and I recommend this behavior here as well.

If you don't have this, then you can have a situation where you run with -jN and you invoke N instances of sub-makes, and now no sub-make can start a new target because all available jobs are used.

Not counting make is sensible because while all jobs are running, make itself is not actually running: it's waiting for one of the jobs to complete.

In GNU make, we basically say that each invocation of make is given one "free" jobserver token that it can use for starting one job at a time: if it ever wants to start more than one job it needs to get another jobserver token as expected. This way every instance of make can always make progress.

(0005331)
psmith (developer)
2021-04-23 18:28

I think the rest of the content here will work for GNU make. Thanks!
(0005345)
rhansen (manager)
2021-04-29 16:59
edited on: 2021-04-29 17:00

Re: Note: 0005330:
It's not clear from the text whether this set of targets includes make itself. In a recursive make scenario, you will be running sub-makes to create targets. In GNU make we do not count make itself as one of the targets to be updated and I recommend this behavior here as well.
Out of curiosity, how is this implemented in GNU make? Does the act of establishing a connection to the parent make conceptually return a worker to the pool of available workers? Make doesn't parse shell code, so make can't know whether a rule spawns a submake even if the rule expands the MAKE macro (<tt>echo $(MAKE)</tt> expands the MAKE macro but it doesn't spawn a submake).

(0005346)
rhansen (manager)
2021-04-29 17:03
edited on: 2021-04-29 17:18

> Out of curiosity, how is this implemented in GNU make?

Nevermind, you already answered this in Note: 0005330:
In GNU make, we basically say that each invocation of make is given one "free" jobserver token that it can use for starting one job at a time: if it ever wants to start more than one job it needs to get another jobserver token as expected.
Apologies for not reading closely.

(0005347)
rhansen (manager)
2021-04-29 17:30
edited on: 2021-04-29 17:34

Re: Note: 0005330:
> In GNU make, we basically say that each invocation of make is given one "free" jobserver token that it can use for starting one job at a time: if it ever wants to start more than one job it needs to get another jobserver token as expected.

If I understand GNU make's behavior correctly, a user could exceed N parallel jobs if they write a rule that starts multiple submakes in parallel. For example:
all:
    i=0; subs=''; while [ $$i -lt 10 ]; do $(MAKE) -C subdir$$i & subs="$$subs $$!"; i
=$$((i+1)); done; for sub in $$subs; do wait $$sub || exit 1; done

The above could have up to 11 jobs running in parallel if <tt>-j2</tt> is passed to make. Correct? We'll want to make sure the standard permits that behavior, probably by saying that the behavior is unspecified if a rule invokes multiple submakes in parallel.

(0005348)
psmith (developer)
2021-04-29 20:19

Yes, that's correct: it is possible to circumvent this limit. It's unfortunate but alternative implementations which are completely reliable are proportionately more difficult, and I just figured if people want to write their makefiles that way and break the jobserver behavior, then, they get to keep both pieces :).
(0005353)
psmith (developer)
2021-05-14 14:12

I see this is still "Accepted". I hope it can be reopened and the issues above addressed before the text is applied. I don't think the current text is appropriate for being applied to the standard.
(0005354)
geoffclare (manager)
2021-05-14 14:30

Reopening.

We had left it as resolved in the expectation that we would just need to make a small wording change to the note containing the final accepted text. However, we have since realised that a more extensive rewrite is needed.
(0005355)
psmith (developer)
2021-05-14 14:35

Thanks Geoff! Please let me know if I can help. Cheers!
(0005362)
rhansen (manager)
2021-05-20 17:08

We think we have achieved consensus on a rewrite of the description of the -j option; see "attempt #3" on line 65 of https://posix.rhansen.org/p/2021-05-20. [^] Feedback would be appreciated.
(0005364)
psmith (developer)
2021-05-22 19:19

Thanks for your work on this! I've reviewed the text in "attempt #3" and it seems acceptable to me as-is. I did have this thought which you may accept or ignore as you like :)

> When make is bringing a target with commands up-to-date

It might be more clear, although not meaningfully different, to say something like:

When make is bringing one or more targets with commands up-to-date
(0005496)
nick (manager)
2021-09-09 15:35
edited on: 2021-09-09 15:36

Updated changes after resolving 0001437

After page 2969 line 98472, make OPTIONS, add:
-j maxjobs Set the maximum number of targets that can be updated concurrently. If this option is specified multiple times, the last value of maxjobs specified shall take precedence. If this option is not specified, or if maxjobs is 1, only one target shall be updated at a time (no parallelization). If the value of maxjobs is non-positive, the behavior is unspecified. When maxjobs is greater than 1, make shall create a pool of up to maxjobs - 1 tokens. (Note that implementations are not required to create a pool of exactly maxjobs - 1 tokens. For example, an implementation could limit the pool size based on the number of processors available.) If the size of the token pool would be 0, make need not implement a token pool.

When all of the following are true:

  • there is a target with commands that is not up-to-date

  • the target's prerequisites (if any) are up-to-date

  • make is not waiting to bring the target up-to-date (see .WAIT)

  • make is currently bringing a different target with commands up-to-date

  • make is not currently bringing maxjobs targets up-to-date in parallel

  • the special target .NOTPARALLEL is not specified

  • the token pool is not empty

then make may attempt to remove one token from the pool. If a token is successfully removed, it shall attempt to bring this target up-to-date in parallel, and after this processing completes shall return the token to the pool. When make is bringing a target without commands up-to-date, it need not remove a token from the pool.

If a rule invokes a sub-make either via the MAKE macro or via a command line that begins with '+', the sub-make is the same implementation as the make that invoked the sub-make, and the -j option is passed to the sub-make via the MAKEFLAGS environment variable with the same maxjobs value and is not overridden by a maxjobs value from another source (even if it has the same value), the sub-make shall use the same token pool as its invoking make rather than create a new token pool. Otherwise, it is unspecified whether the sub-make uses the same token pool as its invoking make or creates a new token pool. If a rule executes multiple sub-make processes asynchronously the behavior is unspecified.


On page 2969 line 98476 change:
However, lines with a <plus-sign> (<tt>'+'</tt>) prefix shall be executed.

to:
However, lines with a <plus-sign> (<tt>'+'</tt>) prefix and lines that expand the MAKE macro shall be executed.


On page 2986 line 99178, delete the bullet item:
    
Specifying that command lines containing the strings "${MAKE}" and "$(MAKE)" are executed when the −n option is specified (GNU and System V).


On page 2987, delete lines 99229 - 99235 (the paragraph beginning "Early proposals ...").

On page 2990, after line 99387 add the following new paragraphs:
The standard specifies a way for portable applications to request parallel updating of targets with commands by using the -j maxjobs option. This feature is described in terms of a token pool initially containing up to maxjobs - 1 tokens. Note that this is not intended to prescribe a particular implementation design; the usual "as if" rule applies.

Implementations are permitted to silently limit the pool size for a few reasons, including:
Implementations that do not support parallelism can support the -j option by simply ignoring the option (other than passing it to sub-make invocations via the MAKEFLAGS environment variable). In effect, such an implementation silently restricts the size of the token pool to zero (and therefore need not create a token pool).
Some historical implementations dynamically limit the token pool size based on current system load to avoid overloading the system.
Implementations may want to limit the token pool size based on the number of processors available.
Implementations may want to limit the token pool size based on resource limits.
Limiting the pool size does not change the value of maxjobs that is passed to sub-make invocations via the MAKEFLAGS environment variable.

When a different maxjobs value is passed to a sub-make, some historical make implementations created a separate pool of tokens while other historical make implementations continued to obtain tokens from the invoking make but limited the number of tokens held at a time to the new value of maxjobs - 1. Both behaviors are believed to have merit in different situations: The former gives a sub-make complete control the amount of parallelism while the latter allows the user to control overall system load. The standard permits either behavior.

The standard calls for a token pool of size maxjobs - 1, and for removal from that pool only for the second and subsequent tasks in a set of parallel tasks. This design was chosen because this is effectively what existing implementations do, and also because the token consumed by a parallel task that invokes a sub-make is effectively lent to the sub-make. Lending the token to the sub-make has the following advantages:

  • It prevents the sub-make from being completely idle due to token starvation, allowing it to always make some progress regardless of how many tokens other sub-make invocations have consumed.

  • It prevents token pool exhaustion caused by a long chain of sub-make invocations. If the token consumed by the invoking rule was not effectively lent to the sub-make, then the pool would be exhausted by a chain of sub-make invocations that is maxjobs long. Such a chain would never accomplish any work, and would thus never complete.



When a rule invokes multiple sub-make processes asynchronously (for example by using an asynchronous list in the shell), some implementations allow each sub-make to execute at least one rule even though this would cause the total number of parallel rule executions across all make instances to exceed maxjobs (after discounting the rules that execute sub-make processes). This behavior may not be ideal, but it is easier to implement and is unlikely to cause problems in practice because applications typically do not have any rules that invoke multiple sub-make processes asynchronously. For this reason the behavior is unspecified if a rule executes multiple sub-make processes asynchronously.

When multiple sub-make processes are running in parallel there is no requirement placed on the ordering of output from these processes. Some implementations of make attempt to serialize output from each sub-<make>; others make no such attempt. If diagnostic messages from failed commands are intermixed, the usual way to deal with this is to repeat the make without -j (or with -j 1) so that intermixing will not occur.



- Issue History
Date Modified Username Field Change
2020-12-15 21:00 steffen New Issue
2020-12-15 21:00 steffen Name => steffen
2020-12-15 21:00 steffen Section => Vol. 3: Shell and Utilities, Issue 7, make
2020-12-15 21:00 steffen Page Number => 2969
2020-12-15 21:00 steffen Line Number => 98473
2021-03-22 21:27 psmith Note Added: 0005299
2021-04-22 15:34 rhansen Note Added: 0005327
2021-04-22 15:36 rhansen Interp Status => ---
2021-04-22 15:36 rhansen Final Accepted Text => Note: 0005327
2021-04-22 15:36 rhansen Status New => Resolved
2021-04-22 15:36 rhansen Resolution Open => Accepted As Marked
2021-04-22 15:36 rhansen Tag Attached: issue8
2021-04-22 18:36 psmith Note Added: 0005328
2021-04-23 17:40 psmith Note Added: 0005329
2021-04-23 17:47 psmith Note Added: 0005330
2021-04-23 18:09 psmith Note Edited: 0005330
2021-04-23 18:28 psmith Note Added: 0005331
2021-04-29 16:59 rhansen Note Added: 0005345
2021-04-29 17:00 rhansen Note Edited: 0005345
2021-04-29 17:00 rhansen Note Edited: 0005345
2021-04-29 17:03 rhansen Note Added: 0005346
2021-04-29 17:18 rhansen Note Edited: 0005346
2021-04-29 17:30 rhansen Note Added: 0005347
2021-04-29 17:34 rhansen Note Edited: 0005347
2021-04-29 20:19 psmith Note Added: 0005348
2021-05-14 14:12 psmith Note Added: 0005353
2021-05-14 14:30 geoffclare Note Added: 0005354
2021-05-14 14:30 geoffclare Status Resolved => Under Review
2021-05-14 14:30 geoffclare Resolution Accepted As Marked => Reopened
2021-05-14 14:35 psmith Note Added: 0005355
2021-05-20 17:08 rhansen Note Added: 0005362
2021-05-22 19:19 psmith Note Added: 0005364
2021-08-19 16:59 rhansen Relationship added related to 0001437
2021-09-09 15:35 nick Note Added: 0005496
2021-09-09 15:36 nick Note Edited: 0005496
2021-09-09 15:37 nick Final Accepted Text Note: 0005327 => Note: 0005496
2021-09-09 15:37 nick Status Under Review => Resolved
2021-09-09 15:37 nick Resolution Reopened => Accepted As Marked
2021-11-26 15:00 geoffclare Status Resolved => Applied
2023-04-03 09:27 geoffclare Relationship added related to 0001651
2023-04-03 09:27 geoffclare Relationship deleted related to 0001651
2023-04-03 09:28 geoffclare Relationship added related to 0001652
2023-04-03 14:44 geoffclare Relationship added related to 0001659
2023-04-06 08:44 geoffclare Relationship added related to 0001660
2023-04-21 16:37 geoffclare Relationship added related to 0001680
2024-06-11 09:08 agadmin Status Applied => Closed


Mantis 1.1.6[^]
Copyright © 2000 - 2008 Mantis Group
Powered by Mantis Bugtracker