Austin Group Defect Tracker

Aardvark Mark III


Viewing Issue Simple Details Jump to Notes ] Issue History ] Print ]
ID Category Severity Type Date Submitted Last Update
0000328 [1003.1(2008)/Issue 7] Shell and Utilities Objection Enhancement Request 2010-10-09 15:47 2011-12-06 17:33
Reporter Love4Boobies View Status public  
Assigned To
Priority normal Resolution Open  
Status New  
Name Bogdan Barbu
Organization
User Reference
Section c99 - compile standard C programs
Page Number 2488
Line Number 79531
Interp Status ---
Final Accepted Text
Summary 0000328: Auto-Dependency Generation
Description I think that standardizing auto-dependency generation would be a very helpful thing as portable source code is not the only issue software projects run into - build tools go with that. Auto-dependency generation is something any sane project uses one way or another.
Desired Action I propose using the -M switch, which would tell the C preprocessor to generate make-compatible output. This switch is used by several compilers (e.g., GCC) and doesn't break compatibility with any POSIX-compliant compilers as far as I know.
Tags No tags attached.
Attached Files

- Relationships

-  Notes
(0001010)
ajosey (manager)
2011-11-12 06:13

It was agreed at the 10 November 2011 teleconference that in principle this is acceptable and that a detailed proposal is needed to take this forward.
Action: Andrew to contact submitter.
(0001045)
dwheeler (reporter)
2011-11-24 15:20

Comment #1010 notes that a detailed proposal is needed to take this forward. How about this as a start:

In section "c99", page 2489, after line 79595 (between the description of the "-l" and "-O" options), add:

-M Instead of producing object or executable files, output rules suitable for make describing the dependencies of the main source file. For each given pathname (which is a source file), output one "make" file with the pathname, a colon, a space, and then the pathnames of all included files (the prerequisites) including those from command line options and system headers. Prerequisites must be separated in a way suitable for make; see the section on make.

-M -M If -M is used twice, do not include as prerequisites any header files in system header directories (included directly or indirectly).


Rationale:

The "-M" option is widely implemented and used. For example, the GNU make "info" documentation section 4.14 ("Generating Prerequisites Automatically") states that "most modern C compilers can write these rules for you, by looking at the `#include' lines in the source files. Usually this is done with the `-M' option to the compiler...".

The gcc compiler includes "-MM" to not include system header directories. This is really useful, especially when distributing pre-generated files to others. However, POSIX generally avoids multi-character option names such as -MM, so I've proposed two -M instead. Other possibilities exist too.

This doesn't add the many variations of "-M" that are implemented. In particular, the gcc compiler has lots of other variations, but most of them can be implemented easily by post-processing the output using sed (etc.).


Issues:

Adding "-M" is a big help, but just adding "-M" is NOT enough to implement widely-used approaches to automated dependency management. For example, the GNU make implementation document says: "we recommend [having one generated] makefile corresponding to each source file. For each source file `NAME.c' there is a makefile `NAME.d'... That way only the source files that have changed need to be rescanned." They then recommend using this:
     %.d: %.c
             @set -e; rm -f $@; \
              $(CC) -M $(CPPFLAGS) $< > $@.$$$$; \
              sed 's,\($*\)\.o[ :]*,\1.o $@ : ,g' < $@.$$$$ > $@; \
              rm -f $@.$$$$
     sources = foo.c bar.c
     include $(sources:.c=.d)

For this approach to work with POSIX make, the "-M" flag would need to be added to c99 (as proposed). In addition, for POSIX to support this common approach, it would also require automatic remaking of makefiles (bug #332), pattern rules (bug #513), and adding the ability to make "include" to include multiple pathnames on one line (no bug number at this time).

Other articles you might consider include "Recursive Make Considered Harmful" by Peter Miller (http://miller.emu.id.au/pmiller/books/rmch/ [^] and http://aegis.sourceforge.net/auug97.pdf) [^] and "Advanced Auto-Dependency Generation" by Paul D. Smith (http://make.paulandlesley.org/autodep.html). [^]
(0001046)
dwheeler (reporter)
2011-11-24 17:56

Here's another and hopefully better attempt to define -M. In particular, this one talks about how to handle relative pathnames. Please replace the previous proposal's definition of "-M" in comment 1045 with this:

-M Instead of producing object or executable files, output rules suitable for make describing the dependencies. For each given source file pathname, the following output is produced. First, the generated object pathname is output, consisting of the name of the source file with any suffix replaced with the object file suffix ".o" and with any leading directory parts removed. This is followed by a colon, space, and a list of the pathnames of the prerequisites. The first prerequisite shall be the given source file pathname, followed by all other prerequisites (files included directly and indirectly), including those from command line options and system headers. If #include "..." directives use relative pathnames, then the prerequisite must also be relative, but it must be relative to the current directory. Prerequisites shall be separated in a way suitable for make; see the section on make. No make commands are given. It is an error if no source file pathnames are given, or if the given pathname cannot be read.
(0001047)
brkorb (reporter)
2011-11-24 19:17

"relative" as in "../" relative, since the file names are really nearly always relative:

#include <sys/types.h>

referring to a file relative to /usr/include. The idea is really good, but tiny cracks get exploited...
(0001048)
Love4Boobies (reporter)
2011-11-24 19:22
edited on: 2011-11-25 12:02

Alternatively, we could use another mechanism that might be more elegant (because it combines -MM, -MF, and, optionally, -MT into a single solution). Apparently, SunPro introduced the SUNPRO_DEPENDENCIES environment variable back in 1986. In turn, GCC introduced DEPENDENCIES_OUTPUT, which is similar except it also ignores system headers from the dependency list. The latter seems perfect. Less importantly, -M breaks compatibility with MKS.

Add the following before the description of LANG, in the ENVIRONMENT VARIABLES section:

DEPENDENCIES_OUTPUT

If set, this variable specifies the makefile to which the source files' dependencies (specified via #include directives), with their relative paths, and optionally the target name will be written as make rules; if the target is not specified, $(basename pathname .c).o shall be used instead. If the makefile already exists, it shall be appended to.

Implementations must be able to output dependencies up to at least up to 15 levels of nesting. The headers described in this standard will be ignored. Additionally, implementations may choose to ignore other headers as well.

(0001049)
dwheeler (reporter)
2011-11-24 21:40

Regarding comment 1047:
I specifically specified (in comment #1046) that relative filenames are ONLY required for #include "...", so the problem example cited (#include <sys/types.h>) wouldn't apply (and thus wouldn't be a problem).
(0001050)
dwheeler (reporter)
2011-11-24 23:13

Regarding 1048:
I don't think it's a disaster that some compilers use an "-M" flag, e.g., MKS
(http://www.mkssoftware.com/docs/man1/cc.1.asp). [^] Their compiler name will be different than "c99", which is probably a wrapper for a local compiler.

It'd certainly be possible to set an environment variable. But I believe relatively few preprocessors or compilers implement that (I only know of one, gcc, that supports the environment variable DEPENDENCIES_OUTPUT). The "-M" flag also requires preprocessor support, but I believe it has more widespread support.

A different approach would be to standardize the command "makedepend" or some similar command. This command *just* does dependency analysis for programs that use the C pre-processor, e.g., C/C++/Objective-C. Makedepend is part of X:
  http://www.x.org/archive/X11R6.8.1/doc/makedepend.1.html [^]

An advantage of standardizing makedepend is that implementors wouldn't need to modify their compilers at all. Instead, they could implement *just* makedepend as a separate command. People could even use the X implementation (MIT license), if it met the standard. X's makedepend program *does* make a big assumption, namely, that all include file's dependencies are the same for a given execution of makedepend. This assumption is almost always true in practice, and it provides a major performance boost. In cases where it's not true, the solution is usually easy: perform separate executions for the separate files. The spec would need to include permission to make this assumption, if it were included. Some makedepend use cases require some sed commands to make it useful; adding a few options to make it easier to use might be a big win for everyone.

Comments?

BTW, other pages about automated dependencies are:
* "Tips and Tricks From the Automatic Dependency Generation Masters" http://www.cmcrossroads.com/ask-mr-make/7172-tips-and-tricks-from-the-automatic-dependency-generation-masters [^]
* "Advanced Auto-Dependency Generation" http://make.paulandlesley.org/autodep.html [^]

A general challenge with automated dependency generation is that (1) lots of people need it, (2) the standard fails to provide it, and therefore (3) lots of people do it different ways. But since it's a widespread need, a standard mechanism to do it would be appropriate.
(0001051)
Love4Boobies (reporter)
2011-11-25 02:08
edited on: 2011-11-25 02:14

The original implementation doesn't seem to ignore system headers (which is also true for the -M you proposed)---I'm not sure that's a good thing for the following reasons:

1. They are not actually supposed to change. If you think this should be allowed in order to help transition projects to newer environment implementations (e.g., the underlying OS gets updated from SUSv3 to SUSv4, or some TC is applied), consider the following:

a) POSIX and C both care a great deal about backwards compatibility. In most scenarios, they simply add new stuff. That by itself isn't a problem.

b) Given a scenario where something is so broken that it requires to break compatibility with older versions of POSIX and/or C, the maintainer would need to fix the code that relied on the problem anyway, thus preparing the build system to rebuild the neccessary targets.

c) Developers can always use _POSIX_VERSION and _POSIX2_VERSION (or __STDC_VERSION__).

2. Consider an implementation of <stdio.h> which also includes <stdio_wide.h>. If we generate the dependencies, the latter would appear in the dependency list yielding to unportable dependency files. I see no reason why portable dependency files should not be allowed as long as two environments implement the same ABI; it might be particularly useful in a distributed environment.

It is my view that we are currently left with three options:

1. A compiler switch other than -M because this one specifies system headers.
2. Use GCC's DEPENDENCIES_OUTPUT.
3. Standardize makedepend, which would be very flexible due to the fact that implementations could easily extend it for other languages; however, the POSIX version would need to be incompatible with the original implementation in that it would ignore system headers.

Last but not least, everywhere I said "system headers" I actually meant "headers defined by POSIX, as well as unspecified ones (e.g., <curses.h>)."

(0001052)
dwheeler (reporter)
2011-11-25 07:10

System include files absolutely *can* change. Even if the public interface doesn't change, the way they are implemented can change in an update, and some of this information may be stored in an include file (it often is). A program that had some object files compiled to the old include file, and some to the new, might not work. And if the old libraries aren't kept, you almost certainly need to recompile any file that depends on them. Some Linux distros, like Fedora, update system include files surprisingly often.

I see two different use cases:
1. Define a standard way to *regenerate* complete dependencies before make uses them to determine what to do. The dependency information would then necessarily include system-specific information, but that's okay; it'll get regenerated before use. The advantage of this approach is that it gives "make" complete and accurate dependency information, leading to more accurate results. This is what makedepend and -M currently do.
2. Define a standard way to report the dependencies inside the project, presuming that it will use "whatever the standard libraries are". This is what -MD and friends do. This might be especially useful if you want to distribute a source package to others, or you just might prefer this.

Since there are two different (related) use cases, I think there should be a way to get either one; option switches are the usual way to do that.

So if it's a compiler switch, a "-M" and a "-MF" could both be standardized. For the compiler, there's the challenge of option letters; typically POSIX doesn't have option names like -MF if they can avoid it, and if you're stuck with one character, it's almost impossible to avoid stepping on someone. Technically, POSIX only defines the flags for "c99", but it'd be nice if the flags were the same as the underlying tool if it's used. It's kind of ugly.

I'm wary of DEPENDENCIES_OUTPUT. It's not widely implemented by other compilers, and frankly, it's not clear there's a lot of experience with it.

I find option #3 (makedepend) compelling. There's lots of experience with this, you don't have to muck with anyone's compiler, and it provides a separate option flag namespace.

If we built on makedepend, we could easily add new switches like one that meant "ignore system headers". There are several common situations that people have to use sed to fix; it'd be nice if common situations were "just built in" too instead of having to use sed to repair the output. If we rename the command slightly, then it wouldn't interfere with the existing makedepend (so that they don't HAVE to implement anything), and we could make the defaults nicer too. I think standards bodies should avoid inventing much, but building on an existing base and just setting better defaults is okay (I think), as long as they don't interfere with existing use and they're an obvious extension from current practice.
(0001076)
eblake (manager)
2011-12-06 16:53
edited on: 2011-12-06 17:05

Discussion on the 6 Dec 2011 teleconference suggested that any solution to this bug needs to meet the following guideline:

In order to allow a makefile to be shared among multiple platforms, it must be possible to conditionally include automatic dependencies as nested files, where the condition on which nested files to include can be determined by the platform. Automatic dependencies should be designed for nested inclusion, rather than directly modifying the makefile with the compilation rules.

(0001077)
eblake (manager)
2011-12-06 17:25

The ability to track included files from an input file to a language interpreter is useful enough that we may want to also standardize a way to do this for awk, lex, m4, sh, and yacc - that is, any standardized interpreters that provide an include operation. Conversely, these languages do not have the automatic suffix dependencies already built into make that c99 does.

For example, this link gives a patch (as yet unapplied) that proposes -M for GNU m4 to output dependency information:
https://lists.gnu.org/archive/html/m4-patches/2011-02/msg00005.html [^]
(0001078)
Love4Boobies (reporter)
2011-12-06 17:33
edited on: 2011-12-06 18:48

Regarding comment 1077, I initially hinted at this idea in comment 1048 but later edited it because I imagined a different bug report might be more appropriate once the mechanism for c99 was agreed upon.

However, this *is* one of the reasons I preferred DEPENDENCIES_OUTPUT over GCC's -M? family of switches. But makedepend would work equally well, if not better.


- Issue History
Date Modified Username Field Change
2010-10-09 15:47 Love4Boobies New Issue
2010-10-09 15:47 Love4Boobies Name => Bogdan Barbu
2010-10-09 15:47 Love4Boobies URL => http://www.opengroup.org/onlinepubs/9699919799/utilities/c99.html [^]
2010-10-09 15:47 Love4Boobies Section => c99 - compile standard C programs
2010-10-14 16:23 msbrown Project Online Pubs => 1003.1(2008)/Issue 7
2010-10-14 16:25 msbrown Page Number => 2488
2010-10-14 16:25 msbrown Line Number => 79531
2010-10-14 16:25 msbrown Interp Status => ---
2010-10-14 16:25 msbrown Type Omission => Enhancement Request
2011-04-19 01:38 msbrown Category Shell & Utilities => Shell and Utilities
2011-11-12 06:13 ajosey Note Added: 0001010
2011-11-24 15:20 dwheeler Note Added: 0001045
2011-11-24 17:34 dwheeler Issue Monitored: dwheeler
2011-11-24 17:56 dwheeler Note Added: 0001046
2011-11-24 19:17 brkorb Note Added: 0001047
2011-11-24 19:22 Love4Boobies Note Added: 0001048
2011-11-24 21:40 dwheeler Note Added: 0001049
2011-11-24 21:58 Love4Boobies Note Edited: 0001048
2011-11-24 23:13 dwheeler Note Added: 0001050
2011-11-25 02:08 Love4Boobies Note Added: 0001051
2011-11-25 02:10 Love4Boobies Note Edited: 0001051
2011-11-25 02:14 Love4Boobies Note Edited: 0001051
2011-11-25 07:10 dwheeler Note Added: 0001052
2011-11-25 12:02 Love4Boobies Note Edited: 0001048
2011-12-06 16:53 eblake Note Added: 0001076
2011-12-06 17:05 eblake Note Edited: 0001076
2011-12-06 17:25 eblake Note Added: 0001077
2011-12-06 17:33 Love4Boobies Note Added: 0001078
2011-12-06 18:48 Love4Boobies Note Edited: 0001078


Mantis 1.1.6[^]
Copyright © 2000 - 2008 Mantis Group
Powered by Mantis Bugtracker