0001746: fuser output format clarification - Austin Group Defect Tracker

Notes
(0006330) kre (reporter) 2023-06-13 20:20	Why the change from %d to %1d ? The default for %d is %1d according to XBD 5. Putting the '1' there just causes readers to wonder why? Unless explicitly given as %0d, %d (and all the related ones) always outputs at least one character. You also omitted the "for each process using that file" from the final sentence. But the STDOUT section still says: The fuser utility shall write the process ID for each process using each file given So there can clearly be more than one pid printed for one file, and hence possibly more than one user. It isn't clear if the multiple user names/uids should each appear in their own set of parentheses, or if one set should surround all of them, and in the latter case if any separation is supplied. It also isn't clear, as the -u option mentions nothing about what happens with multiple processes whether each user name (or uid) is written just once, or once for each pid. There is nothing, in the current, or proposed new, text which says there should be any correlation between the order in which the pids are printed, and the order the user names (or uids) are printed, nor for that matter, the order in which any 'c' or 'r' characters are written, nor whether there is intended to be any way to relate one of those to a pid, particularly if an implementation chooses not to output other characters for other uses of the file, nor what should be done if the same file is open by the same process for several different uses (we may have the root directoru as our current directory (both 'r' and 'c') and also have it open for reading to obtain a directory listing (like "cd /; ls' when the "fuser /" command is given, and that ls process is located as one user of that directory). I can't test any of this, NetBSD has no fuser command - and given the description of it, I'm not surprised, it looks foul to use (we do have a utility which provides similar information in a totally different format). I'd be in favour of obsoleting this trash... If not, I think its description needs a complete rewrite, something I cannot assist with, as I have no idea at all what it really does (I can see what it is intended for, just not the details of how it really operates).

(0006331) kre (reporter) 2023-06-13 20:44	Also, there doesn't really need to be a space before the first pid to allow it to be separated from the pathname - the latter is followed by a ':' according to the spec. And while a ':' may be part of the pathname, so might a space (and in both cases, that might be the trailing character of the pathname). This might not be an issue, if the pathname written is just the file arg as given to the command - those would usually be able to compared with the output - particularly if the output were constrained to be in the same order as the file args were present (but nothing actually says that it is). But that's also not what is output (apparently) rather it is "The pathname of each named file" - but there's no clue given as to what that really means, is it intended to run XSH/realpath on each file name given? If so, what is done if that fails? Any other way of obtaining a pathname of the file is just as likely to fail of course. It is also not clear what (for the -c option) The file is treated as a mount point actually means. Is it saying that the file must be a mount point (and what is "the file" there can be many of them, unless a -c only applies to the file name immediately following, as if it were "-c file" as the option) and it would be an error if "the" file is not a mount point, or does it mean to treat the file as if it were a mount point, and then list all open files that would be in the filesystem it would designate (everything in the hierarchy rooted at that point) if it actually were a mount point? Is this intended to work if "file" does not name a directory? The more I look at the spec of this command, the more I think it useless. (as a specification(, and the command as specified, almost useless as well. How is anything supposed to distinguish diagnostic messages that may be written to stderr, from the actual output that also goes there? Are those even expected to start a new line, or do that just appear intermixed with whatever output has already been written to stderr for that file? Enough ... if I keep re-reading that "spec" I'll be here all night!

(0006332) geoffclare (manager) 2023-06-15 09:35	Re Note: 0006330 > Why the change from %d to %1d ? The default for %d is %1d according to XBD 5. Read the description of the d conversion more carefully, particularly the last sentence. The diff, stty, and ulimit pages already use %1d for the same reason I'm proposing to use it here. > You also omitted the "for each process using that file" from the final sentence. Oops, thanks for spotting that. I removed it because it's in the wrong place, but must have got distracted while I was working out how/where to reintroduce it. The rest of your comment results from this text being out of place. With -u the output looks similar to this on the four implementations I tested: /: 1805cr(root) 1806cr(root) 1807cr(root) and that is what I was trying to make the proposed new STDERR text match. Note that the paragraph being changed is not stand-alone; it follows a bullet list which describes each component written to standard error, so it just needs to make the ordering between what's written to standard output and what's written to standard error clear. Re Note: 0006331 > Also, there doesn't really need to be a space before the first pid True, but all the implementations I tested have one or more blanks there and I'm sure the original intention was to describe the existing practice. > "The pathname of each named file" - but there's no clue given as to what that really means, is it intended to run XSH/realpath on each file name given? Of the four implementations I tested, GNU coreutils uses realpath() but the others just write the argument as-is. > It is also not clear what (for the -c option) > The file is treated as a mount point > actually means. Currently I think the standard requires an error if the specified file is not a mount point (via the default CONSEQUENCES OF ERRORS requirements). However, only two of the four implementations I tested do that. So I think we should say the behaviour is unspecified if it's not a mount point. > How is anything supposed to distinguish diagnostic messages that may be written to stderr, from the actual output that also goes there? The format of diagnostic messages is unspecified, so they are always expected to be interpreted by a human, not a "thing". Distinguishing them doesn't seem hard for a human: $ fuser foo . bar foo: fuser: No such file or directory .: 15164c 2545c bar: fuser: No such file or directory $ fuser foo . bar > /dev/null foo: fuser: No such file or directory .: cc bar: fuser: No such file or directory

(0006334) kre (reporter) 2023-06-15 13:58	Re Note: 0006332 On %1d OK ... normally I wouldn't like the spec over specifying something like that, but here I can see that avoiding trailing space might actually be important (but also perhaps not, see below). Good luck with getting the rest of it right. Not that I care a lot what this ends up saying, but I'd start by scrapping the first two paragraphs of the Description. In order to get the output format that Note: 0006332 shows - which is one possibility I considered for what it might look like, but based on the current wording, about the least plausible possibility I could come up with, separating the stdout/stderr outputs that way doesn't work. To do justice to this, the order of the output needs to be the primary factor in the description - which part goes to stdout and which to stderr is a more minor consideration. So I'd write it something like For each file given, in order, fuser shall write one line of output, some of it to standard output, and the rest to standard error. The line shall consist of the file argument, or of a pathname that refers to that file, followed immediately by a <colon> character (':'). ("and then a <space> character" - perhaps - see below). That shall be written to standard error. Then for each process using that file, either with the file open, or as its current, or root directory, or otherwise, fuser shall write the process id of that process to standard output, in the format given below. That shall be followed by a sequence of zero or more alphabetic characters, each of which indicates a type of use of the file by that process. The characters are specified below, and are written to standard error. Then, if the -u option was given, the name or user-id of the (effective/real/both/saved/all?) user associated with the process shall immediately follow, in the format specified below, and shall be written to standard error. The order in which the processes are listed is unspecified (is it, or are they sorted in some fashion?) but each process using the file shall be listed only once, regardless of how many uses it is making of the file (assuming that is true: an implementation might work through the kernel's open file list, then for each which qualifies as representing the file in question, find all processes using that file. In that case, a process which has opened the file twice (separate open() calls, rather than open() dup()) might be found, and listed, twice). When this information has been written for each process currently accessing the file, a newline character shall be written to standard error, no newline is written to standard output (?). And then continue with the rest of the specification, which should just concentrate on the specific format details, and exactly what "using" the file means in the case where the aim is to find all processes using any file on a filesystem or block device. Also note that some of the above is still guesswork, even with your example. Nothing currently says whether "The user name associated with each process ID" means the real userid, effective userid, or saved userid (or perhaps 2 or all 3 of them, if they're different, and if more than one, which order they appear in, and how that is formatted - I'm guessing probably just one of them, but which one?) And from the example you gave: $ fuser foo . bar > /dev/null foo: fuser: No such file or directory .: cc bar: fuser: No such file or directory ignoring the errors for the minute, I currently see nothing which justifies that ' ' between the ':' and the first 'c', unless that is to be "Implementations may write other alphabetic characters to indicate other uses of files.", with a rather loose interpretation of what it means to be an alphabetic character. If leading spaces are permitted before the "use type" characters. then there's no longer any need for the '1' in %1d as there's no way to tell the difference between white space added after the pid, and the white space inserted before the first 'c' in this case (and since the second 'c' relates to another process, there could be more white space before it as well). On the other hand, maybe the format of the pathname output should say "The pathname of each named file is written followed immediately by a <colon> followed by a <space>." and that space comes from there, otherwise where does the space between the first ':' and 'fuser' in the error messages come from, unless you're assuming the disgnostic messages are starting with a space (which seems unlikely, though obviously not forbidden). wrt: The format of diagnostic messages is unspecified, so they are always expected to be interpreted by a human, not a "thing". Of course, that wasn't the point. Distinguishing them doesn't seem hard for a human: No, but the (combined) stdout+stderr stream must be intended to be processed by "thing"s - or we wouldn't care at all about the precise details of the format. That is, if you expect common usage to be kill -s KILL $(fuser -c /path/to/mountpoint 2>/dev/null) which is the kind of thing I am expecting this is used for, then the format of stderr really doesn't matter. Without the pids (and particularly when -u is not used) the data on stderr seems completely useless. The point was more to how the "thing" is supposed to process that combined stdout/stderr stream in the more general case of errors beyond the simple ones you showed. eg: nothing in the spec says that if it fails to map a uid to a user name, it cannot write a diagnostic about that failure (which might be because of an I/O error reading the mapping database (aka /etc/passwd or similar)). That's likely to appear at a rather less appealing point in the output line than the examples given. Similarly, if realpath() is being used, and fails, there might be a diagnostic from that before the pathname is output. Also note that the volume of output might be large, so waiting for fuser to exit, to get its exit status, is quite likely not practical. Also before I forget, the -c option specification currently says: the utility shall report on any files open in the file system which, if it means what it says, would not include any use of directories which are some process's current (or even root) directory (not because directories aren't files, they are, but because the cwd is not normally open by the process, or often, anything). It might want to say "in use" instead of "open". I also have no idea what the output is supposed to look like when the -c option is given - is there just one long line of all of the pids of any process with anything open (or probably a cwd or root, or other use) [other uses include being mapped as shared text segment, or part thereof]. Or does: "shall report on any files open" mean that it should treat any files open in the file system named as if it were a file arg to the command, and generate an output line for each of them ? The same applies to "fuser /dev/block-dev" type queries, intended to find anything open on the device (which is really the same operation as -c, except specifying the device from which the mount-point was mounted, rather than the mount-point itself - block devices that aren't mounted don't typically contain any files that can be in use.) And once again, I will stop, before I spend all night on this utter garbage utility, which really isn't worth anyone's effort. Good luck coming up with a spec that actually describes how this thing works. Oh, you're also definitely going to need an APPLICATION USAGE section for this one (not just the current "None") - if for no other reason to point out that the system is dynamic, processes are coming and going, those processes are opening and closing files, all while fuser is trying to work out what is going on, By the time fuser prints something, there can't be any expectation that the something being printed is still valid. For example, in the "kill -s KILL" example I gave above, it is entirely possible that will (attempt to) kill a process that isn't using anything under the mountpoint given - but some now dead process was, and the pid has been reused while fuser has been running, before kill gets to see it). Application writers need to be aware of the limitations of this nonsense.

(0006335) kre (reporter) 2023-06-15 14:36	Oh, one more weird possible requirement, in the case of fuser /dev/block-device if the block device doesn't contain a file system, but instead contains swap space, is fuser intended to report on any processes which have pages swapped out to that device? And in that line, what of systems that allow swapping/paging to files in the filesystem, what's reported about one of those if it is named on the command line? Typically no process will have that file open (though the kernel will) so there would be no process ids to report - but it is still an active file on the filesystem (preventing umount(8) from working).

(0006341) geoffclare (manager) 2023-06-20 15:56	New proposal that (I hope) addresses all the points raised so far... On page 2816 line 92653 section fuser (NAME), change: list process IDs of all processes that have one or more files open to: list process IDs of all processes that are using one or more named files On page 2816 line 92657 section fuser (DESCRIPTION), change: The fuser utility shall write to standard output the process IDs of processes running on the local system that have one or more named files open. For block special devices, all processes using any file on that device are listed. The fuser utility shall write to standard error additional information about the named files indicating how the file is being used. Any output for processes running on remote systems that have a named file open is unspecified. A user may need appropriate privileges to invoke the fuser utility. to: For each file operand, in order, fuser shall write one line of output, some of it to standard output, and the rest to standard error, giving information about processes running on the local system that are using the file. A process shall be considered to be using a file if it has at least one open file descriptor associated with the file or if the file is a directory that is the current working directory or the root directory for the process, and may be considered to be using a file for other implementation-dependent reasons. If file names a block special device that contains a mounted file system, and the -f option is not specified, any processes using any file on that mounted file system and any processes that are using the device file itself shall be listed. Any output for processes running on remote systems that are using a named file is unspecified. A user may need appropriate privileges to invoke the fuser utility. When standard output and standard error are directed to the same file, the output for each file operand shall be interleaved so that it is written to the file in the following order: On standard error, a pathname for the file, immediately followed by a <colon> and zero or more <blank> characters. The pathname shall be either the file operand (unaltered) or the pathname that would result from a successful call to the realpath() function, defined in System Interfaces volume of POSIX.1-202x, with the file operand as its file_name argument. For each process using the file: On standard output, the process ID in the format: " %1d", <process ID> On standard error, information about the file's use by the process, in the following format: "%s", <use chars> if the -u option is not specified, or in the following format: "%s(%s)", <use chars>, <user name> if the -u option is specified, where <use chars> is a string of zero or more characters indicating the use of the file and <user name> is the user name corresponding to the real user ID of the process or, if the user name cannot be resolved from the real user ID of the process, the real user ID of the process in decimal. The value of <use chars> shall include the character 'c' if the process is using the file as its current directory and the character 'r' if the process is using the file as its root directory; implementations may include other alphabetic characters to indicate other uses of the file. On standard error, a <newline> character. When standard output and standard error are not directed to the same file, the data written to each shall be as described above but the ordering of writes to standard output relative to writes to standard error is unspecified. For example, fuser might first write the information for all file operands to standard error and then write all of the process IDs to standard output. On page 2816 line 92667 section fuser (OPTIONS, -c), change: The file is treated as a mount point and the utility shall report on any files open in the file system. to: If a file operand names a directory that is the mount point of a mounted file system, all processes using any file on that file system shall be listed as if they were using the named directory. The behavior for any file operand that names an existing file that is not the mount point of a mounted file system is unspecified. On page 2816 line 92674 section fuser (OPERANDS), change: A pathname on which the file or file system is to be reported. to: A pathname of a file for which the processes using the file are to be reported. On page 2817 line 92696-92698 section fuser, replace the STDOUT section with: See DESCRIPTION. On page 2817 line 92700-92716 section fuser, replace the STDERR section with: The fuser utility shall write diagnostic messages to standard error. The fuser utility also shall write information to standard error as specified in the DESCRIPTION section. On page 2818 line 92728 section fuser, change APPLICATION USAGE from "None" to: Things can change while fuser is running; the snapshot it gives is only true for an instant, and might not be accurate by the time it is displayed. On page 2818 line 92743 section fuser (EXAMPLES), change: fuser <block device> writes to standard output the process IDs of processes that are using any file which is on the device named by <block device> and writes to standard error an indication of how those processes are using the file. fuser -f <block device> writes to standard output the process IDs of processes that are using the file <block device> itself and writes to standard error an indication of how those processes are using the file. to: fuser <mounted block device> writes to standard output the process IDs of processes that are using any file on the mounted file system contained by <mounted block device> and of processes that are using the device file <mounted block device> itself, and writes to standard error an indication of how those processes are using the files. fuser -f <mounted block device> writes to standard output the process IDs of processes that are using the device file <mounted block device> itself and writes to standard error an indication of how those processes are using the file.

(0006348) kre (reporter) 2023-06-25 05:45	Re Note: 0006431 Looks fine to me. I am a little surprised by the stated requirement when a block device is a file arg, but that's just from how I read the text as it was before, not from any knowledge whatever of what the implementations actually do, and if they act as described, then that's fine (the whole utility is still badly designed rubbish, but that's a different issue).

(0006406) Don Cragun (manager) 2023-07-27 16:01	Interpretation response ------------------------ The standard states that the process ID is written using the format "%d", and conforming implementations must conform to this. However, concerns have been raised about this which are being referred to the sponsor. Rationale: ------------- Format "%d" allows, but does not require a space or tab before the process ID. The standard should require separation between process IDs in order for the output to be usable. Notes to the Editor (not part of this interpretation): ------------------------------------------------------- Make the changes in Note: 0006341

(0006407) agadmin (administrator) 2023-07-27 16:05	Interpretation proposed: 27 July 2023

(0006437) ajosey (manager) 2023-08-31 13:29	Interpretation approved: 31 August 2023

Issue History
Date Modified	Username	Field	Change
2023-06-13 15:58	geoffclare	New Issue
2023-06-13 15:58	geoffclare	Name	=> Geoff Clare
2023-06-13 15:58	geoffclare	Organization	=> The Open Group
2023-06-13 15:58	geoffclare	Section	=> fuser
2023-06-13 15:58	geoffclare	Page Number	=> 2817
2023-06-13 15:58	geoffclare	Line Number	=> 92698
2023-06-13 15:58	geoffclare	Interp Status	=> ---
2023-06-13 20:20	kre	Note Added: 0006330
2023-06-13 20:44	kre	Note Added: 0006331
2023-06-15 09:35	geoffclare	Note Added: 0006332
2023-06-15 13:58	kre	Note Added: 0006334
2023-06-15 14:36	kre	Note Added: 0006335
2023-06-20 15:56	geoffclare	Note Added: 0006341
2023-06-25 05:45	kre	Note Added: 0006348
2023-07-27 16:01	Don Cragun	Note Added: 0006406
2023-07-27 16:03	Don Cragun	Final Accepted Text	=> See Note: 0006406.
2023-07-27 16:03	Don Cragun	Status	New => Interpretation Required
2023-07-27 16:03	Don Cragun	Resolution	Open => Accepted As Marked
2023-07-27 16:03	Don Cragun	Tag Attached: tc3-2008
2023-07-27 16:05	agadmin	Interp Status	--- => Proposed
2023-07-27 16:05	agadmin	Note Added: 0006407
2023-08-31 13:29	ajosey	Interp Status	Proposed => Approved
2023-08-31 13:29	ajosey	Note Added: 0006437
2023-09-05 11:05	geoffclare	Status	Interpretation Required => Applied
2023-09-05 11:06	geoffclare	Tag Attached: applied_after_i8d3
2024-06-11 09:07	agadmin	Status	Applied => Closed

Relationships

Aardvark Mark IV