Anonymous | Login | 2024-03-29 11:28 UTC |
Main | My View | View Issues | Change Log | Docs |
Viewing Issue Simple Details [ Jump to Notes ] | [ Issue History ] [ Print ] | ||||||
ID | Category | Severity | Type | Date Submitted | Last Update | ||
0000876 | [1003.1(2013)/Issue7+TC1] Shell and Utilities | Objection | Omission | 2014-09-11 14:55 | 2019-06-10 08:54 | ||
Reporter | eblake | View Status | public | ||||
Assigned To | |||||||
Priority | normal | Resolution | Accepted As Marked | ||||
Status | Closed | ||||||
Name | Eric Blake | ||||||
Organization | Red Hat | ||||||
User Reference | eblake.cat | ||||||
Section | cat | ||||||
Page Number | 2526 | ||||||
Line Number | 81474 | ||||||
Interp Status | Approved | ||||||
Final Accepted Text | See Note: 0002424. | ||||||
Summary | 0000876: allow implementations to fail on 'cat a >> a' | ||||||
Description |
Several existing implementations of cat explicitly refuse to output to the same file descriptor as any of its inputs, to avoid filling up the disk if the file is non-empty: Solaris: $ cd /tmp/ $ touch a $ /usr/bin/cat a >> a cat: input/output files 'a' identical GNU: $ cd /tmp/ $ touch a $ cat a >> a cat: a: input file is output file This behavior doesn't seem to be permitted by the standard, although it is useful. The proposal here only fixes cat, although the group may decide to make the allowance for same input/output rejection have wider scope. |
||||||
Desired Action |
At line 81474 [XCU cat STDOUT], add a sentence: If the standard output is a regular file, and is the same file as any of the input file operands, the implementation may treat this as an error without writing anything to the output file. |
||||||
Tags | tc2-2008 | ||||||
Attached Files | |||||||
|
Notes | |
(0002377) eblake (manager) 2014-09-11 16:23 |
Paul Eggert pointed out to me that the example at line 81508 may also need updating: Because of the shell language mechanism used to perform output redirection, a command such as this: cat doc doc.end > doc causes the original data in doc to be lost. On my test system, data was indeed lost $ echo 1 > a $ echo 2 > b $ cat a b > a cat: a: input file is output file $ cat a 2 but it also demonstrates that cat merely skipped the processing of 'a', rather than failing up front. So it may be better to document that an input file is skipped if it is the same as the output (causing an overall error), rather than my original proposed wording tied to the output file. |
(0002378) eblake (manager) 2014-09-11 16:27 |
Arguably, using the fact that cat'ting a non-empty regular file to itself will eventually fail due to the disk being full, one can argue that POSIX already allows implementations to fail (and doesn't forbid from failing early, rather than first exhausting the disk). After all, we justified the reason that 'rm -rf /' is allowed to fail early rather than late based on the fact that it will eventually fail when rm is itself removed, and failing earlier is nicer if we can prove eventual failure would have happened. But it then raises the issue of whether cat'ting an empty file to itself is allowed to fail (since if permitted, it would not fill the disk, therefore, we cannot prove that it would fail late, so it is harder to justify it failing early). |
(0002379) shware_systems (reporter) 2014-09-11 16:54 |
Would it be better to make explicit, for those file types where the size is known when the operand is encountered, that cat shall only copy those size bytes in number, regardless of target? This would have cat a >> a only doubling in size, not entering an infinite loop that exhausts the disk. I believe this matches better the intent of the utility; copy this chunk of data, as it is now, to standard output. |
(0002380) eblake (manager) 2014-09-11 17:02 |
In response to Note: 0002379: If we were designing from scratch, maybe. But it would render existing implementations non-compliant; my argument here is to relax the standard to allow existing behavior, not to mandate new behavior of only doubling in size. |
(0002381) Don Cragun (manager) 2014-09-11 17:33 edited on: 2014-09-11 17:34 |
Concerning Note: 0002377: The command: cat a b > a does not indicate that cat skipped the contents of a; the contents of a were destroyed by the shell when it performed the requested redirection before cat entered main(). |
(0002382) eblake (manager) 2014-09-11 17:42 |
Ah, but: $ echo 1 > a $ echo 2 > b $ echo 3 > c $ cat a b c > b cat: b: input file is output file $ cat b 1 3 this time, the contents of 'b' are NOT empty at the time the error message about b is produced. |
(0002384) geoffclare (manager) 2014-09-12 09:40 edited on: 2014-10-17 09:22 |
I checked a bunch of other utilities on Solaris, HP-UX and Linux and the only other one that produced an error was (GNU) grep on Linux. So we should definitely make this change for grep as well, but I agree we should consider allowing it for some other utilities. The additional utilities I checked were: awk grep cut dd fold head m4 more nl paste od pr sed tail uniq. The file I used was not empty, in case that would make a difference. Update: In the Oct 16 teleconference it was decided to allow this only for cat. For cat, redirection to an input file is invariably only done by mistake, but for other utilities there are legitimate cases where this might be done and we do not want to prevent power users from making use of them. For example, a one-to-one substitution can be performed in place (thus not needing space for a second copy of the file) by doing a read-write redirection of stdout, e.g. sed s/a/A/ file 1<> file In the case of grep, a legitimate use case is: grep -l 'regexp' * > filelist which is a perfectly reasonable thing to do if it is known that none of the filenames match the regexp. In shells that support extended globbing the * could be replaced with !(filelist) but a portable method of avoiding filelist being an input file (assuming it has to be in the current directory) would be complicated and hard to get right. |
(0002424) Don Cragun (manager) 2014-10-16 17:11 edited on: 2014-10-23 15:41 |
Interpretation response: ------------------------ The standard states that the cat utility is required to copy the files named as operands to standard output even if standard output is redirected to one of those input files, and conforming implementations must conform to this. However, concerns have been raised about this which are being referred to the sponsor. Rationale: ------------- A cat command which redirects its standard output to a file that is also named as a file operand is likely to run until the output file reaches the maximum output file size allowed for that process or the underlying filesystem runs out of space. This is a common application error that accidentally consumes a lot of space needed by other users on the system. Therefore, many implementations of the cat utility check for this condition and, when it is found, print a diagnostic message and exit with a non-zero exit status. This behavior is not currently allowed by the standard, but should be. Notes to the Editor (not part of this interpretation): At page 2526 line 81474 (XCU cat STDOUT), add a sentence: If the standard output is a regular file, and is the same file as any of the input file operands, the implementation may treat this as an error. At page 2526 lines 81506-81509 (XCU cat EXAMPLES), change: Because of the shell language mechanism used to perform output redirection, a command such as this: to: Because of the shell language mechanism used to perform output redirection, a command such as this: |
(0002452) ajosey (manager) 2014-11-27 10:33 |
Interpretation Proposed: 27 November 2014 |
(0002515) ajosey (manager) 2015-01-05 14:13 |
Interpretation approved: 5 Jan 2015 |
Issue History | |||
Date Modified | Username | Field | Change |
2014-09-11 14:55 | eblake | New Issue | |
2014-09-11 14:55 | eblake | Name | => Eric Blake |
2014-09-11 14:55 | eblake | Organization | => Red Hat |
2014-09-11 14:55 | eblake | User Reference | => eblake.cat |
2014-09-11 14:55 | eblake | Section | => cat |
2014-09-11 14:55 | eblake | Page Number | => 2526 |
2014-09-11 14:55 | eblake | Line Number | => 81474 |
2014-09-11 14:55 | eblake | Interp Status | => --- |
2014-09-11 16:23 | eblake | Note Added: 0002377 | |
2014-09-11 16:27 | eblake | Note Added: 0002378 | |
2014-09-11 16:54 | shware_systems | Note Added: 0002379 | |
2014-09-11 17:02 | eblake | Note Added: 0002380 | |
2014-09-11 17:33 | Don Cragun | Note Added: 0002381 | |
2014-09-11 17:34 | Don Cragun | Note Edited: 0002381 | |
2014-09-11 17:42 | eblake | Note Added: 0002382 | |
2014-09-12 09:40 | geoffclare | Note Added: 0002384 | |
2014-10-16 17:11 | Don Cragun | Note Added: 0002424 | |
2014-10-16 17:12 | Don Cragun | Note Edited: 0002424 | |
2014-10-16 17:13 | Don Cragun | Interp Status | --- => Pending |
2014-10-16 17:13 | Don Cragun | Final Accepted Text | => See Note: 0002424. |
2014-10-16 17:13 | Don Cragun | Status | New => Interpretation Required |
2014-10-16 17:13 | Don Cragun | Resolution | Open => Accepted |
2014-10-17 08:59 | geoffclare | Note Edited: 0002384 | |
2014-10-17 09:03 | geoffclare | Note Edited: 0002384 | |
2014-10-17 09:22 | geoffclare | Note Edited: 0002384 | |
2014-10-23 15:19 | Don Cragun | Note Edited: 0002424 | |
2014-10-23 15:19 | Don Cragun | Note Edited: 0002424 | |
2014-10-23 15:36 | Don Cragun | Note Edited: 0002424 | |
2014-10-23 15:38 | Don Cragun | Note Edited: 0002424 | |
2014-10-23 15:39 | Don Cragun | Note Edited: 0002424 | |
2014-10-23 15:39 | Don Cragun | Resolution | Accepted => Accepted As Marked |
2014-10-23 15:40 | geoffclare | Tag Attached: tc2-2008 | |
2014-10-23 15:41 | Don Cragun | Note Edited: 0002424 | |
2014-11-27 10:33 | ajosey | Interp Status | Pending => Proposed |
2014-11-27 10:33 | ajosey | Note Added: 0002452 | |
2015-01-05 14:13 | ajosey | Interp Status | Proposed => Approved |
2015-01-05 14:13 | ajosey | Note Added: 0002515 | |
2019-06-10 08:54 | agadmin | Status | Interpretation Required => Closed |
Mantis 1.1.6[^] Copyright © 2000 - 2008 Mantis Group |