View Issue Details

IDProjectCategoryView StatusLast Update
00002441003.1(2008)/Issue 7Shell and Utilitiespublic2023-01-09 16:22
Reporterdwheeler Assigned Toajosey  
PrioritynormalSeverityObjectionTypeEnhancement Request
Status ClosedResolutionDuplicate 
NameDavid A. Wheeler
Organization
User Reference
Sectionxargs
Page Number3381
Line Number113172
Interp Status---
Final Accepted Text
Summary0000244: Add -0 to xargs
DescriptionAs noted in 0000243, the POSIX specification and common implementations permit nearly all bytes to be in pathnames, and yet it is surprisingly difficult to portably and correctly process such pathnames. This is one of the more common reason for security vulnerabilities (see CERT’s "Secure Coding" item MSC09-C, CWE 78, CWE 73, and CWE 116, and the 2009 CWE/SANS Top 25 Most Dangerous Programming Errors). For more details about this problem, see:
 http://www.dwheeler.com/essays/filenames-in-shell.html
 http://www.dwheeler.com/essays/fixing-unix-linux-filenames.html

The find command's "-exec...+" was intended to fix this, but it is simply inadequate. This is only practical for trivial commands. It also fails to acknowledge a very common construct, find ... -print0 | xargs -0, which is technically not portable (it's not in the spec) but is actually in wide use.

The current situation is that it is too hard to *correctly* process filenames, leading to a number of security vulnerabilities. Expecting users and developers to use complicated constructs to handle filenames is unreasonable and dangerous; they should be given a safer and easy-to-use set of constructs for this common case.

Many of the POSIX examples that use xargs simply hope that filenames do not include \n, with no reasonable way to enforce this. For example, line 99054 says "This example assumes that no pathnames in the archive contain <newline> characters", but there is no way to enfore this. Some examples, such as the one at line 108778, do not even note that they could go horribly wrong, and that a filename like mystuff\n/etc/shadow might cause the script to give away security information. And even that doesn't really work correctly; by default, xargs *parses* its input, in ways many users don't expect, making xargs remarkably hard to use.

Instead, please add to the standard xargs the -0 option, which adds support for the widely-used null byte as a "safe" separator or terminator of pathnames.

This is *widely* implemented, and easy to implement where it does not exist.

Note that this is ESPECIALLY useful if 0000243 is also accepted.
Desired ActionAfter line 113172 (which introduces the option list), add this:

-0
Standard input items are terminated *only* by a null byte or by end-of-file, and not by whitespace. Every character other than the null byte is taken literally; quotes and backslash are not special. There is no "logical" end-of-file string (this option implies
-E ""
). This option is useful in conjunction with find's "-print0" option.

TagsNo tags attached.

Relationships

duplicate of 0000243 Closedajosey Add -print0 to "find" 
related to 0000245 Closedajosey Add -0 option to shell's "read" 
related to 0000251 Closedajosey Forbid newline, or even bytes 1 through 31 (inclusive), in filenames 

Activities

Don Cragun

2011-07-06 23:55

manager   bugnote:0000883

The current plan is to add a set of byte values (based on single-byte characters in
the C Locale) that will not be allowed in newly created filenames using 0000251
as the bug to make the changes. If consensus is reached on a resolution for bug
251, the plan is to reject and close bugs 243, 244, and 245. These three bugs
will remain open until bug 251 is resolved

dwheeler

2011-11-27 22:01

reporter   bugnote:0001055

As I noted in bug #243, On further reflection, I recommend that bugs 243, 244, and 245 be accepted, *regardless* of the resolution of bug 251.

Adding these capabilities will make it easier to implement portable applications. Most POSIX systems today permit filenames with include anything except NUL (including newline). Even if a future version of POSIX forbids it, there's no guarantee that implementations will move quickly to implement this change to POSIX. In addition, most application developers will want to develop software that works correctly on both older and newer systems. Technically older POSIX systems need not implement bug 243, 244, and 245, but they are very widely implemented.

Perhaps most importantly, it will make it easy to write POSIX-compliant programs that can handle files with newlines embedded in them, perhaps from systems that complied with older versions of POSIX (that allowed such things).

Basically, let's move to systems that can't have nasty filenames - at least embedded newlines - AND provide a few portable tools in POSIX to deal with their legacy.

Don Cragun

2023-01-09 16:22

manager   bugnote:0006101

The changes for this proposal are included in the resolution of 0000243.

Issue History

Date Modified Username Field Change
2010-04-29 19:48 dwheeler New Issue
2010-04-29 19:48 dwheeler Status New => Under Review
2010-04-29 19:48 dwheeler Assigned To => ajosey
2010-04-29 19:48 dwheeler Name => David A. Wheeler
2010-04-29 19:48 dwheeler Section => xargs
2010-04-29 19:48 dwheeler Page Number => 3381
2010-04-29 19:48 dwheeler Line Number => 113172
2011-07-06 23:42 Don Cragun Relationship added related to 0000243
2011-07-06 23:43 Don Cragun Relationship added related to 0000245
2011-07-06 23:55 Don Cragun Note Added: 0000883
2011-11-27 22:01 dwheeler Note Added: 0001055
2023-01-09 16:13 Don Cragun Relationship replaced duplicate of 0000243
2023-01-09 16:15 Don Cragun Interp Status => ---
2023-01-09 16:15 Don Cragun Status Under Review => Closed
2023-01-09 16:15 Don Cragun Resolution Open => Duplicate
2023-01-09 16:22 Don Cragun Note Added: 0006101
2023-08-22 06:29 Don Cragun Relationship added related to 0000251