View Issue Details

IDProjectCategoryView StatusLast Update
00019731003.1(2024)/Issue8Shell and Utilitiespublic2026-03-07 11:48
Reporterstephane Assigned To 
PrioritynormalSeverityObjectionTypeClarification Requested
Status NewResolutionOpen 
NameStephane Chazelas
Organization
User Reference
Sectionawk utility
Page Number(page or range of pages)
Line Number(Line or range of lines)
Interp Status---
Final Accepted Text
Summary0001973: awk "numeric string " origins
DescriptionThe awk specification (https://pubs.opengroup.org/onlinepubs/9799919799.2024edition/utilities/awk.html#tag_20_06_13_02) has:

<<<
 A string value shall be considered a numeric string if it comes from one of the following:

     1. Field variables
     2. Input from the getline() function
     3. FILENAME
     4. ARGV array elements
     5. ENVIRON array elements
     6. Array elements created by the split() function
     7. A command line variable assignment
     8. Variable assignment from another numeric string variable
>>>

It can be interpreted as meaning that

awk 'BEGIN{$1 = "10"; print ($1 > 2)}'

should return 1 for instance. But no implementation that I know does so. By assigning a string to $1, it loses that special property whereby when containing a string that looks like a number it shall be considered as a number.

Same applies for ARGV, FILENAME...

Typo in rationale section btw:

> also shall have the numeric value of the numeric string" was removed
>from several sections of the ISO POSIX-2:1993 standard because *is*
> specifies an unnecessary implementation detail

is -> it
Desired ActionMake it clear that it's

1. the values resulting from the splitting of $0 into $1, $2... (upon first dereferencing after reading a record (including via getline) or after assigning to $0) that are candidate for numeric strings, not the field variables per se, or change to "Field variables unless subsequently assigned a string value".
3. the current input file as initially assigned to FILENAME, or "FILENAME unless subsequently assigned a string value"

And so on for ARGV and ENVIRON

Or add some verbiage below that list along the lines of:

> And the corresponding variables have not been subsequently assigned a string value.

That still makes it ambiguous for things like:

$1 = "10"; $0 = "11 12"; print ($1 > 2)

Where $1 becomes a numeric string again after assignment to $0
TagsNo tags attached.

Activities

stephane

2026-03-06 08:01

reporter   bugnote:0007389

Last edited: 2026-03-06 10:25

May also be worth clarifying (in a separate ticket?) that in sub(ere, repl[, in ]) or gsub(ere, repl[, in ]), if "in" (or $0 if omitted) was a numeric string and there's been at least one substitution, then it becomes a non-numeric string even if it contains the valid representation of a number.

That is for instance:

printf '%s\n' 12 13 | awk '{gsub("2", "2")}; $0 > 2'

Should output 13 only as 12 is successfully substituted with 12, making it a string which is not greater than "2" while 13 remains a numeric string as the substitution failed.

stephane

2026-03-06 09:59

reporter   bugnote:0007391

For context, that came up at https://unix.stackexchange.com/questions/804798/awk-comparing-to-constant-numbers

stephane

2026-03-06 10:20

reporter   bugnote:0007392

Last edited: 2026-03-06 10:21

> 1. the values resulting from the splitting of $0 into $1, $2...

Sorry, that wording is insufficient as that doesn't cover $0 itself, where it's its assigning from input (the current record or via getline) that is considered for numeric strings.

For the case where $0 is recomputed when individual fields are modified, I find the behaviour varies between implementations.

echo 10 | LC_ALL=C awk '{$1 = $1}; $0 > 2'

outputs 10 in mawk, but not in busybox, GNU nor bwk's `awk`.

While

echo 10 | LC_ALL=C awk -v OFS=. '{$2 = 3}; $0 > 2'

outputs 10 in none of them.

agadmin

2026-03-07 11:48

administrator   bugnote:0007393

Adjust summary as requested (seq 38910)

Issue History

Date Modified Username Field Change
2026-03-06 07:22 stephane New Issue
2026-03-06 08:01 stephane Note Added: 0007389
2026-03-06 09:59 stephane Note Added: 0007391
2026-03-06 10:20 stephane Note Added: 0007392
2026-03-06 10:21 stephane Note Edited: 0007392
2026-03-06 10:25 stephane Note Edited: 0007389
2026-03-07 11:48 agadmin Summary awk "string variables" origin => awk "numeric string " origins
2026-03-07 11:48 agadmin Interp Status => ---
2026-03-07 11:48 agadmin Note Added: 0007393