Austin Group Defect Tracker

Aardvark Mark IV


Viewing Issue Simple Details Jump to Notes ] Issue History ] Print ]
ID Category Severity Type Date Submitted Last Update
0001546 [1003.1(2016/18)/Issue7+TC2] Base Definitions and Headers Editorial Enhancement Request 2022-01-08 03:48 2022-01-08 03:48
Reporter calestyo View Status public  
Assigned To
Priority normal Resolution Open  
Status New  
Name Christoph Anton Mitterer
Organization
User Reference
Section 9.3 Basic Regular Expressions
Page Number N/A
Line Number N/A
Interp Status ---
Final Accepted Text
Summary 0001546: BREs: reserve \? \+ and \|
Description At least two implementations use \? \+ and \| as special characters, namely GNU's sed and grep as well as busybox' sed and grep.
Thereby being used by countless of systems.

There \? \+ and \| act for BREs as their unquoted (? + and |) counterparts of EREs, bringing the same functionality to BREs.

POSXI, as of now (https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html) [^] says:
The interpretation of an ordinary character preceded by an unescaped <backslash> ( '\\' ) is undefined, except for:
- The characters ')', '(', '{', and '}'
- The digits 1 to 9 inclusive (see BREs Matching Multiple Characters)
- A character inside a bracket expression

so a conforming implementation either didn't support \? \+ and \| at all, or already defined it's own semantics, as e.g. GNU did.
Desired Action I would propose, that POSIX doesn't standardise \? \+ and \| ... but reserve these for that purpose.

I.e. not add the functionality as used by GNU as required for conforming implementations, but mandating that *if* an implementation chooses to use \? \+ and \| in some way - it has to be the counterparts of ERE's ? + and | .


The actual effect of this to the world would be small. Implementation that don’t already support it, wouldn’t need to add support for it.
But it would prevent that any conforming implementation uses these as special characters for other purposes.


GNU (and I guess others) support further such uses beyond POSIX, e.g.:
\b \B \< \> \w \W \s \S (GNU's sed and grep)
and:
\` \' (GNU's sed)

I personally would rather tend not to reserve those in POSIX.
POSIX isn't GNU, and the main reason why I propose the reservation of \? \+ and \| is that these are already special in EREs.
But the others from above aren't.
Tags No tags attached.
Attached Files

- Relationships

There are no notes attached to this issue.

- Issue History
Date Modified Username Field Change
2022-01-08 03:48 calestyo New Issue
2022-01-08 03:48 calestyo Name => Christoph Anton Mitterer
2022-01-08 03:48 calestyo Section => 9.3 Basic Regular Expressions
2022-01-08 03:48 calestyo Page Number => N/A
2022-01-08 03:48 calestyo Line Number => N/A


Mantis 1.1.6[^]
Copyright © 2000 - 2008 Mantis Group
Powered by Mantis Bugtracker