0001077: Recommend support for wide-character regcomp and regexec and/or specify multi-byte behavior

Notes
(0003376) deadpixi (reporter) 2016-09-11 17:53	As an additional note, wide-character regular expressions are described in standard C++ as well, as of the C++11 language.

(0003377) Don Cragun (manager) 2016-09-11 21:03	I see nothing in the description of regular expression in the standard nor in the description of the regcomp() and regexec() functions that restricts their use to single-byte character strings. I believe the requirements are perfectly clear and that they apply to multi-byte character strings (such as UTF-8) just as much as they to do to single-byte character strings (such as ASCII, EBCDIC, and ISO 8859-*).

(0003378) deadpixi (reporter) 2016-09-11 23:54	I appreciate the rapid response, thank you. I agree now that the mutli-byte concerns are likely unfounded. I still think that there is some significant merit to specifying the wide-character interfaces. The only way to portably do many things on strings is to first convert them to wide strings. For example, it is impossible to move a character pointer backwards to point at the previous character of a string portably, since in shifted encodings one would have to scan first from the beginning of the string. I appreciate everyone's time in considering this.

(0005833) nick (manager) 2022-05-12 15:51 edited on: 2022-05-12 15:51	This issue failed to get a sponsor, and is therefore rejected. If complete wording can be specified and forwarded to a sponsor (IEEE, ISO/IEC JTC 1/SC 22, or The Open Group), it can be reconsidered in the future.

Issue History
Date Modified	Username	Field	Change
2016-09-11 17:47	deadpixi	New Issue
2016-09-11 17:47	deadpixi	Name	=> Rob King
2016-09-11 17:47	deadpixi	Section	=> regcomp
2016-09-11 17:47	deadpixi	Page Number	=> -
2016-09-11 17:47	deadpixi	Line Number	=> -
2016-09-11 17:53	deadpixi	Note Added: 0003376
2016-09-11 17:53	deadpixi	Issue Monitored: deadpixi
2016-09-11 21:03	Don Cragun	Note Added: 0003377
2016-09-11 21:08	Don Cragun	Page Number	- => 1783-1789
2016-09-11 21:08	Don Cragun	Line Number	- => 57399-57703
2016-09-11 21:08	Don Cragun	Interp Status	=> ---
2016-09-11 23:54	deadpixi	Note Added: 0003378
2022-05-12 15:51	nick	Note Added: 0005833
2022-05-12 15:51	nick	Note Edited: 0005833
2022-05-12 15:52	nick	Status	New => Closed
2022-05-12 15:52	nick	Resolution	Open => Rejected

Relationships

Aardvark Mark IV