Austin Group Defect Tracker

Aardvark Mark IV


Viewing Issue Simple Details Jump to Notes ] Issue History ] Print ]
ID Category Severity Type Date Submitted Last Update
0000967 [1003.1(2013)/Issue7+TC1] Base Definitions and Headers Editorial Error 2015-07-02 19:47 2019-06-10 08:54
Reporter rhansen View Status public  
Assigned To
Priority normal Resolution Accepted As Marked  
Status Closed  
Name Richard Hansen
Organization BBN
User Reference
Section 6
Page Number 125-127, 130
Line Number 3482-3596, 3692-3698
Interp Status ---
Final Accepted Text Note: 0002752
Summary 0000967: character set confusion
Description While reviewing the Character Set and Locale chapters, I noticed some issues:

  • Pages 125-127 lines 3482-3596, Table 6-1:
    • it isn't clear whether <hyphen-minus> and <hyphen>, <full-stop> and <period>, etc. are intended to be synonyms for the same character or distinct characters (whose closest Unicode equivalent happens to be the same)
    • the table doesn't include <BEL>, <BS>, <HT>, <LF>, <VT>, <FF>, and <CR> as aliases for <alert>, <backspace>, <tab>, <newline>, <vertical-tab>, <form-feed>, and <carriage-return>, respectively
    • <carriage-return> is not sorted in UCS codepoint order
  • Page 130 lines 3692-3698, Table 6-2:
    • the characters aren't defined
    • the relationship between <BEL> and <alert>, <BS> and <backspace>, etc. is not specified, but Table 10-1 on page 198 makes it seem like there is one
    • if there is a relationship, there's no need to repeat the characters

Desired Action On page 125 lines 3479-3481, change:
The first eight entries in [xref to Table 6-1] are defined in the ISO/IEC 6429:1992 standard and the rest of the characters are defined in the ISO/IEC 10646-1:2000 standard.
to:
The first eight entries in [xref to Table 6-1] and all characters in [xref to Table 6-2] are defined in the ISO/IEC 6429:1992 standard. The rest of the characters in [xref to Table 6-1] are defined in the ISO/IEC 10646-1:2000 standard.
On page 125 line 3483, change:
| Symbolic Name | Glyph | UCS | Description |
to:
| Symbolic Name(s) | Glyph | UCS | Description |
On page 125 lines 3485-3491, change:
| <alert>           |  | <U0007> | BELL (BEL)                |
| <backspace>       |  | <U0008> | BACKSPACE (BS)            |
| <tab>             |  | <U0009> | CHARACTER TABULATION (HT) |
| <carriage-return> |  | <U000D> | CARRIAGE RETURN (CR)      |
| <newline>         |  | <U000A> | LINE FEED (LF)            |
| <vertical-tab>    |  | <U000B> | LINE TABULATION (VT)      |
| <form-feed>       |  | <U000C> | FORM FEED (FF)            |
to (note the description column changes to match the Unicode spec, and the change in line order):
| <alert>, <BEL>          |  | <U0007> | BELL                 |
| <backspace>, <BS>       |  | <U0008> | BACKSPACE            |
| <tab>, <HT>             |  | <U0009> | CHARACTER TABULATION |
| <newline>, <LF>         |  | <U000A> | LINE FEED (LF)       |
| <vertical-tab>, <VT>    |  | <U000B> | LINE TABULATION      |
| <form-feed>, <FF>       |  | <U000C> | FORM FEED (FF)       |
| <carriage-return>, <CR> |  | <U000D> | CARRIAGE RETURN (CR) |
On page 125 lines 3505-3510, change:
| <hyphen-minus> | - | <U002D> | HYPHEN-MINUS |
| <hyphen>       | - | <U002D> | HYPHEN-MINUS |
| <full-stop>    | . | <U002E> | FULL STOP    |
| <period>       | . | <U002E> | FULL STOP    |
| <slash>        | / | <U002F> | SOLIDUS      |
| <solidus>      | / | <U002F> | SOLIDUS      |
to:
| <hyphen-minus>, <hyphen> | - | <U002D> | HYPHEN-MINUS |
| <full-stop>, <period>    | . | <U002E> | FULL STOP    |
| <slash>, <solidus>       | / | <U002F> | SOLIDUS      |
On page 126 lines 3556-3557, change:
| <backslash>       | \ | <U005C> | REVERSE SOLIDUS |
| <reverse-solidus> | \ | <U005C> | REVERSE SOLIDUS |
to:
| <backslash>, <reverse-solidus> | \ | <U005C> | REVERSE SOLIDUS |
On page 126 lines 3559-3562, change:
| <circumflex-accent> | ^ | <U005E> | CIRCUMFLEX ACCENT |
| <circumflex>        | ^ | <U005E> | CIRCUMFLEX ACCENT |
| <low-line>          | _ | <U005F> | LOW LINE          |
| <underscore>        | _ | <U005F> | LOW LINE          |
to:
| <circumflex-accent>, <circumflex> | ^ | <U005E> | CIRCUMFLEX ACCENT |
| <low-line>, <underscore>          | _ | <U005F> | LOW LINE          |
On page 127 lines 3591-3592, change:
| <left-brace>         | { | <U007B> | LEFT CURLY BRACKET |
| <left-curly-bracket> | { | <U007B> | LEFT CURLY BRACKET |
to:
| <left-brace>, <left-curly-bracket> | { | <U007B> | LEFT CURLY BRACKET |
On page 127 lines 3594-3595, change:
| <right-brace>         | } | <U007D> | RIGHT CURLY BRACKET |
| <right-curly-bracket> | } | <U007D> | RIGHT CURLY BRACKET |
to:
| <right-brace>, <right-curly-bracket> | } | <U007D> | RIGHT CURLY BRACKET |
On page 128 lines 3623-3625, after applying the changes in 0000663, delete:
that are not included in [xref to Table 6-1]
On page 130 lines 3693-3698, change:
<ACK> <DC2> <ENQ> <FS>  <IS4> <SOH>
<BEL> <DC3> <EOT> <GS>  <LF>  <STX>
<BS>  <DC4> <ESC> <HT>  <NAK> <SUB>
<CAN> <DEL> <ETB> <IS1> <RS>  <SYN>
<CR>  <DLE> <ETX> <IS2> <SI>  <US>
<DC1> [EM]  <FF>  <IS3> <SO>  <VT>
to:
| Symbolic Name(s) |   UCS   | Description                 |
+------------------+---------+-----------------------------+
| <SOH>            | <U0001> | START OF HEADING            |
| <STX>            | <U0002> | START OF TEXT               |
| <ETX>            | <U0003> | END OF TEXT                 |
| <EOT>            | <U0004> | END OF TRANSMISSION         |
| <ENQ>            | <U0005> | ENQUIRY                     |
| <ACK>            | <U0006> | ACKNOWLEDGE                 |
| <SO>             | <U000E> | SHIFT OUT                   |
| <SI>             | <U000F> | SHIFT IN                    |
| <DLE>            | <U0010> | DATA LINK ESCAPE            |
| <DC1>            | <U0011> | DEVICE CONTROL ONE          |
| <DC2>            | <U0012> | DEVICE CONTROL TWO          |
| <DC3>            | <U0013> | DEVICE CONTROL THREE        |
| <DC4>            | <U0014> | DEVICE CONTROL FOUR         |
| <NAK>            | <U0015> | NEGATIVE ACKNOWLEDGE        |
| <SYN>            | <U0016> | SYNCHRONOUS IDLE            |
| <ETB>            | <U0017> | END OF TRANSMISSION BLOCK   |
| <CAN>            | <U0018> | CANCEL                      |
| [EM]             | <U0019> | END OF MEDIUM               |
| <SUB>            | <U001A> | SUBSTITUTE                  |
| <ESC>            | <U001B> | ESCAPE                      |
| <IS4>, <FS>      | <U001C> | INFORMATION SEPARATOR FOUR  |
| <IS3>, <GS>      | <U001D> | INFORMATION SEPARATOR THREE |
| <IS2>, <RS>      | <U001E> | INFORMATION SEPARATOR TWO   |
| <IS1>, <US>      | <U001F> | INFORMATION SEPARATOR ONE   |
| <DEL>            | <U007F> | DELETE                      |
On page 198 lines 6484-6500, delete the Symbolic Name column.

Note that the square brackets in [EM] should be angle brackets (I can't get Mantis to escape the angle brackets, so it thinks I want to italicize something).
Tags tc2-2008, UTF-8_Locale
Attached Files

- Relationships
related to 0000663Closedajosey 1003.1(2008)/Issue 7 Specification of str[n]casecmp is ambiguous 

-  Notes
(0002740)
geoffclare (manager)
2015-07-03 08:26

Table 6-2 is titled "Control Character Set" and is supposed to list all of the standard control characters. We should not remove the ones that are duplicated in Table 6-1. Please reinstate them (in the new format), and remove the page 128 deletion as that text will still be needed.
(0002741)
rhansen (manager)
2015-07-04 00:46

How about we rename the table to something like "Additional Control Characters"? I couldn't find any text that requires that table to contain every control character in the POSIX locale.

Either way, an additional change is required:

On page 129 lines 3689-3691 delete the following sentence:
Some of the encodings associated with the symbolic names in Table 6-2 (on page 130) may be the same as characters found in Table 6-1 (on page 125); both names shall be provided for each encoding.
(0002742)
rhansen (manager)
2015-07-04 01:06

Also:

On page 129 lines 3683-3686, change:
Each symbolic name specified in [xref to Table 6-1] (on page 125) shall be included in the file and shall be mapped to a unique coding value, except as noted below. The glyphs represented by the C character constants '{', '}', '_', '-', '/', '\\', '.', and '^' have more than one symbolic name; all symbolic names for each such glyph shall be included, each with identical encoding.
to:
Each symbolic name specified in [xref to Table 6-1] (on page 125) shall be included in the file. Each character in [xref to Table 6-1] (each row in the table) shall be mapped to a unique coding value.
(0002743)
rhansen (manager)
2015-07-04 01:11

The following sentence on page 129 line 3689 is a bit problematic:
The encoding values shall each be represented in a single byte.

Does that sentence refer to just the characters in Table 6-2? Or also the characters in Table 6-1?

I interpret this sentence as requiring all locales that support the Table 6-2 characters to encode each of those characters as a single byte. And if that sentence also applies to Table 6-1, then each of those must be encoded as a single byte as well. Is that the intended meaning?
(0002744)
geoffclare (manager)
2015-07-06 10:38

(Response to Note: 0002743)

That sentence was added in TC1 in order to require that the characters in Table 6-2 have single-byte encodings.

The equivalent requirement for the characters in Table 6-1 is at the top of page 128 (last bullet item).
(0002752)
rhansen (manager)
2015-07-10 21:04

On page 125 lines 3479-3481, change:
The first eight entries in [xref to Table 6-1] are defined in the ISO/IEC 6429:1992 standard and the rest of the characters are defined in the ISO/IEC 10646-1:2000 standard.
to:
The first eight entries in [xref to Table 6-1] and all characters in [xref to Table 6-2] are defined in the ISO/IEC 6429:1992 standard. The rest of the characters in [xref to Table 6-1] are defined in the ISO/IEC 10646-1:2000 standard.
On page 125 line 3483, change:
| Symbolic Name | Glyph | UCS | Description |
to:
| Symbolic Name(s) | Glyph | UCS | Description |
On page 125 lines 3485-3491, change:
| <alert>           |  | <U0007> | BELL (BEL)                |
| <backspace>       |  | <U0008> | BACKSPACE (BS)            |
| <tab>             |  | <U0009> | CHARACTER TABULATION (HT) |
| <carriage-return> |  | <U000D> | CARRIAGE RETURN (CR)      |
| <newline>         |  | <U000A> | LINE FEED (LF)            |
| <vertical-tab>    |  | <U000B> | LINE TABULATION (VT)      |
| <form-feed>       |  | <U000C> | FORM FEED (FF)            |
to (note the description column changes to match the Unicode spec, and the change in line order):
| <alert>, <BEL>          |  | <U0007> | BELL                 |
| <backspace>, <BS>       |  | <U0008> | BACKSPACE            |
| <tab>, <HT>             |  | <U0009> | CHARACTER TABULATION |
| <newline>, <LF>         |  | <U000A> | LINE FEED (LF)       |
| <vertical-tab>, <VT>    |  | <U000B> | LINE TABULATION      |
| <form-feed>, <FF>       |  | <U000C> | FORM FEED (FF)       |
| <carriage-return>, <CR> |  | <U000D> | CARRIAGE RETURN (CR) |
On page 125 lines 3505-3510, change:
| <hyphen-minus> | - | <U002D> | HYPHEN-MINUS |
| <hyphen>       | - | <U002D> | HYPHEN-MINUS |
| <full-stop>    | . | <U002E> | FULL STOP    |
| <period>       | . | <U002E> | FULL STOP    |
| <slash>        | / | <U002F> | SOLIDUS      |
| <solidus>      | / | <U002F> | SOLIDUS      |
to:
| <hyphen-minus>, <hyphen> | - | <U002D> | HYPHEN-MINUS |
| <full-stop>, <period>    | . | <U002E> | FULL STOP    |
| <slash>, <solidus>       | / | <U002F> | SOLIDUS      |
On page 126 lines 3556-3557, change:
| <backslash>       | \ | <U005C> | REVERSE SOLIDUS |
| <reverse-solidus> | \ | <U005C> | REVERSE SOLIDUS |
to:
| <backslash>, <reverse-solidus> | \ | <U005C> | REVERSE SOLIDUS |
On page 126 lines 3559-3562, change:
| <circumflex-accent> | ^ | <U005E> | CIRCUMFLEX ACCENT |
| <circumflex>        | ^ | <U005E> | CIRCUMFLEX ACCENT |
| <low-line>          | _ | <U005F> | LOW LINE          |
| <underscore>        | _ | <U005F> | LOW LINE          |
to:
| <circumflex-accent>, <circumflex> | ^ | <U005E> | CIRCUMFLEX ACCENT |
| <low-line>, <underscore>          | _ | <U005F> | LOW LINE          |
On page 127 lines 3591-3592, change:
| <left-brace>         | { | <U007B> | LEFT CURLY BRACKET |
| <left-curly-bracket> | { | <U007B> | LEFT CURLY BRACKET |
to:
| <left-brace>, <left-curly-bracket> | { | <U007B> | LEFT CURLY BRACKET |
On page 127 lines 3594-3595, change:
| <right-brace>         | } | <U007D> | RIGHT CURLY BRACKET |
| <right-curly-bracket> | } | <U007D> | RIGHT CURLY BRACKET |
to:
| <right-brace>, <right-curly-bracket> | } | <U007D> | RIGHT CURLY BRACKET |

On page 127 lines 3601-3603, delete the following sentence:
The table contains more than one symbolic character name for characters whose traditional name differs from the chosen name.

On page 128 lines 3623-3625, after applying the changes in 0000663, delete:
that are not included in [xref to Table 6-1]

On page 129 lines 3683-3691, change:
Each symbolic name specified in [xref to Table 6-1] (on page 125) shall be included in the file and shall be mapped to a unique coding value, except as noted below. The glyphs represented by the C character constants <tt>'{'</tt>, <tt>'}'</tt>, <tt>'_'</tt>, <tt>'-'</tt>, <tt>'/'</tt>, <tt>'\\'</tt>, <tt>'.'</tt>, and <tt>'ˆ'</tt> have more than one symbolic name; all symbolic names for each such glyph shall be included, each with identical encoding. If some or all of the control characters identified in [xref to Table 6-2] (on page 130) are supported by the implementation, the symbolic names and their corresponding encoding values shall be included in the file. The encoding values shall each be represented in a single byte. Some of the encodings associated with the symbolic names in [xref to Table 6-2] (on page 130) may be the same as characters found in [xref to Table 6-1] (on page 125); both names shall be provided for each encoding.
to:
Each symbolic name specified in [xref to Table 6-1] (on page 125) shall be included in the file. Each character in [xref to Table 6-1] (each row in the table) shall be mapped to a unique coding value. For each character in [xref to Table 6-2] that exists in the character set described by the file, the character's symbolic name(s) from [xref to Table 6-2] and the character's single-byte encoding value shall be included in the file.

On page 130 lines 3692-3698, change:
Table 6-2 Control Character Set
<ACK> <DC2> <ENQ> <FS>  <IS4> <SOH>
<BEL> <DC3> <EOT> <GS>  <LF>  <STX>
<BS>  <DC4> <ESC> <HT>  <NAK> <SUB>
<CAN> <DEL> <ETB> <IS1> <RS>  <SYN>
<CR>  <DLE> <ETX> <IS2> <SI>  <US>
<DC1> [EM]  <FF>  <IS3> <SO>  <VT>
to:
Table 6-2 Non-Portable Control Characters
| Symbolic Name(s) |   UCS   | Description                 |
+------------------+---------+-----------------------------+
| <SOH>            | <U0001> | START OF HEADING            |
| <STX>            | <U0002> | START OF TEXT               |
| <ETX>            | <U0003> | END OF TEXT                 |
| <EOT>            | <U0004> | END OF TRANSMISSION         |
| <ENQ>            | <U0005> | ENQUIRY                     |
| <ACK>            | <U0006> | ACKNOWLEDGE                 |
| <SO>             | <U000E> | SHIFT OUT                   |
| <SI>             | <U000F> | SHIFT IN                    |
| <DLE>            | <U0010> | DATA LINK ESCAPE            |
| <DC1>            | <U0011> | DEVICE CONTROL ONE          |
| <DC2>            | <U0012> | DEVICE CONTROL TWO          |
| <DC3>            | <U0013> | DEVICE CONTROL THREE        |
| <DC4>            | <U0014> | DEVICE CONTROL FOUR         |
| <NAK>            | <U0015> | NEGATIVE ACKNOWLEDGE        |
| <SYN>            | <U0016> | SYNCHRONOUS IDLE            |
| <ETB>            | <U0017> | END OF TRANSMISSION BLOCK   |
| <CAN>            | <U0018> | CANCEL                      |
| [EM]             | <U0019> | END OF MEDIUM               |
| <SUB>            | <U001A> | SUBSTITUTE                  |
| <ESC>            | <U001B> | ESCAPE                      |
| <IS4>, <FS>      | <U001C> | INFORMATION SEPARATOR FOUR  |
| <IS3>, <GS>      | <U001D> | INFORMATION SEPARATOR THREE |
| <IS2>, <RS>      | <U001E> | INFORMATION SEPARATOR TWO   |
| <IS1>, <US>      | <U001F> | INFORMATION SEPARATOR ONE   |
| <DEL>            | <U007F> | DELETE                      |
On page 198 lines 6484-6500, delete the Symbolic Name column.

Note that the square brackets in [EM] should be angle brackets (I can't get Mantis to escape the angle brackets, so it thinks I want to italicize something).

- Issue History
Date Modified Username Field Change
2015-07-02 19:47 rhansen New Issue
2015-07-02 19:47 rhansen Name => Richard Hansen
2015-07-02 19:47 rhansen Organization => BBN
2015-07-02 19:47 rhansen Section => 6
2015-07-02 19:47 rhansen Page Number => 125-127, 130
2015-07-02 19:47 rhansen Line Number => 3482-3596, 3692-3698
2015-07-02 19:47 rhansen Interp Status => ---
2015-07-02 19:52 rhansen Desired Action Updated
2015-07-02 19:54 rhansen Desired Action Updated
2015-07-02 19:58 rhansen Desired Action Updated
2015-07-02 19:59 rhansen Desired Action Updated
2015-07-02 20:00 rhansen Desired Action Updated
2015-07-02 20:02 rhansen Relationship added related to 0000663
2015-07-02 20:21 rhansen Tag Attached: UTF-8_Locale
2015-07-02 20:48 rhansen Desired Action Updated
2015-07-03 08:26 geoffclare Note Added: 0002740
2015-07-04 00:46 rhansen Note Added: 0002741
2015-07-04 01:06 rhansen Note Added: 0002742
2015-07-04 01:11 rhansen Note Added: 0002743
2015-07-06 10:38 geoffclare Note Added: 0002744
2015-07-10 21:04 rhansen Note Added: 0002752
2015-07-16 15:37 geoffclare Final Accepted Text => Note: 0002752
2015-07-16 15:37 geoffclare Status New => Resolved
2015-07-16 15:37 geoffclare Resolution Open => Accepted As Marked
2015-07-16 15:38 geoffclare Tag Attached: tc2-2008
2019-06-10 08:54 agadmin Status Resolved => Closed


Mantis 1.1.6[^]
Copyright © 2000 - 2008 Mantis Group
Powered by Mantis Bugtracker