Anonymous | Login | 2024-10-14 23:49 UTC |
Main | My View | View Issues | Change Log | Docs |
Viewing Issue Simple Details [ Jump to Notes ] | [ Issue History ] [ Print ] | ||||||
ID | Category | Severity | Type | Date Submitted | Last Update | ||
0001100 | [1003.1(2016/18)/Issue7+TC2] Shell and Utilities | Editorial | Clarification Requested | 2016-10-27 12:40 | 2018-05-17 16:03 | ||
Reporter | Mark_Galeck | View Status | public | ||||
Assigned To | |||||||
Priority | normal | Resolution | Rejected | ||||
Status | Closed | ||||||
Name | Mark Galeck | ||||||
Organization | |||||||
User Reference | |||||||
Section | 2.10 Shell Grammar | ||||||
Page Number | 2375-2381 | ||||||
Line Number | 75873-76150 | ||||||
Interp Status | --- | ||||||
Final Accepted Text | |||||||
Summary | 0001100: Rewrite of Section 2.10 Shell Grammar, of the Shell Standard, to fix previous reports, fix new issues, and improve presentation. | ||||||
Description |
I recently made several reports concerning sections 2.10.1/2, and then I saw at least one more problem of the similar kind. If I continue making incremental reports, even if the changes were approved, they will result in a bigger and bigger mess. Therefore I decided to cancel some previous reports, add new issues and make one summary report, which is a comprehensive rewrite of the whole Shell Grammar section, to fix the issues I find, as well as make the whole presentation more straightforward and less convoluted. Here is the list of all the specific bugs this report addresses, including some previous reports. I am not listing changes here that morely improve the presentation; to see all the changes, you should probably use some "diff" program. 1. Previous reports 1096, 1094, 1097, 1099, 1095, 1092 are included here and can be cancelled. 2. Previous reports 1098, 1093, 1091, 1088 can be cancelled. Let's say we classify them as bogus, and those changes are not included here. 3. (new issue) In the current standard, cmd_word cannot be a reserved word. It is very convoluted, but if you carefully trace the application of various rules to each other, you will end up that in fact, cmd_name and cmd_word follow exactly the same semantics right now, both do not allow reserved words. Only cmd_name should not allow reserved words. 4. (new issue) In multiple places in the current standard, rule 1 applies to WORD, and thus reserved words are not allowed, where all words should be allowed. Some of the reports above cover this. Additionally, we have: WORD in the case_clause production - currently it cannot be a reserved word, but it should be allowed to be a reserved word. Same for WORD in cmd_suffix production. ------------------------ This rewrite is intended only to include the changes mentioned above, and should otherwise be equivalent to the current standard. I will be happy to answer any questions, provide clarifications, or fix if you find any bugs. I do not have the time to discuss the merits of the changes. The maintainer of this standard is free to reject any part or all of this report, or to continue to rewrite my Section 2.10 in any way that suits them. I completely do not mind. Yes the text I provide for the new Section 2.10 is just raw text format, it does not have hyperlinks and different fonts. Somebody else would have to do that. Thank you! |
||||||
Desired Action |
2.10. Shell Grammar The following grammar defines the Shell Command Language. This formal syntax shall take precedence over the preceding text syntax description. The rules in Token Recognition delimit operator and word tokens. In order to appear in the grammar as token identifiers, the tokens shall be classified according to the following rules, applied in the following order of precedence: 1. The token identifier for any operator, occurs when the token is that operator. 2. IO_NUMBER is if the string consists solely of digits and the delimiter character is one of '<' or '>'. 3. This rule only applies in function_body production; see below in the grammar. Word expansion and assignment shall never occur, even when required by the rules below, when this production is being parsed. WORD is each token that might either be expanded or have assignment applied to it, consisting only of characters that are exactly described in Token Recognition. 4. The token identifier for any reserved word, occurs when the token is exactly that reserved word. Note: Because at this point <quotation-mark> characters are retained in the token, quoted strings cannot be recognized as reserved words. Also note that line joining is done before tokenization, as described in Escape Character (Backslash), so escaped <newline> characters are already removed at this point. 5. This rule only applies in simple_command and cmd_prefix productions; see below in the grammar. For this rule, we define "important" <equal-sign> characters in a token: they are unquoted (as determined while applying rule 4 from Token Recognition), that are not part of an embedded parameter expansion, command substitution, or arithmetic expansion construct (as determined while applying rule 5 from Token Recognition), and do not begin the token. For the definition of a valid "name", see XBD Name. 5a. If the token does not contain important '=' and is not a reserved word, it is WORD. If there are important '=' and all the characters preceding the first such '=' do not form a valid name, it is unspecified whether it is WORD. 5b. If the token does not contain important '=', it is WORD. If there are important '=' and all the characters preceding the first such '=' do not form a valid name, it is unspecified whether it is WORD. 5c. If there are important '=' and all the characters preceding the first such '=' form a valid name, it is ASSIGNMENT_WORD. If they do not form a valid name, it is unspecified whether it is ASSIGNMENT_WORD. Assignment to the name within ASSIGNMENT_WORD token shall occur as specified in Simple Commands. 6. This rule only applies in the function_definition production; see below in the grammar. NAME is any word that is not reserved, and is a valid name. 7. This rule only applies in the for_clause production; see below in the grammar. NAME is any valid name. 8. This rule only applies in pattern_not_esac productions; see below in the grammar. WORD is any word except 'esac'. 9. This rule only applies in here_end production; see below in the grammar. Quote removal shall be applied to the word to determine the delimiter that is used to find the end of the here-document that begins after the next <newline>. 10. This rule only applies in the filename production; see below in the grammar. The expansions specified in Redirection shall occur. WORD occurs, if as specified there, exactly one field results (or the result is unspecified), and there are additional requirements on pathname expansion. 11. WORD is any word. ------------------------------ The WORD tokens shall have the word expansion rules applied to them immediately before the associated command is executed, not at the time the command is parsed. /* ------------------------------------------------------- The grammar symbols ------------------------------------------------------- */ %token WORD %token ASSIGNMENT_WORD %token NAME %token NEWLINE %token IO_NUMBER /* The following are the operators (see XBD Operator) containing more than one character. */ %token AND_IF OR_IF DSEMI /* '&&' '||' ';;' */ %token DLESS DGREAT LESSAND GREATAND LESSGREAT DLESSDASH /* '<<' '>>' '<&' '>&' '<>' '<<-' */ %token CLOBBER /* '>|' */ /* The following are the reserved words. */ %token If Then Else Elif Fi Do Done /* 'if' 'then' 'else' 'elif' 'fi' 'do' 'done' */ %token Case Esac While Until For /* 'case' 'esac' 'while' 'until' 'for' */ /* These are reserved words, not operator tokens, and are recognized when reserved words are recognized. */ %token Lbrace Rbrace Bang /* '{' '}' '!' */ %token In /* 'in' */ /* ------------------------------------------------------- The Grammar ------------------------------------------------------- */ %start program %% program : linebreak complete_commands linebreak | linebreak ; complete_commands: complete_commands newline_list complete_command | complete_command ; complete_command : list separator_op | list ; list : list separator_op and_or | and_or ; and_or : pipeline | and_or AND_IF linebreak pipeline | and_or OR_IF linebreak pipeline ; pipeline : pipe_sequence | Bang pipe_sequence ; pipe_sequence : command | pipe_sequence '|' linebreak command ; command : simple_command | compound_command | compound_command redirect_list | function_definition ; compound_command : brace_group | subshell | for_clause | case_clause | if_clause | while_clause | until_clause ; subshell : '(' compound_list ')' ; compound_list : linebreak term | linebreak term separator ; term : term separator and_or | and_or ; /* Apply rule 7:*/ for_clause : For NAME do_group | For NAME sequential_sep do_group | For NAME linebreak In sequential_sep do_group | For NAME linebreak In wordlist sequential_sep do_group ; wordlist : wordlist WORD | WORD ; case_clause : Case WORD linebreak In linebreak case_list Esac | Case WORD linebreak In linebreak case_list_ns Esac | Case WORD linebreak In linebreak Esac ; case_list_ns : case_list case_item_ns | case_item_ns ; case_list : case_list case_item | case_item ; case_item_ns : pattern_not_esac ')' linebreak | pattern_not_esac ')' compound_list | '(' pattern ')' linebreak | '(' pattern ')' compound_list ; case_item : pattern_not_esac ')' linebreak DSEMI linebreak | pattern_not_esac ')' compound_list DSEMI linebreak | '(' pattern ')' linebreak DSEMI linebreak | '(' pattern ')' compound_list DSEMI linebreak ; /* Apply rule 8:*/ pattern_not_esac: WORD | WORD '|' pattern ; pattern : WORD | pattern '|' WORD ; if_clause : If compound_list Then compound_list else_part Fi | If compound_list Then compound_list Fi ; else_part : Elif compound_list Then compound_list | Elif compound_list Then compound_list else_part | Else compound_list ; while_clause : While compound_list do_group ; until_clause : Until compound_list do_group ; /* Apply rule 6:*/ function_definition : NAME '(' ')' linebreak function_body ; /* Apply rule 3:*/ function_body : compound_command | compound_command redirect_list ; brace_group : Lbrace compound_list Rbrace ; do_group : Do compound_list Done ; simple_command : cmd_prefix WORD cmd_suffix /* Apply rule 5b */ | cmd_prefix WORD /* Apply rule 5b */ | cmd_prefix | WORD cmd_suffix /* Apply rule 5a */ | WORD /* Apply rule 5a */ ; /* Apply rule 5c:*/ cmd_prefix : io_redirect | cmd_prefix io_redirect | ASSIGNMENT_WORD | cmd_prefix ASSIGNMENT_WORD ; cmd_suffix : io_redirect | cmd_suffix io_redirect | WORD | cmd_suffix WORD ; redirect_list : io_redirect | redirect_list io_redirect ; io_redirect : io_file | IO_NUMBER io_file | io_here | IO_NUMBER io_here ; io_file : '<' filename | LESSAND filename | '>' filename | GREATAND filename | DGREAT filename | LESSGREAT filename | CLOBBER filename ; filename : WORD /* Apply rule 10*/ ; io_here : DLESS here_end | DLESSDASH here_end ; here_end : WORD /* Apply rule 9 */ ; newline_list : NEWLINE | newline_list NEWLINE ; linebreak : newline_list | /* empty */ ; separator_op : '&' | ';' ; separator : separator_op linebreak | newline_list ; sequential_sep : ';' linebreak | newline_list ; |
||||||
Tags | No tags attached. | ||||||
Attached Files | |||||||
|
Relationships | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Mantis 1.1.6[^] Copyright © 2000 - 2008 Mantis Group |