Austin Group Defect Tracker

Aardvark Mark IV


Viewing Issue Simple Details Jump to Notes ] Issue History ] Print ]
ID Category Severity Type Date Submitted Last Update
0000800 [1003.1(2013)/Issue7+TC1] System Interfaces Editorial Enhancement Request 2013-11-18 10:50 2014-04-11 07:01
Reporter joerg View Status public  
Assigned To
Priority normal Resolution Rejected  
Status Closed  
Name Jörg Schilling
Organization
User Reference
Section stdarg.h and printf()
Page Number 342, 905
Line Number 11449, 11485, 30393
Interp Status ---
Final Accepted Text
Summary 0000800: Add support for the recursive printf() format %r
Description The recursive printf format "%r" exists since at least aprox. 1980. It is not
in the standard yet because at some time in the mid 1980 people believed that
this feature cannot be implemented in a portable way and the technically
inferior v*printf() was introduced instead.

In contrary to the v*printf() functions, the "%r" format not only allows one
level of indirection but an unlimited number of indirections. If the standard
adds new printf()-like functions, introducing related v*() functions can be
avoided if the "%r" format is present.
 
The "%r" format takes two arguments:
 
- a printf format string that again may contain "%r".
 
- a parameter of the type "va_list" that holds the arguments for the
        format string from above.

Now that the "%r" format exists in a reference implementation since more than
30 years and has been ported to all known platforms, it is time to add it to
the standard.
 
In order to implement "%r", an enhancement to stdarg.h is needed, as the type
"va_list" is an opaque type that may be based on the only non-orthogonal type
from the C-language, the array.

The enhancement is not needed for users of the feature but for implementing
"%r". The enhancement is:
 
va_list va_arg_list(va_list ap) fetches a va_list argument from an variable
                                argument list. It is not possible to assign
                                the result of a va_arg_list() call to a
                                variable, if va_list is an array, but it may
                                be used as argument of a function call.

The implementation of va_arg_list() is done this way:
 
#ifdef VA_LIST_IS_ARRAY
#define va_arg_list(list) va_arg(list, void *)
#else
#define va_arg_list(list) va_arg(list, va_list)
#endif
This is an example on how to use the feature:
 
xprnt(char *fmt, ...)
{
        va_list lst;
  
        va_start(lst, fmt);
        printf("Output: %r.\n", fmt, lst);
        va_end(lst);
}
  
main()
{
        xprnt("%d %s", 123, "test");
}

Desired Action On page 342 after line 11449 add:
 
va_list va_arg_list(va_list ap);
 
After line 11485 add:
 
va_arg_list returns the next va_list type argument in the list pointed to by
ap. The behavior is as if va_arg(ap, va_list) was called, but this would not
work in case va_list is an array type and va_arg_list works even when va_list
is an array type. The result from calling va_arg_list() cannot be assigned to
a variable but va_arg_list() may appear as parameter of a function call.
 
??? Should we require that va_arg_list() may be used as the second argument
of a va_vopy() call ???
 
On page 905 after line 30393 add:
 
r Recursive printf(). The %r format takes two parameters, a pointer
        to an array of char being the new format string and a va_list that
        holds the arguments for the format string.
 
Tags No tags attached.
Attached Files

- Relationships

-  Notes
(0001995)
dalias (reporter)
2013-11-18 19:21

Generally I object to adding this to POSIX for lack of precedent. An idea from 33 years ago that was abandoned at the time is not a precedent. If it's to be added, I think it should be added to the core C language, not to POSIX.

I also don't see any reason to couple the addition of this va_arg_list macro with the proposed %r format specifier to printf. The mechanism by which %r itself would be implemented is outside the scope of the standards.
(0001996)
joerg (reporter)
2013-11-18 20:33
edited on: 2013-11-18 20:44

Be open to things you don't know... Just because you don't know it,
doesn't verify that there is no precedent. As mentioned, there is a
heavily used implementation and this also appeared in a real time
UNIX clone in 1982 already.

POSIX likes to see a reference implementation and this implementation
exists.

BTW: if you take the time to look at a printf() implementation, you may
be able to understand why va_arg_list() is needed to implement the feature.

When SunOS 4 or SVr4 (I am not sure who actually introduced that) came up
with the '$' modifier for printf(), they needed to introduce va_copy()
for the same reason why %r needs va_arg_list().

(0001997)
shware_systems (reporter)
2013-11-19 20:33

The v*printf functions may have been less functional, but they were backwards compatible. This has %r consuming 2 arguments, where all the other format specifiers and modifiers just consume 1. This breaks the requirements of the %n$ type of specifier. Having %r just refer to the va_list argument would fix that, with maybe %R referring to a format string and a * modifier on %r referring to a list size count instead of, or with, a format. A * on %R could be used for max width like with %s too. Using the separate specifier would indicate that the string was to be saved for use with %r, not used directly like %s. Also, while va_arg() is supposed to be a macro, I've seen at least one implementation that uses function calls to implement it, so va_arg_list() would need to guard against an array being a function result as well. Relying on the compiler to implicitly cast void * to &va_list may be problematic. It might be easier to add an optional named parameter to va_copy() with a boolean to advance the arg reference after copying the argument as a list. While I'm not against the idea, I think changes like that would be needed and an implementation incorporating them as proof of concept is also desirable.
(0001998)
joerg (reporter)
2013-11-20 10:09
edited on: 2013-11-20 10:24

I am not sure whether I understand you correctly. Do you like to say that
having a real world implementation used since 31+ years and verified at
least on:

Alpha
amd64
ARM
HP-PA
Itanium
MC68x00
MC88x00
MIPS
PPC
s390
sparc
superH
x86

using 14 different compilers is not sufficient as a "proof of concept"?
If you believe this, we should withdraw v*printf() as something that
exists only since 25 years then of course cannot be seen as a proof of
concept either.

If you have a look at the printf implementation, you will see, that
the format %*.*s takes 3 var arg parameters which is even more than
what %r needs.

If you like to implement %n$, you of course need to implement a parser
that understands the format in order to be able to correctly skip
to the right parameter. The effort for supporting %r is the same effort
as you need to add support for e.g. long double parameters.

IIRC, var_list may be either a void *, a structure or an array[1] of
structure. Anyway, the first architecture that needed a different
handling since 1982 was PPC and since va_arg_list() was implemented
as documented 15 years ago (after the PPC problem has been discovered),
all new architectures worked without a need for another change.

If you know of a real problem, you are of course welcome for a discussion.

(0001999)
shware_systems (reporter)
2013-11-20 20:24
edited on: 2013-11-26 19:00

Re: "If you have a look at the printf implementation, you will see, that
the format %*.*s takes 3 var arg parameters which is even more than
what %r needs."

No, each '*' takes 1 each and the 's' takes 1, so none consumes more than 1; there's a character for each argument. A '%r' takes two arguments for the one character. I'm not debating it isn't doable as proposed as an extension; the actual mechanics when the compiler cooperates are pretty simple, and the list shows many have. I even grant it could be added to the C standard as is as a generic specifier, since C doesn't require %n$ forms of a format string. However, all those that have done it are non-conforming to POSIX by relaxing the restriction from Line 30189-91:
"When numbered argument specifications are used, specifying the Nth argument requires that all the leading arguments, from the first to the (N−1)th, are specified in the format string."

That string could be:
    printf("%1$*3$.*2$s", str, num, width)
where with%r it would be
    printf("%1$*4$.*3$r", str, lst, num, width)
and no %2$ or %n$*2$ reference.
Even with %r taking no modifiers, with a following specifier of any sort, e.g.:
    printf("%1$r %5$*3$.*4$s", str, lst, width, num, str2)
has no %2$ or %n$*2$ reference either, yet the %s reference is how "%*.*s" would be modified for similar argument positioning.

The only case where it's compatible is when '%N$r' is required to be the last format specifier. I've no idea for what historical reason that restriction is there, but it is so needs to be accounted for. My change accounts for it and adds more general purpose functionality to the concept, which is what would probably need the new reference implementation.

Re: "IIRC, var_list may be either a void *, a structure or an array[1] of structure."

As to the standard, va_list is allowed to be anything, including int * or char * and struct[NL_ARGMAX] arrays. Using those might make an implementation less efficient, but they're allowed.
Per Line 11459: "The <stdarg.h> header shall define the va_list type for variables used to traverse the list."

The C standard, c99tc3 Sect. 7.15p3, uses:
"The type declared is va_list which is an object type suitable for holding information needed by the macros va_start, va_arg, va_end, and va_copy."

The "object type" qualifier still includes arrays of any sort, not just array[1]. Modifying va_copy() instead of adding va_arg_list() avoids the macro having to be usable as all rvalue types, not just rvalue atomic types, including long double, that may be returned by functions. All the standard format specifiers and * are restricted to those atomic types so the current language is adequate. The referenced implementations may have decided to pick just one of those 3 types, but that's an implementation defined limit, not what the standard requires.

So, from a standardization standpoint, these are issues that just haven't bitten anyone yet but might if it was required behavior. Either the addition changes or those existing parts change somehow to resolve the conflicts, and the proposal would have to account for those changes to existing parts too. If any of the referenced compilers are considered conforming it's with the behavior of %r as described disabled, or not tested for as being an extension.

(0002000)
dalias (reporter)
2013-11-20 20:42

Jörg, I think it would be helpful to your argument to cite the specific real-world implementation you're talking about rather than just asserting that it exists with no reference.

I'm quite aware that the fact va_list can have array type makes lots of things problematic (I've dealt with several such issues in the past including in my implementation of printf), but I still think the mechanisms of how %r might be implemented are outside the scope of the request for it to be added.

Regarding the comments by shware_systems about the number of arguments, it's conceivable that the proposed %r specifier could take as its argument a pointer to a structure containing pointers to the format string and va_list object. However if this proposal is really based on existing practice, that would of course deviate from existing practice. I'm interesting in hearing from Jörg how the current implementation that provides %r handles combining it with i18n-style %n$ argument specifiers.
(0002002)
Don Cragun (manager)
2013-11-20 21:04

Note: 0002000 was entered twice; I removed the duplicate.
(0002005)
joerg (reporter)
2013-11-21 10:27
edited on: 2013-11-21 16:13

Note: 0002000

The feature appeared in 1980 on UNOS, the first real time UNIX clone.
UNOS was marketed until the first half of the 1990 and later in long
term support (maybe as it was also used for rocket control devices).

When we moved from UNOS to SunOS in 1985, I created libschily as a
portability helper that includes support for a printf() implementation
with %r. Libschily is used by various OSS projects, similar as libast
is used by some AT&T OSS projects.

%r is needed for the error reporting functions:

comerr()
comerrno()
errmsg()
errmsgno()
...

that exist since 1980 and that write error messages in a standardized
way.

BTW: I don't like the idea with the structure shell as it makes programs
complexer than needed. I could however add support for %r to the printf
in libc from OpenSolaris, to check it with %n$. The current libschily
does not yet support %n$.


I believe that the simplest way to deal with the %n$ feature is to
add a note past line 30189..30191.

After: When numbered argument specifications are used, specifying the Nth
      argument requires that all the leading arguments, from the first
      to the (N−1)th, are specified in the format string.

Add:

      If the format string contains %n$r, this takes two arguments. So if
      %3$r appears in the format string, this refers to arguments 3 and 4
      and %4$ must not appear in the format string in such a case.

It may be that ne need to require that va_copy(b, va_arg_list(list)) works,
which looks like requiring that va_copy() does not reference it's second
argument more than once.

Note that such enhanced constraints are not specific for this case, we
e.g. added a requirement that dlsym() returning void * must be able to
cast this value for data and text addresses.

(0002007)
joerg (reporter)
2013-11-21 13:07
edited on: 2013-11-22 10:20

Note: 0001999

The current implementation does not support fieldwidth modifiers.

If would be possible to support a maximum fieldwidth by just counting
characters, but in order to implement a minimum right adjusted
fieldwidth, the implementation would need to create a malloc()ed copy
first, to be able to add blanks to the left. I am not sure whether this
is really needed.

(0002018)
joerg (reporter)
2013-11-23 15:17
edited on: 2013-11-26 18:26

I implemented the %r feature in libc from OpenSolaris and
it turned out that the current description fits nicely even to a
completely different implementation.

Implementing %r without supporting %n$ takes aprox. 10 lines of new
code. This is for an implementation that coveres printf() and wprintf()
with the same code base.

Implementing support for %n$ for 'n' < MAXARGS takes aprox. 5 lines
of code.

Solaris printf supports an unlimited 'n' with %n$, it took another
5 lines of code to support n >= MAXARGS.

As OpenSolaris printf defines a structure around va_list, there is
no need to require va_copy(to, va_arg_list(list)) to work.

Both implementations (libschily & libc from OpenSolaris) currently
ignore fieldwidth and precision for %r.

(0002019)
dalias (reporter)
2013-11-23 18:11

The fact that a "worse is better", inconsistent implementation is trivial is not an argument for standardization. Anything standardized should be consistent, meaning field widths and precisions and i18n %n$ forms all work correctly. Note that making width/precision work correctly does not seem to require unlimited buffering/malloc. Instead the va_list can be copied and a dummy run can be performed first to determine the length, and this length result used to pad or trim the output on the second run. In order for such an implementation to work correctly, the actual run also needs to work on a copy of the va_list; otherwise, recursive use of %r would consume the va_list on the first run. So while not impossible, making this work is also non-trivial. The reason I raise the issues involved in avoiding buffering is that some implementors want dprintf and snprintf to be reentrant/AS-safe. Roland McGrath (inventor of dprintf and the original implementation in glibc) has stated on the libc-alpha list that he intended for it to be AS-safe initially, and there's interest into restoring this property in the future. My implementation in musl libc is also AS-safe and intended to remain safe.
(0002020)
joerg (reporter)
2013-11-26 18:03
edited on: 2014-01-23 16:30

I cannot edit the desired action item, so I add a modified copy here:

On page 342 after line 11449 add:
 
va_list va_arg_list(va_list ap);
 
After line 11485 add:
 
va_arg_list returns the next va_list type argument in the list pointed to by
ap. The behavior is as if va_arg(ap, va_list) was called, but this would not
work in case va_list is an array type and va_arg_list works even when va_list
is an array type. The result from calling va_arg_list() cannot be assigned to
a variable but va_arg_list() may appear as parameter of a function call.
 
On page 905 after line 30393 add:
 
r Recursive printf(). The %r format takes two parameters, a pointer
        to an array of char being the new format string and a va_list that
        holds the arguments for the format string. When the new format string
        has been processed, processing continues with the previous format
        string.

        It is unspecified whether field width or precision influence the
        processing.


On page 900 after line 30191 add:

      If the format string contains %n$r, this takes two arguments. So if
      %3$r appears in the format string, this refers to arguments 3 and 4
      and thus %4$ must not appear in the format string in such a case.

On page 911 at line 30672 replace:

None.

by:

A construct like %n$*n$.*n$d consumes four printf arguments, a construct
like %n$r consumes two printf arguments. This may quickly reach the
NL_ARGMAX limit. Implementors are encouraged to support an an unlimited
number of i18n %n$ elements instead of the minimum acceptable value of 9
as defined for limits.h

(0002021)
joerg (reporter)
2013-11-26 18:24
edited on: 2013-11-27 11:00

Regarding Note: 0002019

I am not sure where you suspect an "inconsistent implementation".

Not implementing field width and precision is not inconsistent but
aligned with the fact that there is no "nvprintf()" that woould allow
you to define field width and precision for the overall result of a
printf() call. I did not needed it during the past 30 years and other
people I asked are not interested in having it implemented with %r.


If you see "inconsistent" in relation to %n$r, the current proposal
is consistent with the previous definitions for %n$. The related code
only needs enhancements for the case that va_list is larger than sizeof
int.

My experiences from implementing support for %r and %n$r in the
OpenSolaris libc makes me assume that if you need more than aprox.
10 lines of new code to support %r or if you need more than aprox.
5 lines of code to implement %n$ in addition, you may have started
with a base implementation that lacks a sufficient amount of consistence.

(0002022)
shware_systems (reporter)
2013-11-26 19:31
edited on: 2013-11-26 19:42

I think there should also be a mention that for the %n$r form the range of n is [1,NL_ARGMAX-1], and use of %r may reduce the effective count of format specifiers and '*' widths an interface call can specify to (NL_ARGMAX+1)/2, when it's used multiple times in the initial format string.

(0002023)
joerg (reporter)
2013-11-26 20:22
edited on: 2013-12-02 10:37

It would be nice to hear about properties of other implementations.

Printf from OpenSolaris caches the first 30 arguments and supports
an arbitrary amount of %n$ arguments.

Are there other implementations that limit n in %n$?
If there is such a limitation, what is the actual max value?

From my understanding, the number of arguments is not halved, but
as %r takes two arguments, it may have a similar effect as when
using constructs like: %n$*n$.*n$d.

I understand your concern and I believe that it may be nice to
add a related rationale section.

(0002024)
shware_systems (reporter)
2013-11-27 02:19

NL_ARGMAX is a required value of <limits.h>, minimum 9, maximum arbitrary but finite. Seems silly, but the minimum allowed by the C standard is 1. One reason it's there is so some of the formatted output lines of the standard utilities don't need to be split up into multiple print() calls, if a compiler wanted to use a lower limit. All are supposedly limited, in other words, and EINVAL is the code for when an index is used larger than that, that I see. It's rarely an issue as compiles usually fail with "Too many arguments in call" which preempts a run time "Too many specifiers in string". The practical limit is related to the compiler's total named and varargs identifiers limit, which is a minimum of 127.

Halved is worst case, yes, which is why it's "may reduce", not "shall reduce". With an NL_ARGMAX of 9, "%r%r%r%r%r" wouldn't have enough arguments. "%r%r%r%r%d" would so that's the '+1'.
(0002025)
dalias (reporter)
2013-11-27 02:51

NL_ARGMAX pertains only to the POSIX %n$-style i18n format strings (where it is the maximum value of "n" the implementation promises to support), and has nothing to do with standard C format strings.
(0002026)
joerg (reporter)
2013-11-27 10:59

Using %n$*.n$*n$d reduces the number of output fields by the factor of 3.
This is worse than what you get from using %r.

BTW: this is not a compiler property but a property of libc.

There is another limit, that may be enforced by the compiler: the total
number of parameters to a function. I know of no compiler that actually
limits the number of function parameters as this would prevent my option
parser to work for programs like star or mkisofs with more than 200 options.

And please note that the actual value of NL_ARGMAX may be useless as
OpenSolaris e.g. defines NL_ARGMAX to 9 but supports an unlimited number
of %n$ args.

I am therefore going to add a rationale section to Note: 0002020
(0002028)
shware_systems (reporter)
2013-11-28 00:46

Re: 2026
"There is another limit, that may be enforced by the compiler: the total
number of parameters to a function. I know of no compiler that actually
limits the number of function parameters as this would prevent my option
parser to work for programs like star or mkisofs with more than 200 options.

And please note that the actual value of NL_ARGMAX may be useless as
OpenSolaris e.g. defines NL_ARGMAX to 9 but supports an unlimited number
of %n$ args."

I vaguely remember this being discussed at one time as a side issue of another ERN, so some particulars may be off. Questions were asked, but no change to the language appeared warranted afterward so I can't really provide a confirming URL about this. I forget whether an actual implementation was cited or it was more theoretical as to why there wasn't more precise language about it. IIRC, with that caveat, the discussion had when the constant was added there were a few versions of cc for smaller systems, or it was known to be possible and desirable for some architectures, that set an arbitrary maximum of parameters. Code considered conforming that was compiling and running cleanly on larger systems was getting, or could get, mysterious failures when compiled on or for those smaller systems. It was added more as a debugging aid to point out that some code might have portability issues which the standard couldn't adequately guard against because the C standard wasn't guarding against it enough, and as a compile time hook so one code base might be developed to handle small and larger systems. It didn't guarantee such code would be portable, just that it stood a higher chance of being so. The value of 9 was picked (vague here) as large enough so most standard utilities didn't have to change their code significantly to report things in a locale generic fashion, and I think is why it has an NL_ prefix, and not being so large a significant number of cc implementations would be impacted. The intent was all C forms could use the %n$ form, which is where I get it does apply to both, though the converse might not hold due to argument reuse. No one was happy anything needed to be picked, but it was considered better to have it than not at the time it was added, is what I remember pretty firmly as the opinion cited.

In that vein, what OpenSolaris does is considered a benign extension, in that code written to compile based on a value of NL_ARGMAX where it's smaller than the developer would prefer will still compile and run. It probably won't be as efficient as code written for a large NL_ARGMAX, over 1K, but them's the breaks; it's still more portable. Code developed on there that doesn't take it into account may not be portable, however, as the compiler doesn't inspect format strings and some implementations may defer detection to some routine in libc, i.e. that and variable argument handling issues in general may not be visible until the application is run.

I believe the above fairly represents the discussion, but if someone that was involved in putting the version together where NL_ARGMAX was introduced notes any major inaccuracies in this please share it.
(0002029)
dalias (reporter)
2013-11-28 00:50

A particular value for NL_ARGMAX is not "useless" just because an implementation actually accepts larger values for n in %n$. NL_ARGMAX is a guarantee that any format string using NL_ARGMAX or fewer positional arguments will be accepted by the implementation, not a guarantee that using more will fail. In fact the latter would be mostly useless.

A more accurate statement would be that it's "useless" for the implementation to support more than NL_ARGMAX positional arguments (i.e. more than it advertises supporting) since a correct application cannot make use of them.
(0002030)
joerg (reporter)
2013-11-28 11:09
edited on: 2013-11-28 15:05

The %n$ feature appeared on SunOS-4.1 24 years ago without a limit.

I would guess that NL_ARGMAX was introduced by the POSIX standard
and defined large enough to be able to print the date using this
feature with some reserve added.

(0002115)
eblake (manager)
2014-01-23 21:23

Discussion on the 23 Jan 2014 call mentioned the following points:

printf("%r", newfmt, va_list); // same as vprintf(newfmt, va_list)

In $ notation, %r always consumes a pair of consecutive arguments, with only the format argument listed in the $ notation:
    printf("%1$r %3$d", newfmt, va_list, 7); // "... 7"

Use (or not) of $ modifiers is per-format string (outer string is not aware of whether inner string used $, nor how many arguments were used by inner string):
    if va_list1 contains 2 elements "str", 1, then:
    printf("%r %d", "%2$d %1$s", va_list1, 2); // "1 str 2"

It is probably okay to allow reuse a format argument as both the format arg of a %r and as a %s via $ notation:
    printf("%1$s %1$r %3$d", "string", va_list1, 3); // "string string 3"
But not a good idea to allow reuse the va_list arg in any other $ specifier:
    printf("%1$r %2$p", "fmt", va_list1); // undefined behavior if sizeof(void*) != sizeof(va_list)

Okay to use %r twice in one format string:
    printf("%r %d %r\n", fmt1, va_list1, 5, fmt2, va_list2); // "fmt1... 5 fmt2..."
But be careful that the same va_list is not fed through two different parameters; use va_copy so that there is no risk of traversing the list in one instance affecting what the second instance sees for the list

How would flags work? If va_list1 contains the int 1, would printf("%-3r", "%d", va_list1) result in "1 ", as if by "%-3s", "1"? Probably best to leave %r as not required to support flags, field width, or precision in the standard (although these can be added as extensions), perhaps with note that if implementations DO support any of these fields, it should be a similar effect as if those fields had appeared to the %s corresponding to the string output of the %r

Narrow (char*) vs. wide (wchar_t*) arrays: we'd definitely want to standardize both approaches, with both usable from printf and wprintf:
%r - format string is char * (as if processed by printf then fed through %s)
%lr or %R format string is a wchar_t * (as if processed by wprintf then fed through %S)
Thus, with va_list1 containing 2 elements "narrow", L"wide", then:
    va_start(va_list1, foo);
    va_copy(va_list2, va_list1);
    va_copy(va_list3, va_list1);
    va_copy(va_list4, va_list1);
    printf("%r %lr", "%s %ls", va_list1, L"%s %ls", va_list2); // "narrow wide narrow wide"
    wprintf(L"%r %lr", "%s %ls", va_list3, L"%s %ls", va_list4); // L"narrow wide narrow wide"
    va_end(va_list1);
    va_end(va_list2);
    va_end(va_list3);
    va_end(va_list4);
(0002116)
joerg (reporter)
2014-01-26 13:33

Before we forget about this, let me write down a rationale related to
fieldwidth and precision.

When printing strings via %.*s, the referenced string is shortened
to the number of bytes specified by the precision.

When printing numbers, this is different and the precision specifies
the minimum number of digits to print.

Assuming %s processing for the result from a %r format thus will
break the assumption that a number will never be shortened in a
way that a different value is shown.

Also, the normal printf() has no way to limit the amount of output
for printf().

This is the reason, why it is probably not even desirable to implement
fieldwidth and precision for the %r format. %r exists since aprox. 34
years and for now, there was no need to have fieldwidth or precision.
(0002117)
joerg (reporter)
2014-01-29 16:21
edited on: 2014-01-29 17:19

Another note with background thoughts.

If a printf() call includes a %R reference, this is for a wide char format string but the output has to be in multi byte mode.

If a wprintf() call includes a %r reference, this is for a multi byte format string but the output has to be in wide char mode.

This can either be implemented by converting the format string referenced by %R/%r into a local copy of the other orientation or by implementing four instead of currently two low level format processors.

The two new ones will be a function like dowmprintf() that imports a wide format string and converts the output into multi byte chars and a function like domwprint() that imports a multi byte format string and converts the output into wide chars on the fly.

The first method would need to allocate space for the converted format string and the latter would need two new copies of the code.

I am ot sure whether the first variant will fork correctly in case of nesting.
OK, it should work if you pass down the stream orientation of the top level
*printf() call.

(0002118)
shware_systems (reporter)
2014-01-29 20:45

Re: 2116, 2117

Another possibility is they're implemented internally as _fprintf(flags, dev, fmt, ...) and _wprintf() versions that communicate extra %r and %R data to the current 2 routines. As a char *flags param it could easily incorporate width modifiers also, but a typedef {int flagbits, wid, prec} record is plausible too. A top level fprintf() would then be a wrapper of _fprintf(NULL, dev, fmt, ...).

This would keep the guts of doing it in only two functions still and target wide or char orientation would be communicated as flag bits, without any effective change visible to applications. I'm glossing over the s*() and v*() versions, but similar treatment should be trivial.

Since width and precision can be incorporated, I'm more for requiring them than leaving them out, with suggested semantics of width being length of passed fmt string and precision being maximum characters to output, wide or single byte; or vice versa. Allowing passing the input length can save a recursion the time of having to do a strlen() call on the passed fmt if a temp malloc() is used. While workarounds do exist that %r hasn't needed modifiers in the past, with an implementation like this it would make applications more efficient.

The argument that numeric values aren't supposed to be truncated when a precison value overrides width I do not consider persuasive, because for various device types with fixed size output buffering that truncation is always a possibility since the first version of printf() was implemented, however the standard is worded. A simple example of that, historically, is watching a Teletype overstrike column 80 into illegibility because the application presumed the 132 columns of most line printers were available. That can be considered operator error, but it's not a situation the standard sufficiently guards against happening either, that I see.
(0002174)
carlos (reporter)
2014-03-04 06:06

This is a sufficiently subtle addition to POSIX that I'd like to see this go through ISO C first and then allow POSIX to harmonize against the language standard. Adding this to POSIX first is, in my opinion, a mistake. This kind of change needs more careful review from the language standards groups.
(0002225)
ajosey (manager)
2014-04-11 07:01

The consensus is that this should be addressed by the C committee rather than by the Austin Group.

If they were to adopt it then it would be considered for a future revision of the standard.




- Issue History
Date Modified Username Field Change
2013-11-18 10:50 joerg New Issue
2013-11-18 10:50 joerg Name => Jörg Schilling
2013-11-18 10:50 joerg Section => stdarg.h and printf()
2013-11-18 10:50 joerg Page Number => 342, 905
2013-11-18 10:50 joerg Line Number => 11449, 11485, 30393
2013-11-18 19:21 dalias Note Added: 0001995
2013-11-18 20:33 joerg Note Added: 0001996
2013-11-18 20:44 joerg Note Edited: 0001996
2013-11-19 20:33 shware_systems Note Added: 0001997
2013-11-20 10:09 joerg Note Added: 0001998
2013-11-20 10:24 joerg Note Edited: 0001998
2013-11-20 20:24 shware_systems Note Added: 0001999
2013-11-20 20:42 dalias Note Added: 0002000
2013-11-20 20:42 dalias Note Added: 0002001
2013-11-20 21:03 Don Cragun Note Deleted: 0002001
2013-11-20 21:04 Don Cragun Note Added: 0002002
2013-11-21 10:27 joerg Note Added: 0002005
2013-11-21 13:03 joerg Note Edited: 0002005
2013-11-21 13:07 joerg Note Added: 0002007
2013-11-21 16:13 joerg Note Edited: 0002007
2013-11-21 16:13 joerg Note Edited: 0002005
2013-11-22 10:20 joerg Note Edited: 0002007
2013-11-23 15:17 joerg Note Added: 0002018
2013-11-23 18:11 dalias Note Added: 0002019
2013-11-26 18:03 joerg Note Added: 0002020
2013-11-26 18:24 joerg Note Added: 0002021
2013-11-26 18:26 joerg Note Edited: 0002018
2013-11-26 19:00 shware_systems Note Edited: 0001999
2013-11-26 19:31 shware_systems Note Added: 0002022
2013-11-26 19:42 shware_systems Note Edited: 0002022
2013-11-26 20:22 joerg Note Added: 0002023
2013-11-27 02:19 shware_systems Note Added: 0002024
2013-11-27 02:51 dalias Note Added: 0002025
2013-11-27 10:59 joerg Note Added: 0002026
2013-11-27 11:00 joerg Note Edited: 0002021
2013-11-27 11:08 joerg Note Edited: 0002020
2013-11-27 14:46 joerg Note Edited: 0002020
2013-11-28 00:46 shware_systems Note Added: 0002028
2013-11-28 00:50 dalias Note Added: 0002029
2013-11-28 11:09 joerg Note Added: 0002030
2013-11-28 15:05 joerg Note Edited: 0002030
2013-12-02 10:37 joerg Note Edited: 0002023
2013-12-02 10:37 joerg Note Edited: 0002020
2014-01-23 16:30 joerg Note Edited: 0002020
2014-01-23 21:23 eblake Note Added: 0002115
2014-01-26 13:33 joerg Note Added: 0002116
2014-01-29 16:21 joerg Note Added: 0002117
2014-01-29 16:22 joerg Note Edited: 0002117
2014-01-29 17:19 joerg Note Edited: 0002117
2014-01-29 20:45 shware_systems Note Added: 0002118
2014-03-04 06:06 carlos Note Added: 0002174
2014-04-11 07:01 ajosey Interp Status => ---
2014-04-11 07:01 ajosey Note Added: 0002225
2014-04-11 07:01 ajosey Status New => Closed
2014-04-11 07:01 ajosey Resolution Open => Rejected


Mantis 1.1.6[^]
Copyright © 2000 - 2008 Mantis Group
Powered by Mantis Bugtracker