This is the mail archive of the
libc-alpha@sources.redhat.com
mailing list for the glibc project.
Dangerous glibc byte/wide stream interactions
- To: libc-alpha at sources dot redhat dot com
- Subject: Dangerous glibc byte/wide stream interactions
- From: Markus Kuhn <Markus dot Kuhn at cl dot cam dot ac dot uk>
- Date: Fri, 22 Sep 2000 12:14:56 +0100
- cc: linux-utf8 at nl dot linux dot org
I get with a recent glibc 2.1.93 CVS snapshot some rather worrying
results if I try to use byte and wide output functions on the same
stream:
The little test program wtest1.c
---------------------------------------------------------------------
#define _ISOC99_SOURCE /* argl */
#include <stdio.h>
#include <assert.h>
#include <wchar.h>
#include <locale.h>
int main() {
int r1, r2;
fwprintf(stderr, L"Schöne Grüße!\n");
r1 = fwide(stderr, -1);
fprintf (stderr, "WARNING: This message will be lost!\n");
r2 = fwide(stderr, 1);
fwprintf(stderr, L"Have a nice day!\n");
printf("\nr1=%d, r2=%d\n", r1, r2);
return 0;
}
---------------------------------------------------------------------
produces the following output:
---------------------------------------------------------------------
SchoHave a nice day!
r1=1, r2=1
---------------------------------------------------------------------
Another test program wtest2.c
---------------------------------------------------------------------
#define _ISOC99_SOURCE /* argl */
#include <stdio.h>
#include <assert.h>
#include <wchar.h>
#include <locale.h>
int main() {
FILE *r1, *r2;
fwprintf(stderr, L"Schöne Grüße!\n");
r1 = freopen(NULL, "w", stderr);
fprintf (stderr, "WARNING: This message will be lost!\n");
r2 = freopen(NULL, "w", stderr);
fwprintf(stderr, L"Have a nice day!\n");
printf("\nr1=%p, r2=%p\n", r1, r2);
return 0;
}
---------------------------------------------------------------------
produces the following output:
---------------------------------------------------------------------
Scho
r1=(nil), r2=(nil)
---------------------------------------------------------------------
I admit that this behaviour does not strictly violate the letter of the
ISO C99 standard, which says in section 7.19.2
---------------------------------------------------------------------
[#4] Each stream has an orientation. After a stream is
associated with an external file, but before any operations
are performed on it, the stream is without orientation.
Once a wide character input/output function has been applied
to a stream without orientation, the stream becomes a wide-
oriented stream. Similarly, once a byte input/output
function has been applied to a stream without orientation,
the stream becomes a byte-oriented stream. Only a call to
the freopen function or the fwide function can otherwise
alter the orientation of a stream. (A successful call to
freopen removes any orientation.)224)
[#5] Byte input/output functions shall not be applied to a
wide-oriented stream and wide character input/output
functions shall not be applied to a byte-oriented stream.
---------------------------------------------------------------------
and which also requires neither the fwide() function nor the freopen
function to be successful. However, I strongly argue that this behaviour
makes the entire wide I/O system practically completely useless, even
dangerous!
Why?
A typical program contains library modules from various sources, some of
which will use byte-output (printf) and others will use wide-output
(wprintf). It is *essential* that all modules can independently and
reliably access stderr in order to issue diagnostic, warning and error
messages. The above two test programs show that at the moment, there is
no way to change the orientation of stderr once a first output function
has been applied. Both fwide() and freopen() fail consistently. Also
fflush(stderr) instead of reopen() does not help.
Suggested glibc behaviour:
a) fwide(f, -1) and fwide(f, 1) must never fail for a valid stream f
and must always return the specified and set mode [At the moment,
fwide() simply refuses to set the orientation once it has been set.]
b) freopen should also not fail on the standard streams.
c) At least in the stateless 8-bit and UTF-8 locales, if a byte-output
function is applied to a wide stream or a wide-output function is
applied to a byte-stream (something a portable C program should not
try to do), all characters are still written out correctly.
I see technically absolutely no reason, why c) can't be implemented
easily and efficiently, at least for stateless multi-byte encodings such
as ISO 8859 and UTF-8. The wide output functions should immediately
convert their output into a byte stream and then committ the result to a
byte block buffer. This way, printf() and wprintf() can be interleaved
in arbitrary ways without any intervening fwide() or freopen() calls.
Then all fprintf(stderr, ...) or wfprintf(stderr, ...) emergency output
will always be guaranteed to reach the terminal, which is definitely
needed.
Markus
--
Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
Email: mkuhn at acm.org, WWW: <http://www.cl.cam.ac.uk/~mgk25/>