How To Print() A String In Python3 Without Exceptions?
Solution 1:
Is there a clean portable way to fix this?
Set PYTHONIOENCODING=<encoding>:<error_handler>
e.g.,
$ PYTHONIOENCODING=utf-8 python your_script.py >output-in-utf-8.txt
In your case, I'd configure your environment (LANG
, LC_CTYPE
) to accept non-ascii input:
$ locale charmap
Solution 2:
The most practical way to solve this issue seems to be to force the output encoding to utf-8:surrogateescape
. This will not only force UTF-8 output, but also ensure that surrogate escaped strings, as returned by os.fsdecode()
, can be printed without throwing an exception. On command line this looks like this:
PYTHONIOENCODING=utf-8:surrogateescape python3 -c 'print("\udcff")'
To do this from within the program itself one has to reassign stdout
and stderr
, this can be done with (the line_buffering=True
is important, as otherwise the output won't get flushed properly):
import sys
import io
sys.stdout = io.TextIOWrapper(sys.stdout.buffer, errors="surrogateescape", line_buffering=True)
sys.stderr = io.TextIOWrapper(sys.stderr.buffer, errors="surrogateescape", line_buffering=True)
print("\udcff")
This approach will cause characters to be incorrectly displayed on terminals not set to UTF-8, but this to me seems to be strongly prefered over randomly throwing exceptions and making it impossible to print filenames without corrupting them, as they might not be in any valid encoding at all on Linux systems.
I read in a few places that utf-8:surrogateescape
might become the default in the future, but as of Python 3.6.0b2 that is not the case.
Solution 3:
The reason it is giving you an error is because it is trying to decipher what \u is. Just like \r is ascii for carriage return, \n - newline \t - tab etc...
If:
my_string = '\u112'print(my_string)
That will give you an error, to print the '\' without it trying to find out what \ is is like so:
my_string = '\\u122'print(my_string)
Output:
\u122
Post a Comment for "How To Print() A String In Python3 Without Exceptions?"