Explain why \0 escaping is bad.
This commit is contained in:
parent
a67b456afe
commit
7b5654f7c0
|
@ -149,6 +149,14 @@ A few alternative approaches have been proposed:
|
|||
* use different escape schemes, such as escaping with a NUL
|
||||
character, or mapping to infrequent characters.
|
||||
|
||||
Of these proposals, the approach of escaping each byte XX
|
||||
with the sequence U+0000 U+00XX has the disadvantage that
|
||||
encoding to UTF-8 will introduce a NUL byte in the UTF-8
|
||||
sequence. As a consequence, C libraries may interpret this
|
||||
as a string termination, even though the string continues.
|
||||
In particular, the gtk libraries will truncate text in this
|
||||
case; other libraries may show similar problems.
|
||||
|
||||
References
|
||||
==========
|
||||
|
||||
|
|
Loading…
Reference in New Issue