Explain why \0 escaping is bad.

This commit is contained in:
Martin v. Löwis 2009-04-30 15:56:46 +00:00
parent a67b456afe
commit 7b5654f7c0
1 changed files with 8 additions and 0 deletions

View File

@ -149,6 +149,14 @@ A few alternative approaches have been proposed:
* use different escape schemes, such as escaping with a NUL
character, or mapping to infrequent characters.
Of these proposals, the approach of escaping each byte XX
with the sequence U+0000 U+00XX has the disadvantage that
encoding to UTF-8 will introduce a NUL byte in the UTF-8
sequence. As a consequence, C libraries may interpret this
as a string termination, even though the string continues.
In particular, the gtk libraries will truncate text in this
case; other libraries may show similar problems.
References
==========