diff --git a/src/main/java/org/apache/commons/lang3/StringEscapeUtils.java b/src/main/java/org/apache/commons/lang3/StringEscapeUtils.java index 90ac154dc..a05c2f249 100644 --- a/src/main/java/org/apache/commons/lang3/StringEscapeUtils.java +++ b/src/main/java/org/apache/commons/lang3/StringEscapeUtils.java @@ -502,7 +502,7 @@ public static final String unescapeHtml3(String input) { *

Note that unicode characters greater than 0x7f are as of 3.0, no longer * escaped. If you still wish this functionality, you can achieve it * via the following: - * {@code StringEscapeUtils.ESCAPE_XML.with( UnicodeEscaper.above(0x7f) );}

+ * {@code StringEscapeUtils.ESCAPE_XML.with( new UnicodeEscaper(Range.between(0x7f, Integer.MAX_VALUE)) );}

* * @param input the {@code String} to escape, may be null * @return a new escaped {@code String}, {@code null} if null string input diff --git a/src/site/xdoc/article3_0.xml b/src/site/xdoc/article3_0.xml index aca7b0166..6033e2bd2 100644 --- a/src/site/xdoc/article3_0.xml +++ b/src/site/xdoc/article3_0.xml @@ -142,11 +142,16 @@ available in the user guide.

Here we see that ESCAPE_XML is a 'CharSequenceTranslator', which in turn is made up of two lookup translators based on the basic XML escapes and another to escape apostrophes. This shows one way to combine translators. Another can be shown by looking at the example to achieve the old XML escaping functionality (escaping non-ASCII):

-          StringEscapeUtils.ESCAPE_XML.with( UnicodeEscaper.above(0x7f) );
+          StringEscapeUtils.ESCAPE_XML.with( new UnicodeEscaper(Range.between(0x7f, Integer.MAX_VALUE) ) );
 

That takes the standard Commons Lang provided escape functionality, and adds on another translation layer. Another JIRA requested option was to also escape non-printable ASCII, this is now achievable with a modification of the above:

-          StringEscapeUtils.ESCAPE_XML.with( UnicodeEscaper.outsideOf(32, 0x7f) );
+          StringEscapeUtils.ESCAPE_XML.with(
+              new AggregateTranslator(
+                  new UnicodeEscaper(Range.between(0, 31)),
+                  new UnicodeEscaper(Range.between(0x80, Integer.MAX_VALUE))
+              )
+          )
 

You can also implement your own translators (be they for escaping, unescaping or some aspect of your own). See the CharSequenceTranslator and its CodePointTranslator helper subclass for details - primarily a case of implementing the translate(CharSequence, int, Writer);int method.