[ML] Account for the possibility of C++ log messages being UTF-16 (elastic/x-pack-elasticsearch#2952)

On Windows, log4cxx always writes to stderr in UTF-16, and we get the
logs from C++ to Java by redirecting stderr to our named pipe.  Hence
the log handler in Java needs to tolerate the log stream it's reading
being either UTF-16 (for Windows) or UTF-8 (for other platforms).

Fixes elastic/machine-learning-cpp#385

Original commit: elastic/x-pack-elasticsearch@89237d7125
This commit is contained in:
David Roberts 2017-11-10 15:14:28 +00:00
parent a90cd81f99
commit 742a052619
1 changed files with 6 additions and 0 deletions

View File

@ -205,6 +205,12 @@ public class CppLogMessageHandler implements Closeable {
parseMessage(xContent, bytesRef.slice(from, nextMarker - from));
}
from = nextMarker + 1;
if (from < bytesRef.length() && bytesRef.get(from) == (byte) 0) {
// This is to work around the problem of log4cxx on Windows
// outputting UTF-16 instead of UTF-8. For full details see
// https://github.com/elastic/machine-learning-cpp/issues/385
++from;
}
}
if (from >= bytesRef.length()) {
return null;