Add documentation about breaking java time changes (#38886)

In addition remove joda time mentions across the docs, make 
sure links are updated to java time javadocs.

Forward port of #38720
This commit is contained in:
Alexander Reelsen 2019-02-14 10:18:12 +01:00 committed by GitHub
parent 88489a3f3a
commit 8e5e48319e
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
8 changed files with 245 additions and 52 deletions

View File

@ -110,68 +110,149 @@ bucket, as if they had a date value of "1899-12-31".
==== Date Format/Pattern
NOTE: this information was copied from
http://www.joda.org/joda-time/apidocs/org/joda/time/format/DateTimeFormat.html[JodaDate]
https://docs.oracle.com/javase/8/docs/api/java/time/format/DateTimeFormatter.html[DateTimeFormatter]
All ASCII letters are reserved as format pattern letters, which are defined
as follows:
[options="header"]
|=======
|Symbol |Meaning |Presentation |Examples
|G |era |text |AD
|C |century of era (>=0) |number |20
|Y |year of era (>=0) |year |1996
|Symbol |Meaning |Presentation |Examples
|G |era |text |AD; Anno Domini; A
|u |year |year |2004; 04
|y |year-of-era |year |2004; 04
|D |day-of-year |number |189
|M/L |month-of-year |number/text |7; 07; Jul; July; J
|d |day-of-month |number |10
|x |weekyear |year |1996
|w |week of weekyear |number |27
|e |day of week |number |2
|E |day of week |text |Tuesday; Tue
|Q/q |quarter-of-year |number/text |3; 03; Q3; 3rd quarter
|Y |week-based-year |year |1996; 96
|w |week-of-week-based-year |number |27
|W |week-of-month |number |4
|E |day-of-week |text |Tue; Tuesday; T
|e/c |localized day-of-week |number/text |2; 02; Tue; Tuesday; T
|F |week-of-month |number |3
|y |year |year |1996
|D |day of year |number |189
|M |month of year |month |July; Jul; 07
|d |day of month |number |10
|a |am-pm-of-day |text |PM
|h |clock-hour-of-am-pm (1-12) |number |12
|K |hour-of-am-pm (0-11) |number |0
|k |clock-hour-of-am-pm (1-24) |number |0
|a |halfday of day |text |PM
|K |hour of halfday (0~11) |number |0
|h |clockhour of halfday (1~12) |number |12
|H |hour-of-day (0-23) |number |0
|m |minute-of-hour |number |30
|s |second-of-minute |number |55
|S |fraction-of-second |fraction |978
|A |milli-of-day |number |1234
|n |nano-of-second |number |987654321
|N |nano-of-day |number |1234000000
|H |hour of day (0~23) |number |0
|k |clockhour of day (1~24) |number |24
|m |minute of hour |number |30
|s |second of minute |number |55
|S |fraction of second |number |978
|V |time-zone ID |zone-id |America/Los_Angeles; Z; -08:30
|z |time-zone name |zone-name |Pacific Standard Time; PST
|O |localized zone-offset |offset-O |GMT+8; GMT+08:00; UTC-08:00;
|X |zone-offset 'Z' for zero |offset-X |Z; -08; -0830; -08:30; -083015; -08:30:15;
|x |zone-offset |offset-x |+0000; -08; -0830; -08:30; -083015; -08:30:15;
|Z |zone-offset |offset-Z |+0000; -0800; -08:00;
|z |time zone |text |Pacific Standard Time; PST
|Z |time zone offset/id |zone |-0800; -08:00; America/Los_Angeles
|' |escape for text |delimiter
|'' |single quote |literal |'
|p |pad next |pad modifier |1
|' |escape for text |delimiter
|'' |single quote |literal |'
|[ |optional section start
|] |optional section end
|# |reserved for future use
|{ |reserved for future use
|} |reserved for future use
|=======
The count of pattern letters determine the format.
The count of pattern letters determines the format.
Text:: If the number of pattern letters is 4 or more, the full form is used;
otherwise a short or abbreviated form is used if available.
Text:: The text style is determined based on the number of pattern letters
used. Less than 4 pattern letters will use the short form. Exactly 4
pattern letters will use the full form. Exactly 5 pattern letters will use
the narrow form. Pattern letters `L`, `c`, and `q` specify the stand-alone
form of the text styles.
Number:: The minimum number of digits. Shorter numbers are zero-padded to
this amount.
Number:: If the count of letters is one, then the value is output using
the minimum number of digits and without padding. Otherwise, the count of
digits is used as the width of the output field, with the value
zero-padded as necessary. The following pattern letters have constraints
on the count of letters. Only one letter of `c` and `F` can be specified.
Up to two letters of `d`, `H`, `h`, `K`, `k`, `m`, and `s` can be
specified. Up to three letters of `D` can be specified.
Year:: Numeric presentation for year and weekyear fields are handled
specially. For example, if the count of 'y' is 2, the year will be displayed
as the zero-based year of the century, which is two digits.
Number/Text:: If the count of pattern letters is 3 or greater, use the
Text rules above. Otherwise use the Number rules above.
Month:: 3 or over, use text, otherwise use number.
Fraction:: Outputs the nano-of-second field as a fraction-of-second. The
nano-of-second value has nine digits, thus the count of pattern letters is
from 1 to 9. If it is less than 9, then the nano-of-second value is
truncated, with only the most significant digits being output.
Zone:: 'Z' outputs offset without a colon, 'ZZ' outputs the offset with a
colon, 'ZZZ' or more outputs the zone id.
Year:: The count of letters determines the minimum field width below which
padding is used. If the count of letters is two, then a reduced two digit
form is used. For printing, this outputs the rightmost two digits. For
parsing, this will parse using the base value of 2000, resulting in a year
within the range 2000 to 2099 inclusive. If the count of letters is less
than four (but not two), then the sign is only output for negative years
as per `SignStyle.NORMAL`. Otherwise, the sign is output if the pad width is
exceeded, as per `SignStyle.EXCEEDS_PAD`.
Zone names:: Time zone names ('z') cannot be parsed.
ZoneId:: This outputs the time-zone ID, such as `Europe/Paris`. If the
count of letters is two, then the time-zone ID is output. Any other count
of letters throws `IllegalArgumentException`.
Zone names:: This outputs the display name of the time-zone ID. If the
count of letters is one, two or three, then the short name is output. If
the count of letters is four, then the full name is output. Five or more
letters throws `IllegalArgumentException`.
Offset X and x:: This formats the offset based on the number of pattern
letters. One letter outputs just the hour, such as `+01`, unless the
minute is non-zero in which case the minute is also output, such as
`+0130`. Two letters outputs the hour and minute, without a colon, such as
`+0130`. Three letters outputs the hour and minute, with a colon, such as
`+01:30`. Four letters outputs the hour and minute and optional second,
without a colon, such as `+013015`. Five letters outputs the hour and
minute and optional second, with a colon, such as `+01:30:15`. Six or
more letters throws `IllegalArgumentException`. Pattern letter `X` (upper
case) will output `Z` when the offset to be output would be zero,
whereas pattern letter `x` (lower case) will output `+00`, `+0000`, or
`+00:00`.
Offset O:: This formats the localized offset based on the number of
pattern letters. One letter outputs the short form of the localized
offset, which is localized offset text, such as `GMT`, with hour without
leading zero, optional 2-digit minute and second if non-zero, and colon,
for example `GMT+8`. Four letters outputs the full form, which is
localized offset text, such as `GMT, with 2-digit hour and minute
field, optional second field if non-zero, and colon, for example
`GMT+08:00`. Any other count of letters throws
`IllegalArgumentException`.
Offset Z:: This formats the offset based on the number of pattern letters.
One, two or three letters outputs the hour and minute, without a colon,
such as `+0130`. The output will be `+0000` when the offset is zero.
Four letters outputs the full form of localized offset, equivalent to
four letters of Offset-O. The output will be the corresponding localized
offset text if the offset is zero. Five letters outputs the hour,
minute, with optional second if non-zero, with colon. It outputs `Z` if
the offset is zero. Six or more letters throws IllegalArgumentException.
Optional section:: The optional section markers work exactly like calling
`DateTimeFormatterBuilder.optionalStart()` and
`DateTimeFormatterBuilder.optionalEnd()`.
Pad modifier:: Modifies the pattern that immediately follows to be padded
with spaces. The pad width is determined by the number of pattern letters.
This is the same as calling `DateTimeFormatterBuilder.padNext(int)`.
For example, `ppH` outputs the hour-of-day padded on the left with spaces to a width of 2.
Any unrecognized letter is an error. Any non-letter character, other than
`[`, `]`, `{`, `}`, `#` and the single quote will be output directly.
Despite this, it is recommended to use single quotes around all characters
that you want to output directly to ensure that future changes do not
break your application.
Any characters in the pattern that are not in the ranges of ['a'..'z'] and
['A'..'Z'] will be treated as quoted text. For instance, characters like ':',
'.', ' ', '#' and '?' will appear in the resulting time text even they are
not embraced within single quotes.
[[time-zones]]
==== Time zone in date range aggregations
@ -180,8 +261,7 @@ Dates can be converted from another time zone to UTC by specifying the
`time_zone` parameter.
Time zones may either be specified as an ISO 8601 UTC offset (e.g. +01:00 or
-08:00) or as one of the http://www.joda.org/joda-time/timezones.html [time
zone ids] from the TZ database.
-08:00) or as one of the time zone ids from the TZ database.
The `time_zone` parameter is also applied to rounding in date math expressions.
As an example, to round to the beginning of the day in the CET time zone, you

View File

@ -137,9 +137,9 @@ understands this to mean `2016-04-01` as is explained in the <<date-math-index-n
| `field` | yes | - | The field to get the date or timestamp from.
| `index_name_prefix` | no | - | A prefix of the index name to be prepended before the printed date. Supports <<accessing-template-fields,template snippets>>.
| `date_rounding` | yes | - | How to round the date when formatting the date into the index name. Valid values are: `y` (year), `M` (month), `w` (week), `d` (day), `h` (hour), `m` (minute) and `s` (second). Supports <<accessing-template-fields,template snippets>>.
| `date_formats` | no | yyyy-MM-dd'T'HH:mm:ss.SSSZ | An array of the expected date formats for parsing dates / timestamps in the document being preprocessed. Can be a Joda pattern or one of the following formats: ISO8601, UNIX, UNIX_MS, or TAI64N.
| `date_formats` | no | yyyy-MM-dd'T'HH:mm:ss.SSSZ | An array of the expected date formats for parsing dates / timestamps in the document being preprocessed. Can be a java time pattern or one of the following formats: ISO8601, UNIX, UNIX_MS, or TAI64N.
| `timezone` | no | UTC | The timezone to use when parsing the date and when date math index supports resolves expressions into concrete index names.
| `locale` | no | ENGLISH | The locale to use when parsing the date from the document being preprocessed, relevant when parsing month names or week days.
| `index_name_format` | no | yyyy-MM-dd | The format to be used when printing the parsed date into the index name. An valid Joda pattern is expected here. Supports <<accessing-template-fields,template snippets>>.
| `index_name_format` | no | yyyy-MM-dd | The format to be used when printing the parsed date into the index name. An valid java time pattern is expected here. Supports <<accessing-template-fields,template snippets>>.
include::common-options.asciidoc[]
|======

View File

@ -14,7 +14,7 @@ in the same order they were defined as part of the processor definition.
| Name | Required | Default | Description
| `field` | yes | - | The field to get the date from.
| `target_field` | no | @timestamp | The field that will hold the parsed date.
| `formats` | yes | - | An array of the expected date formats. Can be a Joda pattern or one of the following formats: ISO8601, UNIX, UNIX_MS, or TAI64N.
| `formats` | yes | - | An array of the expected date formats. Can be a java time pattern or one of the following formats: ISO8601, UNIX, UNIX_MS, or TAI64N.
| `timezone` | no | UTC | The timezone to use when parsing the date. Supports <<accessing-template-fields,template snippets>>.
| `locale` | no | ENGLISH | The locale to use when parsing the date, relevant when parsing month names or week days. Supports <<accessing-template-fields,template snippets>>.
include::common-options.asciidoc[]

View File

@ -33,7 +33,7 @@ down to the nearest day.
==== Custom date formats
Completely customizable date formats are supported. The syntax for these is explained
http://www.joda.org/joda-time/apidocs/org/joda/time/format/DateTimeFormat.html[in the Joda docs].
https://docs.oracle.com/javase/8/docs/api/java/time/format/DateTimeFormatter.html[DateTimeFormatter docs].
[[built-in-date-formats]]
==== Built In Formats
@ -69,7 +69,6 @@ The following tables lists all the defaults ISO formats supported:
A generic ISO datetime parser where the date is mandatory and the time is
optional.
http://www.joda.org/joda-time/apidocs/org/joda/time/format/ISODateTimeFormat.html#dateOptionalTimeParser--[Full details here].
`basic_date`::

View File

@ -27,6 +27,7 @@ See also <<release-highlights>> and <<es-release-notes>>.
* <<breaking_70_low_level_restclient_changes>>
* <<breaking_70_logging_changes>>
* <<breaking_70_node_changes>>
* <<breaking_70_java_time_changes>>
[float]
=== Indices created before 7.0
@ -62,3 +63,4 @@ include::migrate_7_0/restclient.asciidoc[]
include::migrate_7_0/low_level_restclient.asciidoc[]
include::migrate_7_0/logging.asciidoc[]
include::migrate_7_0/node.asciidoc[]
include::migrate_7_0/java_time.asciidoc[]

View File

@ -0,0 +1,113 @@
[float]
[[breaking_70_java_time_changes]]
=== Replacing Joda-Time with java time
Since Java 8 there is a dedicated `java.time` package, which is superior to
the Joda-Time library, that has been used so far in Elasticsearch. One of
the biggest advantages is the ability to be able to store dates in a higher
resolution than milliseconds for greater precision. Also this will allow us
to remove the Joda-Time dependency in the future.
The mappings, aggregations and search code switched from Joda-Time to
java time.
[float]
==== Joda based date formatters are replaced with java ones
With the release of Elasticsearch 6.7 a backwards compatibility layer was
introduced, that checked if you are using a Joda-Time based formatter, that is
supported differently in java time. A log message was emitted, and you could
create the proper java time based formatter prefixed with an `8`.
With Elasticsearch 7.0 all formatters are now java based, which means you will
get exceptions when using deprecated formatters without checking the
deprecation log in 6.7. In the worst case you may even end up with different
dates.
An example deprecation message looks like this, that is returned, when you
try to use a date formatter that includes a lower case `Y`
[source,text]
----------
Use of 'Y' (year-of-era) will change to 'y' in the next major version of
Elasticsearch. Prefix your date format with '8' to use the new specifier.
----------
So, instead of using `YYYY.MM.dd` you should use `8yyyy.MM.dd`.
You can find more information about available formatting strings in the
https://docs.oracle.com/javase/8/docs/api/java/time/format/DateTimeFormatter.html[DateTimeFormatter javadocs].
[float]
==== Date formats behavioural change
The `epoch_millis` and `epoch_second` formatters no longer support
scientific notation.
If you are using the century of era formatter in a date (`C`), this will no
longer be supported.
The year-of-era formatting character is a `Y` in Joda-Time, but a lowercase
`y` in java time.
The week-based-year formatting character is a lowercase `x` in Joda-Time,
but an upper-case `Y` in java time.
[float]
==== Using time zones in the Java client
Timezones have to be specified as java time based zone objects. This means,
instead of using a `org.joda.time.DateTimeZone` the use of
`java.time.ZoneId` is required.
Examples of possible uses are the `QueryStringQueryBuilder`, the
`RangeQueryBuilder` or the `DateHistogramAggregationBuilder`, each of them
allow for an optional timezone for that part of the search request.
[float]
==== Parsing aggregation buckets in the Java client
The date based aggregation buckets in responses used to be of
type `JodaTime`. Due to migrating to java-time, the buckets are now of
type `ZonedDateTime`. As the client is returning untyped objects here, you
may run into class cast exceptions only when running the code, but not at
compile time, ensure you have proper test coverage for this in your
own code.
[float]
==== Parsing `GMT0` timezone with JDK8 is not supported
When you are running Elasticsearch 7 with Java 8, you are not able to parse
the timezone `GMT0` properly anymore. The reason for this is a bug in the
JDK, which has not been fixed for JDK8. You can read more in the
https://bugs.openjdk.java.net/browse/JDK-8138664[official issue]
[float]
==== Scripting with dates should use java time based methods
If dates are used in scripting, a backwards compatibility layer has been added
that emulates the Joda-Time methods, but logs a deprecation message as well
to use the java time methods.
The following methods will be removed in future versions of Elasticsearch
and should be replaced.
* `getDayOfWeek()` will be an enum instead of an int, if you need to use
an int, use `getDayOfWeekEnum().getValue()`
* `getMillis()` should be replaced with `toInstant().toEpochMilli()`
* `getCenturyOfEra()` should be replaced with `get(ChronoField.YEAR_OF_ERA) / 100`
* `getEra()` should be replaced with `get(ChronoField.ERA)`
* `getHourOfDay()` should be replaced with `getHour()`
* `getMillisOfDay()` should be replaced with `get(ChronoField.MILLI_OF_DAY)`
* `getMillisOfSecond()` should be replaced with `get(ChronoField.MILLI_OF_SECOND)`
* `getMinuteOfDay()` should be replaced with `get(ChronoField.MINUTE_OF_DAY)`
* `getMinuteOfHour()` should be replaced with `getMinute()`
* `getMonthOfYear()` should be replaced with `getMonthValue()`
* `getSecondOfDay()` should be replaced with `get(ChronoField.SECOND_OF_DAY)`
* `getSecondOfMinute()` should be replaced with `getSecond()`
* `getWeekOfWeekyear()` should be replaced with `get(WeekFields.ISO.weekOfWeekBasedYear())`
* `getWeekyear()` should be replaced with `get(WeekFields.ISO.weekBasedYear())`
* `getYearOfCentury()` should be replaced with `get(ChronoField.YEAR_OF_ERA) % 100`
* `getYearOfEra()` should be replaced with `get(ChronoField.YEAR_OF_ERA)`
* `toString(String)` should be replaced with a `DateTimeFormatter`
* `toString(String,Locale)` should be replaced with a `DateTimeFormatter`

View File

@ -164,7 +164,7 @@ format corresponds to the primary timestamp, but you do not want to specify the
full `grok_pattern`.
If this parameter is not specified, the structure finder chooses the best format from
the formats it knows, which are these Java time formats and their Joda equivalents:
the formats it knows, which are these Java time formats:
* `dd/MMM/yyyy:HH:mm:ss XX`
* `EEE MMM dd HH:mm zzz yyyy`

View File

@ -118,8 +118,7 @@ both>>.
|`lenient` |If set to `true` will cause format based failures (like
providing text to a numeric field) to be ignored.
|`time_zone` | Time Zone to be applied to any range query related to dates. See also
http://www.joda.org/joda-time/apidocs/org/joda/time/DateTimeZone.html[JODA timezone].
|`time_zone` | Time Zone to be applied to any range query related to dates.
|`quote_field_suffix` | A suffix to append to fields for quoted parts of
the query string. This allows to use a field that has a different analysis chain