Add documentation about breaking java time changes (#38886)

In addition remove joda time mentions across the docs, make sure links are updated to java time javadocs. Forward port of #38720
2025-03-25 09:28:27 +00:00 · 2019-02-14 10:18:12 +01:00 · 2019-02-14 10:18:12 +01:00 · 8e5e48319e
commit 8e5e48319e
parent 88489a3f3a
8 changed files with 245 additions and 52 deletions
--- a/docs/reference/aggregations/bucket/daterange-aggregation.asciidoc
+++ b/docs/reference/aggregations/bucket/daterange-aggregation.asciidoc
@ -110,68 +110,149 @@ bucket, as if they had a date value of "1899-12-31".
 ==== Date Format/Pattern

 NOTE: this information was copied from
-http://www.joda.org/joda-time/apidocs/org/joda/time/format/DateTimeFormat.html[JodaDate]
+https://docs.oracle.com/javase/8/docs/api/java/time/format/DateTimeFormatter.html[DateTimeFormatter]

 All ASCII letters are reserved as format pattern letters, which are defined
 as follows:

 [options="header"]
 |=======
-|Symbol |Meaning                |Presentation       |Examples
-|G      |era                    |text               |AD
-|C      |century of era (>=0)   |number             |20
-|Y      |year of era (>=0)      |year               |1996
+|Symbol |Meaning                    |Presentation |Examples
+|G      |era                        |text         |AD; Anno Domini; A
+|u      |year                       |year         |2004; 04
+|y      |year-of-era                |year         |2004; 04
+|D      |day-of-year                |number       |189
+|M/L    |month-of-year              |number/text  |7; 07; Jul; July; J
+|d      |day-of-month               |number       |10

-|x      |weekyear               |year               |1996
-|w      |week of weekyear       |number             |27
-|e      |day of week            |number             |2
-|E      |day of week            |text               |Tuesday; Tue
+|Q/q    |quarter-of-year            |number/text  |3; 03; Q3; 3rd quarter
+|Y      |week-based-year            |year         |1996; 96
+|w      |week-of-week-based-year    |number       |27
+|W      |week-of-month              |number       |4
+|E      |day-of-week                |text         |Tue; Tuesday; T
+|e/c    |localized day-of-week      |number/text  |2; 02; Tue; Tuesday; T
+|F      |week-of-month              |number       |3

-|y      |year                   |year               |1996
-|D      |day of year            |number             |189
-|M      |month of year          |month              |July; Jul; 07
-|d      |day of month           |number             |10
+|a      |am-pm-of-day               |text         |PM
+|h      |clock-hour-of-am-pm (1-12) |number       |12
+|K      |hour-of-am-pm (0-11)       |number       |0
+|k      |clock-hour-of-am-pm (1-24) |number       |0

-|a      |halfday of day               |text         |PM
-|K      |hour of halfday (0~11)       |number       |0
-|h      |clockhour of halfday (1~12)  |number       |12
+|H      |hour-of-day (0-23)         |number       |0
+|m      |minute-of-hour             |number       |30
+|s      |second-of-minute           |number       |55
+|S      |fraction-of-second         |fraction     |978
+|A      |milli-of-day               |number       |1234
+|n      |nano-of-second             |number       |987654321
+|N      |nano-of-day                |number       |1234000000

-|H      |hour of day (0~23)           |number       |0
-|k      |clockhour of day (1~24)      |number       |24
-|m      |minute of hour               |number       |30
-|s      |second of minute             |number       |55
-|S      |fraction of second           |number       |978
+|V      |time-zone ID               |zone-id      |America/Los_Angeles; Z; -08:30
+|z      |time-zone name             |zone-name    |Pacific Standard Time; PST
+|O      |localized zone-offset      |offset-O     |GMT+8; GMT+08:00; UTC-08:00;
+|X      |zone-offset 'Z' for zero   |offset-X     |Z; -08; -0830; -08:30; -083015; -08:30:15;
+|x      |zone-offset                |offset-x     |+0000; -08; -0830; -08:30; -083015; -08:30:15;
+|Z      |zone-offset                |offset-Z     |+0000; -0800; -08:00;

-|z      |time zone                    |text         |Pacific Standard Time; PST
-|Z      |time zone offset/id          |zone         |-0800; -08:00; America/Los_Angeles
-
-|'      |escape for text              |delimiter
-|''     |single quote                 |literal      |'
+|p      |pad next                   |pad modifier |1
+|'      |escape for text            |delimiter
+|''     |single quote               |literal      |'
+|[      |optional section start
+|]      |optional section end
+|#      |reserved for future use
+|{      |reserved for future use
+|}      |reserved for future use
 |=======

-The count of pattern letters determine the format.
+The count of pattern letters determines the format.

-Text:: If the number of pattern letters is 4 or more, the full form is used;
-otherwise a short or abbreviated form is used if available.
+Text:: The text style is determined based on the number of pattern letters
+used. Less than 4 pattern letters will use the short form. Exactly 4
+pattern letters will use the full form. Exactly 5 pattern letters will use
+the narrow form. Pattern letters `L`, `c`, and `q` specify the stand-alone
+form of the text styles.

-Number:: The minimum number of digits. Shorter numbers are zero-padded to
-this amount.
+Number:: If the count of letters is one, then the value is output using
+the minimum number of digits and without padding. Otherwise, the count of
+digits is used as the width of the output field, with the value
+zero-padded as necessary. The following pattern letters have constraints
+on the count of letters. Only one letter of `c` and `F` can be specified.
+Up to two letters of `d`, `H`, `h`, `K`, `k`, `m`, and `s` can be
+specified. Up to three letters of `D` can be specified.

-Year:: Numeric presentation for year and weekyear fields are handled
-specially. For example, if the count of 'y' is 2, the year will be displayed
-as the zero-based year of the century, which is two digits.
+Number/Text:: If the count of pattern letters is 3 or greater, use the
+Text rules above. Otherwise use the Number rules above.

-Month:: 3 or over, use text, otherwise use number.
+Fraction:: Outputs the nano-of-second field as a fraction-of-second. The
+nano-of-second value has nine digits, thus the count of pattern letters is
+from 1 to 9. If it is less than 9, then the nano-of-second value is
+truncated, with only the most significant digits being output.

-Zone:: 'Z' outputs offset without a colon, 'ZZ' outputs the offset with a
-colon, 'ZZZ' or more outputs the zone id.
+Year:: The count of letters determines the minimum field width below which
+padding is used. If the count of letters is two, then a reduced two digit
+form is used. For printing, this outputs the rightmost two digits. For
+parsing, this will parse using the base value of 2000, resulting in a year
+within the range 2000 to 2099 inclusive. If the count of letters is less
+than four (but not two), then the sign is only output for negative years
+as per `SignStyle.NORMAL`. Otherwise, the sign is output if the pad width is
+exceeded, as per `SignStyle.EXCEEDS_PAD`.

-Zone names:: Time zone names ('z') cannot be parsed.
+ZoneId:: This outputs the time-zone ID, such as `Europe/Paris`. If the
+count of letters is two, then the time-zone ID is output. Any other count
+of letters throws `IllegalArgumentException`.
+
+Zone names:: This outputs the display name of the time-zone ID. If the
+count of letters is one, two or three, then the short name is output. If
+the count of letters is four, then the full name is output. Five or more
+letters throws `IllegalArgumentException`.
+
+Offset X and x:: This formats the offset based on the number of pattern
+letters. One letter outputs just the hour, such as `+01`, unless the
+minute is non-zero in which case the minute is also output, such as
+`+0130`. Two letters outputs the hour and minute, without a colon, such as
+`+0130`. Three letters outputs the hour and minute, with a colon, such as
+`+01:30`. Four letters outputs the hour and minute and optional second,
+without a colon, such as `+013015`. Five letters outputs the hour and
+minute and optional second, with a colon, such as `+01:30:15`. Six or
+more letters throws `IllegalArgumentException`. Pattern letter `X` (upper
+case) will output `Z` when the offset to be output would be zero,
+whereas pattern letter `x` (lower case) will output `+00`, `+0000`, or
+`+00:00`.
+
+Offset O:: This formats the localized offset based on the number of
+pattern letters. One letter outputs the short form of the localized
+offset, which is localized offset text, such as `GMT`, with hour without
+leading zero, optional 2-digit minute and second if non-zero, and colon,
+for example `GMT+8`. Four letters outputs the full form, which is
+localized offset text, such as `GMT, with 2-digit hour and minute
+field, optional second field if non-zero, and colon, for example
+`GMT+08:00`. Any other count of letters throws
+`IllegalArgumentException`.
+
+Offset Z:: This formats the offset based on the number of pattern letters.
+One, two or three letters outputs the hour and minute, without a colon,
+such as `+0130`. The output will be `+0000` when the offset is zero.
+Four letters outputs the full form of localized offset, equivalent to
+four letters of Offset-O. The output will be the corresponding localized
+offset text if the offset is zero. Five letters outputs the hour,
+minute, with optional second if non-zero, with colon. It outputs `Z` if
+the offset is zero. Six or more letters throws IllegalArgumentException.
+
+Optional section:: The optional section markers work exactly like calling
+`DateTimeFormatterBuilder.optionalStart()` and
+`DateTimeFormatterBuilder.optionalEnd()`.
+
+Pad modifier:: Modifies the pattern that immediately follows to be padded
+with spaces. The pad width is determined by the number of pattern letters.
+This is the same as calling `DateTimeFormatterBuilder.padNext(int)`.
+
+For example, `ppH` outputs the hour-of-day padded on the left with spaces to a width of 2.
+
+Any unrecognized letter is an error. Any non-letter character, other than
+`[`, `]`, `{`, `}`, `#` and the single quote will be output directly.
+Despite this, it is recommended to use single quotes around all characters
+that you want to output directly to ensure that future changes do not
+break your application.

-Any characters in the pattern that are not in the ranges of ['a'..'z'] and
-['A'..'Z'] will be treated as quoted text. For instance, characters like ':',
- '.', ' ', '#' and '?' will appear in the resulting time text even they are
- not embraced within single quotes.

 [[time-zones]]
 ==== Time zone in date range aggregations
@ -180,8 +261,7 @@ Dates can be converted from another time zone to UTC by specifying the
 `time_zone` parameter.

 Time zones may either be specified as an ISO 8601 UTC offset (e.g. +01:00 or
-08:00) or as one of the http://www.joda.org/joda-time/timezones.html [time
-zone ids] from the TZ database.
+-08:00) or as one of the time zone ids from the TZ database.

 The `time_zone` parameter is also applied to rounding in date math expressions.
 As an example, to round to the beginning of the day in the CET time zone, you
--- a/docs/reference/ingest/processors/date-index-name.asciidoc
+++ b/docs/reference/ingest/processors/date-index-name.asciidoc
@ -137,9 +137,9 @@ understands this to mean `2016-04-01` as is explained in the <<date-math-index-n
 | `field`                | yes       | -                            | The field to get the date or timestamp from.
 | `index_name_prefix`    | no        | -                            | A prefix of the index name to be prepended before the printed date. Supports <<accessing-template-fields,template snippets>>.
 | `date_rounding`        | yes       | -                            | How to round the date when formatting the date into the index name. Valid values are: `y` (year), `M` (month), `w` (week), `d` (day), `h` (hour), `m` (minute) and `s` (second). Supports <<accessing-template-fields,template snippets>>.
-| `date_formats`         | no        | yyyy-MM-dd'T'HH:mm:ss.SSSZ   | An array of the expected date formats for parsing dates / timestamps in the document being preprocessed. Can be a Joda pattern or one of the following formats: ISO8601, UNIX, UNIX_MS, or TAI64N.
+| `date_formats`         | no        | yyyy-MM-dd'T'HH:mm:ss.SSSZ   | An array of the expected date formats for parsing dates / timestamps in the document being preprocessed. Can be a java time pattern or one of the following formats: ISO8601, UNIX, UNIX_MS, or TAI64N.
 | `timezone`             | no        | UTC                          | The timezone to use when parsing the date and when date math index supports resolves expressions into concrete index names.
 | `locale`               | no        | ENGLISH                      | The locale to use when parsing the date from the document being preprocessed, relevant when parsing month names or week days.
-| `index_name_format`    | no        | yyyy-MM-dd                   | The format to be used when printing the parsed date into the index name. An valid Joda pattern is expected here. Supports <<accessing-template-fields,template snippets>>.
+| `index_name_format`    | no        | yyyy-MM-dd                   | The format to be used when printing the parsed date into the index name. An valid java time pattern is expected here. Supports <<accessing-template-fields,template snippets>>.
 include::common-options.asciidoc[]
 |======
--- a/docs/reference/ingest/processors/date.asciidoc
+++ b/docs/reference/ingest/processors/date.asciidoc
@ -14,7 +14,7 @@ in the same order they were defined as part of the processor definition.
 | Name                   | Required  | Default             | Description
 | `field`                | yes       | -                   | The field to get the date from.
 | `target_field`         | no        | @timestamp          | The field that will hold the parsed date.
-| `formats`              | yes       | -                   | An array of the expected date formats. Can be a Joda pattern or one of the following formats: ISO8601, UNIX, UNIX_MS, or TAI64N.
+| `formats`              | yes       | -                   | An array of the expected date formats. Can be a java time pattern or one of the following formats: ISO8601, UNIX, UNIX_MS, or TAI64N.
 | `timezone`        | no        | UTC                 | The timezone to use when parsing the date. Supports <<accessing-template-fields,template snippets>>.
 | `locale`          | no        | ENGLISH             | The locale to use when parsing the date, relevant when parsing month names or week days. Supports <<accessing-template-fields,template snippets>>.
 include::common-options.asciidoc[]
--- a/docs/reference/mapping/params/format.asciidoc
+++ b/docs/reference/mapping/params/format.asciidoc
@ -33,7 +33,7 @@ down to the nearest day.
 ==== Custom date formats

 Completely customizable date formats are supported.  The syntax for these is explained
-http://www.joda.org/joda-time/apidocs/org/joda/time/format/DateTimeFormat.html[in the Joda docs].
+https://docs.oracle.com/javase/8/docs/api/java/time/format/DateTimeFormatter.html[DateTimeFormatter docs].

 [[built-in-date-formats]]
 ==== Built In Formats
@ -69,7 +69,6 @@ The following tables lists all the defaults ISO formats supported:

    A generic ISO datetime parser where the date is mandatory and the time is
    optional.
-    http://www.joda.org/joda-time/apidocs/org/joda/time/format/ISODateTimeFormat.html#dateOptionalTimeParser--[Full details here].

 `basic_date`::

--- a/docs/reference/migration/migrate_7_0.asciidoc
+++ b/docs/reference/migration/migrate_7_0.asciidoc
@ -27,6 +27,7 @@ See also <<release-highlights>> and <<es-release-notes>>.
 * <<breaking_70_low_level_restclient_changes>>
 * <<breaking_70_logging_changes>>
 * <<breaking_70_node_changes>>
+* <<breaking_70_java_time_changes>>

 [float]
 === Indices created before 7.0
@ -62,3 +63,4 @@ include::migrate_7_0/restclient.asciidoc[]
 include::migrate_7_0/low_level_restclient.asciidoc[]
 include::migrate_7_0/logging.asciidoc[]
 include::migrate_7_0/node.asciidoc[]
+include::migrate_7_0/java_time.asciidoc[]
--- a/docs/reference/migration/migrate_7_0/java_time.asciidoc
+++ b/docs/reference/migration/migrate_7_0/java_time.asciidoc
@ -0,0 +1,113 @@
+[float]
+[[breaking_70_java_time_changes]]
+=== Replacing Joda-Time with java time
+
+Since Java 8 there is a dedicated `java.time` package, which is superior to
+the Joda-Time library, that has been used so far in Elasticsearch. One of
+the biggest advantages is the ability to be able to store dates in a higher
+resolution than milliseconds for greater precision. Also this will allow us
+to remove the Joda-Time dependency in the future.
+
+The mappings, aggregations and search code switched from Joda-Time to
+java time.
+
+[float]
+==== Joda based date formatters are replaced with java ones
+
+With the release of Elasticsearch 6.7 a backwards compatibility layer was
+introduced, that checked if you are using a Joda-Time based formatter, that is
+supported differently in java time. A log message was emitted, and you could
+create the proper java time based formatter prefixed with an `8`.
+
+With Elasticsearch 7.0 all formatters are now java based, which means you will
+get exceptions when using deprecated formatters without checking the
+deprecation log in 6.7. In the worst case you may even end up with different
+dates.
+
+An example deprecation message looks like this, that is returned, when you
+try to use a date formatter that includes a lower case `Y`
+
+[source,text]
+----------
+Use of 'Y' (year-of-era) will change to 'y' in the next major version of
+Elasticsearch. Prefix your date format with '8' to use the new specifier.
+----------
+
+So, instead of using `YYYY.MM.dd` you should use `8yyyy.MM.dd`.
+
+You can find more information about available formatting strings in the
+https://docs.oracle.com/javase/8/docs/api/java/time/format/DateTimeFormatter.html[DateTimeFormatter javadocs].
+
+[float]
+==== Date formats behavioural change
+
+The `epoch_millis` and `epoch_second` formatters no longer support
+scientific notation.
+
+If you are using the century of era formatter in a date (`C`), this will no
+longer be supported.
+
+The year-of-era formatting character is a `Y` in Joda-Time, but a lowercase
+`y` in java time.
+
+The week-based-year formatting character is a lowercase `x` in Joda-Time,
+but an upper-case `Y` in java time.
+
+[float]
+==== Using time zones in the Java client
+
+Timezones have to be specified as java time based zone objects. This means,
+instead of using a `org.joda.time.DateTimeZone` the use of
+`java.time.ZoneId` is required.
+
+Examples of possible uses are the `QueryStringQueryBuilder`, the
+`RangeQueryBuilder` or the `DateHistogramAggregationBuilder`, each of them
+allow for an optional timezone for that part of the search request.
+
+[float]
+==== Parsing aggregation buckets in the Java client
+
+The date based aggregation buckets in responses used to be of
+type `JodaTime`. Due to migrating to java-time, the buckets are now of
+type `ZonedDateTime`. As the client is returning untyped objects here, you
+may run into class cast exceptions only when running the code, but not at
+compile time, ensure you have proper test coverage for this in your
+own code.
+
+[float]
+==== Parsing `GMT0` timezone with JDK8 is not supported
+
+When you are running Elasticsearch 7 with Java 8, you are not able to parse
+the timezone `GMT0` properly anymore. The reason for this is a bug in the
+JDK, which has not been fixed for JDK8. You can read more in the
+https://bugs.openjdk.java.net/browse/JDK-8138664[official issue]
+
+[float]
+==== Scripting with dates should use java time based methods
+
+If dates are used in scripting, a backwards compatibility layer has been added
+that emulates the Joda-Time methods, but logs a deprecation message as well
+to use the java time methods.
+
+The following methods will be removed in future versions of Elasticsearch
+and should be replaced.
+
+* `getDayOfWeek()` will be an enum instead of an int, if you need to use
+  an int, use `getDayOfWeekEnum().getValue()`
+* `getMillis()` should be replaced with `toInstant().toEpochMilli()`
+* `getCenturyOfEra()` should be replaced with `get(ChronoField.YEAR_OF_ERA) / 100`
+* `getEra()` should be replaced with `get(ChronoField.ERA)`
+* `getHourOfDay()` should be replaced with `getHour()`
+* `getMillisOfDay()` should be replaced with `get(ChronoField.MILLI_OF_DAY)`
+* `getMillisOfSecond()` should be replaced with `get(ChronoField.MILLI_OF_SECOND)`
+* `getMinuteOfDay()` should be replaced with `get(ChronoField.MINUTE_OF_DAY)`
+* `getMinuteOfHour()` should be replaced with `getMinute()`
+* `getMonthOfYear()` should be replaced with `getMonthValue()`
+* `getSecondOfDay()` should be replaced with `get(ChronoField.SECOND_OF_DAY)`
+* `getSecondOfMinute()` should be replaced with `getSecond()`
+* `getWeekOfWeekyear()` should be replaced with `get(WeekFields.ISO.weekOfWeekBasedYear())`
+* `getWeekyear()` should be replaced with `get(WeekFields.ISO.weekBasedYear())`
+* `getYearOfCentury()` should be replaced with `get(ChronoField.YEAR_OF_ERA) % 100`
+* `getYearOfEra()` should be replaced with `get(ChronoField.YEAR_OF_ERA)`
+* `toString(String)` should be replaced with a `DateTimeFormatter`
+* `toString(String,Locale)` should be replaced with a `DateTimeFormatter`
--- a/docs/reference/ml/apis/find-file-structure.asciidoc
+++ b/docs/reference/ml/apis/find-file-structure.asciidoc
@ -164,7 +164,7 @@ format corresponds to the primary timestamp, but you do not want to specify the
 full `grok_pattern`.

 If this parameter is not specified, the structure finder chooses the best format from
-the formats it knows, which are these Java time formats and their Joda equivalents:
+the formats it knows, which are these Java time formats:

 * `dd/MMM/yyyy:HH:mm:ss XX`
 * `EEE MMM dd HH:mm zzz yyyy`
--- a/docs/reference/query-dsl/query-string-query.asciidoc
+++ b/docs/reference/query-dsl/query-string-query.asciidoc
@ -118,8 +118,7 @@ both>>.
 |`lenient` |If set to `true` will cause format based failures (like
 providing text to a numeric field) to be ignored.

-|`time_zone` | Time Zone to be applied to any range query related to dates. See also
-http://www.joda.org/joda-time/apidocs/org/joda/time/DateTimeZone.html[JODA timezone].
+|`time_zone` | Time Zone to be applied to any range query related to dates.

 |`quote_field_suffix` | A suffix to append to fields for quoted parts of
 the query string. This allows to use a field that has a different analysis chain