From 8e5e48319e04e359d8881fcf1400aa66c9d09b7b Mon Sep 17 00:00:00 2001 From: Alexander Reelsen Date: Thu, 14 Feb 2019 10:18:12 +0100 Subject: [PATCH] Add documentation about breaking java time changes (#38886) In addition remove joda time mentions across the docs, make sure links are updated to java time javadocs. Forward port of #38720 --- .../bucket/daterange-aggregation.asciidoc | 168 +++++++++++++----- .../processors/date-index-name.asciidoc | 4 +- .../reference/ingest/processors/date.asciidoc | 2 +- docs/reference/mapping/params/format.asciidoc | 3 +- docs/reference/migration/migrate_7_0.asciidoc | 2 + .../migration/migrate_7_0/java_time.asciidoc | 113 ++++++++++++ .../ml/apis/find-file-structure.asciidoc | 2 +- .../query-dsl/query-string-query.asciidoc | 3 +- 8 files changed, 245 insertions(+), 52 deletions(-) create mode 100644 docs/reference/migration/migrate_7_0/java_time.asciidoc diff --git a/docs/reference/aggregations/bucket/daterange-aggregation.asciidoc b/docs/reference/aggregations/bucket/daterange-aggregation.asciidoc index 4b172402da9..5578ecba886 100644 --- a/docs/reference/aggregations/bucket/daterange-aggregation.asciidoc +++ b/docs/reference/aggregations/bucket/daterange-aggregation.asciidoc @@ -110,68 +110,149 @@ bucket, as if they had a date value of "1899-12-31". ==== Date Format/Pattern NOTE: this information was copied from -http://www.joda.org/joda-time/apidocs/org/joda/time/format/DateTimeFormat.html[JodaDate] +https://docs.oracle.com/javase/8/docs/api/java/time/format/DateTimeFormatter.html[DateTimeFormatter] All ASCII letters are reserved as format pattern letters, which are defined as follows: [options="header"] |======= -|Symbol |Meaning |Presentation |Examples -|G |era |text |AD -|C |century of era (>=0) |number |20 -|Y |year of era (>=0) |year |1996 +|Symbol |Meaning |Presentation |Examples +|G |era |text |AD; Anno Domini; A +|u |year |year |2004; 04 +|y |year-of-era |year |2004; 04 +|D |day-of-year |number |189 +|M/L |month-of-year |number/text |7; 07; Jul; July; J +|d |day-of-month |number |10 -|x |weekyear |year |1996 -|w |week of weekyear |number |27 -|e |day of week |number |2 -|E |day of week |text |Tuesday; Tue +|Q/q |quarter-of-year |number/text |3; 03; Q3; 3rd quarter +|Y |week-based-year |year |1996; 96 +|w |week-of-week-based-year |number |27 +|W |week-of-month |number |4 +|E |day-of-week |text |Tue; Tuesday; T +|e/c |localized day-of-week |number/text |2; 02; Tue; Tuesday; T +|F |week-of-month |number |3 -|y |year |year |1996 -|D |day of year |number |189 -|M |month of year |month |July; Jul; 07 -|d |day of month |number |10 +|a |am-pm-of-day |text |PM +|h |clock-hour-of-am-pm (1-12) |number |12 +|K |hour-of-am-pm (0-11) |number |0 +|k |clock-hour-of-am-pm (1-24) |number |0 -|a |halfday of day |text |PM -|K |hour of halfday (0~11) |number |0 -|h |clockhour of halfday (1~12) |number |12 +|H |hour-of-day (0-23) |number |0 +|m |minute-of-hour |number |30 +|s |second-of-minute |number |55 +|S |fraction-of-second |fraction |978 +|A |milli-of-day |number |1234 +|n |nano-of-second |number |987654321 +|N |nano-of-day |number |1234000000 -|H |hour of day (0~23) |number |0 -|k |clockhour of day (1~24) |number |24 -|m |minute of hour |number |30 -|s |second of minute |number |55 -|S |fraction of second |number |978 +|V |time-zone ID |zone-id |America/Los_Angeles; Z; -08:30 +|z |time-zone name |zone-name |Pacific Standard Time; PST +|O |localized zone-offset |offset-O |GMT+8; GMT+08:00; UTC-08:00; +|X |zone-offset 'Z' for zero |offset-X |Z; -08; -0830; -08:30; -083015; -08:30:15; +|x |zone-offset |offset-x |+0000; -08; -0830; -08:30; -083015; -08:30:15; +|Z |zone-offset |offset-Z |+0000; -0800; -08:00; -|z |time zone |text |Pacific Standard Time; PST -|Z |time zone offset/id |zone |-0800; -08:00; America/Los_Angeles - -|' |escape for text |delimiter -|'' |single quote |literal |' +|p |pad next |pad modifier |1 +|' |escape for text |delimiter +|'' |single quote |literal |' +|[ |optional section start +|] |optional section end +|# |reserved for future use +|{ |reserved for future use +|} |reserved for future use |======= -The count of pattern letters determine the format. +The count of pattern letters determines the format. -Text:: If the number of pattern letters is 4 or more, the full form is used; -otherwise a short or abbreviated form is used if available. +Text:: The text style is determined based on the number of pattern letters +used. Less than 4 pattern letters will use the short form. Exactly 4 +pattern letters will use the full form. Exactly 5 pattern letters will use +the narrow form. Pattern letters `L`, `c`, and `q` specify the stand-alone +form of the text styles. -Number:: The minimum number of digits. Shorter numbers are zero-padded to -this amount. +Number:: If the count of letters is one, then the value is output using +the minimum number of digits and without padding. Otherwise, the count of +digits is used as the width of the output field, with the value +zero-padded as necessary. The following pattern letters have constraints +on the count of letters. Only one letter of `c` and `F` can be specified. +Up to two letters of `d`, `H`, `h`, `K`, `k`, `m`, and `s` can be +specified. Up to three letters of `D` can be specified. -Year:: Numeric presentation for year and weekyear fields are handled -specially. For example, if the count of 'y' is 2, the year will be displayed -as the zero-based year of the century, which is two digits. +Number/Text:: If the count of pattern letters is 3 or greater, use the +Text rules above. Otherwise use the Number rules above. -Month:: 3 or over, use text, otherwise use number. +Fraction:: Outputs the nano-of-second field as a fraction-of-second. The +nano-of-second value has nine digits, thus the count of pattern letters is +from 1 to 9. If it is less than 9, then the nano-of-second value is +truncated, with only the most significant digits being output. -Zone:: 'Z' outputs offset without a colon, 'ZZ' outputs the offset with a -colon, 'ZZZ' or more outputs the zone id. +Year:: The count of letters determines the minimum field width below which +padding is used. If the count of letters is two, then a reduced two digit +form is used. For printing, this outputs the rightmost two digits. For +parsing, this will parse using the base value of 2000, resulting in a year +within the range 2000 to 2099 inclusive. If the count of letters is less +than four (but not two), then the sign is only output for negative years +as per `SignStyle.NORMAL`. Otherwise, the sign is output if the pad width is +exceeded, as per `SignStyle.EXCEEDS_PAD`. -Zone names:: Time zone names ('z') cannot be parsed. +ZoneId:: This outputs the time-zone ID, such as `Europe/Paris`. If the +count of letters is two, then the time-zone ID is output. Any other count +of letters throws `IllegalArgumentException`. + +Zone names:: This outputs the display name of the time-zone ID. If the +count of letters is one, two or three, then the short name is output. If +the count of letters is four, then the full name is output. Five or more +letters throws `IllegalArgumentException`. + +Offset X and x:: This formats the offset based on the number of pattern +letters. One letter outputs just the hour, such as `+01`, unless the +minute is non-zero in which case the minute is also output, such as +`+0130`. Two letters outputs the hour and minute, without a colon, such as +`+0130`. Three letters outputs the hour and minute, with a colon, such as +`+01:30`. Four letters outputs the hour and minute and optional second, +without a colon, such as `+013015`. Five letters outputs the hour and +minute and optional second, with a colon, such as `+01:30:15`. Six or +more letters throws `IllegalArgumentException`. Pattern letter `X` (upper +case) will output `Z` when the offset to be output would be zero, +whereas pattern letter `x` (lower case) will output `+00`, `+0000`, or +`+00:00`. + +Offset O:: This formats the localized offset based on the number of +pattern letters. One letter outputs the short form of the localized +offset, which is localized offset text, such as `GMT`, with hour without +leading zero, optional 2-digit minute and second if non-zero, and colon, +for example `GMT+8`. Four letters outputs the full form, which is +localized offset text, such as `GMT, with 2-digit hour and minute +field, optional second field if non-zero, and colon, for example +`GMT+08:00`. Any other count of letters throws +`IllegalArgumentException`. + +Offset Z:: This formats the offset based on the number of pattern letters. +One, two or three letters outputs the hour and minute, without a colon, +such as `+0130`. The output will be `+0000` when the offset is zero. +Four letters outputs the full form of localized offset, equivalent to +four letters of Offset-O. The output will be the corresponding localized +offset text if the offset is zero. Five letters outputs the hour, +minute, with optional second if non-zero, with colon. It outputs `Z` if +the offset is zero. Six or more letters throws IllegalArgumentException. + +Optional section:: The optional section markers work exactly like calling +`DateTimeFormatterBuilder.optionalStart()` and +`DateTimeFormatterBuilder.optionalEnd()`. + +Pad modifier:: Modifies the pattern that immediately follows to be padded +with spaces. The pad width is determined by the number of pattern letters. +This is the same as calling `DateTimeFormatterBuilder.padNext(int)`. + +For example, `ppH` outputs the hour-of-day padded on the left with spaces to a width of 2. + +Any unrecognized letter is an error. Any non-letter character, other than +`[`, `]`, `{`, `}`, `#` and the single quote will be output directly. +Despite this, it is recommended to use single quotes around all characters +that you want to output directly to ensure that future changes do not +break your application. -Any characters in the pattern that are not in the ranges of ['a'..'z'] and -['A'..'Z'] will be treated as quoted text. For instance, characters like ':', - '.', ' ', '#' and '?' will appear in the resulting time text even they are - not embraced within single quotes. [[time-zones]] ==== Time zone in date range aggregations @@ -180,8 +261,7 @@ Dates can be converted from another time zone to UTC by specifying the `time_zone` parameter. Time zones may either be specified as an ISO 8601 UTC offset (e.g. +01:00 or --08:00) or as one of the http://www.joda.org/joda-time/timezones.html [time -zone ids] from the TZ database. +-08:00) or as one of the time zone ids from the TZ database. The `time_zone` parameter is also applied to rounding in date math expressions. As an example, to round to the beginning of the day in the CET time zone, you diff --git a/docs/reference/ingest/processors/date-index-name.asciidoc b/docs/reference/ingest/processors/date-index-name.asciidoc index 6dd54dab056..e2f28425758 100644 --- a/docs/reference/ingest/processors/date-index-name.asciidoc +++ b/docs/reference/ingest/processors/date-index-name.asciidoc @@ -137,9 +137,9 @@ understands this to mean `2016-04-01` as is explained in the <>. | `date_rounding` | yes | - | How to round the date when formatting the date into the index name. Valid values are: `y` (year), `M` (month), `w` (week), `d` (day), `h` (hour), `m` (minute) and `s` (second). Supports <>. -| `date_formats` | no | yyyy-MM-dd'T'HH:mm:ss.SSSZ | An array of the expected date formats for parsing dates / timestamps in the document being preprocessed. Can be a Joda pattern or one of the following formats: ISO8601, UNIX, UNIX_MS, or TAI64N. +| `date_formats` | no | yyyy-MM-dd'T'HH:mm:ss.SSSZ | An array of the expected date formats for parsing dates / timestamps in the document being preprocessed. Can be a java time pattern or one of the following formats: ISO8601, UNIX, UNIX_MS, or TAI64N. | `timezone` | no | UTC | The timezone to use when parsing the date and when date math index supports resolves expressions into concrete index names. | `locale` | no | ENGLISH | The locale to use when parsing the date from the document being preprocessed, relevant when parsing month names or week days. -| `index_name_format` | no | yyyy-MM-dd | The format to be used when printing the parsed date into the index name. An valid Joda pattern is expected here. Supports <>. +| `index_name_format` | no | yyyy-MM-dd | The format to be used when printing the parsed date into the index name. An valid java time pattern is expected here. Supports <>. include::common-options.asciidoc[] |====== diff --git a/docs/reference/ingest/processors/date.asciidoc b/docs/reference/ingest/processors/date.asciidoc index 17cb367afad..d797dffd8d4 100644 --- a/docs/reference/ingest/processors/date.asciidoc +++ b/docs/reference/ingest/processors/date.asciidoc @@ -14,7 +14,7 @@ in the same order they were defined as part of the processor definition. | Name | Required | Default | Description | `field` | yes | - | The field to get the date from. | `target_field` | no | @timestamp | The field that will hold the parsed date. -| `formats` | yes | - | An array of the expected date formats. Can be a Joda pattern or one of the following formats: ISO8601, UNIX, UNIX_MS, or TAI64N. +| `formats` | yes | - | An array of the expected date formats. Can be a java time pattern or one of the following formats: ISO8601, UNIX, UNIX_MS, or TAI64N. | `timezone` | no | UTC | The timezone to use when parsing the date. Supports <>. | `locale` | no | ENGLISH | The locale to use when parsing the date, relevant when parsing month names or week days. Supports <>. include::common-options.asciidoc[] diff --git a/docs/reference/mapping/params/format.asciidoc b/docs/reference/mapping/params/format.asciidoc index 2be1bdf12d8..8e79a217a1a 100644 --- a/docs/reference/mapping/params/format.asciidoc +++ b/docs/reference/mapping/params/format.asciidoc @@ -33,7 +33,7 @@ down to the nearest day. ==== Custom date formats Completely customizable date formats are supported. The syntax for these is explained -http://www.joda.org/joda-time/apidocs/org/joda/time/format/DateTimeFormat.html[in the Joda docs]. +https://docs.oracle.com/javase/8/docs/api/java/time/format/DateTimeFormatter.html[DateTimeFormatter docs]. [[built-in-date-formats]] ==== Built In Formats @@ -69,7 +69,6 @@ The following tables lists all the defaults ISO formats supported: A generic ISO datetime parser where the date is mandatory and the time is optional. - http://www.joda.org/joda-time/apidocs/org/joda/time/format/ISODateTimeFormat.html#dateOptionalTimeParser--[Full details here]. `basic_date`:: diff --git a/docs/reference/migration/migrate_7_0.asciidoc b/docs/reference/migration/migrate_7_0.asciidoc index 25c2e3eef44..b040941b3a9 100644 --- a/docs/reference/migration/migrate_7_0.asciidoc +++ b/docs/reference/migration/migrate_7_0.asciidoc @@ -27,6 +27,7 @@ See also <> and <>. * <> * <> * <> +* <> [float] === Indices created before 7.0 @@ -62,3 +63,4 @@ include::migrate_7_0/restclient.asciidoc[] include::migrate_7_0/low_level_restclient.asciidoc[] include::migrate_7_0/logging.asciidoc[] include::migrate_7_0/node.asciidoc[] +include::migrate_7_0/java_time.asciidoc[] diff --git a/docs/reference/migration/migrate_7_0/java_time.asciidoc b/docs/reference/migration/migrate_7_0/java_time.asciidoc new file mode 100644 index 00000000000..a79bda40dd5 --- /dev/null +++ b/docs/reference/migration/migrate_7_0/java_time.asciidoc @@ -0,0 +1,113 @@ +[float] +[[breaking_70_java_time_changes]] +=== Replacing Joda-Time with java time + +Since Java 8 there is a dedicated `java.time` package, which is superior to +the Joda-Time library, that has been used so far in Elasticsearch. One of +the biggest advantages is the ability to be able to store dates in a higher +resolution than milliseconds for greater precision. Also this will allow us +to remove the Joda-Time dependency in the future. + +The mappings, aggregations and search code switched from Joda-Time to +java time. + +[float] +==== Joda based date formatters are replaced with java ones + +With the release of Elasticsearch 6.7 a backwards compatibility layer was +introduced, that checked if you are using a Joda-Time based formatter, that is +supported differently in java time. A log message was emitted, and you could +create the proper java time based formatter prefixed with an `8`. + +With Elasticsearch 7.0 all formatters are now java based, which means you will +get exceptions when using deprecated formatters without checking the +deprecation log in 6.7. In the worst case you may even end up with different +dates. + +An example deprecation message looks like this, that is returned, when you +try to use a date formatter that includes a lower case `Y` + +[source,text] +---------- +Use of 'Y' (year-of-era) will change to 'y' in the next major version of +Elasticsearch. Prefix your date format with '8' to use the new specifier. +---------- + +So, instead of using `YYYY.MM.dd` you should use `8yyyy.MM.dd`. + +You can find more information about available formatting strings in the +https://docs.oracle.com/javase/8/docs/api/java/time/format/DateTimeFormatter.html[DateTimeFormatter javadocs]. + +[float] +==== Date formats behavioural change + +The `epoch_millis` and `epoch_second` formatters no longer support +scientific notation. + +If you are using the century of era formatter in a date (`C`), this will no +longer be supported. + +The year-of-era formatting character is a `Y` in Joda-Time, but a lowercase +`y` in java time. + +The week-based-year formatting character is a lowercase `x` in Joda-Time, +but an upper-case `Y` in java time. + +[float] +==== Using time zones in the Java client + +Timezones have to be specified as java time based zone objects. This means, +instead of using a `org.joda.time.DateTimeZone` the use of +`java.time.ZoneId` is required. + +Examples of possible uses are the `QueryStringQueryBuilder`, the +`RangeQueryBuilder` or the `DateHistogramAggregationBuilder`, each of them +allow for an optional timezone for that part of the search request. + +[float] +==== Parsing aggregation buckets in the Java client + +The date based aggregation buckets in responses used to be of +type `JodaTime`. Due to migrating to java-time, the buckets are now of +type `ZonedDateTime`. As the client is returning untyped objects here, you +may run into class cast exceptions only when running the code, but not at +compile time, ensure you have proper test coverage for this in your +own code. + +[float] +==== Parsing `GMT0` timezone with JDK8 is not supported + +When you are running Elasticsearch 7 with Java 8, you are not able to parse +the timezone `GMT0` properly anymore. The reason for this is a bug in the +JDK, which has not been fixed for JDK8. You can read more in the +https://bugs.openjdk.java.net/browse/JDK-8138664[official issue] + +[float] +==== Scripting with dates should use java time based methods + +If dates are used in scripting, a backwards compatibility layer has been added +that emulates the Joda-Time methods, but logs a deprecation message as well +to use the java time methods. + +The following methods will be removed in future versions of Elasticsearch +and should be replaced. + +* `getDayOfWeek()` will be an enum instead of an int, if you need to use + an int, use `getDayOfWeekEnum().getValue()` +* `getMillis()` should be replaced with `toInstant().toEpochMilli()` +* `getCenturyOfEra()` should be replaced with `get(ChronoField.YEAR_OF_ERA) / 100` +* `getEra()` should be replaced with `get(ChronoField.ERA)` +* `getHourOfDay()` should be replaced with `getHour()` +* `getMillisOfDay()` should be replaced with `get(ChronoField.MILLI_OF_DAY)` +* `getMillisOfSecond()` should be replaced with `get(ChronoField.MILLI_OF_SECOND)` +* `getMinuteOfDay()` should be replaced with `get(ChronoField.MINUTE_OF_DAY)` +* `getMinuteOfHour()` should be replaced with `getMinute()` +* `getMonthOfYear()` should be replaced with `getMonthValue()` +* `getSecondOfDay()` should be replaced with `get(ChronoField.SECOND_OF_DAY)` +* `getSecondOfMinute()` should be replaced with `getSecond()` +* `getWeekOfWeekyear()` should be replaced with `get(WeekFields.ISO.weekOfWeekBasedYear())` +* `getWeekyear()` should be replaced with `get(WeekFields.ISO.weekBasedYear())` +* `getYearOfCentury()` should be replaced with `get(ChronoField.YEAR_OF_ERA) % 100` +* `getYearOfEra()` should be replaced with `get(ChronoField.YEAR_OF_ERA)` +* `toString(String)` should be replaced with a `DateTimeFormatter` +* `toString(String,Locale)` should be replaced with a `DateTimeFormatter` diff --git a/docs/reference/ml/apis/find-file-structure.asciidoc b/docs/reference/ml/apis/find-file-structure.asciidoc index caed632bda0..9c21d2a88b4 100644 --- a/docs/reference/ml/apis/find-file-structure.asciidoc +++ b/docs/reference/ml/apis/find-file-structure.asciidoc @@ -164,7 +164,7 @@ format corresponds to the primary timestamp, but you do not want to specify the full `grok_pattern`. If this parameter is not specified, the structure finder chooses the best format from -the formats it knows, which are these Java time formats and their Joda equivalents: +the formats it knows, which are these Java time formats: * `dd/MMM/yyyy:HH:mm:ss XX` * `EEE MMM dd HH:mm zzz yyyy` diff --git a/docs/reference/query-dsl/query-string-query.asciidoc b/docs/reference/query-dsl/query-string-query.asciidoc index ce7690868ec..c293bf5457b 100644 --- a/docs/reference/query-dsl/query-string-query.asciidoc +++ b/docs/reference/query-dsl/query-string-query.asciidoc @@ -118,8 +118,7 @@ both>>. |`lenient` |If set to `true` will cause format based failures (like providing text to a numeric field) to be ignored. -|`time_zone` | Time Zone to be applied to any range query related to dates. See also -http://www.joda.org/joda-time/apidocs/org/joda/time/DateTimeZone.html[JODA timezone]. +|`time_zone` | Time Zone to be applied to any range query related to dates. |`quote_field_suffix` | A suffix to append to fields for quoted parts of the query string. This allows to use a field that has a different analysis chain