Commit Graph

24 Commits

Author SHA1 Message Date
Mayya Sharipova aa6248d4d7
Move dense_vector and sparse_vector to module (#43280) (#43333) 2019-06-18 11:56:04 -04:00
Jim Ferenczi 67820f9da1 Fix search_as_you_type's sub-fields to pick their names from the full path of the root field (#41541)
The subfields of the search_as_you_type are prefixed with the name of their root field.
However they should used the full path of the root field rather than just the name since
these fields can appear in a multi-`fields` definition or under an object field.
Since this field type is not released yet, this should be considered as a non-issue.
2019-04-26 10:19:20 +02:00
Jim Ferenczi e256eb361a Fix merging of search_as_you_type field mapper (#40593)
The merge of the `search_as_you_type` field mapper uses the wrong prefix field
and does not update the underlying field types.
2019-03-29 09:02:40 +01:00
Jeff Hajewski 6c13ed7db8 Update max dims for vectors to 1024. (#40597) 2019-03-28 17:08:14 -04:00
Andy Bristol 23395a9b9f
search as you type fieldmapper (#35600)
Adds the search_as_you_type field type that acts like a text field optimized
for as-you-type search completion. It creates a couple subfields that analyze
the indexed terms as shingles, against which full terms are queried, and a
prefix subfield that analyze terms as the largest shingle size used and
edge-ngrams, against which partial terms are queried

Adds a match_bool_prefix query type that creates a boolean clause of a term
query for each term except the last, for which a boolean clause with a prefix
query is created.

The match_bool_prefix query is the recommended way of querying a search as you
type field, which will boil down to term queries for each shingle of the input
text on the appropriate shingle field, and the final (possibly partial) term
as a term query on the prefix field. This field type also supports phrase and
phrase prefix queries however
2019-03-27 13:29:13 -07:00
Julie Tibshirani 419cf1c02f Fix an off-by-one error in the vector field dimension limit. (#40489)
Previously only vectors up to 499 dimensions were accepted, whereas the stated
limit is 500.
2019-03-27 11:17:58 -07:00
Mayya Sharipova e80284231d
Backport distance functions vectors (#39330)
Distance functions for dense and sparse vectors

Backport for #37947, #39313
2019-02-23 11:52:43 -05:00
Mayya Sharipova a30ce6a00a
Rename feature, feature_vector and feature_query (#37794)
Ranaming as follows:
feature -> rank_feature
feature_vector -> rank_features
feature query -> rank_feature query

Ranaming is done to distinguish from other vector types.

Closes #36723
2019-01-24 19:18:48 -05:00
Alexander Reelsen daa2ec8a60
Switch mapping/aggregations over to java time (#36363)
This commit moves the aggregation and mapping code from joda time to
java time. This includes field mappers, root object mappers, aggregations with date
histograms, query builders and a lot of changes within tests.

The cut-over to java time is a requirement so that we can support nanoseconds
properly in a future field mapper.

Relates #27330
2019-01-23 10:40:05 +01:00
Armin Braun 860a8a7b23
Improve Precision for scaled_float (#37169)
* Use `toString` and `Bigdecimal` parsing to get intuitive behaviour for `scaled_float` as discussed in #32570
* Closes #32570
2019-01-11 08:07:55 +01:00
Mayya Sharipova b5d532f9e3
Vector field (#33022)
1. Dense vector

PUT dindex
{
  "mappings": {
    "_doc": {
      "properties": {
        "my_vector": {
          "type": "dense_vector"
        },
        "my_text" : {
          "type" : "keyword"
        }
      }
    }
  }
}

PUT dinex/_doc/1
{
  "my_text" : "text1",
  "my_vector" : [ 0.5, 10, 6 ]
}

2. Sparse vector

PUT sindex
{
  "mappings": {
    "_doc": {
      "properties": {
        "my_vector": {
          "type": "sparse_vector"
        },
        "my_text" : {
          "type" : "keyword"
        }
      }
    }
  }
}

PUT sindex/_doc/1
{
  "my_text" : "text1",
  "my_vector" : {"1": 0.5, "99": -0.5,  "5": 1}
}
2018-12-12 21:20:53 -05:00
Julie Tibshirani 78df00ff24
Simplify the return type of FieldMapper#parse. (#32654) 2018-09-04 01:15:19 +00:00
Tanguy Leroux bf58660482
Remove all unused imports and fix CRLF (#31207)
The X-Pack opening and the recent other refactorings left a lot of 
unused imports in the codebase. This commit removes them all.
2018-06-11 15:12:12 +02:00
Adrien Grand 458bca11bc
Add a `feature_vector` field. (#31102)
This field is similar to the `feature` field but is better suited to index
sparse feature vectors. A use-case for this field could be to record topics
associated with every documents alongside a metric that quantifies how well
the topic is connected to this document, and then boost queries based on the
topics that the logged user is interested in.

Relates #27552
2018-06-07 10:05:37 +02:00
Julie Tibshirani cd0a375414
Remove unused query methods from MappedFieldType. (#30987)
* Remove MappedFieldType#nullValueQuery, as it is now unused.
* Remove MappedFieldType#queryStringTermQuery, as it is never overridden.
2018-05-31 12:47:52 -07:00
Adrien Grand 886db84ad2
Expose Lucene's FeatureField. (#30618)
Lucene has a new `FeatureField` which gives the ability to record numeric
features as term frequencies. Its main benefit is that it allows to boost
queries with the values of these features and efficiently skip non-competitive
documents at the same time using block-max WAND and indexed impacts.
2018-05-23 08:55:21 +02:00
Adrien Grand 4918924fae
Remove legacy mapping code. (#29224)
Some features have been deprecated since `6.0` like the `_parent` field or the
ability to have multiple types per index. This allows to remove quite some
code, which in-turn will hopefully make it easier to proceed with the removal
of types.
2018-04-11 09:41:37 +02:00
Adrien Grand 700d9ecc95
Remove the `update_all_types` option. (#28288)
This option is not useful in 7.x since no indices may have more than one type
anymore.
2018-01-22 12:03:07 +01:00
Jason Tedor 75c0cd0672
Move range field mapper back to core
This commit moves the range field mapper back to core so that we can
remove the compile-time dependency of percolator on mapper-extras which
compilcates dependency management for the percolator client JAR, and
modules should not be intertwined like this anyway.

Relates #27854
2017-12-17 14:27:10 -05:00
Armin Braun 3deba0ed1f #26260 Allow ip_range to accept CIDR notation (#27192)
*  #26260 Allow ip_range to accept CIDR notation

*  #26260 added non-byte-alligned cidr test cases
2017-11-03 13:34:48 -06:00
Armin Braun 8f0f024507 #27189 Fixed rounding of bounds in scaled float comparison (#27207)
*  #27189 Fixed rounding of bounds in scaled float comparison

*  #27189 more assertions from CR
2017-11-03 13:23:07 -06:00
Colin Goodheart-Smithe 99aca9cdfc
Enhances exists queries to reduce need for `_field_names` (#26930)
* Enhances exists queries to reduce need for `_field_names`

Before this change we wrote the name all the fields in a document to a `_field_names` field and then implemented exists queries as a term query on this field. The problem with this approach is that it bloats the index and also affects indexing performance.

This change adds a new method `existsQuery()` to `MappedFieldType` which is implemented by each sub-class. For most field types if doc values are available a `DocValuesFieldExistsQuery` is used, falling back to using `_field_names` if doc values are disabled. Note that only fields where no doc values are available are written to `_field_names`.

Closes #26770

* Addresses review comments

* Addresses more review comments

* implements existsQuery explicitly on every mapper

* Reinstates ability to perform term query on `_field_names`

* Added bwc depending on index created version

* Review Comments

* Skips tests that are not supported in 6.1.0

These values will need to be changed after backporting this PR to 6.x
2017-11-01 10:46:59 +00:00
Christoph Büscher 6189c54c84 Reject the `index_options` parameter for numeric fields (#26668)
Numeric fields no longer support the index_options parameter. This changes the parameter
to be rejected in numeric field types after it was deprecated in 6.0.

Closes #21475
2017-09-25 23:43:14 +02:00
Adrien Grand 93da7720ff Move non-core mappers to a module. (#26549)
Today we have all non-plugin mappers in core. I'd like to start moving those
that neither map to json datatypes nor are very frequently used like `date` or
`ip` to a module.

This commit creates a new module called `mappers-extra` and moves the
`scaled_float` and `token_count` mappers to it. I'd like to eventually move
`range` fields there but it's more complicated due to their intimate
relationship with range queries.

Relates #10368
2017-09-13 17:58:53 +02:00