Commit Graph

1581 Commits

Author SHA1 Message Date
Igor Motov e32efba3d8 Improve RecoverAfterNodes tests 2013-01-31 20:05:55 -05:00
Martijn van Groningen 5e811e5382 Another small TopChildrenQuery cleanup. 2013-01-31 23:49:32 +01:00
Martijn van Groningen 7ef65688cd - TopChildrenQuery cleanup.
- Added class level jdocs for TopChildrenQuery and ChildrenQuery.
2013-01-31 23:38:09 +01:00
Simon Willnauer 1a1df06411 Move OrdsBuilding into a dedicated class and abstract integer pools used to build sparse ordinals 2013-01-31 19:02:31 +01:00
Martijn van Groningen 1f50b07406 Initial parent/child queries cleanup. 2013-01-31 18:39:31 +01:00
Martijn van Groningen 371b071fb7 Added notion of Rewrite that replaces ScopePhase 2013-01-31 17:24:46 +01:00
Martijn van Groningen d4ef4697d5 Also remove scope from facet builders. Fixes build. 2013-01-31 16:34:45 +01:00
Martijn van Groningen 46dd42920c Remove scope support in query and facet dsl.
Remove support for the `scope` field in facets and `_scope` field in the nested and parent/child queries. The scope support for nested queries will be replaced by the `nested` facet option and a facet filter with a nested filter. The nested filters will now support the a `join` option. Which controls whether to perform the block join. By default this enabled, but when disabled it returns the nested documents as hits instead of the joined root document.

Search request with the current scope support.
```
curl -s -XPOST 'localhost:9200/products/_search' -d '{
    "query" : {
		"nested" : {
			"path" : "offers",
			"query" : {
				"match" : {
					"offers.color" : "blue"
				}
			},
			"_scope" : "my_scope"
		}
	},
	"facets" : {
		"size" : {
			"terms" : {
				"field" : "offers.size"
			},
			"scope" : "my_scope"
		}
	}
}'
```

The following will be functional equivalent of using the scope support:
```
curl -s -XPOST 'localhost:9200/products/_search?search_type=count' -d '{
    "query" : {
		"nested" : {
			"path" : "offers",
			"query" : {
				"match" : {
					"offers.color" : "blue"
				}
			}
		}
	},
	"facets" : {
		"size" : {
			"terms" : {
				"field" : "offers.size"
			},
			"facet_filter" : {
				"nested" : {
					"path" : "offers",
					"query" : {
						"match" : {
							"offers.color" : "blue"
						}
					},
					"join" : false
				}
			},
			"nested" : "offers"
		}
	}
}'
```

The scope support for parent/child queries will be replaced by running the child query as filter in a global facet.

Search request with the current scope support:
```
curl -s -XPOST 'localhost:9200/products/_search' -d '{
	"query" : {
		"has_child" : {
			"type" : "offer",
			"query" : {
				"match" : {
					"color" : "blue"
				}
			},
			"_scope" : "my_scope"
		}
	},
	"facets" : {
		"size" : {
			"terms" : {
				"field" : "size"
			},
			"scope" : "my_scope"
		}
	}
}'
```

The following is the functional equivalent of using the scope support with parent/child queries:
```
curl -s -XPOST 'localhost:9200/products/_search' -d '{
	"query" : {
		"has_child" : {
			"type" : "offer",
			"query" : {
				"match" : {
					"color" : "blue"
				}
			}
		}
	},
	"facets" : {
		"size" : {
			"terms" : {
				"field" : "size"
			},
			"global" : true,
			"facet_filter" : {
				"term" : {
					"color" : "blue"
				}
			}
		}
	}
}'
```

Closes #2606
2013-01-31 15:09:57 +01:00
Martijn van Groningen 355381962b Use only the 'test' index, instead of all indices for child search benchmark. 2013-01-31 13:12:33 +01:00
Shay Banon 6cec73c201 remove fuzzy factor from mapping (internally implemented)
we want to support ~ notion in query parser for types other than strings, we are getting there, one can do now age:10~5, we would love to support it for dates, as in timestamp:2012-10-10~5d, but that requires changes in the query parser to support strings after the ~ sign
2013-01-31 12:23:03 +01:00
Igor Motov 8df7f2af0d Improve testReusePeerRecovery test 2013-01-30 19:51:41 -05:00
Igor Motov 29f4274213 Add index cleanup if index creation fails
Fixes #2590
2013-01-30 10:40:01 -05:00
Shay Banon 5c40c97e6e Id Cache: Allow to configure if ids should be reused (memory wise) or not, default to false
closes #2605
2013-01-30 14:42:07 +01:00
Martijn van Groningen bc20f068c9 Made `search_analyzer` updateable via put mapping api.
Closes #2604
2013-01-30 11:49:20 +01:00
Martijn van Groningen e074e00f76 Fielddata: Moved the growing logic to IntArrayRef 2013-01-30 11:20:41 +01:00
Martijn van Groningen f7692aeef2 Fielddata: IntArrayRef is initialized with small array and grows if needed 2013-01-30 10:57:52 +01:00
Simon Willnauer 5df37eaf75 add more advanced tests for phrase_prefix 2013-01-30 10:51:05 +01:00
Shay Banon f5e55b7cb9 properly print JVM version 2013-01-29 20:25:13 +01:00
Shay Banon 0568284147 reduce the memory needed while building the sparse array ordinals 2013-01-29 20:23:54 +01:00
Shay Banon 716f2aebbb add 0.20.5 2013-01-29 10:14:25 +01:00
Simon Willnauer 0697e2f23e use index prefix in tests to prevent misconfiguration 2013-01-28 15:51:06 +01:00
Simon Willnauer 72a2416a8c Support MultiPhrasePrefixQuery and MultiPhraseQuery in highlighters
Closes #2596
2013-01-28 15:41:25 +01:00
Martijn van Groningen 2e68207d6d Updated suggest api.
# Suggest feature
The suggest feature suggests similar looking terms based on a provided text by using a suggester. At the moment there the only supported suggester is `fuzzy`. The suggest feature is available from version `0.21.0`.

# Fuzzy suggester
The `fuzzy` suggester suggests terms based on edit distance. The provided suggest text is analyzed before terms are suggested. The suggested terms are provided per analyzed suggest text token. The `fuzzy` suggester doesn't take the query into account that is part of request.

# Suggest API
The suggest request part is defined along side the query part as top field in the json request.

```
curl -s -XPOST 'localhost:9200/_search' -d '{
  "query" : {
    ...
  },
  "suggest" : {
    ...
  }
}'
```

Several suggestions can be specified per request. Each suggestion is identified with an arbitary name. In the example below two suggestions are requested. Both `my-suggest-1` and `my-suggest-2` suggestions use the `fuzzy` suggester, but have a different `text`.

```
"suggest" : {
  "my-suggest-1" : {
    "text" : "the amsterdma meetpu",
    "fuzzy" : {
      "field" : "body"
    }
  },
  "my-suggest-2" : {
    "text" : "the rottredam meetpu",
    "fuzzy" : {
      "field" : "title",
    }
  }
}
```

The below suggest response example includes the suggestion response for `my-suggest-1` and `my-suggest-2`. Each suggestion part contains entries. Each entry is effectively a token from the suggest text and contains the suggestion entry text, the original start offset and length in the suggest text and if found an arbitary number of options.

```
{
  ...
  "suggest": {
    "my-suggest-1": [
      {
        "text" : "amsterdma",
        "offset": 4,
        "length": 9,
        "options": [
           ...
        ]
      },
      ...
    ],
    "my-suggest-2" : [
      ...
    ]
  }
  ...
}
```

Each options array contains a option object that includes the suggested text, its document frequency and score compared to the suggest entry text. The meaning of the score depends on the used suggester. The fuzzy suggester's score is based on the edit distance.

```
"options": [
  {
    "text": "amsterdam",
    "freq": 77,
    "score": 0.8888889
  },
  ...
]
```

# Global suggest text

To avoid repitition of the suggest text, it is possible to define a global text. In the example below the suggest text is defined globally and applies to the `my-suggest-1` and `my-suggest-2` suggestions.

```
"suggest" : {
  "text" : "the amsterdma meetpu"
  "my-suggest-1" : {
    "fuzzy" : {
      "field" : "title"
    }
  },
  "my-suggest-2" : {
    "fuzzy" : {
      "field" : "body"
    }
  }
}
```

The suggest text can in the above example also be specied as suggestion specific option. The suggest text specified on suggestion level override the suggest text on the global level.

# Other suggest example.

In the below example we request suggestions for the following suggest text: `devloping distibutd saerch engies` on the `title` field with a maximum of 3 suggestions per term inside the suggest text. Note that in this example we use the `count` search type. This isn't required, but a nice optimalization. The suggestions are gather in the `query` phase and in the case that we only care about suggestions (so no hits) we don't need to execute the `fetch` phase.

```
curl -s -XPOST 'localhost:9200/_search?search_type=count' -d '{
  "suggest" : {
    "my-title-suggestions-1" : {
      "text" : "devloping distibutd saerch engies",
      "fuzzy" : {
        "size" : 3,
        "field" : "title"
      }
    }
  }
}'
```

The above request could yield the response as stated in the code example below. As you can see if we take the first suggested options of each suggestion entry we get `developing distributed search engines` as result.

```
{
  ...
  "suggest": {
    "my-title-suggestions-1": [
      {
        "text": "devloping",
        "offset": 0,
        "length": 9,
        "options": [
          {
            "text": "developing",
            "freq": 77,
            "score": 0.8888889
          },
          {
            "text": "deloping",
            "freq": 1,
            "score": 0.875
          },
          {
            "text": "deploying",
            "freq": 2,
            "score": 0.7777778
          }
        ]
      },
      {
        "text": "distibutd",
        "offset": 10,
        "length": 9,
        "options": [
          {
            "text": "distributed",
            "freq": 217,
            "score": 0.7777778
          },
          {
            "text": "disributed",
            "freq": 1,
            "score": 0.7777778
          },
          {
            "text": "distribute",
            "freq": 1,
            "score": 0.7777778
          }
        ]
      },
      {
        "text": "saerch",
        "offset": 20,
        "length": 6,
        "options": [
          {
            "text": "search",
            "freq": 1038,
            "score": 0.8333333
          },
          {
            "text": "smerch",
            "freq": 3,
            "score": 0.8333333
          },
          {
            "text": "serch",
            "freq": 2,
            "score": 0.8
          }
        ]
      },
      {
        "text": "engies",
        "offset": 27,
        "length": 6,
        "options": [
          {
            "text": "engines",
            "freq": 568,
            "score": 0.8333333
          },
          {
            "text": "engles",
            "freq": 3,
            "score": 0.8333333
          },
          {
            "text": "eggies",
            "freq": 1,
            "score": 0.8333333
          }
        ]
      }
    ]
  }
  ...
}
```

# Common suggest options:
* `text` - The suggest text. The suggest text is a required option that needs to be set globally or per suggestion.

# Common fuzzy suggest options
* `field` - The field to fetch the candidate suggestions from. This is an required option that either needs to be set globally or per suggestion.
* `analyzer` - The analyzer to analyse the suggest text with. Defaults to the search analyzer of the suggest field.
* `size` - The maximum corrections to be returned per suggest text token.
* `sort` - Defines how suggestions should be sorted per suggest text term. Two possible value:
** `score` - Sort by sore first, then document frequency and then the term itself.
** `frequency` - Sort by document frequency first, then simlarity score and then the term itself.
* `suggest_mode` - The suggest mode controls what suggestions are included or controls for what suggest text terms, suggestions should be suggested. Three possible values can be specified:
** `missing` - Only suggest terms in the suggest text that aren't in the index. This is the default.
** `popular` - Only suggest suggestions that occur in more docs then the original suggest text term.
** `always` - Suggest any matching suggestions based on terms in the suggest text.

# Other fuzzy suggest options:
* `lowercase_terms` - Lower cases the suggest text terms after text analyzation.
* `max_edits` - The maximum edit distance candidate suggestions can have in order to be considered as a suggestion. Can only be a value between 1 and 2. Any other value result in an bad request error being thrown. Defaults to 2.
* `min_prefix` - The number of minimal prefix characters that must match in order be a candidate suggestions. Defaults to 1. Increasing this number improves spellcheck performance. Usually misspellings don't occur in the beginning of terms.
* `min_query_length` -  The minimum length a suggest text term must have in order to be included. Defaults to 4.
* `shard_size` - Sets the maximum number of suggestions to be retrieved from each individual shard. During the reduce phase only the top N suggestions are returned based on the `size` option. Defaults to the `size` option. Setting this to a value higher than the `size` can be useful in order to get a more accurate document frequency for spelling corrections at the cost of performance. Due to the fact that terms are partitioned amongst shards, the shard level document frequencies of spelling corrections may not be precise. Increasing this will make these document frequencies more precise.
* `max_inspections` - A factor that is used to multiply with the `shards_size` in order to inspect more candidate spell corrections on the shard level. Can improve accuracy at the cost of performance. Defaults to 5.
* `threshold_frequency` - The minimal threshold in number of documents a suggestion should appear in. This can be specified as an absolute number or as a relative percentage of number of documents. This can improve quality by only suggesting high frequency terms. Defaults to 0f and is not enabled. If a value higher than 1 is specified then the number cannot be fractional. The shard level document frequencies are used for this option.
* `max_query_frequency` - The maximum threshold in number of documents a sugges text token can exist in order to be included. Can be a relative percentage number (e.g 0.4) or an absolute number to represent document frequencies. If an value higher than 1 is specified then fractional can not be specified. Defaults to 0.01f. This can be used to exclude high frequency terms from being spellchecked. High frequency terms are usually spelled correctly on top of this this also improves the spellcheck performance.  The shard level document frequencies are used for this option.
2013-01-28 15:18:18 +01:00
Simon Willnauer 48488f707f Expose CommonTermsQuery in Match & MultiMatch and enable highlighting
Closes #2591
2013-01-28 11:57:05 +01:00
Shay Banon bfdf8fe590 Indexes created from index request might not replica initial doc to replica
fixes #2594
2013-01-28 11:29:32 +01:00
Shay Banon 9539661d40 move facet reduce from facet process to the actual facet
this will simplify execution, and actually let the process just be a parser (rename will probably happen)
2013-01-27 13:45:38 +01:00
Shay Banon 360d7d9425 default for paged_bytes for string type
less memory overhead, though a bit slower on the execution side for facets, and might require more memory per facet execution
2013-01-26 15:11:14 +01:00
Simon Willnauer 5c89d66216 move ShardsAllocatorModuleTests to o.e.t.integration 2013-01-25 22:26:30 +01:00
Shay Banon 41cfe9cc27 add 0.20.4 2013-01-25 22:02:34 +01:00
Shay Banon 042a5d02d9 Primary shard failure with initializing replica shards can cause the replica shard to cause allocation failures
fixes #2592
2013-01-25 17:59:01 +01:00
Simon Willnauer a7bb3c29f2 Propagate exception during recovery if segement info can not be opended but should 2013-01-25 15:25:48 +01:00
Shay Banon 1be84c273b eagerly reroute when a node leaves the cluster 2013-01-25 15:23:05 +01:00
Martijn van Groningen a1ef1f02cc Exposed IndexOptions#DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS setting. 2013-01-25 00:02:43 +01:00
Shay Banon 45ed9ddba7 cleanup ordinals in field data 2013-01-24 22:31:52 +01:00
Shay Banon 990acff4f7 make sure we wait for yellow stats in suggest API when searching on clean index 2013-01-24 22:31:51 +01:00
Martijn van Groningen f974a17229 Removed AbstractFragmentsBuilder. Lucene's BaseFragmentsBuilder has now discrete multivalued highlighting and better support for requesting large number of fragments. 2013-01-24 22:15:07 +01:00
Martijn van Groningen e56b279624 Made BlockJoinScorer#freq() method handle freqs correctly (as is done in ToParentBlockJoinQuery) 2013-01-24 21:52:56 +01:00
Martijn van Groningen 9013eeae8a Added filter support in the `has_child` and `has_parent` filters.
Example:
```
curl -XPOST 'localhost:9200/_search' -d '{
  "query": {
    "filtered_query": {
      "query": {
        "match": {
          "title": "distributed systems"
        }
      },
      "filter": {
        "has_child": {
          "type": "tag",
          "filter": {
            "term": {
              "name": "book"
            }
          }
        }
      }
    }
  }
}'
```

Closes #2585
2013-01-24 21:32:38 +01:00
Shay Banon a39469a252 gather the field data that are changed
(we will make use of that later)
2013-01-24 15:55:23 +01:00
Martijn van Groningen 98a674fc6e Added suggest api.
# Suggest feature
The suggest feature suggests similar looking terms based on a provided text by using a suggester. At the moment there the only supported suggester is `fuzzy`. The suggest feature is available since version `0.21.0`.

# Fuzzy suggester
The `fuzzy` suggester suggests terms based on edit distance. The provided suggest text is analyzed before terms are suggested. The suggested terms are provided per analyzed suggest text token. The `fuzzy` suggester doesn't take the query into account that is part of request.

# Suggest API
The suggest request part is defined along side the query part as top field in the json request.

```
curl -s -XPOST 'localhost:9200/_search' -d '{
    "query" : {
        ...
    },
    "suggest" : {
        ...
    }
}'
```

Several suggestions can be specified per request. Each suggestion is identified with an arbitary name. In the example below two suggestions are requested. The `my-suggest-1` suggestion uses the `body` field and `my-suggest-2` uses the `title` field. The `type` field is a required field and defines what suggester to use for a suggestion.

```
"suggest" : {
    "suggestions" : {
        "my-suggest-1" : {
            "type" : "fuzzy",
            "field" : "body",
            "text" : "the amsterdma meetpu"
        },
        "my-suggest-2" : {
            "type" : "fuzzy",
            "field" : "title",
            "text" : "the rottredam meetpu"
        }
    }
}
```

The below suggest response example includes the suggestions part for `my-suggest-1` and `my-suggest-2`. Each suggestion part contains a terms array, that contains all terms outputted by the analyzed suggest text. Each term object includes the term itself, the original start and end offset in the suggest text and if found an arbitary number of suggestions.

```
{
    ...
    "suggest": {
        "my-suggest-1": {
            "terms" : [
              {
                "term" : "amsterdma",
                "start_offset": 5,
                "end_offset": 14,
                "suggestions": [
                   ...
                ]
              }
              ...
            ]
        },
        "my-suggest-2" : {
          "terms" : [
            ...
          ]
        }
    }
```

Each suggestions array contains a suggestion object that includes the suggested term, its document frequency and score compared to the suggest text term. The meaning of the score depends on the used suggester. The fuzzy suggester's score is based on the edit distance.

```
"suggestions": [
    {
        "term": "amsterdam",
        "frequency": 77,
        "score": 0.8888889
    },
    ...
]
```

# Global suggest text

To avoid repitition of the suggest text, it is possible to define a global text. In the example below the suggest text is a global option and applies to the `my-suggest-1` and `my-suggest-2` suggestions.

```
"suggest" : {
    "suggestions" : {
        "text" : "the amsterdma meetpu",
        "my-suggest-1" : {
            "type" : "fuzzy",
            "field" : "title"
        },
        "my-suggest-2" : {
            "type" : "fuzzy",
            "field" : "body"
        }
    }
}
```

The suggest text can be specied as global option or as suggestion specific option. The suggest text specified on suggestion level override the suggest text on the global level.

# Other suggest example.

In the below example we request suggestions for the following suggest text: `devloping distibutd saerch engies` on the `title` field with a maximum of 3 suggestions per term inside the suggest text. Note that in this example we use the `count` search type. This isn't required, but a nice optimalization. The suggestions are gather in the `query` phase and in the case that we only care about suggestions (so no hits) we don't need to execute the `fetch` phase.

```
curl -s -XPOST 'localhost:9200/_search?search_type=count' -d '{
  "suggest" : {
      "suggestions" : {
        "my-title-suggestions" : {
          "suggester" : "fuzzy",
          "field" : "title",
          "text" : "devloping distibutd saerch engies",
          "size" : 3
        }
      }
  }
}'
```

The above request could yield the response as stated in the code example below. As you can see if we take the first suggested term of each suggest text term we get `developing distributed search engines` as result.

```
{
  ...
  "suggest": {
    "my-title-suggestions": {
      "terms": [
        {
          "term": "devloping",
          "start_offset": 0,
          "end_offset": 9,
          "suggestions": [
            {
              "term": "developing",
              "frequency": 77,
              "score": 0.8888889
            },
            {
              "term": "deloping",
              "frequency": 1,
              "score": 0.875
            },
            {
              "term": "deploying",
              "frequency": 2,
              "score": 0.7777778
            }
          ]
        },
        {
          "term": "distibutd",
          "start_offset": 10,
          "end_offset": 19,
          "suggestions": [
            {
              "term": "distributed",
              "frequency": 217,
              "score": 0.7777778
            },
            {
              "term": "disributed",
              "frequency": 1,
              "score": 0.7777778
            },
            {
              "term": "distribute",
              "frequency": 1,
              "score": 0.7777778
            }
          ]
        },
        {
          "term": "saerch",
          "start_offset": 20,
          "end_offset": 26,
          "suggestions": [
            {
              "term": "search",
              "frequency": 1038,
              "score": 0.8333333
            },
            {
              "term": "smerch",
              "frequency": 3,
              "score": 0.8333333
            },
            {
              "term": "serch",
              "frequency": 2,
              "score": 0.8
            }
          ]
        },
        {
          "term": "engies",
          "start_offset": 27,
          "end_offset": 33,
          "suggestions": [
            {
              "term": "engines",
              "frequency": 568,
              "score": 0.8333333
            },
            {
              "term": "engles",
              "frequency": 3,
              "score": 0.8333333
            },
            {
              "term": "eggies",
              "frequency": 1,
              "score": 0.8333333
            }
          ]
        }
      ]
    }
  }
  ...
}
```

# Common suggest options:
* `suggester` - The suggester implementation type. The only supported value is 'fuzzy'. This is a required option.
* `text` - The suggest text. The suggest text is a required option that needs to be set globally or per suggestion.

# Common fuzzy suggest options
* `field` - The field to fetch the candidate suggestions from. This is an required option that either needs to be set globally or per suggestion.
* `analyzer` - The analyzer to analyse the suggest text with. Defaults to the search analyzer of the suggest field.
* `size` - The maximum corrections to be returned per suggest text token.
* `sort` - Defines how suggestions should be sorted per suggest text term. Two possible value:
** `score` - Sort by sore first, then document frequency and then the term itself.
** `frequency` - Sort by document frequency first, then simlarity score and then the term itself.
* `suggest_mode` - The suggest mode controls what suggestions are included or controls for what suggest text terms, suggestions should be suggested. Three possible values can be specified:
** `missing` - Only suggest terms in the suggest text that aren't in the index. This is the default.
** `popular` - Only suggest suggestions that occur in more docs then the original suggest text term.
** `always` - Suggest any matching suggestions based on terms in the suggest text.

# Other fuzzy suggest options:
* `lowercase_terms` - Lower cases the suggest text terms after text analyzation.
* `max_edits` - The maximum edit distance candidate suggestions can have in order to be considered as a suggestion. Can only be a value between 1 and 2. Any other value result in an bad request error being thrown. Defaults to 2.
* `min_prefix` - The number of minimal prefix characters that must match in order be a candidate suggestions. Defaults to 1. Increasing this number improves spellcheck performance. Usually misspellings don't occur in the beginning of terms.
* `min_query_length` -  The minimum length a suggest text term must have in order to be included. Defaults to 4.
* `shard_size` - Sets the maximum number of suggestions to be retrieved from each individual shard. During the reduce phase only the top N suggestions are returned based on the `size` option. Defaults to the `size` option. Setting this to a value higher than the `size` can be useful in order to get a more accurate document frequency for spelling corrections at the cost of performance. Due to the fact that terms are partitioned amongst shards, the shard level document frequencies of spelling corrections may not be precise. Increasing this will make these document frequencies more precise.
* `max_inspections` - A factor that is used to multiply with the `shards_size` in order to inspect more candidate spell corrections on the shard level. Can improve accuracy at the cost of performance. Defaults to 5.
* `threshold_frequency` - The minimal threshold in number of documents a suggestion should appear in. This can be specified as an absolute number or as a relative percentage of number of documents. This can improve quality by only suggesting high frequency terms. Defaults to 0f and is not enabled. If a value higher than 1 is specified then the number cannot be fractional. The shard level document frequencies are used for this option.
* `max_query_frequency` - The maximum threshold in number of documents a sugges text token can exist in order to be included. Can be a relative percentage number (e.g 0.4) or an absolute number to represent document frequencies. If an value higher than 1 is specified then fractional can not be specified. Defaults to 0.01f. This can be used to exclude high frequency terms from being spellchecked. High frequency terms are usually spelled correctly on top of this this also improves the spellcheck performance.  The shard level document frequencies are used for this option.

 Closes #2585
2013-01-24 15:41:06 +01:00
Shay Banon 9673a1c366 expose field data settings in mapping, they can be updated using merge mapping 2013-01-24 15:33:24 +01:00
Simon Willnauer 4eefcb9c82 Expose CommonTermsQuery
Closes #2583
2013-01-24 14:18:01 +01:00
Simon Willnauer c4eab90b2e Cleanup MatchQuery 2013-01-24 14:11:56 +01:00
Shay Banon c2f35621f6 allow to get settings as delimited string 2013-01-24 12:03:16 +01:00
Shay Banon b143822bac allow to load settings from delimited string 2013-01-24 12:00:14 +01:00
Simon Willnauer 88f68264c7 Reuse MemoryIndex instances across Percolator requests.
* added configurable MemoryIndexPool that pools MemoryIndex instance across Threads
* Pool can be configured based on the number of pooled instances as well as the maximum number of bytes that is reused across the pooled instances

Closes #2581
2013-01-24 11:53:21 +01:00
Shay Banon e8c1180ede add field data stats 2013-01-24 11:38:18 +01:00
Shay Banon 613b746299 move field data type to simply be type and settings 2013-01-24 09:33:16 +01:00
Martijn van Groningen 50ac477d92 Fixed small bug. Index name should be used to lookup entry. 2013-01-23 23:53:20 +01:00
Shay Banon 4967a97faf don't use private since its accessed from inner class, remove $$ need 2013-01-23 22:17:27 +01:00
Martijn van Groningen 346422b747 Added sparse multi ordinals implementation for field data. 2013-01-23 22:11:31 +01:00
Daniel Muller 9e79f54cb1 Check for java-6-openjdk-amd64 2013-01-23 18:34:37 +01:00
synhershko e0f711a94a Updating Lucene version 2013-01-23 16:18:18 +02:00
Shay Banon a74e7f8099 refactor geo to extract common classes 2013-01-23 14:14:21 +01:00
Simon Willnauer 9c729fad2c remove flush check IW#commit always adds a commit point now even if nothing has changed ie. docs are added, updated or deleted. 2013-01-23 14:06:01 +01:00
Shay Banon 22f0e79a84 use merge trigger to control when to do merges
now with merge trigger, we can simply decide when to do merges based on it
2013-01-23 13:24:20 +01:00
Shay Banon d969e61999 Remove settings option for index store compression, compression is always enabled
closes #2577
2013-01-23 13:11:48 +01:00
Simon Willnauer 2880cd0172 Upgrade to Lucene 4.1
* Removed CustmoMemoryIndex in favor of MemoryIndex which as of 4.1 supports adding the same field twice
* Replaced duplicated logic in X[*]FSDirectory for rate limiting with a RateLimitedFSDirectory wrapper
* Remove hacks to find out merge context in rate limiting in favor of IOContext
* replaced Scorer#freq() return type (from float to int)
* Upgraded FVHighlighter to new 'centered' highlighting
* Fixed RobinEngine to use seperate setCommitData
2013-01-23 11:54:11 +01:00
Shay Banon 20f43bf54c add hasSingleArrayBackingStorage
allow for optimization only when there really is a single array, and not when there is a multi dimensional one
2013-01-23 10:24:43 +01:00
Igor Motov bbfd3957eb Improve stability of the testNodesInfos test 2013-01-22 12:29:38 -05:00
Igor Motov 9becdb814a Improve stability of the shardsCleanup test 2013-01-22 10:20:18 -05:00
Shay Banon c295211a85 final move to new field data 2013-01-22 16:16:33 +01:00
Shay Banon 27bfb341ff better logging on missing format, and allow to configure format on a type on the index level 2013-01-22 16:16:33 +01:00
uboness 09cc70b8c9 added predefined empty implementation for all atomic field datas 2013-01-22 16:16:33 +01:00
Shay Banon 6b92b592b4 allow to clear by reader the new field data cache 2013-01-22 16:16:32 +01:00
Shay Banon c67386f644 properly invalidate on core closed reader 2013-01-22 16:16:32 +01:00
Shay Banon af757fd821 more usage of field data
note, removed field data from cache stats, it will have its own stats later on (cache part is really misleading)
2013-01-22 16:16:32 +01:00
Shay Banon de013babf8 move geo filters and numeric range to use new field data 2013-01-22 16:16:32 +01:00
Shay Banon be1e5becbb move scripts to use new field data 2013-01-22 16:16:32 +01:00
Shay Banon 772ee9db54 move terms to use new field data 2013-01-22 16:16:32 +01:00
Shay Banon e5b651321f remove some safe methods because of the new makeSafe method usage 2013-01-22 16:16:32 +01:00
Shay Banon f189a832c5 grr pages -> paged 2013-01-22 16:16:32 +01:00
Shay Banon 5b7173fc35 move sorting to work with new field data 2013-01-22 16:16:32 +01:00
uboness b739bf97d4 added missing dedicated value comparators for the different indices field data 2013-01-22 16:16:32 +01:00
Shay Banon 45f27fe96a add packed bytes variant for strings/bytes 2013-01-22 16:16:32 +01:00
uboness 855b64a8a7 byte field data implementation 2013-01-22 16:16:31 +01:00
uboness f1f3c241fd short field data implementation 2013-01-22 16:16:31 +01:00
uboness 3840439365 float field data implementation 2013-01-22 16:16:31 +01:00
Shay Banon 9137fcc6fc move geo distance sorting to use new field data 2013-01-22 16:16:31 +01:00
Shay Banon d5e70a27df integer type to support int field data type 2013-01-22 16:16:31 +01:00
uboness fc09ce7ac9 Implemented int field data 2013-01-22 16:16:31 +01:00
Shay Banon d82859c82b geo point new field mapper with geo distance facet based impl 2013-01-22 16:16:31 +01:00
Shay Banon 2e86081f7b use smartNameMapper on context 2013-01-22 16:16:31 +01:00
Shay Banon d88e3f73ac add specific makeSafe method to make an unsafe (shared) bytes based value to a "safe" one 2013-01-22 16:16:31 +01:00
Shay Banon 1765b0b813 date histogram to use new field data 2013-01-22 16:16:31 +01:00
Shay Banon 37acba1b57 terms stats to use new field data 2013-01-22 16:16:31 +01:00
Shay Banon f1f86efed5 move statistical facet to use new field data 2013-01-22 16:16:30 +01:00
Shay Banon 699ff2782e move histogram facet to use new field data 2013-01-22 16:16:30 +01:00
Shay Banon 8c7e0f5ca1 fix getOrds on single array ords 2013-01-22 16:16:30 +01:00
Shay Banon fa363b2dca move range facet to use new field data abstraction 2013-01-22 16:16:30 +01:00
Shay Banon 692413862a add clear when deleting an index for the field data service 2013-01-22 16:16:30 +01:00
Shay Banon a39ca58de9 add field data service to index level services 2013-01-22 16:16:30 +01:00
Shay Banon 2d91939253 add initial field data type support to mappers
hardwired and still happily leaves with current field data impl
2013-01-22 16:16:30 +01:00
Shay Banon e0b280f9b3 use FieldMapper.Names for fieldNames, and not just fieldName as string 2013-01-22 16:16:30 +01:00
Shay Banon 7dc5cf9799 add long field support 2013-01-22 16:16:30 +01:00
Shay Banon 7397007e05 initial commit 2013-01-22 16:16:30 +01:00
Clinton Gormley 7cfdd9ef59 Corrected filter strategy option in FilteredQueryParser
Changed from 'query_filter' to 'query_first'
2013-01-22 12:54:00 +01:00
Simon Willnauer 0b730aae81 Pass on filterStrategy in XFilteredQuery if query is rewritten 2013-01-22 12:40:21 +01:00
Martijn van Groningen a5bd57ed6c Added trace log statement, to catch stacktraces 2013-01-20 23:17:18 +01:00
Simon Willnauer 35cf9ee11d wait for cluster to be formed in SimpleNodesInfoTests 2013-01-19 15:44:26 +01:00
Simon Willnauer d6b613ac8c Respect lowercase_expanded_terms in MappingQueryParser
Fixes #2566
2013-01-19 13:57:45 +01:00
Simon Willnauer 31fd521fd1 provide more information if a null DocumentMapper is returned 2013-01-18 16:43:56 +01:00
Simon Willnauer c563248f76 testMoreLikeThisIssue2197 should create index mapping first to prevent races 2013-01-18 16:41:37 +01:00
Simon Willnauer 6f38a3a8a8 create index and mapping first to ensure all relevant nodes see the mapping 2013-01-18 16:09:24 +01:00
Simon Willnauer 393de984bd Remove deprecated StreamInput/Output#read/writeUTF 2013-01-17 22:38:42 +01:00
Simon Willnauer d37c844da0 use camelcase for getters 2013-01-17 22:27:44 +01:00
Simon Willnauer 3d80c53192 Allow ShardsAllocator to be configured via node level settings.
* Default ShardsAllocator is set to BalancedShardsAllocator
* Core ShardsAllocator implementations can be defined via 'cluster.routing.allocation.type'
* Core ShardsAllocator implementations are exposed via short keys 'balanced' (BalancedShardsAllocator) and 'even_shards' (EvenShardsCountAllocator)
* Third party allocators can be loaded via fully-qualified class names.

Closes #2557
2013-01-17 16:23:52 +01:00
Simon Willnauer 2eb09e6b1a Added BalancedShardsAllocator that balances shards based on a weight function.
* Weights are calculated per index and incorporate index level, global and primary related parameters
 * Balance operations are executed based on a win maximation strategy that tries to relocate shards
   first that offer the biggest gain towards the weight functions optimum
 * The WeightFunction allows settings to prefer index based balance over global balance and vice versa
 * Balance operations can be throttled by raising a threshold resulting in less agressive balance operations
 * WeightFunction shipps with defaults to achive evenly distributed indexes while maintaining a global balance

Closes #2555
2013-01-17 12:02:42 +01:00
Igor Motov d97839b8a8 Fix char filter issues introduced during lucene 4 migration
Fixes #2543
2013-01-14 12:43:02 -05:00
Igor Motov e82f96f1e5 Make script cache configurable and bounded
Fixes #2539
2013-01-14 06:57:13 -05:00
Igor Motov 6243f8e64d Disallow unknown custom indexing parameters
Fixes #2354
2013-01-11 10:14:25 -05:00
Martijn van Groningen 1ce10dfb06 Fixed issue where parent & child queries can fail if a segment doesn't have documents with the targeted type or associated parent type
Closes #2537
2013-01-11 16:06:14 +01:00
Martijn van Groningen 43aabe88e8 Fixed document already exists error when concurrently sending update request with upsert using the same id.
Closes #2530
2013-01-10 14:25:44 +01:00
Shay Banon 6f7253c524 Comments are not allowed in mapping
checked jackson, there won't be an overhead in enabling comments. Added, with the caveat that when used with mappings, and calling "get mapping", the comments will not be returned
closes #1394
2013-01-07 04:21:41 +01:00
Shay Banon 2c4b9d9ba2 cleanup queryHint since its not was never used
preference ended up as the way to control routing
2013-01-07 04:02:45 +01:00
Shay Banon bcdda811ef add read/write optional text 2013-01-07 02:54:22 +01:00
Shay Banon 0e5287f1f2 Binary Mapped Fields: Allow to not store them by default, and return BytesReference
fixes #2523
also, fix another point of normalization of the result for get API
2013-01-05 01:50:46 +01:00
Shay Banon 4b9fcdb900 noramalize the value even when getting it from source
we need to in order to properly handle bytes, and normalize Integer to Long for example for consistency, the fact that mappers now handle different Objtes help here
2013-01-04 23:55:34 +01:00
Shay Banon fe38cecabd return the string for date types if passed for search 2013-01-04 23:31:49 +01:00
Shay Banon bf4c442509 add refresh before calling count 2013-01-04 08:38:00 +01:00
Shay Banon 70f1e2c987 remove problematic timeout test
the timeout feature, even if set to 0, might still mean we get an ack back...
2013-01-04 08:19:01 +01:00
Shay Banon e70cf1849b check on sourceRef, so it won't convert to bytes without needing to 2013-01-04 07:47:26 +01:00
Igor Motov d73a6663b7 Changing non-nested object mapping to nested should fail
Fixes #2518
2013-01-03 18:18:40 -05:00
uboness dc25939b7c fixed hunspell test to clean up properly, this time, for realz 2013-01-03 22:54:37 +01:00
uboness 86f55b3a45 fixed hunspell test to clean up properly 2013-01-03 22:12:36 +01:00
Martijn van Groningen 7cf80aca99 Changed how the stored values of numeric fields are stored in the index. Before numeric values were stored in binary representation, now the values in numeric representation. 2013-01-03 21:34:53 +01:00
Shay Banon 0032b334c5 add simple way to get the index creation version when building a mapper 2013-01-03 19:06:19 +01:00
uboness 9f60dc7578 Support for root logger level runtime update
Closes: #2517
2013-01-03 01:23:40 +01:00
uboness 6c4108b38a Support for hunspell token filter
Closes: #646

- Introduced HunspellService which holds a repository of hunspell dictionaries
- It is possible to register a dictionary via a plugin or by placing the dictionary files on the file system
2013-01-02 03:51:26 +01:00
Shay Banon 720feca3c5 optimize search hit to use Text for type and id
this will reduce serialization string overheads, and faster xcontent(json) generation
2012-12-31 00:13:15 -08:00
Shay Banon 120b766f0a cleanup 2012-12-30 17:39:19 -08:00
Shay Banon 4a5b147634 remove unused method 2012-12-30 17:37:46 -08:00
Shay Banon 22077d1c5f move regex to use Object as paramater as well 2012-12-30 17:36:48 -08:00
Shay Banon f8a08a46ac cleanup more calls to Term with String value 2012-12-30 17:31:13 -08:00
Shay Banon 1c93c8dfb8 cleanup unused term factory 2012-12-30 16:55:22 -08:00
Shay Banon 7f0034d42f fix thread pool stats largest 2012-12-30 01:57:18 -08:00
Shay Banon b6f766af3f backport lucene 4.1 terms filter and use it where applicable 2012-12-29 10:39:53 -08:00
Shay Banon b08e8fb76c add explicit termsFilter in mapper, and use that in terms filter
This also enabled support for terms filter on _id field for example
2012-12-29 01:06:32 -08:00
Shay Banon 2655c2aa58 fix index xcontent flag with id 2012-12-29 00:46:35 -08:00
Shay Banon 9a8d558e51 use object parser value for queries that support it 2012-12-29 00:14:46 -08:00
Shay Banon fd5719b232 support multiple values when mappers construct queries
this will allow us to optimize parsing using actual values (numbers/bytes)
2012-12-28 23:38:43 -08:00
Shay Banon 01ba287164 more mapper simplification, reduce the value methods 2012-12-28 20:33:10 -08:00
Shay Banon e02015c641 Use FieldType and not deprecated Field construction 2012-12-28 14:27:09 -08:00
Shay Banon 64a01c28c3 rename fieldQuery/fieldFilter to termQuery/termFilter in mappers 2012-12-28 13:48:48 -08:00
Shay Banon 12239169b1 add 0.20.3 2012-12-27 14:30:40 -08:00
Shay Banon 7fb98769a6 add a sleep to make sure settings are applied 2012-12-27 14:16:30 -08:00
Igor Motov b7ff23ff93 Update settings: Allow to dynamically update thread pool settings
Closes #2509
2012-12-27 09:39:27 -05:00
Shay Banon 6ef0e4ddda fix type to types 2012-12-26 16:31:55 -08:00
Shay Banon 660d0ceba9 cleanup 2012-12-26 16:18:57 -08:00
Simon Willnauer 750c30f0b8 allow index and type to be specified as arrays in MultiSearchRequest 2012-12-26 16:18:17 -08:00
Dave Brosius 6342beeeb0 fix copy/paste bug where null stopWords is passed causing an NPE 2012-12-26 16:16:37 -08:00
Shay Banon ef55e4feec fix failed tests due to wrapping failures with mapping parsing exception 2012-12-26 15:59:07 -08:00
Shay Banon 7a0404ac35 optimize filtered query with match_all filter
simply just use the query in that case, and don't add the filter overhead
2012-12-26 15:47:53 -08:00
Shay Banon 8a17222ff2 match_all filter with empty array (instead of obj) fires exception when used with facets
fixes #2493
2012-12-26 15:35:30 -08:00
Shay Banon 2a57e7cd4b ShardSearchFailure handling of exception does not take actual into account for status
fixes #2495
2012-12-26 15:00:49 -08:00
Shay Banon 2f4b759df7 Allow highlighting on wildcard fields.. ie, comment_*
closes #2396
2012-12-26 15:00:31 -08:00
Shay Banon 4b69846ba2 cleanup 2012-12-26 14:21:04 -08:00
Simon Willnauer 90bd82ac50 Pass topScorer=false to sub-scorers if a scorer is wrapped. Wrapped BooleanQuery can return collect-only scorers. See #2505 2012-12-26 14:20:14 -08:00
Shay Banon 2449049a84 Plugins Installer: Allow to download plugins from download.elasticsearch.org
closes #2507.
2012-12-26 14:16:35 -08:00
Martijn van Groningen c93babed42 Minor changes to the parent / child benchmark. 2012-12-24 22:12:10 +01:00
Martijn van Groningen c6aaefa27f Improved explain support for nested query.
Closes #2503
2012-12-24 13:40:20 +01:00
Martijn van Groningen d57d89937f Added scoring support to `has_child` and `has_parent` queries.
Added score support to `has_child` and `has_parent` queries. Both queries support a score_type option. The has_child support the same options as the top_children query and the none option which is the default and yields the current behaviour. The has_parent query support the score type options: score and none. The latter is the default and yields the current behaviour.

If the score_type is set to a value other than none then the has_parent query map the matched parent score into the related children documents. The has_child query then map the matched children documents into the related parent document. The score_type on both queries defines how the children documents scores are mapped in the parent documents. Both queries are executed in two phases. First phase collects the parent uid values of matching documents with an aggregated score per parent uid value. In the second phase either child or parent typed documents are emitted as hit that have the same parent uid value as found during the first phase. The score computed in the first phase will be used as score.

Closes #2502
2012-12-24 11:39:43 +01:00
Shay Banon bb9c7172b0 add mappings even on failure to parse
since we add them internally to the compound mappers, we need to publish the fact, otherwise, for example, the codec won't find the relevant one based on the global mapper service
2012-12-24 00:04:29 -08:00
Igor Motov fcdc36977c Fix failure message serialization in MultiSearchResponse
Fixes #2498
2012-12-21 19:26:48 -05:00
Martijn van Groningen 08b026d060 Fixed `top_children` query failure with dfs_query* search types.
Fixed error with the top_children query when `DFS_QUERY_*` is used as search_type and wraps a query that gets rewritten (E.g wildcard query).

Closes #2501
2012-12-21 18:08:44 +01:00
Martijn van Groningen c17755d164 Removed unused code. 2012-12-21 16:54:14 +01:00
Martijn van Groningen 694989141b Fixed AOBE when using `top_children` in a must not clause.
Closes #2500
2012-12-21 16:47:03 +01:00
Martijn van Groningen 826a6ab02a Improved XBooleanFilter by adding drive logic for bit based filter impl and adding unit test, which tests all possible XBooleanFilter options. 2012-12-19 22:43:47 +01:00
Shay Banon 14678a91ab nested path to be represented as bytes as well as string 2012-12-18 15:39:13 -08:00
Shay Banon 2950799243 fix creating uid to bytes 2012-12-18 14:39:25 -08:00
Shay Banon ac253178bd more cleanup in mappings 2012-12-18 13:28:36 -08:00
Shay Banon 1867ef5084 simplify toXContent generation of field mappers 2012-12-18 13:00:47 -08:00
Shay Banon f3dbe9224a rename analyzed to tokenized to match field type 2012-12-18 11:34:41 -08:00
Shay Banon b9c5f0472c indexedTerm to return bytes
part of the effort to reduce conversion from string to types
2012-12-18 10:58:03 -08:00
Shay Banon 1cb531f000 remove unused code 2012-12-18 10:26:52 -08:00
Shay Banon e41def7a59 remove unused code 2012-12-18 10:22:37 -08:00
Shay Banon f01ce61f71 parse to bytes 2012-12-18 10:20:55 -08:00
Martijn van Groningen afd998c482 Improved the size computation in StringFieldData#computeSizeInBytes() 2012-12-18 09:51:01 +01:00
Martijn van Groningen ddea22771e Fixed mlt api bug related to custom routing value.
If the a routing value isn't id based, the get part of the mlt request couldn't retrieve the document for the second part of the mlt request and a 500 code is returned instead. This fix addresses this issue.

Closes #2489
2012-12-17 11:00:30 +01:00
uboness 8b74c42099 Support for RegexpQuery & RegexpFilter
- Added "regexp" query type (based on Lucene 4 RegexpQuery)
- Added "regexp" filter type
- Fixed a bug in IdFieldMapper where prefixQuery on a single type would be redundantly wrapped in a boolean query
2012-12-16 23:24:18 +01:00
Shay Banon 5a6004a168 First indexing of a dynamic boolean field can cause it not to be indexed correctly
fixes #2487
2012-12-15 19:04:10 -08:00
Igor Motov c8285739d2 Correctly parse *:* query into matchAllDocsQuery
Fixes #2486
2012-12-14 14:36:20 -08:00
Martijn van Groningen 148dc3c013 Added Lucene 4.1 todo 2012-12-14 16:08:37 +01:00
Shay Banon c65d5a77c4 reuse non analyzed token stream for string types
so heavyweight token stream won't be created each time
2012-12-12 22:53:48 -08:00
Shay Banon fc35fd8a29 improve fields iteration trying to find customer valued analyzer 2012-12-12 22:08:54 -08:00
Shay Banon 36fd76b826 don't call toLowerCase on each bulk item 2012-12-12 21:56:23 -08:00
Shay Banon 32bf7607c7 optimize boolean filter to use bits driven by result bitset 2012-12-12 20:12:51 -08:00
Shay Banon 4778d5c2eb optimize boolean filter for one clause case 2012-12-12 16:32:22 -08:00
Shay Banon 8d0d288a1c add 0.20 versions 2012-12-08 01:37:21 +01:00
Alex Lambert 635438e7d1 restore deleted plugin path modification 2012-12-08 01:15:17 +01:00
Shay Banon e021904250 use the 0.20.0 version 2012-12-07 23:15:44 +01:00
Shay Banon 4dec14d5da Wildcard query on non existent field matches all documents
fixes #2461
2012-12-07 19:37:26 +01:00
Martijn van Groningen ea9a4d70cf lucene 4: Removed the usage of Document & Field when retrieving stored fields. 2012-12-06 18:18:52 +01:00
Igor Motov d947dfde2b Add support for ignoring settings in system properties.
An elasticsearch node can be instructed to ignore settings specified in system properties by setting config.ignore_system_properties setting to true.
2012-12-06 09:37:36 -05:00
Martijn van Groningen 591a76bd88 Changed es version from string to class Version. 2012-12-06 15:37:03 +01:00
Martijn van Groningen 966fdfdfb8 Changed es version from string to class Version 2012-12-06 15:35:30 +01:00
Martijn van Groningen 22f99e848f Expose es version in node info api.
Closes #2466
2012-12-06 15:22:44 +01:00
Martijn van Groningen f72d5c1907 Expose fragmenter option for plain / normal highlighter.
Closes #2465
2012-12-06 14:59:42 +01:00
Shay Banon c2f8ee105b add a marker CachedFilter
this allows to easily and globally check if we cache a filter or not, all filter caching uses this marker interface
2012-12-06 10:13:47 +01:00
Shay Banon 2786e29a10 expose filter strategy in filtered query 2012-12-06 02:20:09 +01:00
Shay Banon c22b521800 fix properly handling acceptDocs in filters
our idea is to apply it on the "filtered/constant" level, and not on compound filters, so we won't apply it multiple times. The solution is conservative a bit now, we can further optimize it in the future, for example, not to wrap it when no caching is done within the filter chain
2012-12-06 01:55:16 +01:00
Shay Banon 5a226cde8e add 0.19.13 2012-12-04 15:41:38 +01:00
Shay Banon c36638d159 not delete filter improvements
- don't check no null for liveDocs, since we know they are not null with the check for hasDeletion
- improve iteration over liveDocs vs. innerSet, prefer to iterate over the faster one
2012-12-04 02:00:36 +01:00
Martijn van Groningen 6cfd938dce Fixed unable to highlight on all multi-valued field values.
Closes #2384
2012-12-03 12:39:18 +01:00
Shay Banon f17ad829ac remove snappy support
relates to #2459
2012-12-03 12:30:13 +01:00
Shay Banon 677e6ce4ef Deprecate Shared Gateway
closes #2458
2012-12-03 11:44:17 +01:00
Shay Banon a2a8553faf Indexing Slow Log
closes #2457
2012-12-03 10:21:59 +01:00
Shay Banon b4f85ee422 no need to check for log levels
we already do that when we log, and those are set to TRACE most times for slow log (since logging is based on thresholds)
2012-12-02 22:03:51 +01:00
Shay Banon a274d9386f Add types and stats to search slow log
closes #2455
2012-12-02 22:01:17 +01:00
Igor Motov 6021515567 The relevancy score in explanation should match the actual score in custom_filters_query
Fixes #2441
2012-11-27 10:13:16 -08:00
Shay Banon 69ef822da6 cleanup docsets
- remove the DocSet abstraction, and use Bits where we can by getting it from DocIdSet
- better handling of acceptDocs, though still need to properly apply them when caching is involved
2012-11-27 10:04:21 -08:00
Igor Motov fb9143aac1 fix sporadically disappearing fields during concurrent dynamic mapping updates 2012-11-24 14:02:58 +01:00
Simon Willnauer 4ab78bc537 Add basic javadocs for o.e.cluster.rounting package and related classes 2012-11-23 15:14:30 +01:00
Simon Willnauer 32a0772821 #2436 expose KeepWordTokenFilter by default 2012-11-23 10:11:30 +01:00
Igor Motov 65a43d3ad4 Fix handling of stop word _lang_ notation
Fixes #2412
2012-11-23 09:54:02 +01:00
Shay Banon 2094207bf1 add completed count to thread pools 2012-11-22 15:55:25 +01:00
Shay Banon e1679b89bb fix failed test that were using the wrong form match query 2012-11-22 15:14:02 +01:00
Shay Banon 192cf5298a fix failed test that were using the wrong form match query 2012-11-22 14:44:03 +01:00
Shay Banon f4d6d8139d Match query should fail when trying to provide several fields in its simplified form
fixes #2432
2012-11-22 10:23:48 +01:00
Chris Male 2541847945 Added control over Query used by MatchQuery with there are zero terms after analysis 2012-11-22 22:13:29 +13:00
Shay Banon 9a90c1c3b5 conservative timeouts on internal recovery actions
safe guards against cases where intenral recovery actions take too long (possibly due to a bug)
2012-11-22 00:31:57 +01:00
Shay Banon f5a3261e15 only log that we delete unused shard if it exists 2012-11-21 20:45:31 +01:00
Shay Banon d9b78000b1 Setting logger levels using cluster update settings does not work
fixes #2428
2012-11-21 13:44:54 +01:00
Chris Male 9e2469e04f Add per-field Similarity support 2012-11-21 12:44:59 +13:00
Shay Banon 4e8a9008b7 second phase at optimizing merging/parsing large new mappings
apply the new mappings only after the parsing/merging of a full doc/mapping is done
2012-11-19 17:40:13 +01:00
Shay Banon 303752d78a first phase at optimizing merging large mappings
bulk them the same level ones when traversing and introduce them
2012-11-19 17:13:45 +01:00
Shay Banon 6e597ffccb allow to associate a payload with bulk requests 2012-11-19 16:16:35 +01:00
David Pilato 83257c8af8 Add constructor IndexRequest(String index, String type) and fix javadoc 2012-11-19 13:52:55 +01:00
David Pilato b2597b5316 Add a toString() method to MultiSearchResponse 2012-11-19 13:50:58 +01:00
Simon Willnauer 840eaf983d Add JavaDocs for Codecs, PostingsFormat and related services/modules 2012-11-19 10:25:26 +01:00
Shay Banon c09ee82ef5 keep the uidField around so we don't have to look it up 2012-11-15 12:52:58 +01:00
Shay Banon e2e25ffea3 uid to use bloom filter posting by default 2012-11-15 11:57:48 +01:00
Martijn van Groningen 3577d826f2 Removed old file. 2012-11-15 11:54:51 +01:00
Martijn van Groningen be70722de7 Renamed pulsing40 and Lucene40 postings format providers to pulsing and default respectively for more consistent naming in settings. 2012-11-15 09:54:00 +01:00
Martijn van Groningen 20c6085852 changed test method names. 2012-11-15 09:40:24 +01:00
Martijn van Groningen e80f74584b Added licence header. 2012-11-15 00:18:49 +01:00
Martijn van Groningen fd5bd102aa lucene 4: Exposed Lucene's codec api
This feature adds the option to configure a `PostingsFormat` and assign it to a field in the mapping. This feature is very expert and in almost all cases Elasticsearch's defaults will suite your needs.

## Configuring a postingsformat per field

There're several default postings formats configured by default which can be used in your mapping:
a* `direct` - A codec that wraps the default postings format during write time, but loads the terms and postinglists into memory directly in memory during read time as raw arrays. This postings format is exceptional memory intensive, but can give a substantial increase in search performance.
* `memory` - A codec that loads and stores terms and postinglists in memory using a FST. Acts like a cached postingslist.
* `bloom_default` - Maintains a bloom filter for the indexed terms, which is stored to disk and builds on top of the `default` postings format. This postings format is useful for low document frequency terms and offers a fail fast for seeks to terms that don't exist.
* `bloom_pulsing` - Similar to the `bloom_default` postings format, but builds on top of the `pulsing` postings format.
* `default` - The default postings format. The default if none is specified.

On all fields it possible to configure a `postings_format` attribute. Example mapping:
```
{
  "person" : {
     "properties" : {
         "second_person_id" : {"type" : "string", "postings_format" : "pulsing"}
     }
  }
}
```

## Configuring a custom postingsformat
It is possible the instantiate custom postingsformats. This can be specified via the index settings.
```
{
   "codec" : {
      "postings_format" : {
         "my_format" : {
            "type" : "pulsing40"
            "freq_cut_off" : "5"
         }
      }
   }
}
```
In the above example the `freq_cut_off` is set the 5 (defaults to 1). This tells the pulsing postings format to inline the postinglist of terms with a document frequency lower or equal to 5 in the term dictionary.

Closes #2411
2012-11-14 23:54:29 +01:00
Igor Motov 120560bd0a Using non-mapped fields in prefix queries shouldn't cause NullPointerException
Fixes #2408
2012-11-14 18:34:54 +01:00
Igor Motov f47d62cc30 Date fields shouldn't be returned as longs by Get API 2012-11-13 21:36:28 +01:00
Igor Motov d1281d283b Add `index.routing.allocation.require....` and `cluster.routing.allocation.require....` settings
Fixes #2404
2012-11-13 19:29:20 +01:00
Igor Motov ea2732a967 lucene 4: field visitors shouldn't return fields that were not present in the visited document 2012-11-13 07:35:54 -05:00
Shay Banon 258244ef37 Deriving the REST status code from a failure can, very rarely, cause an infinite loop
fixes #2402
2012-11-12 17:09:34 +01:00
Nicholas Tung 46e1886975 Allow both .yml and .yaml as valid YAML configuration file extensions 2012-11-12 14:06:04 +01:00
Igor Motov 3ff54c0b5c Add logging for environment paths on startup 2012-11-12 14:04:13 +01:00
Martijn van Groningen 978c95649e lucene 4: Fixed SimpleQueryTests 2012-11-12 13:44:42 +01:00
Martijn van Groningen 05746adeb2 lucene 4: Set number of replicas to 0. Makes the test run faster. 2012-11-12 13:44:42 +01:00
Martijn van Groningen e2c33ed659 lucene 4: Fixed BitsetExecutionChildQuerySearchTests class. 2012-11-12 13:44:42 +01:00
Shay Banon 9a79fb40bf lucene 4: sort values on hit are Text, not BytesRef 2012-11-12 13:44:42 +01:00
Igor Motov c46228254d lucene 4: fix TTL 2012-11-12 13:44:42 +01:00
Igor Motov c2f3eab7d3 lucene 4: fix sorting 2012-11-12 13:44:42 +01:00
Shay Banon 2b58c2dfff lucene 4: optimize read/write BytesRef handling 2012-11-12 13:44:42 +01:00
Igor Motov c8cf72d657 lucene 4: fix handling of deleted docs in TermFilter 2012-11-12 13:44:42 +01:00
uboness d069212ce4 * fixed the type check for short 2012-11-12 13:44:42 +01:00
uboness 46223c117a * removed unused Streamables class 2012-11-12 13:44:42 +01:00
uboness ed2b009f07 * changed instanceof to be consistent with other type checks 2012-11-12 13:44:41 +01:00
uboness cae66fb636 * lucene 4: added missing short support in stream input/output
* lucene 4: added more extensive test for stored fields
2012-11-12 13:44:41 +01:00
Igor Motov f8842d5a4f lucene 4: fix TokenFilterTests 2012-11-12 13:44:41 +01:00
Igor Motov 98eb97a1ff lucene 4: fix NoopCollector 2012-11-12 13:44:41 +01:00
Shay Banon 9d5cae23fa lucene 4: fix general mapping test
no need to test for boost, we already have specific boost tests, in general, we should get rid of this test, and use more specialized tests if we are missing some
2012-11-12 13:44:41 +01:00
Shay Banon 5c45aad260 lucene 4: fix boost mapping tests 2012-11-12 13:44:41 +01:00
Igor Motov ffd262e96f lucene 4: rollback optimization in SingleFieldVisitor for now to make it work 2012-11-12 13:44:41 +01:00
Igor Motov cfbd17992a lucene 4: convert script term to string 2012-11-12 13:44:41 +01:00
Igor Motov 74464f9f99 lucene 4: fix possible NPE in range queries and filters if one of the bounds is not specified 2012-11-12 13:44:41 +01:00
Igor Motov 6d40770200 lucene 4: fixed facets and filtering aliases
I am not completely sure about this one, but it reduces the number of failing tests from 98 to 31 so I am going to check it in. Please, review and fix it, if there is a better solution.

Because of change in Lucene 4.0, ContextIndexSearcher was bypassed and elasticsearch filters and collectors were ignored.

In lucene 3.6 the stack of Searcher search calls looked like this:
search(Query query, int n)
search(Query query, Filter filter, int n)
search(Weight weight, Filter filter, int nDocs)
search(Weight weight, Filter filter, ScoreDoc after, int nDocs)
search(Weight weight, Filter filter, Collector collector) <-- this is ContextIndexSearcher was injecting combined filter and collector
search(Weight weight, Filter filter, Collector collector)

In Lucene 4.0 the stack looks like this:
search(Query query, int n)
search(Query query, Filter filter, int n) <-- here lucene wraps Query and Filter into Weight
search(Weight weight, ScoreDoc after, int nDocs)
search(List<AtomicReaderContext> leaves, Weight weight, ScoreDoc after, int nDocs)
search(List<AtomicReaderContext> leaves, Weight weight, Collector collector)
...

In other words, when we have Filter, we don't have a Collector yet, but when we have Collector, Filter is already wrapped inside Weight.  The only way to fix for the problem that I could think of is by introducing two injection points: one for Filters and another one for Collectors:

search(Query query, int n)
search(Query query, Filter filter, int n) <-- here combined Filters are injected
search(Weight weight, ScoreDoc after, int nDocs)
search(List<AtomicReaderContext> leaves, Weight weight, ScoreDoc after, int nDocs)
search(List<AtomicReaderContext> leaves, Weight weight, Collector collector) <-- here Collectors are injected

Similar problem existed for count(), so I had to override search(Query query, Collector results) as well.
2012-11-12 13:44:41 +01:00
Igor Motov 2eaad61a9e lucene4: make SimpleIdCache more resilient to missing fields
Not sure if we can get a segment with the _uid field, but segments without the _parent field definitely happen.
2012-11-12 13:44:41 +01:00
Igor Motov 9ad05ecdea lucene 4: make FieldVistors behave similar to FieldSelectors
Added back reset() method for now to make things work. Will refactor it out when we have tests passing.
2012-11-12 13:44:41 +01:00
Igor Motov 7aac88cf5c lucene4: check liveDocs and acceptedDocs for null before trying to call get() on them 2012-11-12 13:44:40 +01:00
Igor Motov 3f3a95668b lucene4: add support for omit_norm setting to numeric types and don't omit norms if boost is not 1.0
This commit enables setting boost for numeric fields. However, there is still no way to take advantage of boosted numeric fields during searching because all queries against numeric fields are translated into range queries wrapped in ConstantScore. Boost for numeric fields is broken on master as well https://gist.github.com/7ecedea4f6a5219efb89
2012-11-12 13:44:40 +01:00
Igor Motov 2fb3591792 lucene4: fixed default values tests to refer to correct default FieldType constants 2012-11-12 13:44:40 +01:00
Igor Motov a5bef30be9 lucene4: fixed CompressIndexInputOutputTests 2012-11-12 13:44:40 +01:00
Igor Motov 3816366780 lucene4: fixed SimpleAllMapperTests 2012-11-12 13:44:40 +01:00
Shay Banon 25717ab253 lucene 4: only omit_norms on non analyzed field if boost is not set 2012-11-12 13:44:40 +01:00
Shay Banon 72f41111c9 lucene 4: calling tokenStream is enough, verified to return a stream to analyze content 2012-11-12 13:44:40 +01:00
Shay Banon cb5df26bf7 lucene 4: use the proper token stream to return 2012-11-12 13:44:40 +01:00
Shay Banon a10f60873c lucene 4: fix numeric types to properly return numeric streams 2012-11-12 13:44:40 +01:00
Shay Banon a38064913f lucene 4: fix engine tests 2012-11-12 13:44:40 +01:00
Shay Banon 53d9b13e2f lucene 4: fix optimization check to set docs_only+omit_norms 2012-11-12 13:44:40 +01:00
Igor Motov 8a34ea1223 lucene4: fixed FloatFieldDataTests 2012-11-12 13:44:40 +01:00
Igor Motov bf13f3f81e lucene4: fixed SimpleIndexQueryParserTests 2012-11-12 13:44:39 +01:00
Martijn van Groningen db639e5c2e lucene 4: Upgraded SimpleLuceneTests class. Test actually passes now. 2012-11-12 13:44:39 +01:00
Martijn van Groningen 2a8161d096 lucene 4: Upgraded SimpleLuceneTests class.
The complete codebase compiles now!
2012-11-12 13:44:39 +01:00
Martijn van Groningen aa2a8c66cc lucene 4: Upgraded UidFieldTests class. 2012-11-12 13:44:39 +01:00
Shay Banon f796fe8d5e lucene 4: fix cases where number values are not stored 2012-11-12 13:44:39 +01:00
Martijn van Groningen 5c0ef796e8 lucene 4: Upgraded BoostMappingTests + SimpleMapperTests 2012-11-12 13:44:39 +01:00
Shay Banon cefe2ba870 lucene 4: fix fuzzy query test 2012-11-12 13:44:39 +01:00
Shay Banon bec0ffa623 lucene 4: make sure to apply doc boost only once per field name 2012-11-12 13:44:39 +01:00
Shay Banon 7ecfa9c35f lucene 4: caching should pass acceptDocs
still work left on streamlining filters
2012-11-12 13:44:39 +01:00
Shay Banon c60f20413b lucene 4: support doc level boost 2012-11-12 13:44:39 +01:00
Shay Banon b492320e2f lucene 4: switch directory not used 2012-11-12 13:44:39 +01:00
Shay Banon dca88a9b7c lucene 4: use field type in UidField 2012-11-12 13:44:39 +01:00
Shay Banon faf3e0e857 lucene 4: comment on adding DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS 2012-11-12 13:44:38 +01:00
Shay Banon e9f8d0c722 lucene 4: extrace Lucene#readSegmentsInfo, and use it where applicable 2012-11-12 13:44:38 +01:00
Shay Banon 0660e20c47 lucene 4: cleanup terms/uid filter 2012-11-12 13:44:38 +01:00
Shay Banon 79368bb221 lucene 4: fix visitors to use constants for field names 2012-11-12 13:44:38 +01:00
Martijn van Groningen 6ca6407468 lucene 4: Re-fixed issue in SourceScoreOrderFragmentsBuilder and SourceSimpleFragmentsBuilder. 2012-11-12 13:44:38 +01:00
Simon Willnauer a3de9e521d lucene 4: replaced TrimFilter and WordDelimiterFilter with lucene versions 2012-11-12 13:44:38 +01:00
Martijn van Groningen e33ae96b38 lucene 4: added overloaded method. To fix issue in SourceScoreOrderFragmentsBuilder and SourceSimpleFragmentsBuilder. 2012-11-12 13:44:38 +01:00
Martijn van Groningen 38dc19d8bc lucene 4: Fixed compile error. 2012-11-12 13:44:38 +01:00
Martijn van Groningen 673712c0b2 lucene 4: Upgraded IndexedGeoBoundingBoxFilter & InMemoryGeoBoundingBoxFilter. 2012-11-12 13:44:38 +01:00
Martijn van Groningen d42d153c48 lucene 4: Upgraded GeoDistanceRangeFilter, GeoPolygonFilter. 2012-11-12 13:44:38 +01:00
Martijn van Groningen 415cfa2e89 lucene 4: Upgraded GeoDistanceFilter, MatchedFiltersFetchSubPhase. 2012-11-12 13:44:38 +01:00
Martijn van Groningen ba1b870580 lucene 4: Upgraded CacheKeyFilter. 2012-11-12 13:44:38 +01:00
Martijn van Groningen 3298ad2235 lucene 4: Upgraded UidField. (version can be stored later as doc values) 2012-11-12 13:44:37 +01:00
Martijn van Groningen 968b012911 lucene 4: Upgraded *ValueGeoPointFieldData and GeoDistanceDataComparator. 2012-11-12 13:44:37 +01:00
Martijn van Groningen 09fe15488d lucene 4: Upgraded ScanContext. 2012-11-12 13:44:37 +01:00
Igor Motov 41325113f0 lucene4: switched from Field.Index to boolean indexed in ParseContext.includeInAll() 2012-11-12 13:44:37 +01:00
Igor Motov daf347e67e lucene4: replace IndexCommit.getVersion() with IndexCommit.getGeneration() 2012-11-12 13:44:37 +01:00
Igor Motov 787b7a3900 lucene4: more unit test cleanup 2012-11-12 13:44:37 +01:00
Igor Motov 5ad40205c2 lucene4: remove DocumentBuilder and FieldBuilder 2012-11-12 13:44:37 +01:00
Shay Banon 594598f493 close the index input in any case when computing length 2012-11-12 13:44:37 +01:00
Igor Motov bb76542068 lucene4: unit tests cleanup 2012-11-12 13:44:37 +01:00
Martijn van Groningen 5a553a1924 lucene 4: Upgraded AndFilter, NotDeletedFilter, NotFilter, OrFilter, TermFilter, XBooleanFilter. Left a live docs and accepted docs unhandled (used null) for. I added a note at all places. 2012-11-12 13:44:37 +01:00
Igor Motov 6b4e483f55 lucene4: fixed unit.index.mapper, unit.index.query and unit.index.store test (with exception of document boost and similarity issues) 2012-11-12 13:44:37 +01:00
Igor Motov 5d7ef8f585 lucene4: fixed SortParseElement 2012-11-12 13:44:37 +01:00
Martijn van Groningen 1a46179c4e lucene 4: Upgraded AndFilter, FilteredCollector, LimitFilter, MatchAllDocsFilter and MatchNoDocsFilter. 2012-11-12 13:44:36 +01:00
Martijn van Groningen e75c732bdd lucene 4: Upgraded MatchNoDocsQuery. 2012-11-12 13:44:36 +01:00
Martijn van Groningen ddc3eb3415 lucene 4: Upgraded MultiPhrasePrefixQuery. 2012-11-12 13:44:36 +01:00
Martijn van Groningen da551e8847 lucene 4: Upgraded o.e.common.lucene.search.function package. 2012-11-12 13:44:36 +01:00
Igor Motov 6bbe37f876 lucene4: fixed integration tests that got broken by switch from String to Text in Facet terms 2012-11-12 13:44:36 +01:00
Igor Motov edb4fe18e0 lucene4: fixed index.merge.policy 2012-11-12 13:44:36 +01:00
Igor Motov f57efcf6c8 lucene4: finish org.elasticsearch.common.compress cleanup 2012-11-12 13:44:36 +01:00
Igor Motov 93906903b6 lucene4: switched setNextReader from IndexReader to AtomicReaderContext 2012-11-12 13:44:36 +01:00
Igor Motov 25d03a6a7d lucene4: upgraded ScoreDocQueue 2012-11-12 13:44:36 +01:00
Igor Motov 5cd9da4565 lucene4: fixed TransportNodesListShardStoreMetaData 2012-11-12 13:44:36 +01:00
Chris Male 27481800bc lucene 4: Upgraded FieldMapper.fuzzyQuery to use new FuzzyQuery API 2012-11-12 13:44:36 +01:00
Igor Motov be424c4564 lucene4: fixed SwitchDirectory and CompressedDirectory (except fileLength method) 2012-11-12 13:44:36 +01:00
Igor Motov e8092fe290 lucene4: org.apache.lucene.search.vectorhighlight package cleanup 2012-11-12 13:44:35 +01:00
Igor Motov 639b1323b8 lucene4: upgrade CustomMemoryIndex to Lucene 4 2012-11-12 13:44:35 +01:00
Martijn van Groningen 4178d48470 lucene 4: Fixed import issue. 2012-11-12 13:44:35 +01:00
Chris Male 724fadd2cd lucene 4: Converted Analyzers in MapperService 2012-11-12 13:44:35 +01:00
Martijn van Groningen 9f45b683d6 lucene 4: Fixed TERM_FACTORY usage in VersionFetchSubPhase class. 2012-11-12 13:44:35 +01:00
Martijn van Groningen 03a16ac7d8 lucene 4: Upgraded ContentIndexSearcher 2012-11-12 13:44:35 +01:00
Chris Male b3e59d58e4 lucene 4: Fixed TermFactory usage in MapperService 2012-11-12 13:44:35 +01:00
Martijn van Groningen 0354825914 lucene 4: Fixed compile error 2012-11-12 13:44:35 +01:00
Martijn van Groningen 083df0a86c lucene 4: Upgraded o.e.search.dfs package. #2 2012-11-12 13:44:35 +01:00
Martijn van Groningen 5f942ef63d lucene 4: Upgraded o.e.search.dfs package. (Distributed idf) 2012-11-12 13:44:35 +01:00
Igor Motov fd2cf776d8 lucene4: action package cleanup 2012-11-12 13:44:35 +01:00
Martijn van Groningen 3269e0c88e lucene 4: Fixed compile error 2012-11-12 13:44:35 +01:00
Martijn van Groningen fcc4fe263e lucene 4: Upgraded PercolatorExecutor 2012-11-12 13:44:34 +01:00
Simon Willnauer 22c14c7354 lucene 4: lucene package cleanups 2012-11-12 13:44:34 +01:00
Simon Willnauer 595acd695e lucene 4: s/reusableTokenStream/tokenStream 2012-11-12 13:44:34 +01:00
Martijn van Groningen d531fa7a46 lucene 4: Fixed compile error in FieldLookup 2012-11-12 13:44:34 +01:00
Martijn van Groningen 42a1d25064 lucene 4: Fixed last compile errors in HighlightPhase 2012-11-12 13:44:34 +01:00
Martijn van Groningen b928e74904 lucene 4: Moved from FieldSelectors to FieldVisitors. Removed BaseFieldVisitor#reset and changed SourceFieldVisitor and UidFieldVisitor to singleton to prototype. 2012-11-12 13:44:34 +01:00
Martijn van Groningen d8d7498292 lucene 4: Moved from FieldSelectors to FieldVisitors. 2012-11-12 13:44:34 +01:00
Simon Willnauer 77cbe0a26b lucene 4: s/getFieldable/getField 2012-11-12 13:44:34 +01:00
Simon Willnauer 0c1778a033 lucene 4: don't restrict ram buffer to 2GB this lucene restriction was removed with DWPT 2012-11-12 13:44:34 +01:00
Simon Willnauer d4e4b5d9f4 lucene 4: read commit user data from directory without a reader 2012-11-12 13:44:34 +01:00
Shay Banon 7b8ab2d685 lucene 4: cleanup unused class 2012-11-12 13:44:34 +01:00
Martijn van Groningen cdf1fc8981 lucene 4: upgraded o.e.index.search.nested package. Also fixed issue with liveDocs in child package. 2012-11-12 13:44:34 +01:00
Igor Motov a49078dfc1 lucene 4: replace UnicodeUtil.UTF8Result with BytesRef 2012-11-12 13:44:33 +01:00
Chris Male f444ed4dff lucene 4: Converted remaining Mappers to FieldType API 2012-11-12 13:44:33 +01:00
Chris Male 549900a082 lucene 4: Converted most Mappers over to FieldType API 2012-11-12 13:44:33 +01:00
Shay Banon e75301b781 lucene 4: optimize bytes on XContentParser
also, does not seem like we need to reuse bytes buffer, if we need to, we can always add it later
2012-11-12 13:44:33 +01:00
Martijn van Groningen 19ab1d0548 lucene 4: upgraded o.e.index.search.child package 2012-11-12 13:44:33 +01:00
Martijn van Groningen 71c3bd7c64 lucene 4: SearchContext#setNextReader accepts an AtomicReaderContext instead of an AtomicReader 2012-11-12 13:44:33 +01:00
Igor Motov 4e5e4869a6 lucene 4: add custom analyzer wrapper that supports overriding of getOffsetGap 2012-11-12 13:44:33 +01:00
Martijn van Groningen 24ef987624 lucene 4: Upgraded the simple id cache. 2012-11-12 13:44:33 +01:00
Simon Willnauer 683be6fc64 lucene 4: converted QueryParser/Builders to Lucene 4 2012-11-12 13:44:33 +01:00
Simon Willnauer 5bd8e1b337 lucene 4: fixed MLT query 2012-11-12 13:44:33 +01:00
Simon Willnauer ad84186509 lucene 4: fixed fuzzy like this queryparser/builder 2012-11-12 13:44:33 +01:00
Simon Willnauer c1a9c802f1 lucene 4: XContentParser now has bytesOrNull and returns bytesref directly 2012-11-12 13:44:33 +01:00
Simon Willnauer 479f1784e8 lucene 4: converted queryparser to lucene classic query parser 2012-11-12 13:44:32 +01:00
Simon Willnauer 5d47ad4648 lucene 4: upgraded FuzzyQueryParser + Builder to use integer edit distance rather
than floats (bw compatible)
2012-11-12 13:44:32 +01:00
Igor Motov b1eaec6c6a lucene 4: change Unicode utils to use BytesRef instead of UTF8Result 2012-11-12 13:44:32 +01:00
uboness c3633ab99f lucene 4: changed InternalIndexShard#checkIndex to use the new fixIndex and indexExists apis 2012-11-12 13:44:32 +01:00
Igor Motov 8009b80481 lucene 4: fix access to segment name due to SegmentInfo refactoring 2012-11-12 13:44:32 +01:00
Shay Banon 4b84078f91 lucene 4: text comparator should always work on bytes 2012-11-12 13:44:32 +01:00
Martijn van Groningen 65ce3aea57 lucene 4: Upgraded the function/sort classes. 2012-11-12 13:44:32 +01:00
Martijn van Groningen 48b8d0544f lucene 4: Moved SearchScript from IndexReader to AtomicReader. This also touches the seach/lookup classes 2012-11-12 13:44:32 +01:00
Martijn van Groningen d820bfe11b lucene 4: Changed from BytesReference to Text as internal term representation for facet keys. Text now also implements comparable. 2012-11-12 13:44:32 +01:00
Igor Motov b128b7a750 lucene 4: use CharArraySet for stem exclusions, stop words and articles and fix analyzer namespaces 2012-11-12 13:44:32 +01:00
Igor Motov 1cc5ee7ad9 lucene 4: implement createComponents in Analyzers 2012-11-12 13:44:32 +01:00
Igor Motov 6fad75df82 lucene 4: remove Pattern tokenizer and filter 2012-11-12 13:44:32 +01:00
Igor Motov 097cb2dac7 lucene 4: migrate char filter from CharStream to Reader 2012-11-12 13:44:31 +01:00
Shay Banon f572a7bcf7 lucene 4: no close on searcher anymore 2012-11-12 13:44:31 +01:00
Shay Banon ed03741353 lucene 4: hashCode and equals for Text and BytesReference
now that we are going to use those more in places like facets, they need to implement equals and hasCode to be used in hashes
2012-11-12 13:44:31 +01:00
Martijn van Groningen 15c9cd5142 lucene 4: Field name no longed interned when loading field data cache and return empty field data cache for fields that don't exist. 2012-11-12 13:44:31 +01:00
Martijn van Groningen 454954e7be lucene 4: Fix field data, facets and field comparators 2012-11-12 13:44:31 +01:00
Shay Banon 81d148b4e4 lucene 4: fix warmup process
also removed ExtendedIndexSearcher, we should do whats needed with the new context and leaves methods
2012-11-12 13:44:31 +01:00
Shay Banon 0c24928ef4 lucene 4: fix similarity packaging 2012-11-12 13:44:31 +01:00
Shay Banon f4418fb181 lucene 4: fix segments info usage 2012-11-12 13:44:31 +01:00
Shay Banon 7972f6f959 lucene 4: fix call to expungeDeletes 2012-11-12 13:44:31 +01:00
Shay Banon 386c2ebdb9 lucene 4: remove bloom cache
we can add bloom cache, if we need it, as a codec on the uid field
we still need to rewrite the UidFilter to not use bloom, but that will be the regular one
2012-11-12 13:44:31 +01:00
Igor Motov 05138bb2fb lucene 4: upgrade analyzers 2012-11-12 13:44:30 +01:00
Shay Banon 7aacc8d448 lucene 4: upgrade store/dir 2012-11-12 13:44:30 +01:00
Shay Banon 3d4ca81c29 remove XIndexWriter
removing the buffered deletes bloom filter no longer requires setting the bloom filter on it
2012-11-12 13:44:30 +01:00
Shay Banon f9b0fcb3a3 remove BufferedDeletesStream
by default, we will put a bloom filter code on the _uid field, so no need for the optimization of using bloom filters when trying to delete a doc by _uid term per segment
2012-11-12 13:44:30 +01:00
Shay Banon edaa65dba2 Multi Match: Wrongly defaults to dis_max instead of bool
fixes #2397
2012-11-10 14:59:45 +01:00
Njal Karevoll f33e353259 The index of the next RestFilter must be incremented before the current filter starts processing.
Otherwise, synchronous filters will not work. For example, the following filter would cause a StackOverflowError:

public class SimpleRestFilter extends RestFilter {
    @Override
    public void process(RestRequest request, RestChannel channel, RestFilterChain filterChain) {
        filterChain.continueProcessing(request, channel);
    }
}
2012-11-09 22:04:09 +01:00
Shay Banon a8e43578a2 Adding a type with _source or _all enabled fails, when these are disabled in index
fixes #2394
2012-11-09 17:21:25 +01:00
Martijn van Groningen 4a9faac470 Added must/should/mustNot method variants that accepts vararg FilterBuilder instances. 2012-11-06 14:23:03 +01:00
Shay Banon 31a8e92b8e Node Stats: add max content length to http info 2012-11-06 11:37:23 +01:00
Shay Banon 0d5530e55f Node Stats: add available processors to OS info 2012-11-06 11:31:23 +01:00
Shay Banon 33900476f4 Node Stats: Add largest thread pool count per thread pool stats
closes #2382
2012-11-06 11:25:38 +01:00
Igor Motov af1e8c0eb1 Add auto index creation on update request
Fixes #2375
2012-11-02 10:18:51 -04:00
Martijn van Groningen ef25ac2414 Fixed that the `ignore_indices` option isn't set in multi search api. Closes #2380 2012-11-02 10:58:02 +01:00
Aaron Dixon bd9a5bfa0c fixed issue2371 (incorrect behavior of path_match) 2012-11-01 22:11:33 +01:00
Chris Male 768b8b4d2b Changed SpatialRelation contains to within 2012-11-01 22:03:47 +01:00
Igor Motov 23f7b0002a Deleting a non-existent warmer shouldn't cause request to hang
Fixes #2363
2012-11-01 21:49:54 +01:00