Commit Graph

156 Commits

Author SHA1 Message Date
Michael McCandless 68d6427944 add missing units to index settings if index was created before 2.0 2015-05-30 04:39:03 -04:00
Robert Muir b462fd712a factor out static analysis 2015-05-23 01:33:37 -04:00
Robert Muir 5330e3423f remove build duplication 2015-05-22 23:23:59 -04:00
Igor Motov 21ed6bb90c Core: Don't allow indices containing too-old segments to be opened
When index is introduced into the cluster via cluster upgrade, restore or as a dangled index the MetaDataIndexUpgradeService checks if this index can be upgraded to the current version. If upgrade is not possible, the newly upgraded cluster startup and restore process are aborted, the dangled index is imported as a closed index that cannot be open.

Closes #10215
2015-05-19 23:37:05 -04:00
Adrien Grand 4131bcbec7 Search: Make FilteredQuery a forbidden API.
This commit makes FilteredQuery a forbidden API and also removes some more usage
of the Filter API. There are some remaining code using filters for parent/child
queries but I'm not touching this as they are already being refactored in #6511.
2015-05-19 15:33:43 +02:00
Robert Muir 38cccfb057 cleanup and ban temp files going to jvm default location 2015-05-08 15:08:13 -04:00
Robert Muir 51c71c235b Ban PathUtils.get (for now, until we fix the two remaining issues) 2015-05-08 14:42:27 -04:00
Adrien Grand b72f27a410 Core: Cut over to the Lucene filter cache.
This removes Elasticsearch's filter cache and uses Lucene's instead. It has some
implications:
 - custom cache keys (`_cache_key`) are unsupported
 - decisions are made internally and can't be overridden by users ('_cache`)
 - not only filters can be cached but also all queries that do not need scores
 - parent/child queries can now be cached, however cached entries are only
   valid for the current top-level reader so in practice it will likely only
   be used on read-only indices
 - the cache deduplicates filters, which plays nicer with large keys (eg. `terms`)
 - better stats: we already had ram usage and evictions, but now also hit count,
   miss count, lookup count, number of cached doc id sets and current number of
   doc id sets in the cache
 - dynamically changing the filter cache size is not supported anymore

Internally, an important change is that it removes the NoCacheFilter infrastructure
in favour of making Query.rewrite specializing the query for the current reader so
that it will only be cached on this reader (look for IndexCacheableQuery).

Note that consuming filters with the query API (createWeight/scorer) instead of
the filter API (getDocIdSet) is important for parent/child queries because
otherwise a QueryWrapperFilter(ParentQuery) would run the wrapped query per
segment while relations might be cross segments.
2015-05-04 09:02:15 +02:00
Alexander Reelsen b25259532e Release: Fix build repositories script
Minor issue with specifying the correct version when starting the package release script.
Another issue fixed to make sure that the S3 bucket parameters act the same.
2015-04-28 10:04:30 +02:00
Alexander Reelsen 924479369f Release script: Fix wrong argument for string formatting 2015-04-27 11:09:02 +02:00
Alexander Reelsen f64739788b Build: Update package repositories when creating a release
In order to automatically sign and and upload our debian and RPM
packages, this commit incorporates signing into the build process
and adds the necessary steps to the release process. In order to do this
the pom.xml has been adapted and the RPM and jdeb maven plugins have been
updated, so the packages are signed on build. However the repositories
need to signed as well.

Syncing the repos requires downloading the current repo, adding
the new packages and syncing it back.

The following environment variables are now required as part of the build

* GPG_KEY_ID - the key ID of the key used for signing
* GPG_PASSPHRASE - your GPG passphrase
* S3_BUCKET_SYNC_TO: S3 bucket to sync new repo into

The following environment variables are optional

* S3_BUCKET_SYNC_FROM: S3 bucket to get existing packages from
* GPG_KEYRING - home of gnupg, defaults to ~/.gnupg

The following command line tools are needed

* createrepo (creates RPM repositories)
* expect (used by the maven rpm plugin)
* apt-ftparchive (creates DEB repositories)
* gpg (signs packages and repo files)
* s3cmd (syncing between the different S3 buckets)

The current approach would also work for users who want to run their
own repositories, all they need to change are a couple of environment
variables.

Minor implementation detail: Right now the branch name is used as version
for the repositories (like 1.4/1.5/1.6) - if we ever change our branch naming
scheme, the script needs to be fixed.
2015-04-26 19:05:47 +02:00
Robert Muir 270cb9f349 enable securitymanager 2015-04-22 03:04:50 -04:00
Robert Muir 69718916df actually remove this line rather than comment it out. tsts pass 2015-04-21 19:04:56 -04:00
Robert Muir 9d6b1382e7 Fix JVM isolation in tests.
Currently security manager would allow for one JVM to muck
with the files (read, write, AND delete) of another JVM.

This is unnecessary.
2015-04-21 19:02:14 -04:00
Adrien Grand d7abb12100 Replace deprecated filters with equivalent queries.
In Lucene 5.1 lots of filters got deprecated in favour of equivalent queries.
Additionally, random-access to filters is now replaced with approximations on
scorers. This commit
 - replaces the deprecated NumericRangeFilter, PrefixFilter, TermFilter and
   TermsFilter with NumericRangeQuery, PrefixQuery, TermQuery and TermsQuery,
   wrapped in a QueryWrapperFilter
 - replaces XBooleanFilter, AndFilter and OrFilter with a BooleanQuery in a
   QueryWrapperFilter
 - removes DocIdSets.isBroken: the new two-phase iteration API will now help
   execute slow filters efficiently
 - replaces FilterCachingPolicy with QueryCachingPolicy

Close #8960
2015-04-21 15:32:43 +02:00
Simon Willnauer 7ad138e17b [TEST] allow to read from lig/sigar 2015-04-20 18:15:51 +02:00
Robert Muir b09d236fc0 run tests with AssertingCodec to find bugs 2015-04-19 13:56:12 -04:00
Robert Muir 370819a98a Merge branch 'master' into mockfilesystem 2015-04-16 18:26:12 -04:00
Robert Muir 68267f4bb6 these leaks are plugged 2015-04-16 09:42:13 -04:00
Michael McCandless 399f0ccce9 Core: add only_ancient_segments to upgrade API, so only segments with an old Lucene version are upgraded
This option defaults to false, because it is also important to upgrade
the "merely old" segments since many Lucene improvements happen within
minor releases.

But you can pass true to do the minimal work necessary to upgrade to
the next major Elasticsearch release.

The HTTP GET upgrade request now also breaks out how many bytes of
ancient segments need upgrading.

Closes #10213

Closes #10540

Conflicts:
	dev-tools/create_bwc_index.py
	rest-api-spec/api/indices.upgrade.json
	src/main/java/org/elasticsearch/action/admin/indices/optimize/OptimizeRequest.java
	src/main/java/org/elasticsearch/action/admin/indices/optimize/ShardOptimizeRequest.java
	src/main/java/org/elasticsearch/action/admin/indices/optimize/TransportOptimizeAction.java
	src/main/java/org/elasticsearch/index/engine/InternalEngine.java
	src/test/java/org/elasticsearch/bwcompat/StaticIndexBackwardCompatibilityTest.java
	src/test/java/org/elasticsearch/index/engine/InternalEngineTests.java
	src/test/java/org/elasticsearch/rest/action/admin/indices/upgrade/UpgradeReallyOldIndexTest.java
2015-04-16 05:24:33 -04:00
Adrien Grand 563e704881 Mappings: Same code path for dynamic mappings updates and updates coming from the API.
We have two completely different code paths for mappings updates, depending on
whether they come from the API or are guessed based on the parsed documents.
This commit makes dynamic mappings updates execute like updates from the API.

The only change in behaviour is that a document that fails parsing can not
modify mappings anymore (useful to prevent issues such as #9851). Other than
that, this change should be fairly transparent to users but working this way
opens doors to other changes such as validating dynamic mappings updates on the
master node (#8688).

The way it works internally is that Mapper.parse now returns a Mapper instead
of being void. The returned Mapper represents a mapping update that has been
performed in order to parse the document. Mappings updates are propagated
recursively back to the root mapper, and once parsing is finished, we check
that the mappings update can be applied, and either fail the parsing if the
update cannot be merged (eg. because of a concurrent mapping update from the
API) or merge the update into the mappings.

However not all mappings updates can be applied recursively, `copy_to` for
instance can add mappings at totally different places in the tree. Because of
it I added ParseContext.rootMapperUpdates which `copy_to` fills when the
field to copy data to does not exist in the mappings yet. These mappings
updates are merged from the ones generated by regular parsing.

One particular mapping update was the `auto_boost` setting on the `all` root
mapper. Being tricky to work on, I removed it in favour of search-time checks
that payloads have been indexed.

One interesting side-effect of the change is that concurrency on ObjectMapper
is greatly simplified since we do not have to care anymore about having
concurrent dynamic mappings and API updates.
2015-04-16 10:16:59 +02:00
Robert Muir e5a699fa05 cutover to lucenetestcase 2015-04-16 00:58:02 -04:00
Robert Muir 6ac4d6daef contain filesystem access 2015-04-15 18:23:30 -04:00
Ryan Ernst a3f078985b Tests: Forbid tests from writing to CWD
Allowing tests writing to the working directory can mask problems.
For example, multiple tests running in the same jvm, and using the
same relative path, may cause issues if the first test to run
leaves data in the directory, and the second test does not remember
to cleanup the path before using it.

This change adds security manager rules to disallow tests writing
to the working directory. Instead, tests create a temp dir with
the existing test framework.

closes #10605
2015-04-15 12:45:20 -07:00
Simon Willnauer 67b48da15f [BUILD] Fix m2.repository path permission in tests.policy 2015-04-14 10:40:31 +02:00
Simon Willnauer fe411a9295 [BUILD] Restrict read permission to project.basedir/target if security manager is used 2015-04-14 09:35:40 +02:00
Simon Willnauer c13e604697 [BUILD] Restrict read permission to project.basedir
This prevents reads from anywhere outside of the elasticsearch
clone when running tests with security manager enabled.
2015-04-13 16:44:31 +02:00
Robert Muir b936ec9a25 allow reflection of MXBean for file descriptor stats 2015-04-10 11:28:30 -04:00
Ryan Ernst e575c4ce53 Tests: Add --all flag to create-bwc script to regenerate all indexes
It is currently a pain to regenerate all the bwc indexes when making
a change to the generation script.  This makes it a single command.
2015-04-07 08:32:34 -07:00
Ryan Ernst c3011cead4 Tests: Revamp static bwc test framework to use dangling indexes
The static old index tests currently take a long time to run because
each index version essentially recreates the cluster, and spins up
new nodes.  This PR instead loads each old version into the existing
cluster as a dangling index. It also removes the intermediate
"StaticIndexBackwardCompatibilityTest" which was an extra layer
with no purpose, and moves a shared version of a commonly found
function to get an http client.

The test now takes between 40 and 60 seconds for me. I also ran it
"under stress" by running all ES tests in one shell, while
simultaneously running 10 iterations of the old index tests. Each
iteration took on average about 90 seconds, which is much better
than the 20+ minutes we see in master on jenkins.

closes #10247
2015-04-03 23:21:55 -07:00
Adrien Grand 1401075070 Tests: Speed up backward-compatibility tests for 1.1.0
1.1.0 is affected by #5817 which prevents merges from keeping up with the
indexing rate. As a consequence it generates lots of segments and makes bw
compat tests slow. So I added a special case for this version to index fewer
documents.
2015-04-02 19:11:12 +02:00
Adrien Grand 08f93cf33f Add doc values support to boolean fields.
This pull request makes boolean handled like dates and ipv4 addresses: things
are stored as as numerics under the hood and aggregations add some special
formatting logic in order to return true/false in addition to 1/0.

For example, here is an output of a terms aggregation on a boolean field:
```
   "aggregations": {
      "top_f": {
         "doc_count_error_upper_bound": 0,
         "buckets": [
            {
               "key": 0,
               "key_as_string": "false",
               "doc_count": 2
            },
            {
               "key": 1,
               "key_as_string": "true",
               "doc_count": 1
            }
         ]
      }
   }
```

Sorted numeric doc values are used under the hood.

Close #4678
Close #7851
2015-04-02 15:40:46 +02:00
javanna d9628649a2 [TEST] remove needless script settings from create-bwc-index script
dynamic scripts are not needed here, can be disabled.
2015-03-26 19:56:55 +01:00
javanna f4592a17e3 [TEST] remove needless script settings from upgrade tests script
dynamic scripts are not needed in our upgrade tests, can be removed.
2015-03-26 19:56:55 +01:00
javanna d9d1e6a67a Scripting: add support for fine-grained settings
Allow to on/off scripting based on their source (where they get loaded from), the  operation that executes them and their language.

The settings cover the following combinations:

- mode: on, off, sandbox
- source: indexed, dynamic, file
- engine: groovy, expressions, mustache, etc
- operation: update, search, aggs, mapping

The following settings are supported for every engine:

script.engine.groovy.indexed.update:    sandbox/on/off
script.engine.groovy.indexed.search:    sandbox/on/off
script.engine.groovy.indexed.aggs:      sandbox/on/off
script.engine.groovy.indexed.mapping:   sandbox/on/off
script.engine.groovy.dynamic.update:    sandbox/on/off
script.engine.groovy.dynamic.search:    sandbox/on/off
script.engine.groovy.dynamic.aggs:      sandbox/on/off
script.engine.groovy.dynamic.mapping:   sandbox/on/off
script.engine.groovy.file.update:       sandbox/on/off
script.engine.groovy.file.search:       sandbox/on/off
script.engine.groovy.file.aggs:         sandbox/on/off
script.engine.groovy.file.mapping:      sandbox/on/off

For ease of use, the following more generic settings are supported too:

script.indexed: sandbox/on/off
script.dynamic: sandbox/on/off
script.file:    sandbox/on/off

script.update:  sandbox/on/off
script.search:  sandbox/on/off
script.aggs:    sandbox/on/off
script.mapping: sandbox/on/off

These will be used to calculate the more specific settings, using the stricter setting of each combination. Operation based settings have precedence over conflicting source based ones.

Note that the `mustache` engine is affected by generic settings applied to any language, while native scripts aren't as they are static by definition.

Also, the previous `script.disable_dynamic` setting can now be deprecated.

Closes #6418
Closes #10116
Closes #10274
2015-03-26 19:56:55 +01:00
Michael McCandless 442f539802 Tests: improve back compat tests by adding delete-by-query in the transaction log on upgrade
Closes #10266
2015-03-26 10:12:22 -04:00
Ryan Ernst 66669c337b Tests: Add static bwc index for 1.5.0 2015-03-23 08:04:05 -07:00
Martijn van Groningen 6a2a4bf28d change elasticsearch.org into elastic.co 2015-03-23 15:13:28 +01:00
Martijn van Groningen 5b3c0143c8 Disable marvel as it may fail the tests, because it creates indices. 2015-03-23 15:13:28 +01:00
Martijn van Groningen 911f522a0e change url to use elastic organization 2015-03-23 13:13:26 +01:00
javanna 65b9a94baa Benchmark api: removed leftovers
Closes #10182
Closes #10184
2015-03-21 10:36:05 +01:00
Clinton Gormley ef5bf9c21b Changed the release notes script to use the category labels instead of title prefix 2015-03-19 16:52:47 +01:00
Robert Muir 3a0700862a regenerate 0.90.x indexes without completion suggester 2015-03-18 14:29:41 -04:00
Robert Muir 087b107dd2 Create bigger back compat indexes. 2015-03-18 14:16:16 -04:00
Britta Weber fc5dcf189c [release script] Check for //NORELEASE in code before release
Lines in the code that should be removed before a release can be annotated with
//NORELEASE . This can be useful when debugging test failures. For example,
one might want to add additional logging that would be too verbose for production
and therfore should be removed before releasing.

closes #10141
2015-03-18 09:59:28 -07:00
Simon Willnauer 8e58c0b7a8 Add license header to signature files 2015-03-16 14:19:08 -07:00
Simon Willnauer 1f712842f7 [TEST] Ban @Seed from test 2015-03-16 14:17:51 -07:00
Tanguy Leroux c457499cb2 [Native] Use direct mapping call in Kernel32Library
This commit modifies the Kernel32Library to use direct mapping instead of a proxy class when doing native calls on Windows platforms. It also adds the "createSecurityManager" permission to the tests.policy file, and adds unit tests that should have failed when the Java security manager is enabled.

Closes #9802
2015-03-02 09:48:18 +01:00
Ryan Ernst 7181bbde26 Mappings: Remove _boost field
This has been deprecated since 1.0.0.RC1. It is finally removed here.

closes #8875
2015-02-26 15:07:07 -08:00
Robert Muir b7e49f11ed fix comment 2015-02-25 15:32:45 -05:00