Commit Graph

174 Commits

Author SHA1 Message Date
Michael McCandless 275fdcc08d fix silly smoke_test_plugin bugs when ES actually succeeds in starting with the installed plugins 2015-07-03 05:59:16 -04:00
Michael McCandless 48b85421ec fix smoke_test_plugins.py: install the .zip under releases, not the .jar 2015-06-30 15:03:54 -04:00
Michael McCandless 189ef91e3c first cut 2015-06-30 14:14:45 -04:00
Simon Willnauer e7eb9cf4de Ban java serialization
We had several problems with Java Serializatin in the past. At some point
in the Java 1.7.x series JDKs where not compatible anymore when java
serialization (ObjectStream) was used to exchange objects. In elasticsearch
we used this to serialize exceptions across the wire which caused several problems
with incompatible JDKs. While causing lot of trouble this essentially prevented
users from moving forward and upgrade their JVMs. To prevent these kind of issues
this commit removes the dependency on java serialization entirely and bans the
usage of ObjectOutputStream and ObjectInputStream entirely.

Yet, we can't fully serialize all exception anymore such that this commit
is best effort and adds hand written serialization to all elasticsearch exceptions
as well to a selected set of JDK and Lucene exceptions. (see StreamOutput#writeThrowable /
StreamInput.readThrowable). Stacktraces should be preserved for all exceptions while
several names might be replaced with ElasticsearchException if there is no mapping for
the given exception.
2015-06-30 14:51:43 +02:00
Alexander Reelsen f26672c184 Release: Build two RPMS, signed and unsigned
In order to support older RPM based distributions like CentOS5,
we should have one RPM available, which is not signed.

This commit creates an unsigned RPM first, then moves it over to
target/releases during the build, then builds a signed RPM.

The unsigned one is uploaded via S3, where as the signed one is
used for the repositories.

In addition, you can now build an RPM without having to specify
any gpg credentials due to offloading this into a maven profile
that is only activated when specifying `rpm.sign` property.

Closes #11587
2015-06-30 14:22:20 +02:00
Clinton Gormley fa40680736 Build: If SHA files have changed, explain how to update them in the license check exception 2015-06-30 11:29:35 +02:00
Simon Willnauer 91ab808ce2 Revert accidential modification of release script in this files previous commit
commit 05db5dc2c8
Author: Simon Willnauer <simonw@apache.org>
Date:   Fri Jun 5 13:12:05 2015 +0200

    create parent pom project from its original location
2015-06-25 17:18:34 +02:00
Clinton Gormley a3d1a50865 Build: tar on linux needs the --wildcard option, but not supported on OSX
Removing '*.jar'  filter when untarring during the license check
2015-06-23 14:07:10 +02:00
Clinton Gormley 9fb3bf06c5 Changed the license checker to use the ZIP file as the source of JARs to check.
Also checks that the tar.gz file (if present) contains the same JARs as the
ZIP file.
2015-06-23 12:50:31 +02:00
Simon Willnauer 1b2a3d0af6 Add @Repeat to forbidden APIs
@Repeat should not be committed just like @Seed.
Use -Pdev to run annotated methods.
2015-06-18 20:34:02 +02:00
Ryan Ernst 9157a11047 Build: Add Iterators.emptyIterator to forbidden apis
As a follow up to #11741, this forbids Iterators.emptyIterator in
favor the of builtin Collections.emptyIterator.
2015-06-18 10:12:58 -07:00
Simon Willnauer 2a63249441 Add DateTime ctors without timezone to forbidden APIs
Using DateTime with default timezone is asking for trouble and should
be added to forbidden APIs
2015-06-18 10:43:45 +02:00
Clinton Gormley 05d512f417 Packaging: Add LICENSE, NOTICE, and sha1 files and tests for all core dependencies
Added a licenses/ directory to core which contains a sha1 file for each JAR
dependency, and one or more LICENSE files and one NOTICE file for each
project.

Also adds dev-tools/src/main/resources/license-check/check_license_and_sha.pl
which checks that the licenses/ dir is up to date during a mvn verify,
and which can be used to update the sha1 files when upgrading dependencies.

Closes #2794
Closes #10684
Closes #11705
2015-06-17 18:06:00 +02:00
Alexander Reelsen a54d4e4aa8 Versioning: Adding 1.6.1 development version & 1.6.0 bwc index 2015-06-09 16:30:02 +02:00
Simon Willnauer 05db5dc2c8 create parent pom project from its original location 2015-06-05 13:12:05 +02:00
Michael McCandless e1197dfea9 Merge branch 'master' into require_units
Conflicts:
	src/main/java/org/elasticsearch/action/bulk/BulkRequest.java
	src/main/java/org/elasticsearch/cluster/metadata/MetaDataIndexUpgradeService.java
	src/main/java/org/elasticsearch/node/internal/InternalSettingsPreparer.java
	src/test/java/org/elasticsearch/snapshots/DedicatedClusterSnapshotRestoreTests.java
2015-06-04 13:59:10 -04:00
Michael McCandless 68d6427944 add missing units to index settings if index was created before 2.0 2015-05-30 04:39:03 -04:00
Clinton Gormley bd381b86ca Release notes: Add HTML header with UTF-8 encoding 2015-05-29 21:02:38 +02:00
Alexander Reelsen b465e19e5f Release script: Always check for valid environment
In order to be sure that a release can be executed on the local machine,
the build_release script now checks for environment variables and tries
to execute a  couple of commands.

In order to easily check for a correctly setup environment, you can
run the following commands, which exits early and does not trigger a
release process.

```
python3 dev-tools/build_release.py --check-only
```
2015-05-26 14:43:29 +02:00
Robert Muir b462fd712a factor out static analysis 2015-05-23 01:33:37 -04:00
Robert Muir 5330e3423f remove build duplication 2015-05-22 23:23:59 -04:00
Igor Motov 21ed6bb90c Core: Don't allow indices containing too-old segments to be opened
When index is introduced into the cluster via cluster upgrade, restore or as a dangled index the MetaDataIndexUpgradeService checks if this index can be upgraded to the current version. If upgrade is not possible, the newly upgraded cluster startup and restore process are aborted, the dangled index is imported as a closed index that cannot be open.

Closes #10215
2015-05-19 23:37:05 -04:00
Adrien Grand 4131bcbec7 Search: Make FilteredQuery a forbidden API.
This commit makes FilteredQuery a forbidden API and also removes some more usage
of the Filter API. There are some remaining code using filters for parent/child
queries but I'm not touching this as they are already being refactored in #6511.
2015-05-19 15:33:43 +02:00
Robert Muir 38cccfb057 cleanup and ban temp files going to jvm default location 2015-05-08 15:08:13 -04:00
Robert Muir 51c71c235b Ban PathUtils.get (for now, until we fix the two remaining issues) 2015-05-08 14:42:27 -04:00
Adrien Grand b72f27a410 Core: Cut over to the Lucene filter cache.
This removes Elasticsearch's filter cache and uses Lucene's instead. It has some
implications:
 - custom cache keys (`_cache_key`) are unsupported
 - decisions are made internally and can't be overridden by users ('_cache`)
 - not only filters can be cached but also all queries that do not need scores
 - parent/child queries can now be cached, however cached entries are only
   valid for the current top-level reader so in practice it will likely only
   be used on read-only indices
 - the cache deduplicates filters, which plays nicer with large keys (eg. `terms`)
 - better stats: we already had ram usage and evictions, but now also hit count,
   miss count, lookup count, number of cached doc id sets and current number of
   doc id sets in the cache
 - dynamically changing the filter cache size is not supported anymore

Internally, an important change is that it removes the NoCacheFilter infrastructure
in favour of making Query.rewrite specializing the query for the current reader so
that it will only be cached on this reader (look for IndexCacheableQuery).

Note that consuming filters with the query API (createWeight/scorer) instead of
the filter API (getDocIdSet) is important for parent/child queries because
otherwise a QueryWrapperFilter(ParentQuery) would run the wrapped query per
segment while relations might be cross segments.
2015-05-04 09:02:15 +02:00
Alexander Reelsen b25259532e Release: Fix build repositories script
Minor issue with specifying the correct version when starting the package release script.
Another issue fixed to make sure that the S3 bucket parameters act the same.
2015-04-28 10:04:30 +02:00
Alexander Reelsen 924479369f Release script: Fix wrong argument for string formatting 2015-04-27 11:09:02 +02:00
Alexander Reelsen f64739788b Build: Update package repositories when creating a release
In order to automatically sign and and upload our debian and RPM
packages, this commit incorporates signing into the build process
and adds the necessary steps to the release process. In order to do this
the pom.xml has been adapted and the RPM and jdeb maven plugins have been
updated, so the packages are signed on build. However the repositories
need to signed as well.

Syncing the repos requires downloading the current repo, adding
the new packages and syncing it back.

The following environment variables are now required as part of the build

* GPG_KEY_ID - the key ID of the key used for signing
* GPG_PASSPHRASE - your GPG passphrase
* S3_BUCKET_SYNC_TO: S3 bucket to sync new repo into

The following environment variables are optional

* S3_BUCKET_SYNC_FROM: S3 bucket to get existing packages from
* GPG_KEYRING - home of gnupg, defaults to ~/.gnupg

The following command line tools are needed

* createrepo (creates RPM repositories)
* expect (used by the maven rpm plugin)
* apt-ftparchive (creates DEB repositories)
* gpg (signs packages and repo files)
* s3cmd (syncing between the different S3 buckets)

The current approach would also work for users who want to run their
own repositories, all they need to change are a couple of environment
variables.

Minor implementation detail: Right now the branch name is used as version
for the repositories (like 1.4/1.5/1.6) - if we ever change our branch naming
scheme, the script needs to be fixed.
2015-04-26 19:05:47 +02:00
Robert Muir 270cb9f349 enable securitymanager 2015-04-22 03:04:50 -04:00
Robert Muir 69718916df actually remove this line rather than comment it out. tsts pass 2015-04-21 19:04:56 -04:00
Robert Muir 9d6b1382e7 Fix JVM isolation in tests.
Currently security manager would allow for one JVM to muck
with the files (read, write, AND delete) of another JVM.

This is unnecessary.
2015-04-21 19:02:14 -04:00
Adrien Grand d7abb12100 Replace deprecated filters with equivalent queries.
In Lucene 5.1 lots of filters got deprecated in favour of equivalent queries.
Additionally, random-access to filters is now replaced with approximations on
scorers. This commit
 - replaces the deprecated NumericRangeFilter, PrefixFilter, TermFilter and
   TermsFilter with NumericRangeQuery, PrefixQuery, TermQuery and TermsQuery,
   wrapped in a QueryWrapperFilter
 - replaces XBooleanFilter, AndFilter and OrFilter with a BooleanQuery in a
   QueryWrapperFilter
 - removes DocIdSets.isBroken: the new two-phase iteration API will now help
   execute slow filters efficiently
 - replaces FilterCachingPolicy with QueryCachingPolicy

Close #8960
2015-04-21 15:32:43 +02:00
Simon Willnauer 7ad138e17b [TEST] allow to read from lig/sigar 2015-04-20 18:15:51 +02:00
Robert Muir b09d236fc0 run tests with AssertingCodec to find bugs 2015-04-19 13:56:12 -04:00
Robert Muir 370819a98a Merge branch 'master' into mockfilesystem 2015-04-16 18:26:12 -04:00
Robert Muir 68267f4bb6 these leaks are plugged 2015-04-16 09:42:13 -04:00
Michael McCandless 399f0ccce9 Core: add only_ancient_segments to upgrade API, so only segments with an old Lucene version are upgraded
This option defaults to false, because it is also important to upgrade
the "merely old" segments since many Lucene improvements happen within
minor releases.

But you can pass true to do the minimal work necessary to upgrade to
the next major Elasticsearch release.

The HTTP GET upgrade request now also breaks out how many bytes of
ancient segments need upgrading.

Closes #10213

Closes #10540

Conflicts:
	dev-tools/create_bwc_index.py
	rest-api-spec/api/indices.upgrade.json
	src/main/java/org/elasticsearch/action/admin/indices/optimize/OptimizeRequest.java
	src/main/java/org/elasticsearch/action/admin/indices/optimize/ShardOptimizeRequest.java
	src/main/java/org/elasticsearch/action/admin/indices/optimize/TransportOptimizeAction.java
	src/main/java/org/elasticsearch/index/engine/InternalEngine.java
	src/test/java/org/elasticsearch/bwcompat/StaticIndexBackwardCompatibilityTest.java
	src/test/java/org/elasticsearch/index/engine/InternalEngineTests.java
	src/test/java/org/elasticsearch/rest/action/admin/indices/upgrade/UpgradeReallyOldIndexTest.java
2015-04-16 05:24:33 -04:00
Adrien Grand 563e704881 Mappings: Same code path for dynamic mappings updates and updates coming from the API.
We have two completely different code paths for mappings updates, depending on
whether they come from the API or are guessed based on the parsed documents.
This commit makes dynamic mappings updates execute like updates from the API.

The only change in behaviour is that a document that fails parsing can not
modify mappings anymore (useful to prevent issues such as #9851). Other than
that, this change should be fairly transparent to users but working this way
opens doors to other changes such as validating dynamic mappings updates on the
master node (#8688).

The way it works internally is that Mapper.parse now returns a Mapper instead
of being void. The returned Mapper represents a mapping update that has been
performed in order to parse the document. Mappings updates are propagated
recursively back to the root mapper, and once parsing is finished, we check
that the mappings update can be applied, and either fail the parsing if the
update cannot be merged (eg. because of a concurrent mapping update from the
API) or merge the update into the mappings.

However not all mappings updates can be applied recursively, `copy_to` for
instance can add mappings at totally different places in the tree. Because of
it I added ParseContext.rootMapperUpdates which `copy_to` fills when the
field to copy data to does not exist in the mappings yet. These mappings
updates are merged from the ones generated by regular parsing.

One particular mapping update was the `auto_boost` setting on the `all` root
mapper. Being tricky to work on, I removed it in favour of search-time checks
that payloads have been indexed.

One interesting side-effect of the change is that concurrency on ObjectMapper
is greatly simplified since we do not have to care anymore about having
concurrent dynamic mappings and API updates.
2015-04-16 10:16:59 +02:00
Robert Muir e5a699fa05 cutover to lucenetestcase 2015-04-16 00:58:02 -04:00
Robert Muir 6ac4d6daef contain filesystem access 2015-04-15 18:23:30 -04:00
Ryan Ernst a3f078985b Tests: Forbid tests from writing to CWD
Allowing tests writing to the working directory can mask problems.
For example, multiple tests running in the same jvm, and using the
same relative path, may cause issues if the first test to run
leaves data in the directory, and the second test does not remember
to cleanup the path before using it.

This change adds security manager rules to disallow tests writing
to the working directory. Instead, tests create a temp dir with
the existing test framework.

closes #10605
2015-04-15 12:45:20 -07:00
Simon Willnauer 67b48da15f [BUILD] Fix m2.repository path permission in tests.policy 2015-04-14 10:40:31 +02:00
Simon Willnauer fe411a9295 [BUILD] Restrict read permission to project.basedir/target if security manager is used 2015-04-14 09:35:40 +02:00
Simon Willnauer c13e604697 [BUILD] Restrict read permission to project.basedir
This prevents reads from anywhere outside of the elasticsearch
clone when running tests with security manager enabled.
2015-04-13 16:44:31 +02:00
Robert Muir b936ec9a25 allow reflection of MXBean for file descriptor stats 2015-04-10 11:28:30 -04:00
Ryan Ernst e575c4ce53 Tests: Add --all flag to create-bwc script to regenerate all indexes
It is currently a pain to regenerate all the bwc indexes when making
a change to the generation script.  This makes it a single command.
2015-04-07 08:32:34 -07:00
Ryan Ernst c3011cead4 Tests: Revamp static bwc test framework to use dangling indexes
The static old index tests currently take a long time to run because
each index version essentially recreates the cluster, and spins up
new nodes.  This PR instead loads each old version into the existing
cluster as a dangling index. It also removes the intermediate
"StaticIndexBackwardCompatibilityTest" which was an extra layer
with no purpose, and moves a shared version of a commonly found
function to get an http client.

The test now takes between 40 and 60 seconds for me. I also ran it
"under stress" by running all ES tests in one shell, while
simultaneously running 10 iterations of the old index tests. Each
iteration took on average about 90 seconds, which is much better
than the 20+ minutes we see in master on jenkins.

closes #10247
2015-04-03 23:21:55 -07:00
Adrien Grand 1401075070 Tests: Speed up backward-compatibility tests for 1.1.0
1.1.0 is affected by #5817 which prevents merges from keeping up with the
indexing rate. As a consequence it generates lots of segments and makes bw
compat tests slow. So I added a special case for this version to index fewer
documents.
2015-04-02 19:11:12 +02:00
Adrien Grand 08f93cf33f Add doc values support to boolean fields.
This pull request makes boolean handled like dates and ipv4 addresses: things
are stored as as numerics under the hood and aggregations add some special
formatting logic in order to return true/false in addition to 1/0.

For example, here is an output of a terms aggregation on a boolean field:
```
   "aggregations": {
      "top_f": {
         "doc_count_error_upper_bound": 0,
         "buckets": [
            {
               "key": 0,
               "key_as_string": "false",
               "doc_count": 2
            },
            {
               "key": 1,
               "key_as_string": "true",
               "doc_count": 1
            }
         ]
      }
   }
```

Sorted numeric doc values are used under the hood.

Close #4678
Close #7851
2015-04-02 15:40:46 +02:00