OpenSearch

Commit Graph

Author	SHA1	Message	Date
Tim Brooks	21838d73b5	Extract message serialization from `TcpTransport` (#37034 ) This commit introduces a NetworkMessage class. This class has two subclasses - InboundMessage and OutboundMessage. These messages can be serialized and deserialized independent of the transport. This allows more granular testing. Additionally, the serialization mechanism is now a simple Supplier. This builds the framework to eventually move the serialization of transport messages to the network thread. This is the one serialization component that is not currently performed on the network thread (transport deserialization and http serialization and deserialization are all on the network thread).	2019-01-21 14:14:18 -07:00
Tim Brooks	f516d68fb2	Share `NioGroup` between http and transport impls (#37396 ) Currently we create dedicated network threads for both the http and transport implementations. Since these these threads should never perform blocking operations, these threads could be shared. This commit modifies the nio-transport to have 0 http workers be default. If the default configs are used, this will cause the http transport to be run on the transport worker threads. The http worker setting will still exist in case the user would like to configure dedicated workers. Additionally, this commmit deletes dedicated acceptor threads. We have never had these for the netty transport and they can be added back if a need is determined in the future.	2019-01-21 13:50:56 -07:00
Tim Vernum	6d99e790b3	Add SSL Configuration Library (#37287 ) This introduces a new ssl-config library that can parse and validate SSL/TLS settings and files. It supports the standard configuration settings as used in the Elastic Stack such as "ssl.verification_mode" and "ssl.certificate_authorities" as well as all file formats used in other parts of Elasticsearch security (such as PEM, JKS, PKCS#12, PKCS#8, et al).	2019-01-16 21:52:17 +11:00
Igor Motov	6f91f06d86	Geo: Adds a set of no dependency geo classes for JDBC driver (#36477 ) Adds a set of geo classes to represent geo data in the JDBC driver and to be used as an intermediate format to pass geo shapes for indexing and query generation in #35320. Relates to #35767 and #35320	2019-01-15 10:52:46 -05:00
Tim Brooks	9de62f1262	Increase IO direct byte buffers to 256KB (#37283 ) Currently we read and write 64KB at a time in the nio libraries. As a single byte buffer per event loop thread does not consume much memory, there is little reason to not increase it further. This commit increases the buffer to 256KB but still limits a single write to 64KB. The write limit could be increased, but too high of a write limit will lead to copying more data (if all the data is not flushed and needs to be copied on the next call). This is something to explore in the future.	2019-01-10 09:17:20 -07:00
Tim Brooks	cfa58a51af	Add TLS/SSL channel close timeouts (#37246 ) Closing a channel using TLS/SSL requires reading and writing a CLOSE_NOTIFY message (for pre-1.3 TLS versions). Many implementations do not actually send the CLOSE_NOTIFY message, which means we are depending on the TCP close from the other side to ensure channels are closed. In case there is an issue with this, we need a timeout. This commit adds a timeout to the channel close process for TLS secured channels. As part of this change, we need a timer service. We could use the generic Elasticsearch timeout threadpool. However, it would be nice to have a local to the nio event loop timer service dedicated to network needs. In the future this service could support read timeouts, connect timeouts, request timeouts, etc. This commit adds a basic priority queue backed service. Since our timeout volume (channel closes) is very low, this should be fine. However, this can be updated to something more efficient in the future if needed (timer wheel). Everything being local to the event loop thread makes the logic simple as no locking or synchronization is necessary.	2019-01-09 11:46:24 -07:00
Alpar Torok	6344e9a3ce	Testing conventions: add support for checking base classes (#36650 )	2019-01-08 13:39:03 +02:00
Alpar Torok	a7c3d5842a	Split third party audit exclusions by type (#36763 )	2019-01-07 17:24:19 +02:00
Alpar Torok	e9ef5bdce8	Converting randomized testing to create a separate unitTest task instead of replacing the builtin test task (#36311 ) - Create a separate unitTest task instead of Gradle's built in - convert all configuration to use the new task - the built in task is now disabled	2018-12-19 08:25:20 +02:00
Tim Brooks	e63d52af63	Move page size constants to PageCacheRecycler (#36524 ) `PageCacheRecycler` is the class that creates and holds pages of arrays for various uses. `BigArrays` is just one user of these pages. This commit moves the constants that define the page sizes for the recycler to be on the recycler class.	2018-12-12 07:00:50 -07:00
Tim Brooks	373c67dd7a	Add DirectByteBuffer strategy for transport-nio (#36289 ) This is related to #27260. In Elasticsearch all of the messages that we serialize to write to the network are composed of heap bytes. When you read or write to a nio socket in java, the heap memory you passed down must be copied to/from direct memory. The JVM internally does some buffering of the direct memory, however it is essentially unbounded. This commit introduces a simple mechanism of buffering and copying the memory in transport-nio. Each network event loop is given a 64kb DirectByteBuffer. When we go to read we use this buffer and copy the data after the read. Additionally, when we go to write, we copy the data to the direct memory before calling write. 64KB is chosen as this is the default receive buffer size we use for transport-netty4 (NETTY_RECEIVE_PREDICTOR_SIZE). Since we only have one buffer per thread, we could afford larger. However, if we the buffer is large and not all of the data is flushed in a write call, we will do excess copies. This is something we can explore in the future.	2018-12-06 18:09:07 -07:00
Jim Ferenczi	18866c4c0b	Make hits.total an object in the search response (#35849 ) This commit changes the format of the `hits.total` in the search response to be an object with a `value` and a `relation`. The `value` indicates the number of hits that match the query and the `relation` indicates whether the number is accurate (in which case the relation is equals to `eq`) or a lower bound of the total (in which case it is equals to `gte`). This change also adds a parameter called `rest_total_hits_as_int` that can be used in the search APIs to opt out from this change (retrieve the total hits as a number in the rest response). Note that currently all search responses are accurate (`track_total_hits: true`) or they don't contain `hits.total` (`track_total_hits: true`). We'll add a way to get a lower bound of the total hits in a follow up (to allow numbers to be passed to `track_total_hits`). Relates #33028	2018-12-05 19:49:06 +01:00
Tim Brooks	b6ed6ef189	Add sni name to SSLEngine in nio transport (#35920 ) This commit is related to #32517. It allows an "sni_server_name" attribute on a DiscoveryNode to be propagated to the server using the TLS SNI extentsion. Prior to this commit, this functionality was only support for the netty transport. This commit adds this functionality to the security nio transport.	2018-11-27 09:06:52 -07:00
John	0baffda390	ingest: grok remove duplicated patterns (#35886 ) This commit removes the redundant (and incorrect) JAVACLASS and JAVAFILE grok patterns. This helps to keep parity with Logstash's patterns. See also: https://github.com/logstash-plugins/logstash-patterns-core/pull/237 closes #35699	2018-11-26 11:13:46 -06:00
Igor Motov	39789d0a10	GEO: More robust handling of ignore_malformed in geoshape parsing (#35603 ) Adds an XContent sub parser class that can to wrap another XContent parser at the beginning of an object and allow skiping all children in case of the parsing failure. It also uses this subparser to ignore the rest of the GeoJson shape if the parsing fails and we need to ignore the geoshape due to the ignore_malformed flag. Supersedes #34498 Closes #34047	2018-11-21 11:04:01 -10:00
Simon Willnauer	0cc0fd2d15	Add a frozen engine implementation (#34357 ) This change adds a `frozen` engine that allows lazily open a directory reader on a read-only shard. The engine wraps general purpose searchers in a LazyDirectoryReader that also allows to release and reset the underlying index readers after any and before secondary search phases. Relates to #34352	2018-11-07 20:23:35 +01:00
Alan Woodward	e2af849f70	Move ObjectPath and XContentUtils to libs/x-content (#34803 ) These are generally useful utility classes that do not need to live in the Watcher code	2018-11-02 15:12:09 +00:00
Nik Everett	3cde1356c1	XContent: Check for bad parsers (#34561 ) Adds checks for misbehaving parsers. The checks aren't perfect at all but they are simple and fast enough that we can do them all the time so they'll catch most badly behaving parsers. Closes #34351	2018-10-25 17:03:42 -04:00
Jay Modi	d824cbe992	Test: ensure char[] doesn't being with prefix (#34816 ) The testCharsBeginsWith test has a check that a random prefix of length 2 is not the prefix of a char[]. However, there is no check that the char[] is not randomly generated with the same two characters as the prefix. This change ensures that the char[] does not begin with the prefix. Closes #34765	2018-10-25 08:58:21 -06:00
Julie Tibshirani	5a4866f67d	Mute CharArraysTests#testCharsBeginsWith while we await a fix.	2018-10-23 11:37:54 -07:00
Alpar Torok	0536635c44	Upgrade forbiddenapis to 2.6 (#33809 ) * Upgrade forbiddenapis to 2.6 Closes #33759 * Switch forbiddenApis back to official plugin * Remove CLI based task * Fix forbiddenApisJava9	2018-10-23 12:06:46 +03:00
Daniel Mitterdorfer	dbb6fe58fa	Remove hand-coded XContent duplicate checks With this commit we cleanup hand-coded duplicate checks in XContent parsing. They were necessary previously but since we reconfigured the underlying parser in #22073 and #22225, these checks are obsolete and were also ineffective unless an undocumented system property has been set. As we also remove this escape hatch, we can remove the additional checks as well. Closes #22253 Relates #34588	2018-10-19 10:13:13 +02:00
Daniel Mitterdorfer	92b2e1a209	Remove lenient boolean handling With this commit we remove some leftovers from #26389 which cleaned up lenient boolean handling. Relates #26389 Relates #22298 Relates #34467	2018-10-16 06:30:00 +02:00
Mayya Sharipova	80c5d30f30	XContentBuilder to handle BigInteger and BigDecimal (#32888 ) Although we allow to index BigInteger and BigDecimal into a keyword field, source filtering on these fields would fail as XContentBuilder was not able to deserialize BigInteger and BigDecimal to json. This modifies XContentBuilder to allow to handle BigInteger and BigDecimal. Closes #32395	2018-09-26 14:24:31 -04:00
Christoph Büscher	ba3ceeaccf	Clean up "unused variable" warnings (#31876 ) This change cleans up "unused variable" warnings. There are several cases were we most likely want to suppress the warnings (especially in the client documentation test where the snippets contain many unused variables). In a lot of cases the unused variables can just be deleted though.	2018-09-26 14:09:32 +02:00
Vladimir Dolzhenko	a3e8b831ee	add elasticsearch-shard tool (#32281 ) Relates #31389	2018-09-19 10:28:22 +02:00
Simon Willnauer	c783488e97	Add `_source`-only snapshot repository (#32844 ) This change adds a `_source` only snapshot repository that allows to wrap any existing repository as a _backend_ to snapshot only the `_source` part including live docs markers. Snapshots taken with the `source` repository won't include any indices, doc-values or points. The snapshot will be reduced in size and functionality such that it requires full re-indexing after it's successfully restored. The restore process will copy the `_source` data locally starts a special shard and engine to allow `match_all` scrolls and searches. Any other query, or get call will fail with and unsupported operation exception. The restored index is also marked as read-only. This feature aims mainly for disaster recovery use-cases where snapshot size is a concern or where time to restore is less of an issue. NOTE: The snapshot produced by this repository is still a valid lucene index. This change doesn't allow for any longer retention policies which is out of scope for this change.	2018-09-12 17:47:10 +02:00
Alpar Torok	44ed5f6306	Enable forbiddenapis server java9 (#33245 )	2018-08-31 09:31:55 +03:00
Alpar Torok	5cf6e0d4bc	Ignore module-info in jar hell checks (#33011 ) * Ignore module-info in JarHell checks * Add unit test * integration test to test that jarhell is ran with precommit	2018-08-30 11:41:39 +03:00
Alpar Torok	82d10b484a	Run forbidden api checks with runtimeJavaVersion (#32947 ) Run forbidden APIs checks with runtime hava version	2018-08-22 09:05:22 +03:00
Adrien Grand	039babddf5	CharArraysTests: Fix test bug.	2018-08-16 11:54:39 +02:00
Jay Modi	1a45b27d8b	Move CharArrays to core lib (#32851 ) This change cleans up some methods in the CharArrays class from x-pack, which includes the unification of char[] to utf8 and utf8 to char[] conversions that intentionally do not use strings. There was previously an implementation in x-pack and in the reloading of secure settings. The method from the reloading of secure settings was adopted as it handled more scenarios related to the backing byte and char buffers that were used to perform the conversions. The cleaned up class is moved into libs/core to allow it to be used by requests that will be migrated to the high level rest client. Relates #32332	2018-08-15 15:26:00 -06:00
Jake Landis	be62092060	Introduce the dissect library (#32297 ) The dissect library will be used for the ingest node as an alternative to Grok to split a string based on a pattern. Dissect differs from Grok such that regular expressions are not used to split the string. Note - Regular expressions are used during construction of the objects, but not in the hot path. A dissect pattern takes the form of: '%{a} %{b},%{c}' which is composed of 3 keys (a,b,c) and two delimiters (space and comma). This dissect pattern will match a string of the form: 'foo bar,baz' and will result a key/value pairing of 'a=foo, b=bar, and c=baz'. See the comments in DissectParser for a full explanation. This commit does not include the ingest node processor that will consume it. However, the consumption should be a trivial mapping between the key/value pairing returned by the parser and the key/value pairing needed for the IngestDocument.	2018-08-14 17:08:55 -07:00
Armin Braun	580d59e2d7	CORE: Upgrade to Jackson 2.8.11 (#32670 ) * closes #30352	2018-08-08 12:04:25 +02:00
Jason Tedor	3fb0923182	Fix content type detection with leading whitespace (#32632 ) Today content type detection on an input stream works by peeking up to twenty bytes into the stream. If the stream is headed by more whitespace than twenty bytes, we might fail to detect the content type. We should be ignoring this whitespace before attempting to detect the content type. This commit does that by ignoring all leading whitespace in an input stream before attempting to guess the content type.	2018-08-06 18:07:46 -04:00
Armin Braun	4dda5a990b	INGEST: Fix ThreadWatchDog Throwing on Shutdown (#32578 ) * INGEST: Fix ThreadWatchDog Throwing on Shutdown * #32539 is caused by the fact that ThreadWatchDog.Default could throw on shutdown if the ThreadPool is interrupted while `interruptLongRunningExecutions` is in progress. This is a result of the watchdog not having a lifecycle of its own (normally it terminates when the threadpool terminates). * We can't easily use `org.elasticsearch.common.util.concurrent.EsRejectedExecutionException#isExecutorShutdown` to catch this state the same way other components do since thatwould require adding the core lib to Grok as a dependency * Since we have no knowledge of the lifecycle in this compontent since we're only passed the scheduler `BiFunction` I fixed this by only scheduling the watchdog when there's actually registered threads in it. * I think using the patter of locking via two `Atomic` values should not be much of a performance concern here under load since either the integer will likely be > 0 in this case (because we have multiple Grok in parallel) or the running state will be true because there likely was at least one thread registered when the watchdog ran and so the enqueing of the watchdog task during `register` will happen very rarely here (in the worst case scenario of only a single Grok thread it will happen less frequently than once every `ingest.grok.watchdog.interval`). The atomic update on the count should not be relevant relative to the cost of adding a new node to the CHM either. Fixes #32539 * Also fixes the watchdog to run if it doens't have to in general.	2018-08-06 22:46:26 +02:00
Christoph Büscher	ff87b7aba4	Remove unnecessary warning supressions (#32250 )	2018-07-23 11:31:04 +02:00
Alpar Torok	38e2e1d553	Detect and prevent configuration that triggers a Gradle bug (#31912 ) * Detect and prevent configuration that triggers a Gradle bug As we found in #31862, this can lead to a lot of wasted time as it's not immediatly obvius what's going on. Givent how many projects we have it's getting increasingly easier to run into gradle/gradle#847.	2018-07-19 06:46:58 +00:00
Tim Brooks	c375d5ab23	Add nio transport to security plugin (#31942 ) This is related to #27260. It adds the SecurityNioTransport to the security plugin. Additionally, it adds support for ip filtering. And it randomly uses the nio transport in security integration tests.	2018-07-12 11:55:38 -06:00
Christoph Büscher	4ae4ac08d5	Add Expected Reciprocal Rank metric (#31891 ) This change adds Expected Reciprocal Rank (ERR) as a ranking evaluation metric as descriped in: Chapelle, O., Metlzer, D., Zhang, Y., & Grinspan, P. (2009). Expected reciprocal rank for graded relevance. Proceeding of the 18th ACM Conference on Information and Knowledge Management. https://doi.org/10.1145/1645953.1646033 ERR is an extension of the classical reciprocal rank to the graded relevance case and assumes a cascade browsing model. It quantifies the usefulness of a document at rank `i` conditioned on the degree of relevance of the items at ranks less than `i`. ERR seems to be gain traction as an alternative to (n)DCG, so it seems like a good metric to support. Also ERR seems to be the default optimization metric used for training in RankLib, a widely used learning to rank library. Relates to #29653	2018-07-12 15:50:58 +02:00
Nik Everett	fb27f3e7f0	HLREST: Add x-pack-info API (#31870 ) This is the first x-pack API we're adding to the high level REST client so there is a lot to talk about here! = Open source The client for these APIs is open source. We're taking the previously Elastic licensed files used for the `Request` and `Response` objects and relicensing them under the Apache 2 license. The implementation of these features is staying under the Elastic license. This lines up with how the rest of the Elasticsearch language clients work. = Location of the new files We're moving all of the `Request` and `Response` objects that we're relicensing to the `x-pack/protocol` directory. We're adding a copy of the Apache 2 license to the root fo the `x-pack/protocol` directory to line up with the language in the root `LICENSE.txt` file. All files in this directory will have the Apache 2 license header as well. We don't want there to be any confusion. Even though the files are under the `x-pack` directory, they are Apache 2 licensed. We chose this particular directory layout because it keeps the X-Pack stuff together and easier to think about. = Location of the API in the REST client We've been following the layout of the rest-api-spec files for other APIs and we plan to do this for the X-Pack APIs with one exception: we're dropping the `xpack` from the name of most of the APIs. So `xpack.graph.explore` will become `graph().explore()` and `xpack.license.get` will become `license().get()`. `xpack.info` and `xpack.usage` are special here though because they don't belong to any proper category. For now I'm just calling `xpack.info` `xPackInfo()` and intend to call usage `xPackUsage` though I'm not convinced that this is the final name for them. But it does get us started. = Jars, jars everywhere! This change makes the `xpack:protocol` project a `compile` scoped dependency of the `x-pack:plugin:core` and `client:rest-high-level` projects. I intend to keep it a compile scoped dependency of `x-pack:plugin:core` but I intend to bundle the contents of the protocol jar into the `client:rest-high-level` jar in a follow up. This change has grown large enough at this point. In that followup I'll address javadoc issues as well. = Breaking-Java This breaks that transport client by a few classes around. We've traditionally been ok with doing this to the transport client.	2018-07-08 11:03:56 -04:00
Armin Braun	b7b413e55e	Extend allowed characters for grok field names (#21745 ) (#31653 )	2018-06-29 09:12:47 +02:00
Tim Brooks	86423f9563	Ensure local addresses aren't null (#31440 ) Currently we set local addresses on the creation time of a NioChannel. However, this may return null as the local address may not have been set yet. An example is the local address has not been set on a client channel as the connection process is not yet complete. This PR modifies the getter to set the local field if it is currently null.	2018-06-20 19:50:14 -06:00
Tim Brooks	ffba20b748	Do not preallocate bytes for channel buffer (#31400 ) Currently, when we open a new channel, we pass it an InboundChannelBuffer. The channel buffer is preallocated a single 16kb page. However, there is no guarantee that this channel will be read from anytime soon. Instead, this commit does not preallocate that page. That page will be allocated when we receive a read event.	2018-06-19 09:36:12 -06:00
Tim Brooks	a705e1a9e3	Add byte array pooling to nio http transport (#31349 ) This is related to #28898. This PR implements pooling of bytes arrays when reading from the wire in the http server transport. In order to do this, we must integrate with netty reference counting. That manner in which this PR implements this is making Pages in InboundChannelBuffer reference counted. When we accessing the underlying page to pass to netty, we retain the page. When netty releases its bytebuf, it releases the underlying pages we have passed to it.	2018-06-15 14:01:03 -06:00
Tim Brooks	700357d04e	Immediately flush channel after writing to buffer (#31301 ) This is related to #27260. Currently when we queue a write with a channel we set OP_WRITE and wait until the next selection loop to flush the write. However, if the channel does not have a pending write, it is probably ready to flush. This PR implements an optimistic flush logic that will attempt this flush.	2018-06-13 15:32:13 -06:00
Martijn van Groningen	6030d4be1e	[INGEST] Interrupt the current thread if evaluation grok expressions take too long (#31024 ) This adds a thread interrupter that allows us to encapsulate calls to org.joni.Matcher#search() This method can hang forever if the regex expression is too complex. The thread interrupter in the background checks every 3 seconds whether there are threads execution the org.joni.Matcher#search() method for longer than 5 seconds and if so interrupts these threads. Joni has checks that that for every 30k iterations it checks if the current thread is interrupted and if so returns org.joni.Matcher#INTERRUPTED Closes #28731	2018-06-12 07:49:03 +02:00
Tanguy Leroux	bf58660482	Remove all unused imports and fix CRLF (#31207 ) The X-Pack opening and the recent other refactorings left a lot of unused imports in the codebase. This commit removes them all.	2018-06-11 15:12:12 +02:00
Jason Tedor	5296c11e4f	Rename elasticsearch-nio to nio (#31186 ) This commit renames :libs:elasticsearch-nio to :libs:nio.	2018-06-07 17:00:00 -04:00
Jason Tedor	94be9b471f	Rename elasticsearch-core to core (#31185 ) This commit renames :libs:elasticsearch-core to :libs:core.	2018-06-07 16:50:21 -04:00

1 2

93 Commits