OpenSearch/CONTRIBUTING.md

Contributing to elasticsearch
=============================

Elasticsearch is an open source project and we love to receive contributions from our community — you! There are many ways to contribute, from writing tutorials or blog posts, improving the documentation, submitting bug reports and feature requests or writing code which can be incorporated into Elasticsearch itself.

Bug reports
-----------

If you think you have found a bug in Elasticsearch, first make sure that you are testing against the [latest version of Elasticsearch](https://www.elastic.co/downloads/elasticsearch) - your issue may already have been fixed. If not, search our [issues list](https://github.com/elastic/elasticsearch/issues) on GitHub in case a similar issue has already been opened.

It is very helpful if you can prepare a reproduction of the bug. In other words, provide a small test case which we can run to confirm your bug. It makes it easier to find the problem and to fix it. Test cases should be provided as `curl` commands which we can copy and paste into a terminal to run it locally, for example:

```sh
# delete the index
curl -XDELETE localhost:9200/test

# insert a document
curl -XPUT localhost:9200/test/test/1 -d '{
 "title": "test document"
}'

# this should return XXXX but instead returns YYY
curl ....
```

Provide as much information as you can. You may think that the problem lies with your query, when actually it depends on how your data is indexed. The easier it is for us to recreate your problem, the faster it is likely to be fixed.

Feature requests
----------------

If you find yourself wishing for a feature that doesn't exist in Elasticsearch, you are probably not alone. There are bound to be others out there with similar needs. Many of the features that Elasticsearch has today have been added because our users saw the need.
Open an issue on our [issues list](https://github.com/elastic/elasticsearch/issues) on GitHub which describes the feature you would like to see, why you need it, and how it should work.

Contributing code and documentation changes
-------------------------------------------

If you have a bugfix or new feature that you would like to contribute to Elasticsearch, please find or open an issue about it first. Talk about what you would like to do. It may be that somebody is already working on it, or that there are particular issues that you should know about before implementing the change.

We enjoy working with contributors to get their code accepted. There are many approaches to fixing a problem and it is important to find the best approach before writing too much code.

Note that it is unlikely the project will merge refactors for the sake of refactoring. These
types of pull requests have a high cost to maintainers in reviewing and testing with little
to no tangible benefit. This especially includes changes generated by tools. For example,
converting all generic interface instances to use the diamond operator.

The process for contributing to any of the [Elastic repositories](https://github.com/elastic/) is similar. Details for individual projects can be found below.

### Fork and clone the repository

You will need to fork the main Elasticsearch code or documentation repository and clone it to your local machine. See
[github help page](https://help.github.com/articles/fork-a-repo) for help.

Further instructions for specific projects are given below.

### Submitting your changes

Once your changes and tests are ready to submit for review:

1. Test your changes

    Run the test suite to make sure that nothing is broken. See the
[TESTING](TESTING.asciidoc) file for help running tests.

2. Sign the Contributor License Agreement

    Please make sure you have signed our [Contributor License Agreement](https://www.elastic.co/contributor-agreement/). We are not asking you to assign copyright to us, but to give us the right to distribute your code without restriction. We ask this of all contributors in order to assure our users of the origin and continuing existence of the code. You only need to sign the CLA once.

3. Rebase your changes

    Update your local repository with the most recent code from the main Elasticsearch repository, and rebase your branch on top of the latest master branch. We prefer your initial changes to be squashed into a single commit. Later, if we ask you to make changes, add them as separate commits.  This makes them easier to review.  As a final step before merging we will either ask you to squash all commits yourself or we'll do it for you.


4. Submit a pull request

    Push your local changes to your forked copy of the repository and [submit a pull request](https://help.github.com/articles/using-pull-requests). In the pull request, choose a title which sums up the changes that you have made, and in the body provide more details about what your changes do. Also mention the number of the issue where discussion has taken place, eg "Closes #123".

Then sit back and wait. There will probably be discussion about the pull request and, if any changes are needed, we would love to work with you to get your pull request merged into Elasticsearch.

Please adhere to the general guideline that you should never force push
to a publicly shared branch. Once you have opened your pull request, you
should consider your branch publicly shared. Instead of force pushing
you can just add incremental commits; this is generally easier on your
reviewers. If you need to pick up changes from master, you can merge
master into your branch. A reviewer might ask you to rebase a
long-running pull request in which case force pushing is okay for that
request. Note that squashing at the end of the review process should
also not be done, that can be done when the pull request is [integrated
via GitHub](https://github.com/blog/2141-squash-your-commits).

Contributing to the Elasticsearch codebase
------------------------------------------

**Repository:** [https://github.com/elastic/elasticsearch](https://github.com/elastic/elasticsearch)

JDK 11 is required to build Elasticsearch. You must have a JDK 11 installation
with the environment variable `JAVA_HOME` referencing the path to Java home for
your JDK 11 installation. By default, tests use the same runtime as `JAVA_HOME`.
However, since Elasticsearch supports JDK 8, the build supports compiling with
JDK 11 and testing on a JDK 8 runtime; to do this, set `RUNTIME_JAVA_HOME`
pointing to the Java home of a JDK 8 installation. Note that this mechanism can
be used to test against other JDKs as well, this is not only limited to JDK 8.

> Note: It is also required to have `JAVA8_HOME`, `JAVA9_HOME`, and
`JAVA10_HOME` are available so that the tests can pass.

> Warning: do not use `sdkman` for Java installations which do not have proper
`jrunscript` for jdk distributions.

Elasticsearch uses the Gradle wrapper for its build. You can execute Gradle
using the wrapper via the `gradlew` script in the root of the repository.

We support development in the Eclipse and IntelliJ IDEs. For Eclipse, the
minimum version that we support is [Eclipse Oxygen][eclipse] (version 4.7). For
IntelliJ, the minimum version that we support is [IntelliJ 2017.2][intellij].

### Configuring IDEs And Running Tests

Eclipse users can automatically configure their IDE: `./gradlew eclipse`
then `File: Import: Existing Projects into Workspace`. Select the
option `Search for nested projects`. Additionally you will want to
ensure that Eclipse is using 2048m of heap by modifying `eclipse.ini`
accordingly to avoid GC overhead errors.

IntelliJ users can automatically configure their IDE: `./gradlew idea`
then `File->New Project From Existing Sources`. Point to the root of
the source directory, select
`Import project from external model->Gradle`, enable
`Use auto-import`. In order to run tests directly from
IDEA 2017.2 and above, it is required to disable the IDEA run launcher in order to avoid
`idea_rt.jar` causing "jar hell". This can be achieved by adding the
`-Didea.no.launcher=true` [JVM
option](https://intellij-support.jetbrains.com/hc/en-us/articles/206544869-Configuring-JVM-options-and-platform-properties).
Alternatively, `idea.no.launcher=true` can be set in the
[`idea.properties`](https://www.jetbrains.com/help/idea/file-idea-properties.html)
file which can be accessed under Help > Edit Custom Properties (this will require a
restart of IDEA). For IDEA 2017.3 and above, in addition to the JVM option, you will need to go to
`Run->Edit Configurations->...->Defaults->JUnit` and verify that the `Shorten command line` setting is set to
`user-local default: none`. You may also need to [remove `ant-javafx.jar` from your
classpath](https://github.com/elastic/elasticsearch/issues/14348) if that is
reported as a source of jar hell.

To run an instance of elasticsearch from the source code run `./gradlew run`

The Elasticsearch codebase makes heavy use of Java `assert`s and the
test runner requires that assertions be enabled within the JVM. This
can be accomplished by passing the flag `-ea` to the JVM on startup.

For IntelliJ, go to
`Run->Edit Configurations...->Defaults->JUnit->VM options` and input
`-ea`.

For Eclipse, go to `Preferences->Java->Installed JREs` and add `-ea` to
`VM Arguments`.


### Java Language Formatting Guidelines

Please follow these formatting guidelines:

* Java indent is 4 spaces
* Line width is 140 characters
* The rest is left to Java coding standards
* Disable “auto-format on save” to prevent unnecessary format changes. This makes reviews much harder as it generates unnecessary formatting changes. If your IDE supports formatting only modified chunks that is fine to do.
* Wildcard imports (`import foo.bar.baz.*`) are forbidden and will cause the build to fail. This can be done automatically by your IDE:
 * Eclipse: `Preferences->Java->Code Style->Organize Imports`. There are two boxes labeled "`Number of (static )? imports needed for .*`". Set their values to 99999 or some other absurdly high value.
 * IntelliJ: `Preferences/Settings->Editor->Code Style->Java->Imports`. There are two configuration options: `Class count to use import with '*'` and `Names count to use static import with '*'`. Set their values to 99999 or some other absurdly high value.
* Don't worry too much about import order. Try not to change it but don't worry about fighting your IDE to stop it from doing so.

### License Headers

We require license headers on all Java files. You will notice that all the Java files in
the top-level `x-pack` directory contain a separate license from the rest of the repository. This
directory contains commercial code that is associated with a separate license. It can be helpful
to have the IDE automatically insert the appropriate license header depending which part of the project
contributions are made to.

#### IntelliJ: Copyright & Scope Profiles

To have IntelliJ insert the correct license, it is necessary to create to copyright profiles.
These may potentially be called `apache2` and `commercial`. These can be created in
`Preferences/Settings->Editor->Copyright->Copyright Profiles`. To associate these profiles to
their respective directories, two "Scopes" will need to be created. These can be created in
`Preferences/Settings->Appearances & Behavior->Scopes`. When creating scopes, be sure to choose
the `shared` scope type. Create a scope, `apache2`, with
the associated pattern of `!file[group:x-pack]:*/`. This pattern will exclude all the files contained in
the `x-pack` directory. The other scope, `commercial`, will have the inverse pattern of `file[group:x-pack]:*/`.
The two scopes, together, should account for all the files in the project. To associate the scopes
with their copyright-profiles, go into `Preferences/Settings->Editor>Copyright` and use the `+` to add
the associations `apache2/apache2` and `commercial/commercial`.

Configuring these options in IntelliJ can be quite buggy, so do not be alarmed if you have to open/close
the settings window and/or restart IntelliJ to see your changes take effect.

### Creating A Distribution

To create a distribution from the source, simply run:

```sh
cd elasticsearch/
./gradlew assemble
```

The package distributions (Debian and RPM) can be found under:
`./distribution/packages/(deb|rpm)/build/distributions/`

The archive distributions (tar and zip) can be found under:
`./distribution/archives/(tar|zip)/build/distributions/`


### Running The Full Test Suite

Before submitting your changes, run the test suite to make sure that nothing is broken, with:

```sh
./gradlew check
```

If your changes affect only the documentation, run:

```sh
./gradlew -p docs check
```
For more information about testing code examples in the documentation, see
https://github.com/elastic/elasticsearch/blob/master/docs/README.asciidoc

### Project layout

This repository is split into many top level directories. The most important
ones are:

#### `docs`
Documentation for the project.

#### `distribution`
Builds our tar and zip archives and our rpm and deb packages.

#### `libs`
Libraries used to build other parts of the project. These are meant to be
internal rather than general purpose. We have no plans to
[semver](https://semver.org/) their APIs or accept feature requests for them.
We publish them to maven central because they are dependencies of our plugin
test framework, high level rest client, and jdbc driver but they really aren't
general purpose enough to *belong* in maven central. We're still working out
what to do here.

#### `modules`
Features that are shipped with Elasticsearch by default but are not built in to
the server. We typically separate features from the server because they require
permissions that we don't believe *all* of Elasticsearch should have or because
they depend on libraries that we don't believe *all* of Elasticsearch should
depend on.

For example, reindex requires the `connect` permission so it can perform
reindex-from-remote but we don't believe that the *all* of Elasticsearch should
have the "connect". For another example, Painless is implemented using antlr4
and asm and we don't believe that *all* of Elasticsearch should have access to
them.

#### `plugins`
Officially supported plugins to Elasticsearch. We decide that a feature should
be a plugin rather than shipped as a module because we feel that it is only
important to a subset of users, especially if it requires extra dependencies.

The canonical example of this is the ICU analysis plugin. It is important for
folks who want the fairly language neutral ICU analyzer but the library to
implement the analyzer is 11MB so we don't ship it with Elasticsearch by
default.

Another example is the `discovery-gce` plugin. It is *vital* to folks running
in [GCP](https://cloud.google.com/) but useless otherwise and it depends on a
dozen extra jars.

#### `qa`
Honestly this is kind of in flux and we're not 100% sure where we'll end up.
Right now the directory contains
* Tests that require multiple modules or plugins to work
* Tests that form a cluster made up of multiple versions of Elasticsearch like
full cluster restart, rolling restarts, and mixed version tests
* Tests that test the Elasticsearch clients in "interesting" places like the
`wildfly` project.
* Tests that test Elasticsearch in funny configurations like with ingest
disabled
* Tests that need to do strange things like install plugins that thrown
uncaught `Throwable`s or add a shutdown hook
But we're not convinced that all of these things *belong* in the qa directory.
We're fairly sure that tests that require multiple modules or plugins to work
should just pick a "home" plugin. We're fairly sure that the multi-version
tests *do* belong in qa. Beyond that, we're not sure. If you want to add a new
qa project, open a PR and be ready to discuss options.

#### `server`
The server component of Elasticsearch that contains all of the modules and
plugins. Right now things like the high level rest client depend on the server
but we'd like to fix that in the future.

#### `test`
Our test framework and test fixtures. We use the test framework for testing the
server, the plugins, and modules, and pretty much everything else. We publish
the test framework so folks who develop Elasticsearch plugins can use it to
test the plugins. The test fixtures are external processes that we start before
running specific tests that rely on them.

For example, we have an hdfs test that uses mini-hdfs to test our
repository-hdfs plugin.

#### `x-pack`
Commercially licensed code that integrates with the rest of Elasticsearch. The
`docs` subdirectory functions just like the top level `docs` subdirectory and
the `qa` subdirectory functions just like the top level `qa` subdirectory. The
`plugin` subdirectory contains the x-pack module which runs inside the
Elasticsearch process. The `transport-client` subdirectory contains extensions
to Elasticsearch's standard transport client to work properly with x-pack.

### Gradle Build

We use Gradle to build Elasticsearch because it is flexible enough to not only
build and package Elasticsearch, but also orchestrate all of the ways that we
have to test Elasticsearch.

#### Configurations

Gradle organizes dependencies and build artifacts into "configurations" and
allows you to use these configurations arbitrarily. Here are some of the most
common configurations in our build and how we use them:

<dl>
<dt>`compile`</dt><dd>Code that is on the classpath at both compile and
runtime.</dd>
<dt>`runtime`</dt><dd>Code that is not on the classpath at compile time but is
on the classpath at runtime. We mostly use this configuration to make sure that
we do not accidentally compile against dependencies of our dependencies also
known as "transitive" dependencies".</dd>
<dt>`compileOnly`</dt><dd>Code that is on the classpath at compile time but that
should not be shipped with the project because it is "provided" by the runtime
somehow. Elasticsearch plugins use this configuration to include dependencies
that are bundled with Elasticsearch's server.</dd>
<dt>`bundle`</dt><dd>Only available in projects with the shadow plugin,
dependencies with this configuration are bundled into the jar produced by the
build. Since IDEs do not understand this configuration we rig them to treat
dependencies in this configuration as `compile` dependencies.</dd>
<dt>`testCompile`</dt><dd>Code that is on the classpath for compiling tests
that are part of this project but not production code. The canonical example
of this is `junit`.</dd>
</dl>

Contributing as part of a class
-------------------------------
In general Elasticsearch is happy to accept contributions that were created as
part of a class but strongly advise against making the contribution as part of
the class. So if you have code you wrote for a class feel free to submit it.

Please, please, please do not assign contributing to Elasticsearch as part of a
class. If you really want to assign writing code for Elasticsearch as an
assignment then the code contributions should be made to your private clone and
opening PRs against the primary Elasticsearch clone must be optional, fully
voluntary, not for a grade, and without any deadlines.

Because:

* While the code review process is likely very educational, it can take wildly
varying amounts of time depending on who is available, where the change is, and
how deep the change is. There is no way to predict how long it will take unless
we rush.
* We do not rush reviews without a very, very good reason. Class deadlines
aren't a good enough reason for us to rush reviews.
* We deeply discourage opening a PR you don't intend to work through the entire
code review process because it wastes our time.
* We don't have the capacity to absorb an entire class full of new contributors,
especially when they are unlikely to become long time contributors.

Finally, we require that you run `./gradlew check` before submitting a
non-documentation contribution. This is mentioned above, but it is worth
repeating in this section because it has come up in this context.

[eclipse]: http://www.eclipse.org/community/eclipse_newsletter/2017/june/
[intellij]: https://blog.jetbrains.com/idea/2017/07/intellij-idea-2017-2-is-here-smart-sleek-and-snappy/
[shadow-plugin]: https://github.com/johnrengelman/shadow