OpenSearch

Test Suite:
===========

[NOTE]
.Required settings
=======================================
Certain tests require specific settings to be applied to the
Elasticsearch instance in order to pass.  You should run
Elasticsearch as follows:

[source,sh]
---------------------
bin/elasticsearch -Enode.attr.testattr=test -Epath.repo=/tmp -Erepositories.url.allowed_urls='http://snapshot.*'
---------------------

=======================================

Test file structure
--------------------

A YAML test file consists of:

- an optional `setup` section, followed by
- an optional `teardown` section, followed by
- one or more test sections

For instance:

    setup:
      - do: ....
      - do: ....

    ---
    teardown:
      - do: ....

    ---
    "First test":
      - do: ...
      - match: ...

    ---
    "Second test":
      - do: ...
      - match: ...


A `setup` section contains a list of commands to run before each test
section in order to setup the same environment for each test section.

A `teardown` section contains a list of commands to run after each test
section in order to setup the same environment for each test section. This
may be needed for modifications made by the testthat are not cleared by the
deletion of indices and templates.

A test section represents an independent test, containing multiple `do`
statements and assertions. The contents of a test section must be run in
order, but individual test sections may be run in any order, as follows:

1. run `setup` (if any)
2. reset the `response` var and the `stash` (see below)
2. run test contents
3. run `teardown` (if any)
4. delete all indices and all templates

Dot notation:
-------------
Dot notation is used for (1) method calls and (2) hierarchical data structures.  For
instance, a method call like `cluster.health` would do the equivalent of:

    client.cluster.health(...params...)

A test against `_tokens.1.token` would examine the `token` key, in the second element
of the `tokens` array, inside the `response` var (see below):

    $val = $response->{tokens}[1]{token}  # Perl syntax roolz!

If one of the levels (eg `tokens`) does not exist, it should return an undefined value.
If no field name is given (ie the empty string) then return the current
$val -- used for testing the whole response body.

Use \. to specify paths that actually contain '.' in the key name, for example
in the `indices.get_settings` API.

Skipping tests:
---------------
If a test section should only be run on certain versions of Elasticsearch,
then the first entry in the section (after the title) should be called
`skip`, and should contain the range of versions to be
skipped, and the reason why the tests are skipped.  For instance:

....
    "Parent":
     - skip:
          version:     "0.20.1 - 0.90.2"
          reason:      Delete ignores the parent param

     - do:
       ... test definitions ...
....

All tests in the file following the skip statement should be skipped if:
`min <= current <= max`.

The `version` range can leave either bound empty, which means "open ended".
For instance:
....
    "Parent":
     - skip:
          version:     "1.0.0.Beta1 - "
          reason:      Delete ignores the parent param

     - do:
       ... test definitions ...
....

The skip section can also be used to list new features that need to be
supported in order to run a test. This way the up-to-date runners will
run the test, while the ones that don't support the feature yet can
temporarily skip it, and avoid having lots of test failures in the meantime.
Once all runners have implemented the feature, it can be declared supported
by default, thus the related skip sections can be removed from the tests.

....
    "Parent":
     - skip:
          features:    regex

     - do:
       ... test definitions ...
....

The `features` field can either be a string or an array of strings.
The skip section requires to specify either a `version` or a `features` list.

Required operators:
-------------------

=== `do`

The `do` operator calls a method on the client. For instance:

....
    - do:
        cluster.health:
            level: shards
....

The response from the `do` operator should be stored in the `response` var, which
is reset (1) at the beginning of a file or (2) on the next `do`.

If the arguments to `do` include `catch`, then we are expecting an error, which should
be caught and tested.  For instance:

....
    - do:
        catch:        missing
        get:
            index:    test
            type:    test
            id:        1
....

The argument to `catch` can be any of:

[horizontal]
`bad_request`::     a 400 response from ES
`unauthorized`::    a 401 response from ES
`forbidden`::       a 403 response from ES
`missing`::         a 404 response from ES
`request_timeout`:: a 408 response from ES
`conflict`::        a 409 response from ES
`request`::         a 4xx-5xx error response from ES, not equal to any named response
                    above
`unavailable`::     a 503 response from ES
`param`::           a client-side error indicating an unknown parameter has been passed
                    to the method
`/foo bar/`::       the text of the error message matches this regular expression

If `catch` is specified, then the `response` var must be cleared, and the test
should fail if no error is thrown.

If the arguments to `do` include `warnings` then we are expecting a `Warning`
header to come back from the request. If the arguments *don't* include a
`warnings` argument then we *don't* expect the response to include a `Warning`
header. The warnings must match exactly. Using it looks like this:

....
    - do:
        warnings:
            - '[index] is deprecated'
            - quotes are not required because yaml
            - but this argument is always a list, never a single string
            - no matter how many warnings you expect
        get:
            index:    test
            type:    test
            id:        1
....

If the arguments to `do` include `node_selector` then the request is only
sent to nodes that match the `node_selector`. It looks like this:

....
"test id":
 - skip:
      features: node_selector
 - do:
      node_selector:
          version: " - 6.9.99"
      index:
          index:  test-weird-index-中文
          type:   weird.type
          id:     1
          body:   { foo: bar }
....

If you list multiple selectors then the request will only go to nodes that
match all of those selectors. The following selectors are supported:

- `version`: Only nodes who's version is within the range will receive the
request. The syntax for the pattern is the same as when `version` is within
`skip`.
- `attribute`: Only nodes that have an attribute matching the name and value
of the provided attribute match.
Looks like:
....
      node_selector:
          attribute:
              name: value
....

=== `set`

For some tests, it is necessary to extract a value from the previous `response`, in
order to reuse it in a subsequent `do` and other tests.  For instance, when
testing indexing a document without a specified ID:

....
    - do:
        index:
            index: test
            type:  test
    - set:  { _id: id }   # stash the value of `response._id` as `id`
    - do:
        get:
            index: test
            type:  test
            id:    $id    # replace `$id` with the stashed value
    - match: { _id: $id } # the returned `response._id` matches the stashed `id`
....

The last response obtained gets always stashed automatically as a string, called `body`.
This is useful when needing to test apis that return text rather than json (e.g. cat api),
as it allows to treat the whole body as an ordinary string field.

Stashed values can be used in property names, eg:

....
  - do:
      cluster.state: {}

  - set: {master_node: master}

  - do:
      nodes.info:
        metric: [ transport ]

  - is_true: nodes.$master.transport.profiles
....


Note that not only expected values can be retrieved from the stashed values (as in the
example above), but the same goes for actual values:

....
    - match: { $body: /^.+$/ } # the returned `body` matches the provided regex if the body is text
    - match: { $body: {} } # the returned `body` matches the JSON object if the body is JSON
....

The stash should be reset at the beginning of each test file.

=== `transform_and_set`

For some tests, it is necessary to extract a value and transform it from the previous `response`, in
order to reuse it in a subsequent `do` and other tests.
Currently, it only has support for `base64EncodeCredentials`, for unknown transformations it will not
do anything and stash the value as is.
For instance, when testing you may want to base64 encode username and password for
`Basic` authorization header:

....
    - do:
        index:
            index: test
            type:  test
    - transform_and_set:  { login_creds: "#base64EncodeCredentials(user,password)" }   # stash the base64 encoded credentials of `response.user` and `response.password` as `login_creds`
    - do:
        headers:
            Authorization: Basic ${login_creds} # replace `$login_creds` with the stashed value
        get:
            index: test
            type:  test
....

Stashed values can be used as described in the `set` section

=== `is_true`

The specified key exists and has a true value (ie not `0`, `false`, `undefined`, `null`
or the empty string), eg:

....
    - is_true:  fields.foo  # the foo key exists in the fields hash and is "true"
....

=== `is_false`

The specified key doesn't exist or has a false value (ie `0`, `false`, `undefined`,
`null` or the empty string), eg:

....
    - is_false:  fields._source  # the _source key doesn't exist in the fields hash or is "false"
....

=== `match`

Used to compare two variables (could be scalars, arrays or hashes).  The two variables
should be identical, eg:

....
    - match: { _source: { foo: bar }}
....

Supports also regular expressions with flag X for more readability (accepts whitespaces and comments):

....
  - match:
      $body: >
               /^  epoch  \s+  timestamp          \s+  count  \s+  \n
                   \d+    \s+  \d{2}:\d{2}:\d{2}  \s+  \d+    \s+  \n  $/
....

**Note:** `$body` is used to refer to the last obtained response body as a string, while `''` refers to the parsed representation (parsed into a Map by the Java runner for instance). Having the raw string response is for example useful when testing cat APIs.

=== `lt` and `gt`

Compares two numeric values, eg:

....
    - lt: { foo: 10000 }  # the `foo` value is less than 10,000
....

=== `lte` and `gte`

Compares two numeric values, eg:

....
    - lte: { foo: 10000 }  # the `foo` value is less than or equal to 10,000
....

=== `length`

This depends on the datatype of the value being examined, eg:

....
    - length: { _id: 22    }   # the `_id` string is 22 chars long
    - length: { _tokens: 3 }   # the `_tokens` array has 3 elements
    - length: { _source: 5 }   # the `_source` hash has 5 keys
....