Merge branch 'main' into docs_cn

# Conflicts:
#	Gemfile
#	README.md
#	_data/footer.yml
#	_security-plugin/index.md
#	index.md
This commit is contained in:
YuCheng Hu 2024-03-18 23:46:13 -04:00
commit 66cb515f13
1464 changed files with 129936 additions and 15254 deletions

1
.github/CODEOWNERS vendored Normal file
View File

@ -0,0 +1 @@
* @hdhalter @kolchfa-aws @Naarcha-AWS @vagimeli @AMoo-Miki @natebower @dlvenable @scrawfor99

View File

@ -0,0 +1,7 @@
---
title: '[AUTOCUT] Broken links'
labels: 'bug'
---
Links checker has failed on push of your commit.
Please examine the workflow log {{ env.WORKFLOW_URL }}.

View File

@ -0,0 +1,20 @@
---
name: 📃 Documentation issue
about: Need docs? Create an issue to request or add new documentation.
title: '[DOC]'
labels: 'untriaged'
assignees: ''
---
**What do you want to do?**
- [ ] Request a change to existing documentation
- [ ] Add new documentation
- [ ] Report a technical problem with the documentation
- [ ] Other
**Tell us about your request.** Provide a summary of the request and all versions that are affected.
**What other resources are available?** Provide links to related issues, POCs, steps for testing, etc.

10
.github/PULL_REQUEST_TEMPLATE.md vendored Normal file
View File

@ -0,0 +1,10 @@
### Description
_Describe what this change achieves._
### Issues Resolved
_List any issues this PR will resolve, e.g. Closes [...]._
### Checklist
- [ ] By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and subject to the [Developers Certificate of Origin](https://github.com/opensearch-project/OpenSearch/blob/main/CONTRIBUTING.md#developer-certificate-of-origin).
For more information on following Developer Certificate of Origin and signing off your commits, please check [here](https://github.com/opensearch-project/OpenSearch/blob/main/CONTRIBUTING.md#developer-certificate-of-origin).

2
.github/dco.yml vendored Normal file
View File

@ -0,0 +1,2 @@
require:
members: false

View File

@ -0,0 +1,75 @@
extends: conditional
message: "'%s': Spell out acronyms the first time that you use them on a page and follow them with the acronym in parentheses. Subsequently, use the acronym alone."
link: 'https://github.com/opensearch-project/documentation-website/blob/main/STYLE_GUIDE.md#acronyms'
level: warning
scope: summary
ignorecase: false
# Ensures that the existence of 'first' implies the existence of 'second'.
first: '\b((?<!k-)[A-Z]{1,3}\/?[A-Z]{1,3}\d{0,2}\b(?!\sCommons))'
second: '(?:\b[A-Za-z-]+ )+\(([A-Z]{1,3}\/?[A-Z]{1,3}\d{0,2})\)'
# ... with the exception of these:
exceptions:
- API
- ASCII
- AWS
- BASIC
- BM25
- CSV
- CPU
- CRUD
- DNS
- DOS
- FAQ
- FTP
- GIF
- HTML
- HTTP
- HTTPS
- I/O
- ID
- IP
- JPEG
- JSON
- NAT
- NGINX
- PDF
- RAM
- REST
- RGB
- ROM
- SAML
- SDK
- SSL
- TCP
- TIFF
- TLS
- UI
- URI
- URL
- UTC
- UTF
- XML
- YAML
- CAT
- GET
- PUT
- POST
- DELETE
- AND
- OR
- KB
- MB
- GB
- TB
- PB
- US
- PNG
- JVM
- N/A
- GROUP
- BY
- SELECT
- HAVING
- SQL
- TOC
- 'NULL'

View File

@ -0,0 +1,6 @@
extends: existence
message: "Don't use an ampersand in place of 'and' in documentation."
nonword: true
level: warning
tokens:
- '\w +& +\w'

View File

@ -0,0 +1,5 @@
extends: existence
message: "Use 'cyber' as a prefix. Remove spaces or hyphens in '%s'."
level: error
tokens:
- '[Cc]yber[- ]+[a-z]*'

View File

@ -0,0 +1,7 @@
extends: existence
message: "There should be no spaces around the dash in '%s'."
ignorecase: true
nonword: true
level: error
tokens:
- '\w+ +-{2,3} +\w+'

View File

@ -0,0 +1,22 @@
extends: substitution
message: "Use '%s' instead of '%s' for versions or orientation within a document. Use 'above' and 'below' only for physical space or screen descriptions."
link: 'https://github.com/opensearch-project/documentation-website/blob/main/TERMS.md'
level: warning
ignorecase: true
swap:
- 'image below': 'following image'
- 'example below': 'following example'
- 'steps below': 'following steps'
- 'section below': 'following section'
- 'table below': 'following table'
- 'image above': 'following image'
- 'example above': 'preceding example'
- 'section above': 'preceding section'
- 'table above': 'preceding table'
- 'above image': 'preceding image'
- 'above section': 'preceding section'
- 'above table': 'preceding table'
- '\d+\.\d+\s+(?:and|or)\s+above': 'later'
- '\d+\.\d+\s+(?:and|or)\s+below': 'earlier'
- 'below(?!\s+(?:the|this|\d))': 'following or later'
- 'above(?!\s+(?:the|this|\d))': 'previous, preceding, or earlier'

View File

@ -0,0 +1,16 @@
extends: substitution
message: "Use '%s' instead of '%s' for window, page, or pane references to features or controls. Use 'top' and 'bottom' only as a general screen reference."
link: 'https://github.com/opensearch-project/documentation-website/blob/main/TERMS.md'
level: warning
ignorecase: true
action:
name: replace
swap:
- top left: upper left
- bottom left: lower left
- top right: upper right
- bottom right: lower right
- top-left: upper-left
- bottom-left: lower-left
- top-right: upper-right
- bottom-right: lower-right

View File

@ -0,0 +1,6 @@
extends: existence
message: "Don't use exclamation points in documentation."
nonword: true
level: error
tokens:
- '\w+!(?:\s|$)'

View File

@ -0,0 +1,7 @@
extends: sequence
message: "Use 'failover' as an adjective or noun instead of '%s'."
level: error
ignorecase: true
tokens:
- tag: NN|JJ
pattern: '(?:fail over|fail-over)'

View File

@ -0,0 +1,7 @@
extends: sequence
message: "Use 'fail over' as a verb instead of '%s'."
level: error
ignorecase: true
tokens:
- tag: VB|VBD|VBG|VBN|VBP|VBZ
pattern: '(?:failover|fail-over)'

View File

@ -0,0 +1,7 @@
extends: existence
message: "'%s' is in future tense. Use present tense in documentation."
ignorecase: true
level: suggestion
scope: raw
tokens:
- '(?:will|is going to|won''t|[A-za-z]+''ll)\s+[a-z]+'

View File

@ -0,0 +1,11 @@
extends: existence
message: "'%s': Don't define acronyms in headings."
link: 'https://github.com/opensearch-project/documentation-website/blob/main/STYLE_GUIDE.md#acronyms'
level: error
ignorecase: false
scope: heading
nonword: true
action:
name: remove
tokens:
- '\([A-Z]{2,5}\)'

View File

@ -0,0 +1,13 @@
extends: capitalization
message: "'%s' is a heading and should be in sentence case."
level: error
scope: heading
match: $sentence
indicators:
- ":"
- "."
- ")"
exceptions:
- k # ignores lowercase k-NN
- '[A-Z]{2,}' # ignores all acronyms
- '([A-Z][a-z0-9]+){2,}' # ignores all camel case words

View File

@ -0,0 +1,7 @@
extends: existence
message: "Capitalize the word after a colon in '%s'."
nonword: true
level: error
scope: heading
tokens:
- '(?::\s)[a-z]+'

View File

@ -0,0 +1,9 @@
extends: existence
message: "Don't use punctuation at the end of a heading."
nonword: true
level: error
scope: heading
action:
name: remove
tokens:
- '[.?!]$'

View File

@ -0,0 +1,15 @@
extends: substitution
message: "Use '%s' instead of '%s' because the latter is an offensive term."
link: https://github.com/opensearch-project/documentation-website/blob/main/STYLE_GUIDE.md#offensive-terms
ignorecase: true
level: error
swap:
'abort': stop, end, or cancel
'black day': blocked day
'blacklist': deny list
'kill': stop, end, clear, remove, or cancel
'master account': 'management account'
'master': cluster manager
'slave': replica, secondary, standby
'white day': open day
'whitelist': allow list

View File

@ -0,0 +1,10 @@
extends: existence
message: "Using '%s' is unnecessary. Remove."
link: https://github.com/opensearch-project/documentation-website/blob/main/STYLE_GUIDE.md#basic-guidelines
ignorecase: true
nonword: true
level: warning
action:
name: remove
tokens:
- '\b(?:etc\.|etc)'

View File

@ -0,0 +1,15 @@
extends: substitution
message: "Use '%s' instead of '%s'."
link: https://github.com/opensearch-project/documentation-website/blob/main/STYLE_GUIDE.md#basic-guidelines
ignorecase: false
level: warning
nonword: true
action:
name: replace
swap:
'\b(?:eg|e\.g\.)[\s,]': for example or such as
'\b(?:ie|i\.e\.)[\s,]': that is or specifically
'\bad hoc[\s,.]': one-time
'\b(?:v\.|vs\.|vs|versus)\s': compared to or compared with
'\bvia\s': using, through, by accessing, or by choosing
'\bvice versa': the other way around

View File

@ -0,0 +1,8 @@
extends: existence
message: "Remove double parentheses from the link '%s'."
level: error
nonword: true
scope: raw
tokens:
- '\]\({2,}[^)]*?\){1,}'

View File

@ -0,0 +1,7 @@
extends: existence
message: "Remove double slashes from the link '%s'."
level: error
nonword: true
scope: raw
tokens:
- '\(\{\{site.url\}\}\{\{site.baseurl\}\}[^)]*?\/{2,}[^)]*?\)'

View File

@ -0,0 +1,7 @@
extends: existence
message: "Add a trailing slash to the link '%s'."
level: error
nonword: true
scope: raw
tokens:
- '\(\{\{site.url\}\}\{\{site.baseurl\}\}(\/[A-Za-z0-9-_]+)+\s*\)'

View File

@ -0,0 +1,7 @@
extends: existence
message: "Add a slash after '{{site.url}}/{{site.baseurl}}' in '%s'."
level: error
nonword: true
scope: raw
tokens:
- '\(\{\{site.url\}\}\{\{site.baseurl\}\}([^\/])(?:(.*))?\)'

View File

@ -0,0 +1,7 @@
extends: sequence
message: "Use 'login' as an adjective or noun instead of '%s'."
level: error
ignorecase: true
tokens:
- tag: NN|JJ
pattern: '(?:log in|log-in)'

View File

@ -0,0 +1,7 @@
extends: sequence
message: "Use 'log in' as a verb instead of '%s'."
level: error
ignorecase: true
tokens:
- tag: VB|VBD|VBG|VBN|VBP|VBZ
pattern: '(?:login|log into|log on|log onto)'

View File

@ -0,0 +1,7 @@
extends: sequence
message: "Use 'logout' as an adjective or noun instead of '%s'."
level: error
ignorecase: true
tokens:
- tag: NN|JJ
pattern: '(?:log out)'

View File

@ -0,0 +1,7 @@
extends: sequence
message: "Use 'log out' as a verb instead of '%s'."
level: error
ignorecase: true
tokens:
- tag: VB|VBD|VBG|VBN|VBP|VBZ
pattern: '(?:logout)'

View File

@ -0,0 +1,6 @@
extends: existence
message: "Resolve lingering merge conflicts."
nonword: true
level: error
tokens:
- '<<<<<<< HEAD'

View File

@ -0,0 +1,7 @@
extends: existence
message: "Add an Oxford comma in '%s'."
scope: sentence
level: warning
nonword: true
tokens:
- '(?:[^\s,]+,){1,} \w+ (?:and|or) \w+[.?!]'

View File

@ -0,0 +1,45 @@
extends: existence
message: "'%s': Whenever possible, use the active voice instead of the passive voice."
ignorecase: true
level: suggestion
raw:
- \b(am|are|were|being|is|been|was|be)\b\s*
tokens:
- '[\w]+ed'
- become
- been
- begun
- brought
- built
- cast
- caught
- chosen
- come
- cut
- dealt
- done
- drawn
- forbidden
- found
- given
- gone
- gotten
- held
- hidden
- kept
- known
- led
- let
- made
- put
- quit
- read
- seen
- sent
- sped
- spent
- stuck
- swept
- taken
- understood
- written

View File

@ -0,0 +1,9 @@
extends: existence
message: "Using '%s' is unnecessary. Remove."
link: https://github.com/opensearch-project/documentation-website/blob/main/TERMS.md
ignorecase: true
level: warning
action:
name: remove
tokens:
- 'please'

View File

@ -0,0 +1,7 @@
extends: existence
message: "Use an en dash (--) with no space on either side in a range of numbers."
link: https://github.com/opensearch-project/documentation-website/blob/main/STYLE_GUIDE.md#numbers-and-measurement
nonword: true
level: error
tokens:
- '\b\d+ *[-] *\d+\b'

View File

@ -0,0 +1,6 @@
extends: repetition
message: "'%s' is repeated."
level: error
alpha: true
tokens:
- '[^\s]+'

View File

@ -0,0 +1,7 @@
extends: sequence
message: "Use 'rollover' as an adjective or noun instead of '%s'."
level: error
ignorecase: true
tokens:
- tag: NN|JJ
pattern: '(?:roll over|roll-over)'

View File

@ -0,0 +1,7 @@
extends: sequence
message: "Use 'roll over' as a verb instead of '%s'."
level: error
ignorecase: true
tokens:
- tag: VB|VBD|VBG|VBN|VBP|VBZ
pattern: '(?:rollover|roll-over)'

View File

@ -0,0 +1,7 @@
extends: sequence
message: "Use 'setup' as an adjective or noun instead of '%s'."
level: error
ignorecase: true
tokens:
- tag: NN|JJ
pattern: '(?:set up|set-up)'

View File

@ -0,0 +1,7 @@
extends: sequence
message: "Use 'set up' as a verb instead of '%s'."
level: error
ignorecase: true
tokens:
- tag: VB|VBD|VBG|VBN|VBP|VBZ
pattern: '(?:setup|set-up)'

View File

@ -0,0 +1,11 @@
extends: substitution
message: "'%s': Use 'AWS Signature Version 4' instead of '%s' on first appearance. Then, Signature Version 4 may be used. Only use SigV4 when space is limited."
ignorecase: true
link: 'https://github.com/opensearch-project/documentation-website/blob/main/TERMS.md'
level: warning
action:
name: replace
swap:
sigv4: Signature Version 4
AWS sigv4: AWS Signature Version 4

View File

@ -0,0 +1,10 @@
extends: existence
message: "Don't use '%s' because it's not neutral in tone. If you mean 'only', use 'only' instead."
link: https://github.com/opensearch-project/documentation-website/blob/main/TERMS.md
ignorecase: true
level: warning
action:
name: remove
tokens:
- simply
- just

View File

@ -0,0 +1,9 @@
extends: existence
message: "There should be no space before and one space after the punctuation mark in '%s'."
level: error
nonword: true
tokens:
- '[A-Za-z]+[.?] {2,}[A-Za-z]+'
- '[A-Za-z]+[.?][A-Za-z]+'
- '[A-Za-z]+[,;] {2,}[A-Za-z]+'
- '[A-Za-z]+[,;][A-Za-z]+'

View File

@ -0,0 +1,6 @@
extends: existence
message: "When using '/' between words, do not insert space on either side of it."
ignorecase: true
level: error
tokens:
- '[A-Za-z]+ +\/ +[A-Za-z]+'

View File

@ -0,0 +1,6 @@
extends: existence
message: "There should be once space between words in '%s'."
level: error
nonword: true
tokens:
- '[A-Za-z]+ {2,}[A-Za-z]+'

View File

@ -0,0 +1,5 @@
extends: spelling
message: "Error: %s. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks."
level: error
action:
name: suggest

View File

@ -0,0 +1,24 @@
extends: script
message: "Do not stack headings. Insert an introductory sentence between headings."
level: error
link: https://github.com/opensearch-project/documentation-website/blob/main/STYLE_GUIDE.md#formatting-and-organization
scope: raw
script: |
text := import("text")
matches := []
// Replace code blocks with dummy text to avoid processing comments
document := text.re_replace("(?s) *(```.*?```)", scope, "text")
isHeading := false
for line in text.split(document, "\n") {
if text.trim_space(line) != "" {
if text.has_prefix(line, "#") {
if isHeading == true {
start := text.index(scope, line)
matches = append(matches, {begin: start, end: start + len(line)})
}
isHeading = true // new section; reset count
} else {
isHeading = false
}
}
}

View File

@ -0,0 +1,55 @@
extends: substitution
message: "Use '%s' instead of '%s'."
ignorecase: true
level: error
action:
name: replace
swap:
'command-line interface': command line interface
'data are': data is
'data set': dataset
'for information on': for information about
'for more information on': for more information about
'geo distance': geodistance
'geo hash': geohash
'geo hex': geohex
'geo point': geopoint
'geo shape': geoshape
'geo tile': geotile
'geospacial': geospatial
'hard code': hardcode
'high-performance computing': high performance computing
'host name': hostname
'Huggingface': Hugging Face
'indices': indexes
'ingestion pipeline': ingest pipeline
'key store': keystore
'key/value': key-value
'kmeans': k-means
'kNN': k-NN
'machine-learning': machine learning
'nonproduction': non-production
'postmigration': post-migration
'pre-configure': preconfigure
'pre-configured': preconfigured
'pre-defined': predefined
'pre-train': pretrain
'pre-trained': pretrained
'premigration': pre-migration
're-enable': reenable
'screen shot': screenshot
'sample request': example request
'sample response': example response
'stacktrace': stack trace
'stand-alone': standalone
'timeframe': time frame
'time series data': time-series data
'time stamp': timestamp
'timezone': time zone
'tradeoff': trade-off
'trust store': truststore
'U.S.': US
'web page': webpage
'web site': website
'whitespace': white space
'user interface \(UI\)': UI

View File

@ -0,0 +1,16 @@
extends: substitution
message: "Use '%s' instead of '%s'."
ignorecase: true
level: suggestion
action:
name: replace
swap:
'app server': application server
'as well as': and
'due to': because of
'it is recommended': we recommend
'leverage': use
'life cycle': lifecycle
'navigate in': navigate to
'wish|desire': want

View File

@ -0,0 +1,7 @@
extends: capitalization
message: "'%s' is a table heading and should be in sentence case."
level: error
scope: table.header
match: $sentence
exceptions:
- k # ignores lowercase k-NN

View File

@ -0,0 +1,7 @@
extends: sequence
message: "Use 'timeout' as an adjective or noun instead of '%s'."
level: error
ignorecase: true
tokens:
- tag: NN|JJ
pattern: '(?:time out|time-out)'

View File

@ -0,0 +1,7 @@
extends: sequence
message: "Use 'time out' as a verb instead of '%s'."
level: error
ignorecase: true
tokens:
- tag: VB|VBD|VBG|VBN|VBP|VBZ
pattern: '(?:timeout|time-out)'

View File

@ -0,0 +1,12 @@
extends: substitution
message: "Use '%s' instead of '%s'."
level: error
ignorecase: false
action:
name: replace
swap:
kb: KB
mb: MB
gb: GB
tb: TB
pb: PB

View File

@ -0,0 +1,8 @@
extends: existence
message: "Put a space between the number and the units in '%s'."
nonword: true
ignorecase: true
level: warning
tokens:
- \d+(?:B|KB|MB|GB|TB|PB)(?:\s|\.|,)
- \d+(?:ns|ms|s|min|h|d)(?:\s|\.|,)

View File

@ -0,0 +1,6 @@
extends: existence
message: "In '%s', spell out 'version'."
ignorecase: true
level: warning
tokens:
- '[v|V][0-9]+\.[0-9]+'

View File

@ -0,0 +1 @@
(?-i)[A-Z]{2,}s

View File

@ -0,0 +1 @@

View File

@ -0,0 +1,29 @@
Alerting plugin
Anomaly Detection plugin
Asynchronous Search plugin
Crypto plugin
Cross-Cluster Replication plugin
Custom Codecs plugin
Flow Framework plugin
Maps plugin
Notebooks plugin
Notifications plugin
Reports plugin
Reports Scheduler plugin
Geospatial plugin
Index Management plugin
Job Scheduler
Job Scheduler plugin
k-NN plugin
ML Commons
ML Commons plugin
Neural Search plugin
Observability plugin
Performance Analyzer plugin
Query Insights plugin
Query Workbench plugin
Search Relevance plugin
Security plugin
Security Analytics plugin
SQL plugin
Trace Analytics plugin

View File

@ -0,0 +1 @@

View File

@ -0,0 +1,93 @@
Active Directory
Adoptium
Amazon
Amazon OpenSearch Serverless
Amazon OpenSearch Service
Amazon Bedrock
Amazon SageMaker
Ansible
Auditbeat
AWS Cloud
Cognito
Dashboards Query Language
Data Prepper
Debian
Dev Tools
Docker
Docker Compose
Dockerfile
DoS
Elasticsearch
GeoJSON
GitHub
Gradle
Grafana
Faiss
Filebeat
Flickr
Fluent Bit
Fluentd
Helm
Hugging Face
IdP
Inferentia
IP2Geo
IPv4
IPv6
Iris
JavaScript
Jaeger
Jaeger HotROD
JSON
JSON Web Token
Keycloak
Kerberos
Kibana
Kubernetes
Lambda
Linux
Log4j
Logstash
Lucene
macOS
Metricbeat
Microsoft Power BI
Minikube
Nagios
Okta
Open Distro
OpenAI
OpenID Connect
OpenSearch
OpenSearch Assistant
OpenSearch Assistant Toolkit
OpenSearch Benchmark
OpenSearch Dashboards
OpenSearch Playground
OpenSearch Project
OpenSearch Service
OpenSSL
OpenTelemetry
OpenTelemetry Collector
OTel
Packetbeat
Painless
Peer Forwarder
Performance Analyzer
Piped Processing Language
Point in Time
Powershell
Python
PyTorch
Querqy
Query Workbench
RCF Summarize
RPM Package Manager
Ruby
Simple Schema for Observability
Tableau
TorchScript
Tribuo
VisBuilder
Winlogbeat
Zstandard

View File

@ -0,0 +1 @@

View File

@ -0,0 +1,143 @@
[Aa]nonymization
[Aa]utomapping
[Aa]utopopulate
[Bb]ackoff
[Bb]ackporting
[Bb]ackpressure
[Bb]asemap
[Bb]enchmarked
[Bb]igram
Boolean
[Cc]allout
[Cc]hatbots?
[Cc]odec
[Cc]omposable
[Cc]onfig
[Cc]ron
[Cc]ybersecurity
[Dd]ashboarding
[Dd]atagram
[Dd]eallocate
[Dd]eduplicates?
[Dd]eduplication
[Dd]eprovision(s|ed|ing)?
[Dd]eserialize
[Dd]eserialization
Dev
[Dd]iscoverability
Distro
[Dd]uplicative
[Ee]gress
[Ee]num
[Ff]ailover
[Ff]lyout
[Ff]sync
Gantt
[Gg]eobounds
[Gg]eodistance
[Gg]eohash
[Gg]eohex
[Gg]eopoint
[Gg]eopolygon
[Gg]eoshape
[Gg]eospatial
[Gg]eotile
gibibyte
[Hh]ashmap
[Hh]ostname
[Hh]yperparameters
[Ii]mpactful
[Ii]ngress
[Ii]nitializer
[Ii]nline
[Ii]nstrumentations?
[Ii]ntracluster
[Jj]avadoc
k-NN
[Kk]eystore
kibibyte
mebibyte
[Ll]earnings
[Ll]emmatization
Levenshtein
[Ll]inestring
[Ll]ookups?
[Ll]ossy
[Ll]owercases?d?
[Mm]acrobenchmarks?
[Mm]etaphone
[Mm]isorder
[Mm]ultifield
[Mm]ultiline
[Mm]ultimodal
[Mm]ultipoint
[Mm]ultipolygon
[Mm]ultithreaded
[Mm]ultivalued
[Mm]ultiword
[Nn]amespace
[Oo]versamples?
[Oo]nboarding
pebibyte
[Pp]erformant
[Pp]luggable
[Pp]reconfigure
[Pp]refetch
[Pp]refilter
[Pp]reload
[Pp]repackaged?
[Pp]repend
[Pp]repper
[Pp]reprocess
[Pp]retrain
[Pp]seudocode
[Rr]ebalance
[Rr]ebalancing
[Rr]edownload
[Rr]eenable
[Rr]eindex
[Rr]eingest
[Rr]erank(er|ed|ing)?
[Rr]epo
[Rr]ewriter
[Rr]ollout
[Rr]ollup
[Rr]unbooks?
[Ss]harding
[Ss]erverless
[Ss]harding
[Ss]ignificand
[Ss]napshott(ed|ing)
stdout
[Ss]temmers?
[Ss]ubaggregation
[Ss]ubcalculation
[Ss]ubcollectors?
[Ss]ubcommands?
[Ss]ubfield
[Ss]ubquer(y|ies)
[Ss]ubstrings?
[Ss]ubtag
[Ss]ubtree
[Ss]ubvector
[Ss]ubwords?
[Ss]uperset
tebibyte
[Tt]emplated
[Tt]okenization
[Tt]okenizer?
[Tt]ooltip
[Tt]ranslog
[Uu]nary
[Uu]ncheck
[Uu]ncomment
[Uu]ndeploy
[Uu]nigram
[Uu]nnesting
[Uu]nrecovered
[Uu]nregister(s|ed|ing)?
[Uu]pdatable
[Uu]psert
[Ww]alkthrough
[Ww]ebpage
xy

View File

@ -0,0 +1 @@

65
.github/vale/tests/test-headings.md vendored Normal file
View File

@ -0,0 +1,65 @@
# This should not be flagged
---
# Example with OpenSearch should not be flagged
---
# Example with Canada should not be flagged
---
# Example with: OpenSearch should not be flagged
---
# This: Should not be flagged
---
# Step 2: (Optional) Test OpenSearch should not be flagged
---
# Example with something else: Should not be flagged
---
# Example 2: Should not be flagged
---
# 1. This should not be flagged
---
# Example: should be flagged
---
# Example 3: should be flagged
---
# Example with something else: should be Flagged twice
---
# Example With something else: should be flagged twice
---
# Example: with OpenSearch should be flagged
---
# this should be flagged
---
# This Should Be Flagged
---
The King of England should not be flagged.

91
.github/vale/tests/test-style-neg.md vendored Normal file
View File

@ -0,0 +1,91 @@
# Test file
This sentence tests Advanced Placement (AP). We should define AP before using.
This sentence tests cybersecurity.
This sentence tests dash---spacing.
This sentence tests numbers above 1.2 in versions 1.2 and earlier.
This sentence tests upper-right and lower left.
This sentence tests exclamation points.
This sentence tests failover. To fail over, we test this as a verb.
This sentence is not testing the future tense.
## This heading tests heading acronyms Security plugin
This sentence tests inclusive terminology by not using offensive terms.
## This heading tests capitalization
This sentence tests Latin elimination.
## This heading tests ending punctuation
This sentence tests Latin substitution through using Latin.
## This heading: Tests colons but fails capitalization
This sentence tests [links double parentheses]({{site.url}}{{site.baseurl}}/opensearch/).
This sentence tests [links double slash]({{site.url}}{{site.baseurl}}/opensearch/).
This sentence tests [links end slash]({{site.url}}{{site.baseurl}}/opensearch/).
This sentence tests [links mid slash]({{site.url}}{{site.baseurl}}/opensearch/).
This sentence tests using login as a noun. To log in, we test this as a verb.
To test merge conflicts, remove tick marks in `<<<<<<< HEAD`.
This sentence tests periods, colons, and commas.
This sentence tests passive voice.
We are pleased that this sentence tests pleading.
This sentence tests the range 2--5.
This sentence tests repetition of words in the middle of a sentence.
This sentence tests rollover as a noun. To roll over, we test this as a verb.
This sentence tests setup as a noun. To set up, we test this as a verb.
This sentence tests AWS Signature Version 4.
This sentence tests the "simple" rule by doing it.
These two sentences. Test the spacing punctuation.
This sentence tests the correct/incorrect slash spacing.
This sentence tests word spacing.
This sentence tests spelling.
## This and the next
This sentence tests substitution error by using the word indexes.
### Headings test stacked headings
This sentence tests substitution suggestion because of its nature.
This table | Tests capitalization
:--- | :---
of table | headings
This sentence tests timeout as a noun. To time out, we test this as a verb.
This sentence tests 2 KB units capitalization.
This sentence tests 2 KB units spacing.
This sentence tests version 1.2 not spelled out.
This sentence tests terms by using Security plugin.

93
.github/vale/tests/test-style-pos.md vendored Normal file
View File

@ -0,0 +1,93 @@
# Test file
This sentence tests AP. AP should be defined before using.
This sentence tests cyber security.
This sentence tests dash --- spacing.
This sentence tests the table above in versions 1.2 and below.
This sentence tests top-right and bottom left.
This sentence tests exclamation points!
This sentence tests fail-over. To failover, we test this as a verb.
This sentence will test the future tense.
## This heading tests heading acronyms (HA)
This sentence tests inclusive terminology by using an offensive term kill.
## This heading tests Capitalization
This sentence tests Latin elimination, etc.
## This heading tests ending punctuation.
This sentence tests Latin substitution via using Latin.
## This heading: tests colons
This sentence tests [links double parentheses](({{site.url}}{{site.baseurl}}/opensearch/)).
This sentence tests [links double slash]({{site.url}}{{site.baseurl}}/opensearch//double-slash/).
This sentence tests [links end slash]({{site.url}}{{site.baseurl}}/opensearch).
This sentence tests [links mid slash]({{site.url}}{{site.baseurl}}opensearch).
This sentence tests log-in as a noun. To login, we test this as a verb.
To test merge conflicts, remove tick marks in `<<<<<<< HEAD`.
This sentence tests periods, colons and commas.
This sentence is tested with passive voice.
Please test this sentence.
This sentence tests the range 2 -- 5.
This sentence tests repetition repetition of words in the middle of a sentence.
This sentence tests roll-over as a noun. To rollover, we test this as a verb.
This sentence tests set-up as a noun. To setup, we test this as a verb.
This sentence tests AWS SigV4 and sigv4.
This sentence simply tests the "simple" rule by just doing it.
These two sentences. Test the spacing punctuation.
This sentence tests the correct / incorrect slash spacing.
This sentence tests word spacing.
This sentence tests splling.
## This and the next
### Headings test stacked headings
This sentence tests substitution error by using the word indices.
This sentence tests substitution suggestion due to its nature.
This Table | tests capitalization
:--- | :---
of table | headings
This sentence tests time-out as a noun. To timeout, we test this as a verb.
This sentence tests 2 kb units capitalization.
This sentence tests 2KB units spacing.
This sentence tests v1.2 not spelled out.
This sentence tests terms by using security plugin.
This sentence tests using the word repo.

19
.github/workflows/add-untriaged.yml vendored Normal file
View File

@ -0,0 +1,19 @@
name: Apply 'untriaged' label during issue lifecycle
on:
issues:
types: [opened, reopened, transferred]
jobs:
apply-label:
runs-on: ubuntu-latest
steps:
- uses: actions/github-script@v6
with:
script: |
github.rest.issues.addLabels({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
labels: ['untriaged']
})

40
.github/workflows/backport.yml vendored Normal file
View File

@ -0,0 +1,40 @@
name: Backport
on:
pull_request_target:
types:
- closed
- labeled
jobs:
backport:
runs-on: ubuntu-latest
permissions:
contents: write
pull-requests: write
name: Backport
# Only react to merged PRs for security reasons.
# See https://docs.github.com/en/actions/using-workflows/events-that-trigger-workflows#pull_request_target.
if: >
github.event.pull_request.merged
&& (
github.event.action == 'closed'
|| (
github.event.action == 'labeled'
&& contains(github.event.label.name, 'backport')
)
)
steps:
- name: GitHub App token
id: github_app_token
uses: tibdex/github-app-token@v1.5.0
with:
app_id: ${{ secrets.APP_ID }}
private_key: ${{ secrets.APP_PRIVATE_KEY }}
# opensearch-trigger-bot installation ID
installation_id: 22958780
- name: Backport
uses: VachaShah/backport@v2.1.0
with:
github_token: ${{ steps.github_app_token.outputs.token }}
head_template: backport/backport-<%= number %>-to-<%= base %>

View File

@ -0,0 +1,15 @@
name: Delete merged branch of the backport PRs
on:
pull_request:
types:
- closed
jobs:
delete-branch:
runs-on: ubuntu-latest
if: startsWith(github.event.pull_request.head.ref,'backport/')
steps:
- name: Delete merged branch
uses: SvanBoxel/delete-merged-branch@main
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

16
.github/workflows/jekyll-build.yml vendored Normal file
View File

@ -0,0 +1,16 @@
name: Jekyll Build Verification
on: [pull_request]
jobs:
check:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: ruby/setup-ruby@v1
with:
ruby-version: '3.2'
bundler-cache: true
- run: |
JEKYLL_FATAL_LINK_CHECKER=internal bundle exec jekyll build --future

26
.github/workflows/link-checker.yml vendored Normal file
View File

@ -0,0 +1,26 @@
name: Check Links
on:
workflow_dispatch:
schedule:
- cron: "30 11 * * 0"
jobs:
check:
if: github.repository == 'opensearch-project/documentation-website'
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: ruby/setup-ruby@v1
with:
ruby-version: '3.0'
bundler-cache: true
- run: |
JEKYLL_FATAL_LINK_CHECKER=all bundle exec jekyll build --future
- name: Create Issue On Build Failure
if: ${{ failure() }}
uses: dblock/create-a-github-issue@v3
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
WORKFLOW_URL: "${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}"
with:
update_existing: true
filename: .github/ISSUE_TEMPLATE/broken_links.md

23
.github/workflows/vale.yml vendored Normal file
View File

@ -0,0 +1,23 @@
name: Style check
on:
pull_request:
workflow_dispatch:
jobs:
style-job:
runs-on: ubuntu-latest
steps:
- name: Check out
uses: actions/checkout@v3
- name: Run Vale
uses: errata-ai/vale-action@reviewdog
env:
GITHUB_TOKEN: ${{secrets.GITHUB_TOKEN}}
with:
fail_on_error: false
reporter: github-pr-check
filter_mode: added
vale_flags: "--no-exit"
version: 2.28.0

70
.vale.ini Normal file
View File

@ -0,0 +1,70 @@
StylesPath = ".github/vale/styles"
Vocab = "OpenSearch"
MinAlertLevel = warning
SkippedScopes = code, style
[*.md]
BasedOnStyles = Vale, OpenSearch
BlockIgnores = {%-?\s*comment[.|\s|\S]*?endcomment\s*-?%}, \
{%\s*raw[.|\s|\S]*?endraw\s*%}, \
{:+\s*[\.\w-\s]*\s*}, \
{%\s+[^%]*%}
# ignore variables
TokenIgnores = [a-zA-Z_]+((?:_|\.)[a-zA-Z]+)+
# override Vale spelling
Vale.Spelling = NO
Vale.Repetition = NO
Vale.Terms = YES
OpenSearch.AcronymParentheses = YES
OpenSearch.Ampersand = YES
OpenSearch.Cyber = YES
OpenSearch.DashSpacing = YES
OpenSearch.DirectionAboveBelow = YES
OpenSearch.DirectionTopBottom = YES
OpenSearch.Exclamation = YES
OpenSearch.FailoverNoun = YES
OpenSearch.FailoverVerb = YES
OpenSearch.FutureTense = NO
OpenSearch.HeadingAcronyms = YES
OpenSearch.HeadingCapitalization = YES
OpenSearch.HeadingColon = YES
OpenSearch.HeadingPunctuation = YES
OpenSearch.Inclusive = YES
OpenSearch.LatinismsElimination = YES
OpenSearch.LatinismsSubstitution = YES
OpenSearch.LinksDoubleParentheses = YES
OpenSearch.LinksDoubleSlash = YES
OpenSearch.LinksEndSlash = YES
OpenSearch.LinksMidSlash = YES
OpenSearch.LoginNoun = YES
OpenSearch.LoginVerb = YES
OpenSearch.LogoutNoun = YES
OpenSearch.LogoutVerb = YES
OpenSearch.MergeConflicts = YES
OpenSearch.OxfordComma = YES
OpenSearch.PassiveVoice = NO
OpenSearch.Please = YES
OpenSearch.Range = YES
OpenSearch.Repetition = YES
OpenSearch.RolloverNoun = YES
OpenSearch.RolloverVerb = YES
OpenSearch.SetupNoun = YES
OpenSearch.SetupVerb = YES
OpenSearch.SignatureV4 = YES
OpenSearch.Simple = YES
OpenSearch.SpacingPunctuation = YES
OpenSearch.SpacingSlash = YES
OpenSearch.SpacingWords = YES
OpenSearch.Spelling = YES
OpenSearch.StackedHeadings = YES
OpenSearch.SubstitutionsError = YES
OpenSearch.SubstitutionsSuggestion = YES
OpenSearch.TableHeadings = YES
OpenSearch.TimeoutNoun = YES
OpenSearch.TimeoutVerb = YES
OpenSearch.UnitsNames = YES
OpenSearch.UnitsSpacing = YES
OpenSearch.Version = YES

16
404.md
View File

@ -1,3 +1,19 @@
---
permalink: /404.html
title: 404
layout: default
heading_anchors: false
nav_exclude: true
---
## Oops, this isn't the page you're looking for.
Maybe our [home page](https://opensearch.org/docs/latest) or one of the commonly visited pages below will help. If you need further support, please use the feedback feature on the right side of the screen to get in touch.
- [Quickstart]({{site.url}}{{site.baseurl}}/quickstart/)
- [Installing OpenSearch]({{site.url}}{{site.baseurl}}/install-and-configure/install-opensearch/index/)
- [OpenSearch Dashboards]({{site.url}}{{site.baseurl}}/dashboards/index/)
- [Query DSL]({{site.url}}{{site.baseurl}}/query-dsl/)
- [API Reference]({{site.url}}{{site.baseurl}}/api-reference/index/)

197
API_STYLE_GUIDE.md Normal file
View File

@ -0,0 +1,197 @@
# API Style Guide
This guide provides the basic structure for creating OpenSearch API documentation. It includes the various elements that we feel are most important to creating complete and useful API documentation, as well as description and examples where appropriate.
Depending on the intended purpose of the API, *some sections will be required while others may not be applicable*.
Use the [API_TEMPLATE](templates/API_TEMPLATE.md) to create an API documentation page.
### A note on terminology ###
Terminology for API parameters varies in the software industry, where two or even three names may be used to label the same type of parameter. For consistency, we use the following nomenclature for parameters in our API documentation:
* *Path parameter* "path parameter" and "URL parameter" are sometimes used synonymously. To avoid confusion, we use "path parameter" in this documentation.
* *Query parameter* This parameter name is often used synonymously with "request parameter." We use "query parameter" to be consistent.
### General usage for code elements
When you describe any code element in a sentence, such as an API, a parameter, or a field, you can use the noun name.
*Example usage*:
The time field provides a timestamp for job completion.
When you provide an exact example with a value, you can use the code element in code font.
*Example usage*:
The response provides a value for `time_field`, such as “timestamp.”
Provide a REST API call example in `json` format. Optionally, also include the `curl` command if the call can only be executed in a command line.
## Basic elements for documentation
The following sections describe the basic API documentation structure. Each section is discussed under its respective heading. Include only those elements appropriate to the API.
Depending on where the documentation appears within a section or subsection, heading levels may be adjusted to fit with other content.
1. Name of API (heading level 2)
1. (Optional) Path and HTTP methods (heading level 3)
1. Path parameters (heading level 3)
1. Query parameters (heading level 3)
1. Request fields (heading level 3)
1. Example request (heading level 4)
1. Example response (heading level 4)
1. Response fields (heading level 3)
## API name
Provide an API name that describes its function, followed by a description of its top use case and any usage recommendations.
*Example function*: "Autocomplete queries"
Use sentence capitalization for the heading (for example, "Create or update mappings"). When you refer to the API operation, you can use lowercase with code font.
If there is a corresponding OpenSearch Dashboards feature, provide a “See also” link that references it.
*Example*: “To learn more about monitor findings, see [Document findings](https://opensearch.org/docs/latest/monitoring-plugins/alerting/monitors/#document-findings)."
If applicable, provide any caveats to its usage with a note or tip, as in the following example:
"If you use the Security plugin, make sure you have the appropriate permissions."
(To set this point in note-style format, follow the text on the next line with {: .note})
### Path and HTTP methods
For relatively complex API calls that include path parameters, it's sometimes a good idea to provide an example so that users can visualize how the request is properly formed. This section is optional and includes examples that illustrate how the endpoint and path parameters fit together in the request. The following is an example of this section for the nodes stats API:
```json
GET /_nodes/stats
GET /_nodes/<node_id>/stats
GET /_nodes/stats/<metric>
GET /_nodes/<node_id>/stats/<metric>
GET /_nodes/stats/<metric>/<index_metric>
GET /_nodes/<node_id>/stats/<metric>/<index_metric>
```
### Path parameters
While the API endpoint states a point of entry to a resource, the path parameter acts on the resource that precedes it. Path parameters come after the resource name in the URL.
In the following example, the resource is `scroll` and its path parameter is `<scroll_id>`:
```json
GET _search/scroll/<scroll_id>
```
Introduce what the path parameters can do at a high level. Provide a table with parameter names and descriptions. Include a table with the following columns:
*Parameter* Parameter name in plain font.
*Data type* Data type capitalized (such as Boolean, String, or Integer).
*Description* Sentence to describe the parameter function, default values or range of values, and any usage examples.
Parameter | Data type | Description
:--- | :--- | :---
### Query parameters
In terms of placement, query parameters are always appended to the end of the URL and located to the right of the operator "?". Query parameters serve the purpose of modifying information to be retrieved from the resource.
In the following example, the endpoint is `aliases` and its query parameter is `v` (provides verbose output):
```json
GET _cat/aliases?v
```
Include a paragraph that describes how to use the query parameters with an example in code font. Include the query parameter operator "?" to delineate query parameters from path parameters.
For GET and DELETE APIs: Introduce what you can do with the optional parameters. Include a table with the same columns as the path parameter table.
Parameter | Data type | Description
:--- | :--- | :---
### Request fields
For PUT and POST APIs: Introduce what the request fields are allowed to provide in the body of the request.
Include a table with these columns:
*Field* Field name in plain font.
*Data type* Data type capitalized (such as Boolean, String, or Integer).
*Description* Sentence to describe the fields function, default values or range of values, and any usage examples.
Field | Data type | Description
:--- | :--- | :---
#### Example request
Provide a sentence that describes what is shown in the example, followed by a cut-and-paste-ready API request in JSON format. Make sure that you test the request yourself in the Dashboards Dev Tools console to make sure it works. See the following examples.
The following request gets all the settings in your index:
```json
GET /sample-index1/_settings
```
The following request copies all of your field mappings and settings from a source index to a destination index:
```json
POST _reindex
{
"source":{
"index":"sample-index-1"
},
"dest":{
"index":"sample-index-2"
}
}
```
#### Example response
Include a JSON example response to show what the API returns. See the following examples.
The `GET /sample-index1/_settings` request returns the following response fields:
```json
{
"sample-index1": {
"settings": {
"index": {
"creation_date": "1622672553417",
"number_of_shards": "1",
"number_of_replicas": "1",
"uuid": "GMEA0_TkSaamrnJSzNLzwg",
"version": {
"created": "135217827",
"upgraded": "135238227"
},
"provided_name": "sample-index1"
}
}
}
}
```
The `POST _reindex` request returns the following response fields:
```json
{
"took" : 4,
"timed_out" : false,
"total" : 0,
"updated" : 0,
"created" : 0,
"deleted" : 0,
"batches" : 0,
"version_conflicts" : 0,
"noops" : 0,
"retries" : {
"bulk" : 0,
"search" : 0
},
"throttled_millis" : 0,
"requests_per_second" : -1.0,
"throttled_until_millis" : 0,
"failures" : [ ]
}
```
### Response fields
For PUT and POST APIs: Define all allowable response fields that can be returned in the body of the response.
Field | Data type | Description
:--- | :--- | :---

View File

@ -1,59 +1,144 @@
# Contributing Guidelines
- [Creating an issue](#creating-an-issue)
- [Contributing content](#contributing-content)
- [Contribution workflow](#contribution-workflow)
- [Before you start](#before-you-start)
- [Making minor changes](#making-minor-changes)
- [Making major changes](#making-major-changes)
- [Setting up your local copy of the repository](#setting-up-your-local-copy-of-the-repository)
- [Making, viewing, and submitting changes](#making-viewing-and-submitting-changes)
- [Review process](#review-process)
- [Style linting](#style-linting)
- [Getting help](#getting-help)
Thank you for your interest in contributing to our project. Whether it's a bug report, new feature, correction, or additional
documentation, we greatly value feedback and contributions from our community.
# Contributing guidelines
Please read through this document before submitting any issues or pull requests to ensure we have all the necessary
information to effectively respond to your bug report or contribution.
Thank you for your interest in improving the OpenSearch documentation! We value and appreciate all feedback and contributions from our community, including requests for additional documentation, corrections to existing content, and reports of technical issues with the documentation site.
You can [create an issue](#creating-an-issue) asking us to change the documentation or [contribute content](#contributing-content) yourself.
## Reporting Bugs/Feature Requests
NOTE: If youd like to contribute but don't know where to start, try browsing existing [issues](https://github.com/opensearch-project/documentation-website/issues). Our projects use custom GitHub issue labels for status, version, type of request, and so on. We recommend starting with any issue labeled "good first issue" if you're a beginner or "help wanted" if you're a more experienced user.
We welcome you to use the GitHub issue tracker to report bugs or suggest features.
## Creating an issue
When filing an issue, please check existing open, or recently closed, issues to make sure somebody else hasn't already
reported the issue. Please try to include as much information as you can. Details like these are incredibly useful:
Use the documentation issue template to describe the change you'd like to make:
* A reproducible test case or series of steps
* The version of our code being used
* Any modifications you've made relevant to the bug
* Anything unusual about your environment or deployment
1. Go to https://github.com/opensearch-project/documentation-website/issues and select **New issue**.
1. Enter the requested information, including as much detail as possible, especially which version or versions the request affects.
1. Select **Submit new issue**.
The `untriaged` label is assigned automatically. During the triage process, the Documentation team will add the appropriate labels, assign the issue to a technical writer, and prioritize the request. We may follow up with you for additional information.
## Contributing via Pull Requests
Contributions via pull requests are much appreciated. Before sending us a pull request, please ensure that:
## Contributing content
1. You are working against the latest source on the *main* branch.
2. You check existing open, and recently merged, pull requests to make sure someone else hasn't addressed the problem already.
3. You open an issue to discuss any significant work - we would hate for your time to be wasted.
There are two ways to contribute content, depending on the magnitude of the change:
To send us a pull request, please:
- [Minor changes](#making-minor-changes): For small changes to existing files, like fixing typos or adding parameters, you can edit files in GitHub directly. This approach does not require cloning the repository and does not allow you to test the documentation.
- [Major changes](#making-major-changes): For changes you want to test first, like adding new or reorganizing pages or adding a table or section, you can edit files locally and push the changes to GitHub. This approach requires setting up a local version of the repository and allows you to test the documentation.
1. Fork the repository.
2. Modify the source; please focus on the specific change you are contributing. If you also reformat all the code, it will be hard for us to focus on your change.
3. Ensure local tests pass.
4. Commit to your fork using clear commit messages.
5. Send us a pull request, answering any default questions in the pull request interface.
6. Pay attention to any automated CI failures reported in the pull request, and stay involved in the conversation.
### Contribution workflow
GitHub provides additional document on [forking a repository](https://help.github.com/articles/fork-a-repo/) and
[creating a pull request](https://help.github.com/articles/creating-a-pull-request/).
The workflow for contributing documentation is the same as the one for contributing code:
- Make your changes.
- Build the documentation website to check your work (only possible if you are making changes locally).
- Submit a [pull request](https://github.com/opensearch-project/documentation-website/pulls) (PR).
- A maintainer reviews and merges your PR.
## Finding contributions to work on
Looking at the existing issues is a great way to find something to contribute on. As our projects, by default, use the default GitHub issue labels (enhancement/bug/duplicate/help wanted/invalid/question/wontfix), looking at any 'help wanted' issues is a great place to start.
### Before you start
Before contributing content, make sure to read the following resources:
- [README](README.md)
- [OpenSearch Project Style Guidelines](STYLE_GUIDE.md)
- [API Style Guide](API_STYLE_GUIDE.md)
- [Formatting Guide](FORMATTING_GUIDE.md)
## Code of Conduct
This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct).
For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq) or contact
opensource-codeofconduct@amazon.com with any additional questions or comments.
NOTE: Make sure that any documentation you submit is your own work or work that you have the right to submit. We respect the intellectual property rights of others, and as part of contributing, we'll ask you to sign your contribution with a [Developer Certificate of Origin (DCO)](https://github.com/opensearch-project/.github/blob/main/CONTRIBUTING.md#developer-certificate-of-origin) stating that you have the right to submit your contribution and that you understand that we will use your contribution.
### Making minor changes
## Security issue notifications
If you discover a potential security issue in this project we ask that you notify AWS/Amazon Security via our [vulnerability reporting page](http://aws.amazon.com/security/vulnerability-reporting/). Please do **not** create a public github issue.
If you want to make minor changes to an existing file, you can use this approach:
1. [Fork this repository](https://docs.github.com/en/get-started/quickstart/fork-a-repo).
## Licensing
1. In your fork on GitHub, navigate to the file that you want to change.
See the [LICENSE](LICENSE) file for our project's licensing. We will ask you to confirm the licensing of your contribution.
1. In the upper-right corner, select the pencil icon and edit the file.
1. In the upper-right corner, select **Commit changes...***. Enter the commit message and optional description and select **Create a new branch for this commit and start a pull request**.
### Making major changes
If you're adding a new page or making major changes to the documentation, such as adding new images, sections, or styling, we recommend that you work in a local copy of the repository and test the rendered HTML before submitting a PR.
#### Setting up your local copy of the repository
Follow these steps to set up your local copy of the repository:
1. [Fork this repository](https://docs.github.com/en/get-started/quickstart/fork-a-repo) and clone your fork.
1. Navigate to your cloned repository.
1. Install [Ruby](https://www.ruby-lang.org/en/) if you don't already have it. We recommend [RVM](https://rvm.io/), but you can use any method you prefer:
```
curl -sSL https://get.rvm.io | bash -s stable
rvm install 3.2
ruby -v
```
1. Install [Bundler](https://bundler.io/) if you don't already have it:
```
gem install bundler
```
1. Install Jekyll and all the dependencies:
```
bundle install
```
#### Making, viewing, and submitting changes
Here's how to build the website, make changes, and view them locally:
1. Build the website:
```
sh build.sh
```
The build script should automatically open your web browser, but if it doesn't, open [http://localhost:4000/docs/](http://localhost:4000/docs/).
1. Create a new branch against the latest source on the main branch.
1. Edit the Markdown files that you want to change.
1. When you save a file, Jekyll automatically rebuilds the site and refreshes your web browser. This process can take 60--90 seconds.
1. When you're happy with how everything looks, commit, [sign off](https://github.com/src-d/guide/blob/9171d013c648236c39faabcad8598be3c0cf8f56/developer-community/fix-DCO.md#how-to-prevent-missing-sign-offs-in-the-future), push your changes to your fork, and submit a PR.
Note that a PR requires DCO sign-off before we can merge it. You can use the -s command line option to append this automatically to your commit message, for example, `git commit -s -m 'This is my commit message'`. For more information, see https://github.com/apps/dco.
## Review process
We greatly appreciate all contributions to the documentation and will review them as quickly as possible.
During the PR process, expect that there will be some back-and-forth. If you want your contribution to be merged quickly, try to respond to comments in a timely fashion, and let us know if you don't want to continue with the PR.
We use the [Vale](https://github.com/errata-ai/vale) linter to ensure that our documentation adheres to the [OpenSearch Project Style Guidelines](STYLE_GUIDE.md). Addressing Vale comments on the PR expedites the review process. You can also install Vale locally so you can address the comments before creating a PR. For more information, see [Style linting](#style-linting).
If we accept the PR, we will merge it and will backport it to the appropriate branches.
### Style linting
To ensure that our documentation adheres to the [OpenSearch Project Style Guidelines](STYLE_GUIDE.md), we use the [Vale](https://github.com/errata-ai/vale) linter. Addressing Vale comments on the PR expedites the review process. You can also install Vale locally as follows so you can address the comments before creating a PR:
1. Run `brew install vale`.
2. Run `vale *` from the documentation site root directory to lint all Markdown files. To lint a specific file, run `vale /path/to/file`.
Optionally, you can install the [Vale VSCode](https://github.com/chrischinchilla/vale-vscode) extension, which integrates Vale with Visual Studio Code. By default, only _errors_ and _warnings_ are underlined. To change the minimum alert level to include _suggestions_, go to **Vale VSCode** > **Extension Settings** and select **suggestion** in the **Vale > Vale CLI: Min Alert Level** dropdown list.
## Getting help
For help with the contribution process, reach out to one of the [points of contact](README.md#points-of-contact).

470
FORMATTING_GUIDE.md Normal file
View File

@ -0,0 +1,470 @@
# Formatting Guide
This guide provides an overview of the formatted elements commonly used in the OpenSearch documentation.
* * *
### Table of contents
* [Adding pages or sections](#adding-pages-or-sections)
* [Buttons](#buttons)
* [Callouts](#callouts)
* [Collapsible blocks](#collapsible-blocks)
* [Dashes](#dashes)
* [Horizontal rule](#horizontal-rule)
* [Images](#images)
* [Images in line with text](#images-in-line-with-text)
* [Labels](#labels)
* [Links](#links)
* [Lists](#lists)
* [Unordered lists](#unordered-lists)
* [Ordered lists](#ordered-lists)
* [Nested lists](#nested-lists)
* [Lists with code snippets or images](#lists-with-code-snippets-or-images)
* [Math](#math)
* [Tables](#tables)
* [Text style](#text-style)
* [Variables in curly braces](#variables-in-curly-braces)
* [Videos](#videos)
* * *
## Adding pages or sections
This repository contains [Markdown](https://guides.github.com/features/mastering-markdown/) files organized into Jekyll _collections_ (for example, `_api-reference` or `_dashboards`). Each Markdown file corresponds to one page on the website.
In addition to the content for a given page, each Markdown file contains some Jekyll [front matter](https://jekyllrb.com/docs/front-matter/) similar to the following:
```
---
layout: default
title: Date
nav_order: 25
has_children: false
parent: Date field types
grand_parent: Supported field types
---
```
If you want to reorganize content or add a new page, make sure to set the appropriate `has_children`, `parent`, `grand_parent`, and `nav_order` variables, which define the hierarchy of pages in the left navigation.
When adding a page or a section, make the `nav_order` of the child pages multiples of 10. For example, if you have a parent page `Clients`, make child pages `Java`, `Python`, and `JavaScript` have a `nav_order` of 10, 20, and 30, respectively. Doing so makes inserting additional child pages easier because it does not require you to renumber existing pages.
Each collection must have an `index.md` file that corresponds to the collection's index page. In the `index.md` file's front matter, specify `nav_excluded: true` so that the page does not appear separately under the collection.
## Buttons
You can use either `copy` or `copy-curl` includes for code snippets. The `copy` include places a **Copy** button on the code snippet, while the `copy-curl` include places both **Copy** and **Copy as cURL** buttons. Use the `copy-curl` include for API requests. If an API request is already in the cURL format, use the `copy` include.
**Example of a `copy` include**
````
```bash
curl -XGET "localhost:9200/_tasks?actions=*search&detailed
```
{% include copy.html %}
````
**Example of a `copy-curl` include**
````
```json
PUT /sample-index1/_clone/cloned-index1
{
"aliases": {
"sample-alias1": {}
}
}
```
{% include copy-curl.html %}
````
## Callouts
You can use four levels of callouts:
* `{: .note}` blue
* `{: .tip }` green
* `{: .important}` yellow
* `{: .warning}` red
Place a callout directly under the paragraph to which you want to apply the callout style.
**Example**
```
In case of a cluster or node failure, all PIT data is lost.
{: .note}
```
For a callout with multiple paragraphs or lists, use `>`:
```
> ****PREREQUISITE****
>
> To use a custom vector map with GeoJSON, install these two required plugins:
> * OpenSearch Dashboards Maps [`dashboards-maps`](https://github.com/opensearch-project/dashboards-maps_) front-end plugin
> * OpenSearch [`geospatial`](https://github.com/opensearch-project/geospatial_) backend plugin
{: .note}
```
## Collapsible blocks
To insert an open collapsible block, use the `<details>` element as follows:
````html
<details open markdown="block">
<summary>
Response
</summary>
{: .text-delta}
```json
{
"_nodes" : {
"total" : 1,
"successful" : 1,
"failed" : 0
}
}
```
</details>
````
To insert a closed collapsible block, omit the `open` state:
````html
<details markdown="block">
<summary>
Response
</summary>
{: .text-delta}
```json
{
"_nodes" : {
"total" : 1,
"successful" : 1,
"failed" : 0
}
}
```
</details>
````
Collapsible blocks are useful for long responses and for the Table of Contents at the beginning of a page.
## Dashes
Use one dash for hyphens, two for en dashes, and three for em dashes:
```
upper-right
10--12 nodes per cluster
There is one candidate generator available---`direct_generator`.
```
## Horizontal rule
A horizontal rule is used to separate text sections. Use three asterisks separated by spaces for a horizontal rule:
```
## Why use OpenSearch?
* * *
```
## Images
Place images in the `images` directory of the documentation website. To refer to images, use relative links (see [Internal links](#internal-links) for more information).
Markdown images are responsive by default. To insert a Markdown image, use the `![<alternate text>](link)` syntax:
```
![OS branding]({{site.url}}{{site.baseurl}}/images/brand.png)
```
Markdown uses the images actual width to render it. It sets the maximum image width to the width of the main body panel.
If you want to specify the image width or another style, use HTML syntax:
```
<img src="{{site.url}}{{site.baseurl}}/images/brand.png" alt="OS branding" width="700"/>
```
You can specify width as a hard-coded number of pixels, as in the preceding example, or as a percentage of the parent width:
```
<img src="{{site.url}}{{site.baseurl}}/images/brand.png" alt="OS branding" width="70%"/>
```
To stretch the image to fit the width of the main body panel, use width=“100%”.
To take high-resolution screenshots, in Firefox, right-click on the page and choose “Take Screenshot”.
Image borders are automatic; do not manually add a border to an image.
Always **separate an image from the text with a blank line**:
```
To send a query to OpenSearch, select the query by placing the cursor anywhere in the query text. Then choose the triangle on the top right of the request or press `Ctrl/Cmd+Enter`:
<img src="{{site.url}}{{site.baseurl}}/images/dev-tools/dev-tools-send.png" alt="Send request">
```
Do not place an image next to text or insert artificial line breaks using `<br>`. Otherwise, the text might render as aligned to the bottom of the image, with the image on the right.
If the image is under a list item, place it on a new line with a tab. For more examples, see [Lists with code snippets or images](#lists-with-code-snippets-or-images).
### Images in line with text
When describing an icon, use the icon's name followed by an inline image in parentheses. Insert the image in line with text using the `nomarkdown` extension and an HTML image:
```
Choose the play icon ({::nomarkdown}<img src="{{site.url}}{{site.baseurl}}/images/dev-tools/play-icon.png" class="inline-icon" alt="play icon"/>{:/}) on the upper right of the request.
```
## Labels
You can use the following labels:
* label-blue
* label-green
* label-purple
* label-red
* label-yellow
Use a purple label to specify the version in which an API was introduced:
```
# Alias
Introduced 1.0
{: .label .label-purple }
```
If we introduce a breaking change to an operation, add an additional label with a link to the release note for that breaking change:
```
## Get roles
Introduced 1.0
{: .label .label-purple }
[Last breaking change 2.0](https://example.com)
{: .label .label-red }
```
## Links
To add a link to a document, section, or image, use the `[name](link)` syntax, for example:
```
## Looking for the Javadoc?
See [opensearch.org/javadocs/](https://opensearch.org/javadocs/).
```
### Section links
**Section links** are links to headings in your document. Markdown lowercases the headings for links, drops back ticks, and replaces spaces with hyphens:
```
## The `minimum_should_match` parameter
For more information, see [the `minimum_should_match` parameter](#the-minimum_should_match-parameter).
```
### Internal links
**Internal links** are links to another document or image within the documentation website. Because the documentation website is versioned, do not hard code the version number in the link. Use the relative path, where `{{site.url}}{{site.baseurl}}` refers to the main directory, instead:
```
If you need to use a field for exact-value search, map it as a [`keyword`]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/keyword/).
```
### GitHub links
When linking to a GitHub issue or PR, refer to the issue or PR number in the following format:
```
For more details, see issue [#1940](https://github.com/opensearch-project/opensearch/issues/1940).
```
## Lists
Markdown supports unordered and ordered lists, nested lists, and lists with code snippets or images.
### Unordered lists
Use asterisks or dashes for unordered lists:
```
* One
* Two
```
or
```
- One
- Two
```
Lists with dashes render the list items closer to each other vertically, while lists with asterisks have more space between the lines.
Dont mix and match asterisks and dashes.
### Ordered lists
Use all 1s for ordered lists:
```
1. One
1. Two
```
Jekyll will automatically correctly number the items, and it will be much easier for you to insert and delete items without renumbering.
If there is a paragraph in the middle of a list, the list will restart with 1 after the paragraph. If you want to continue the list after the paragraph, use `counter-reset: none`:
```
1. One
Paragraph that breaks the numbering
{:style="counter-reset: none"}
1. Two
```
### Nested lists
Use tabs to nest lists:
```
1. Parent 1
- Child 1
- Child 2
- Grandchild 1
```
Markdown automatically adjusts numbered lists so that they use numbers and letters, so always use 1s for nested numbered lists.
### Lists with code snippets or images
If you need to position an image or a code snippet within a list, use tabs to signal to Markdown that the image or code snippet is part of the list item.
**Example with code snippets**
```
1. Run the demo batch script.
There are two ways of running the batch script:
1. Run the batch script using the Windows UI:
1. Navigate to the top directory of your OpenSearch installation and open the `opensearch-{{site.opensearch_version}}` folder.
1. Run the batch script by double-clicking the `opensearch-windows-install.bat` file. This opens a command prompt with an OpenSearch instance running.
1. Run the batch script from Command prompt or Powershell:
1. Open Command Prompt by entering `cmd`, or Powershell by entering `powershell`, in the search box next to ****Start**** on the taskbar.
1. Change to the top directory of your OpenSearch installation.
```bat
cd \path\to\opensearch-{{site.opensearch_version}}
```
1. Run the batch script.
```bat
.\opensearch-windows-install.bat
```
```
**Example with images**
```
1. To begin, select the rule in the **Rule name** column. The rule details pane opens, as shown in the following image.
<img src="{{site.url}}{{site.baseurl}}/images/Security/rule-dup2.png" alt="Opening the rule details pane" width="50%">
1. Select the **Duplicate** button in the upper-right corner of the pane. The **Duplicate rule** window opens in Visual Editor view, and all of the fields are automatically populated with the rule's details. Details are also populated in YAML Editor view, as shown in the following image.
<img src="{{site.url}}{{site.baseurl}}/images/Security/dupe-rule.png" alt="Selecting the duplicate button opens the Duplicate rule window" width="50%">
```
## Math
To add mathematical expressions to a page, add `has_math: true` to the pages front matter. Then insert LaTeX math into HTML tags with the rest of your Markdown content, as shown in the following example:
```
## Math
Some Markdown paragraph. Here's a formula:
<p>
When \(a \ne 0\), there are two solutions to \(ax^2 + bx + c = 0\) and they are
\[x = {-b \pm \sqrt{b^2-4ac} \over 2a}.\]
</p>
And back to Markdown.
```
## Tables
Markdown table columns are automatically sized, and there is no need to specify a different number of dashes in the formatting.
**Example**
```
Header 1 | Header 2
:--- | :---
Body 1 | Body 2, which is extremely lengthy, but there is no need to specify its width.
```
To insert line breaks within tables, use `<br>`:
```
Header 1 | Header 2
:--- | :---
Body 1 | Body paragraph 1 <br> Body paragraph 2
```
To use lists within a table, use `<br>` and `-` :
```
Header 1 | Header 2
:--- | :---
Body 1 | List:<br>- One<br>- Two
```
You can also use `&nbsp;` to insert one space, `&ensp;` to insert two spaces, and `&emsp;` to insert four spaces in table cells.
If you need a list with real bullet points, use the bullet point HTML code:
```
Header 1 | Header 2
:--- | :---
Body 1 | List:<br>&ensp;&#x2022; One<br>&ensp;&#x2022; Two
```
## Text style
You can style text in the following ways:
* ```**bold**```
* ```_italic_``` or ```*italic*```
For guidance on using code examples and when to use code font, see [Code examples](https://github.com/opensearch-project/documentation-website/blob/main/STYLE_GUIDE.md#code-examples).
## Variables in curly braces
To correctly display variables that are in curly braces, escape the curly braces with the `{% raw %}{% endraw %}` tags:
````
"message_template": {
"source": "the index is {% raw %}{{ctx.index}}{% endraw %}"
}
````
The variable `ctx.index` is rendered in double curly braces.
## Videos
To insert a video, add a YouTube player include similar to the following:
```
{% include youtube-player.html id='_g46WiGPhFs' %}
```
Note that the `id` variable refers to the YouTube video ID at the end of the URL. For example, the YouTube video at the URL `https://youtu.be/_g46WiGPhFs` has the ID `_g46WiGPhFs`. The ID must be surrounded with single quotation marks.

10
Gemfile
View File

@ -8,7 +8,7 @@ source "https://rubygems.org"
#
# This will help ensure the proper Jekyll version is running.
# Happy Jekylling!
gem "jekyll", "~> 4.2.0"
gem "jekyll", "~> 4.3.2"
# This is the default theme for new Jekyll sites. You may change this to anything you like.
gem "just-the-docs", "~> 0.3.3"
@ -22,6 +22,7 @@ gem "jekyll-redirect-from", "~> 0.16"
# If you have any plugins, put them here!
group :jekyll_plugins do
gem "jekyll-last-modified-at"
gem "jekyll-sitemap"
end
@ -31,4 +32,11 @@ gem "tzinfo-data", platforms: [:mingw, :mswin, :x64_mingw, :jruby]
# Performance-booster for watching directories on Windows
gem "wdm", "~> 0.1.0" if Gem.win_platform?
# Installs webrick dependency for building locally
gem "webrick", "~> 1.7"
# Link checker
gem "typhoeus"
gem "ruby-link-checker"
gem "ruby-enum"

16
MAINTAINERS.md Normal file
View File

@ -0,0 +1,16 @@
## Overview
This document contains a list of maintainers in this repo. See [opensearch-project/.github/RESPONSIBILITIES.md](https://github.com/opensearch-project/.github/blob/main/RESPONSIBILITIES.md#maintainer-responsibilities) that explains what the role of maintainer means, what maintainers do in this and other repos, and how they should be doing it. If you're interested in contributing, and becoming a maintainer, see [CONTRIBUTING](CONTRIBUTING.md).
## Current Maintainers
| Maintainer | GitHub ID | Affiliation |
| ---------------- | ----------------------------------------------- | ----------- |
| Heather Halter | [hdhalter](https://github.com/hdhalter) | Amazon |
| Fanit Kolchina | [kolchfa-aws](https://github.com/kolchfa-aws) | Amazon |
| Nate Archer | [Naarcha-AWS](https://github.com/Naarcha-AWS) | Amazon |
| Nate Bower | [natebower](https://github.com/natebower) | Amazon |
| Melissa Vagi | [vagimeli](https://github.com/vagimeli) | Amazon |
| Miki Barahmand | [AMoo-Miki](https://github.com/AMoo-Miki) | Amazon |
| David Venable | [dlvenable](https://github.com/dlvenable) | Amazon |
| Stephen Crawford | [scraw99](https://github.com/scrawfor99) | Amazon |

View File

@ -19,6 +19,7 @@
社区成员的贡献对保持本文档的完整性,有效性,完整组织和保持最新起到了非常重要的作用。
- Do you work on one of the various OpenSearch plugins? Take a look at the documentation for the plugin. Is everything accurate? Will anything change in the near future?
## 你可以做什么样的帮助
@ -297,12 +298,12 @@ And back to Markdown.
## 安全特性Security
请参考 [参与项目CONTRIBUTING](CONTRIBUTING.md#security-issue-notifications) 页面中的内容来获得更多信息。
If you discover a potential security issue in this project, we ask that you notify AWS/Amazon Security using our [vulnerability reporting page](http://aws.amazon.com/security/vulnerability-reporting/). Do **not** create a public GitHub issue.
## 许可证License
This project is licensed under the Apache-2.0 License
This project is licensed under the Apache-2.0 License.
## 版权Copyright

510
STYLE_GUIDE.md Normal file
View File

@ -0,0 +1,510 @@
# OpenSearch Project Style Guidelines
Welcome to the content style guide for the OpenSearch Project. This guide covers the style standards to be observed when creating OpenSearch content and will evolve as we implement best practices and lessons learned in order to best serve the community.
In addition to this guide and [TERMS.md](https://github.com/opensearch-project/documentation-website/blob/main/TERMS.md), our content is generally edited in accordance with the [Microsoft Writing Style Guide](https://docs.microsoft.com/en-us/style-guide/welcome/), [The Chicago Manual of Style](https://www.chicagomanualofstyle.org/home.html), and [Merriam-Webster](https://www.merriam-webster.com/) (listed in order of precedence); however, we may deviate from these style guides in order to maintain consistency and accommodate the unique needs of the community. This is by no means an exhaustive list of style standards, and we value transparency, so we welcome contributions to our style standards and guidelines. If you have a question regarding our standards or adherence/non-adherence to the style guides or would like to make a contribution, please tag @natebower on GitHub.
## Naming conventions, voice, tone, and brand personality traits
The following sections provide guidance on OpenSearch Project naming conventions, voice, tone, and brand personality traits.
### Naming conventions
The following naming conventions should be observed in OpenSearch Project content:
* Capitalize both words when referring to the *OpenSearch Project*.
* *OpenSearch* is the name for the distributed search and analytics engine used by Amazon OpenSearch Service.
* Amazon OpenSearch Service is a managed service that makes it easy to deploy, operate, and scale OpenSearch. Use the full name *Amazon OpenSearch Service* on first appearance. The abbreviated service name, *OpenSearch Service*, can be used for subsequent appearances.
* Amazon OpenSearch Serverless is an on-demand serverless configuration for Amazon OpenSearch Service. Use the full name *Amazon OpenSearch Serverless* on first appearance. The abbreviated service name, *OpenSearch Serverless*, can be used for subsequent appearances.
* OpenSearch Dashboards is the UI for OpenSearch. On first appearance, use the full name *OpenSearch Dashboards*. *Dashboards* can be used for subsequent appearances.
* *Security Analytics* is a security information and event management (SIEM) solution for OpenSearch. Capitalize both words when referring to the name of the solution.
* Observability is collection of plugins and applications that let you visualize data-driven events by using Piped Processing Language (PPL). Capitalize *Observability* when referring to the name of the solution.
* Refer to OpenSearch Project customers as *users*, and refer to the larger group of users as *the community*. Do not refer to the OpenSearch Project or to the AWS personnel working on the project as a *team*, as this implies differentiation within the community.
#### Product names
Capitalize product names. The OpenSearch Project has three products: OpenSearch, OpenSearch Dashboards, and Data Prepper. For example:
* “To install *OpenSearch*, download the Docker image.”
* “To access *OpenSearch Dashboards*, open your browser and navigate to http://localhost:5601/app/home.”
* “*Data Prepper* contains the following components:”
Capitalize the names of clients and tools. For example:
* “The OpenSearch *Python* client provides a more natural syntax for interacting with your cluster.”
* “The *Go* client retries requests for a maximum of three times by default.”
* “The *OpenSearch Kubernetes Operator* is an open-source Kubernetes operator that helps automate the deployment and provisioning of OpenSearch and OpenSearch Dashboards in a containerized environment.”
* “You can send events to *Logstash* from many different sources.”
#### Features
Features are the individual building blocks of user experiences, reflect the functionality of a product, and are shared across different experiences. For example, the SQL/PPL, reporting, notifications, alerting, and anomaly detection used for observability are the same SQL/PPL, reporting, notifications, alerting, and anomaly detection used for general analytics, security analytics, and search analytics. Components of the user experience such as navigation, credentials management, theming, etc. are also considered to be features.
Use lowercase when referring to features, unless you are referring to a formally named feature that is specific to OpenSearch. For example:
* “The Notifications plugin provides a central location for all of your *notifications* from OpenSearch plugins.”
* “*Remote-backed storage* is an experimental feature. Therefore, we do not recommend the use of *remote-backed storage* in a production environment.”
* “You can take and restore *snapshots* using the snapshot API.”
* “You can use the *VisBuilder* visualization type in OpenSearch Dashboards to create data visualizations by using a drag-and-drop gesture” (You can refer to VisBuilder alone or qualify the term with “visualization type”).
#### Plugin names
A plugin is a feature or distinct component that extends the functionality of OpenSearch. For now, capitalize plugin names, but use *plugin* sparingly. The concept of plugins will become obsolete once we re-architect the product. For example:
* “Interaction with the *ML Commons* plugin occurs through either the REST API or [ad](https://opensearch.org/docs/latest/search-plugins/sql/ppl/functions#ad) and [kmeans](https://opensearch.org/docs/latest/search-plugins/sql/ppl/functions#kmeans) Piped Processing Language (PPL) commands.”
* “Use the *Neural Search* plugin to integrate ML language models into your search workloads.”
### Voice and tone
Voice is the point of view or style of a writer. Voice can refer to active or passive but may also refer to verb tense (past, present, future, and so on). Tone is the emotional undercurrent (such as calm or angry) of the voice. We strive to speak to the community with a consistent voice and tone, as if a single writer writes all content. Writing with a common voice also helps to establish the OpenSearch Project identity and brand.
#### Voice
The voice of the OpenSearch Project is people oriented and focused on empowering the user directly. We use language that emphasizes what the user can do with OpenSearch rather than what tasks OpenSearch can perform.
Whenever possible, use the active voice instead of the passive voice. The passive form is typically wordier and can often cause writers to obscure the details of the action. For example, change the agentless passive _it is recommended_ to the more direct _we recommend_.
Refer to the reader as _you_ (second person), and refer to the OpenSearch Project as _we_ (first person). If there are multiple authors for a blog post, you can use _we_ to refer to the authors as individuals. Do not refer to the OpenSearch Project or to the AWS personnel working on the project as a *team*, as this implies differentiation within the community.
In most cases, try to describe the actions that the user takes rather than contextualizing from the feature perspective. For example, use phrases such as “With this feature, you can...” or “Use this feature to...” instead of saying a feature *allows*, *enables*, or *lets* the user do something.
For procedures or instructions, ensure that action is taken by the user (“Then you can stop the container...”) rather than the writer (“We also have to stop the container...”). Reserve the first-person plural for speaking as the OpenSearch Project, with recommendations, warnings, or explanations.
In general, use the present tense. Use the future tense only when an event happens later than, not immediately after, the action under discussion.
#### Tone
The tone of the OpenSearch Project is conversational, welcoming, engaging, and open. The overall tone is knowledgeable but humble, informal but authoritative, informative but not dry, and friendly without being overly familiar.
We talk to readers in their own words, never assuming that they understand how OpenSearch works. We use precise technical terms where appropriate, but we avoid technical jargon and insider lingo. We speak to readers in simple, plain, everyday language.
Avoid excessive words, such as please. Be courteous but not wordy. Extra detail can often be moved elsewhere. Use humor with caution because it is subjective, can be easily misunderstood, and can potentially alienate your audience.
### Brand personality traits
| Personality trait | Description | Guidance |
| :--------- | :------- | :------ |
| **Clear and precise** | The OpenSearch Project understands that our community works, develops, and builds in roles and organizations that require precise thinking and thorough documentation. We strive to use precise language—to clearly say what we mean without leaving ideas open to interpretation, to support our assertions with facts and figures, and to provide credible and current (third-party) references where called for. <br> <br> We communicate in plain, direct language that is easily understood. Complex concepts are introduced in a concise, unambiguous way. High-level content is supported by links to more in-depth or technical content that users can engage with at their convenience. | - Write with clarity and choose words carefully. Think about the audience and how they might interpret your assertions. <br> - Be specific. Avoid estimates or general claims when exact data can be provided. <br> - Support claims with data. If something is “faster” or “more accurate,” say how much. <br> - When citing third-party references, include direct links. |
| **Transparent and open** | As an open-source project, we exchange information with the community in an accessible and transparent manner. We publish our product plans in the open on GitHub, share relevant and timely information related to the project through our forum and/or our blog, and engage in open dialogues related to product and feature development in the public sphere. Anyone can view our roadmap, raise a question or an issue, or participate in our community meetings. | - Tell a complete story. If youre walking the reader through a solution or sharing news, dont skip important information. <br> - Be forthcoming. Communicate time-sensitive news and information in a thorough and timely manner. <br> - If theres something the reader needs to know, say it up front. Dont “bury the lede.” |
| **Collaborative and supportive** | Were part of a community that is here to help. We aim to be resourceful on behalf of the community and encourage others to do the same. To facilitate an open exchange of ideas, we provide forums through which the community can ask and answer one anothers questions. | - Use conversational language that welcomes and engages the audience. Have a dialogue. <br> - Invite discussion and feedback. We have several mechanisms for open discussion, including requests for comment (RFCs), a [community forum](https://forum.opensearch.org/), and [community meetings](https://www.meetup.com/OpenSearch/).
| **Trustworthy and personable** | We stay grounded in the facts and the data. We do not overstate what our products are capable of. We demonstrate our knowledge in a humble but authoritative way and reliably deliver what we promise. We provide mechanisms and support that allow the audience to explore our products for themselves, demonstrating that our actions consistently match our words. <br> <br> We speak to the community in a friendly, welcoming, judgment-free way so that our audience perceives us as being approachable. Our content is people oriented and focused on empowering the user directly. | - Claims and assertions should be grounded in facts and data and supported accordingly. <br> - Do not exaggerate or overstate. Let the facts and results speak for themselves. <br> - Encourage the audience to explore our products for themselves. Offer guidance to help them do so. <br> - Write directly and conversationally. Have a dialogue with your audience. Imagine writing as if youre speaking directly to the person for whom youre creating content. <br> - Write from the community, for the community. Anyone creating or consuming content about OpenSearch is a member of the same group, with shared interest in learning about and building better search and analytics solutions. |
| **Inclusive and accessible** | As an open-source project, the OpenSearch Project is for everyone, and we are inclusive. We value the diversity of backgrounds and perspectives in the OpenSearch community and welcome feedback from any contributor, regardless of their experience level. <br> <br> We design and create content so that people with disabilities can perceive, navigate, and interact with it. This ensures that our documentation is available and useful for everyone and helps improve the general usability of content. <br> <br> We understand our community is international and our writing takes that into account. We use plain language that avoids idioms and metaphors that may not be clear to the broader community. | - Use inclusive language to connect with the diverse and global OpenSearch Project audience. <br> - Be careful with our word choices. <br> - Avoid [sensitive terms](https://github.com/opensearch-project/documentation-website/blob/main/STYLE_GUIDE.md#sensitive-terms). <br> - Don't use [offensive terms](https://github.com/opensearch-project/documentation-website/blob/main/STYLE_GUIDE.md#offensive-terms). <br> - Don't use ableist or sexist language or language that perpetuates racist structures or stereotypes. <br> - Links: Use link text that adequately describes the target page. For example, use the title of the target page instead of “here” or “this link.” In most cases, a formal cross-reference (the title of the page youre linking to) is the preferred style because it provides context and helps readers understand where theyre going when they choose the link. <br> - Images: <br> &nbsp;&nbsp;- Add introductory text that provides sufficient context for each image. <br> &nbsp;&nbsp;- Add ALT text that describes the image for screen readers. <br> - Procedures: Not everyone uses a mouse, so use device-independent verbs; for example, use “choose” instead of “click.” <br> - Location: When youre describing the location of something else in your content, such as an image or another section, use words such as “preceding,” “previous,” or “following” instead of “above” and “below.”
## Style guidelines
The following guidelines should be observed in OpenSearch Project content.
### Acronyms
Spell out acronyms the first time that you use them on a page and follow them with the acronym in parentheses. Use the format `spelled-out term (acronym)`. On subsequent use, use the acronym alone.
Do not capitalize the spelled-out form of an acronym unless the spelled-out form is a proper noun or the community generally capitalizes it. In all cases, our usage should reflect the communitys usage.
In general, spell out acronyms once on a page. However, you can spell them out more often for clarity.
Make an acronym plural by adding an *s* to the end of it. Do not add an apostrophe.
How an acronym is pronounced determines whether you use the article *an* or *a* before it. If it's pronounced with an initial vowel sound, use *an*. Otherwise, use *a*.
If the first use of an acronym is in a heading, retain the acronym in the heading, and then write out the term in the following body text, followed by the acronym in parentheses. Don't spell out the term in the heading with the acronym included in parentheses. If the first use of the service name is in a title or heading, use the short form of the name in the heading, and then use the long form followed by the short form in parentheses in the following body text.
In general, spell out abbreviations that end with *-bit* or *-byte*. Use abbreviations only with numbers in specific measurements. Always include a space between the number and unit. Abbreviations that are well known and don't need to be spelled out are *KB*, *MB*, *GB*, and *TB*.
Some acronyms are better known than their spelled-out counterparts or might be used almost exclusively. These include industry-standard protocols, markdown and programming languages, and common file formats. You don't need to spell out these acronyms.
The following table lists acronyms that you don't need to spell out.
| Acronym | Spelled-out term |
| :--------- | :------- |
| 3D | three-dimensional |
| AI | artificial intelligence |
| API | application programming interface |
| ASCII | American Standard Code for Information Interchange |
| BASIC | Beginner's All-Purpose Symbolic Instruction Code |
| BM25 | Best Match 25 |
| CLI | command-line interface |
| CPU | central processing unit |
| CRUD | create, read, update, and delete |
| CSV | comma-separated values |
| DNS | Domain Name System |
| DOS | disk operating system |
| FAQ | frequently asked questions |
| FTP | File Transfer Protocol |
| GIF | Graphics Interchange Format |
| HTML | hypertext markup language |
| HTTP | hypertext transfer protocol |
| HTTPS | hypertext transfer protocol secure |
| HTTP(s) | Use to refer to both protocols, HTTP and HTTPS. |
| I/O | input/output |
| ID | identifier |
| IP | Internet protocol |
| JPEG | Joint Photographic Experts Group |
| JSON | JavaScript Object Notation |
| k-NN | k-nearest neighbors |
| NAT | network address translation |
| NGINX | engine x |
| PDF | Portable Document Format |
| RAM | random access memory |
| REST | Representational State Transfer |
| RGB | red-green-blue |
| ROM | read-only memory |
| SAML | Security Assertion Markup Language |
| SDK | software development kit |
| SSL | Secure Sockets Layer |
| TCP | Transmission Control Protocol |
| TIFF | Tagged Image File Format |
| TLS | Transport Layer Security |
| UI | user interface |
| URI | uniform resource identifier |
| URL | uniform resource locator |
| UTC | Coordinated Universal Time |
| UTF | Unicode Transformation Format |
| XML | Extensible Markup Language |
| YAML | YAML Ain't Markup Language |
### Code examples
Calling out code within a sentence or code block makes it clear to readers which items are code specific. The following is general guidance about using code examples and when to use `code font`:
* In Markdown, use single backticks (`` ` ``) for inline code formatting and triple backticks (```` ``` ````) for code blocks. For example, writing `` `discovery.type` `` in Markdown will render as `discovery.type`. A line containing three backticks should be included both before and after an example code block.
* In sentences, use code font for things relating to code, for example, “The `from` and `size` parameters are stateless, so the results are based on the latest available data.”
* Use lead-in sentences to clarify the example. Exception: API examples, for which a caption-style lead-in (heading 4) is sufficient.
* Use the phrase *such as* for brief examples within a sentence.
* Use language-specific indentation in code examples.
* Make code blocks as copy-and-paste friendly as possible. Use either the [`copy` or `copy-curl` buttons](https://github.com/opensearch-project/documentation-website/blob/main/FORMATTING_GUIDE.md#buttons).
#### Code formatting checklist
The following items should be in `code font`:
* Field names, variables (including environment variables), and settings (`discovery.type`, `@timestamp`, `PATH`). Use code font for variable and setting values if it improves readability (`false`, `1h`, `5`, or 5).
* Placeholder variables. Use angle brackets for placeholder variables (`docker exec -it <container-id> /bin/bash`).
* Commands, command-line utilities, and options (`docker container ls -a`, `curl`, `-v`).
* File names, file paths, and directory names (`docker-compose.yml`, `/var/www/simplesamlphp/config/`).
* URLs and URL components (`localhost`, `http://localhost:5601`).
* Index names (`logs-000001`, `.opendistro-ism-config`), endpoints (`_cluster/settings`), and query parameters (`timeout`).
* Language keywords (`if`, `for`, `SELECT`, `AND`, `FROM`).
* Operators and symbols (`/`, `<`, `*`).
* Regular expression, date, or other patterns (`^.*-\d+$`, `yyyy-MM-dd`).
* Class names (`SettingsModule`) and interface names (*`RestHandler`*). Use italics for interface names.
* Text field inputs (Enter the password `admin`).
* Email addresses (`example@example.org`).
#### Caption-style examples
If you use a caption-style example, use the heading **Example**, with a colon, as appropriate. The following are caption-style examples:
**Example: Retrieve a specified document from an index**
The following example shows a request that retrieves a specific document and its information from an index:
`GET sample-index1/_doc/1`
**Example request**
`GET sample-index1/_doc/1`
Sometimes, you might not want to break up the flow of the text with a new heading. In these cases, you can use an example with no heading.
The following command maps ports 9200 and 9600, sets the discovery type to single-node, and requests the newest image of OpenSearch:
`docker run -d -p 9200:9200 -p 9600:9600 -e "discovery.type=single-node" opensearchproject/opensearch:latest`
#### Lead-in sentences
When using lead-in sentences, summarize, clarify, or refer to the example that follows. A lead-in sentence is a complete sentence that ends in a colon.
For example, the following query requests statistics for `docs` and `search`:
`GET _nodes/stats/indices/docs,search`
#### Referring to a variable or placeholder
When introducing a code or command line example that refers to a variable or placeholder in the example, be direct by including the variable or placeholder name in the text. Surround the variable or placeholder name with angle brackets (`<` and `>`), for example, `<port>`. Don't refer to the variable or placeholder by its color or format because these can change. If variable or placeholder texts have a lot in common and there are several for the user to complete, be direct by including a “template” for the input in the replaceable text.
In the following example, replace `<component-x>` with your own information:
`~/workspace/project-name$ eb init --modules <component-a> <component-b>`
### Formatting and organization
- Use a colon to introduce example blocks (for example, code and scripts) and most lists. Do not use a colon to introduce tables or images.
- Use bold text for all UI elements, including pages, panes, and dialog boxes. In all cases, emphasize what the user must do as opposed to talking about the UI element itself.
- Stacked headings should never appear in our content. Stacked headings are any two consecutive headings without intervening text. Even if it is just an introductory sentence, there should always be text under any heading.
- Use italics for the titles of books, periodicals, and reference guides. However, do not use italics when the title of a work is also a hyperlink.
- You can refer to APIs in three ways:
1. When referring to API names, capitalize all words in the name (example: "Field Capabilities API").
2. When referring to API operations by the exact name of the endpoint, use lowercase with code format (example: "`_field_caps` API").
3. When describing API operations but not using the exact name of the endpoint, use lowercase (example: "field capabilities API operations" or "field capabilities operations").
### Images
- Add introductory text that provides sufficient context for each image.
- Add ALT text that describes the image for screen readers.
- When youre describing the location of an image, use words such as *preceding*, *previous*, or *following* instead of *above* and *below*.
- Text that introduces an image should be a complete sentence and end with a period, not a colon.
### Links
- **Formal cross-references**: In most cases, a formal cross-reference (the title of the page you're linking to) is the preferred style because it provides context and helps readers understand where they're going when they choose the link. Follow these guidelines for formal cross-references:
- Introduce links with formal introductory text:
- Use "For information *about*" or "For more information *about*." Don't use "For information *on*."
- If you are linking to procedures, you can use either "For instructions *on*" or "instructions *for*." Don't use "instructions *about*."
- Where space is limited (for example, in a table), you can use "*See* [link text]."
- Ensure that the link text matches the section title text. <br> <br> Example: "To get involved, see [Contributing](https://opensearch.org/source.html) on the OpenSearch website." <br>
- **Embedded links**: Embedded links are woven into a sentence without formal introductory text. They're especially useful in tables or other elements where space is tight. The text around the embedded link must relate to the information in the link so that the reader understands the context. Do not use *here* or *click here* for link text because it creates accessibility problems. <br> <br> Example: "Finally, [delete the index](https://opensearch.org/docs/latest/api-reference/index-apis/delete-index)."
### Lists
The following guidelines apply to all list types:
- Make lists parallel in content and structure. Dont mix single words with phrases, dont start some phrases with a noun and others with a verb, and dont mix verb forms.
- Present the items in alphabetical order if the order of items is arbitrary.
- Capitalize the first letter of the first word of each list item.
- If the list is simple, you dont need end punctuation for the list items.
- If the list has a mixture of phrases and sentences, punctuate each list item.
- Punctuate each list item with a period if a list item has more than one sentence.
- Punctuate list items consistently. If at least one item in a list requires a period, use a period for all items in that list.
- Introductory sentences are required for lists.
- Introductory sentences should be complete sentences.
- Introductory sentences should end with a colon.
- Dont use semicolons, commas, or conjunctions (like and or or) at the end of list items.
### Numbers and measurement
- Spell out cardinal numbers from 1 to 9. For example, one NAT instance. Use numerals for cardinal numbers 10 and higher. Spell out ordinal numbers: first, second, and so on. In a series that includes numbers 10 or higher, use numerals for all. Use a comma separator for numbers of four digits or more—for example, 1,000.
- For descriptions that include time ranges, separate the numbers with an en dash. Avoid extra words such as between or from n to n.
- Correct: It can take 510 minutes before logs are available.
- Incorrect: It can take between 5 and 10 minutes before logs are available.
- Use numerals for all measurement-based references, including time. Include a space between the number and the abbreviation for the unit of measure.
- Correct:
- 100 GB
- 1 TB
- 3 minutes
- 12 subnets (8 public and 4 private)
- Incorrect
- One hundred GB
- 1TB
### Procedures
A procedure is a series of numbered steps that a user follows to complete a specific task. Users should be able to scan for and recognize procedures easily. Make procedures recognizable by using the following:
- Predictable content parts
- Parallel language constructions
- Consistent formatting
Use *example*, not *sample*, to introduce example blocks (for example, code, scripts, and API requests and responses).
#### Describing interactions with the UI
Replace pointer-specific verbs with device-agnostic/generic verbs to accommodate readers with disabilities and users of various input methods and devices, including the pointer, keyboard, and touch screens. Don't use device-specific verbs such as _click_ or _swipe_. However, when the generic language makes it difficult to understand the instructions, you can include pointer-specific hints in parentheses. Use your judgment. If you have a question, ask your editor.
We follow a slightly modified version of the _Microsoft Writing Style Guide_ guidance on describing interactions with a UI, provided here.
| Verb | Use for | Examples |
| :--------- | :------- | :------- |
| **Open** | - Apps and programs <br> - Files and folders <br> - Shortcut menus <br> Use for websites and webpages only when necessary to match the UI. Otherwise, use _go to_. <br> - Don't use for commands and menus. | - Open Photos. <br> - Open the Reader app. <br> - Open the Filename file. <br> - To open the document in Outline view, select **View** > **Outline**. <br> - In WindowName, open the shortcut menu for ItemName. |
| **Close** | - Apps and programs <br> - Dialog boxes <br> - Files and folders <br> - Notifications and alerts <br> - Tabs <br> - The action a program or app takes when it encounters a problem and can't continue. (Don't confuse with _stop responding_). | - Close the Alarms app. <br> - Close Excel. <br> - Save and close the document. <br> - Closing Excel also closes all open worksheets. |
| **Leave** | Websites and webpages | Select **Submit** to complete the survey and leave this page. |
| **Go to** | - Opening a menu. <br> - Going to a tab or another particular place in the UI. <br> - Going to a website or webpage. <br> - It's ok to use _On the **XXX** tab_ if the instruction is brief and continues immediately. | - Go to Search, enter the word **settings**, and then select **Settings**. <br> - Go to **File**, and then select **Close**. <br> - On the ribbon, go to the **Design** tab. <br> - Go to the **Deploy** tab. in the **Configuration** list ... <br> - On the **Deploy** tab, in the **Configuration** list ... <br> - Go to Example.com to register. |
| **Select** | Instructing the user to select a specific item, including: <br> - Selecting an option, such as a button. <br> - Selecting a checkbox. <br> - Selecting a value from a list box. <br> - Selecting link text to go to a link. <br> - Selecting an item on a menu or shortcut menu. <br> - Selecting an item from a gallery. | - Select the **Modify** button. <br> - For **Alignment**, select **Left**. <br> - Select the text, open the shortcut menu, and then select **Font**. <br> - Select **Open in new tab**. <br> - Select the **LinkName** link. |
| **Select and hold, select and hold (or right-click)** | Use to describe pressing and holding an element in the UI. It's OK to use _right-click_ with _select and hold_ when the instruction isn't specific to touch devices. | - To flag a message that you want to deal with later, select and hold it, and then select **Set flag**. <br> - Select and hold (or right-click) the Windows taskbar, and then select **Cascade windows**. <br> - Select and hold (or right-click) the **Start** button, and then select **Device Manager**. |
| **>** | Use a greater-than symbol (>) to separate sequential steps. <br> Only use this approach when there's a clear and obvious path through the UI and the selection method is the same for each step. For example, don't mix things that require opening, selecting, and choosing. <br> Don't bold the greater-than symbol. Include a space before and after the symbol. | Select **Accounts** > **Other accounts** > **Add an account**. |
| **Clear** | Clearing the selection from a checkbox. | Clear the **Header row** checkbox. |
| **Choose** | Choosing an option, based on the customer's preference or desired outcome. | On the **Font** tab, choose the effects you want. |
| **Switch, turn on, turn off** | Turning a toggle key or toggle switch on or off. | - Use the **Caps lock** key to switch from typing capital letter to typing lowercase letters. <br> - To keep all applied filters, turn on the **Pass all filters** toggle. |
| **Enter** | Instructing the customer to type or otherwise insert a value, or to type or select a value in a combo box. | - In the search box, enter... <br> - In the **Tab stop position** box, enter the location where you want to set the new tab. <br> - In the **Deployment script name** box, enter a name for this script. |
| **Move, drag** | Moving anything from one place to another by dragging, cutting and pasting, or another method. Use for tiles and any open window (including apps, dialog boxes, and files). <br> Use _move through_ to describe moving around on a page, moving through screens or pages in an app, or moving up, down, right, and left in a UI. | - Drag the Filename file to the Foldername folder. <br> - Move the tile to the new section. <br> - Drag the Snipping Tool out of the way, if necessary, and then select the area you want to capture. <br> - If the **Apply Styles** task pane is in your way, just move it. |
| **Press** | Use _press_ to describe single key or key combination entries that users would perform on a keyboard, such as keyboard shortcuts. | - Press **F5**. <br> - Press **Shift+Enter**. <br> - Press **Ctrl+Alt+Delete**. |
| **Zoom, zoom in, zoom out** | Use _zoom_, _zoom in_, and _zoom out_ to refer to changing the magnification of the screen or window. | - Zoom in to see more details on the map. <br> - Zoom out to see a larger geographic area on the map. <br> - Zoom in or out to see more or less detail. |
### Punctuation and capitalization
- Use only one space after a period.
- Use contractions carefully for a more casual tone. Use common contractions. Avoid future tense (Ill), archaic (twas), colloquial (aint), or compound (couldntve) contractions.
- Use sentence case for titles, headings, and table headers. Titles of standalone documents may use title case.
- Use lowercase for nouns and noun phrases that are not proper nouns; for example, *big data*. This style follows the standard rules of American English grammar.
- For plural forms of nouns that end in “s”, form the possessive case by adding only an apostrophe.
- When a colon introduces a list of words, a phrase, or other sentence fragment, the first word following the colon is lowercased unless it is a proper name. When a colon introduces one or more complete sentences, the first word following it is capitalized. When text introduces a table or image, it should be a complete sentence and end with a period, not a colon.
- Use commas to separate the following:
- Independent clauses separated by coordinating conjunctions (but, or, yet, for, and, nor, so).
- Introductory clauses, phrases, words that precede the main clause.
- Words, clauses, and phrases listed in a series. Also known as the Oxford comma.
- An em dash (—) is the width of an uppercase M. Do not include spacing on either side. Use an em dash to set off parenthetical phrases within a sentence or set off phrases or clauses at the end of a sentence for restatement or emphasis.
- An en dash () is the width of an uppercase N. In ranges, do not include spacing on either side. Use an en dash to indicate ranges in values and dates, separate a bullet heading from the following text in a list, or separate an open compound adjective (two compounds, only one of which is hyphenated) from the word that it modifies.
- Words with prefixes are normally closed (no hyphen), whether they are nouns, verbs, adjectives, or adverbs. Note that some industry terms dont follow this hyphenation guidance. For example, *Command Line Interface* and *high performance computing* arent hyphenated, and *machine learning* isnt hyphenated when used as an adjective. Other terms are hyphenated to improve readability. Examples include *non-production*, *post-migration*, and *pre-migration*.
- In general, comparative or superlative modifiers with “more,” “most,” “less,” or “least” dont require hyphens. Use one only if its needed to avoid ambiguity.
- The ampersand (&) should never be used in a sentence as a replacement for the word and. An exception to this is in acronyms where the ampersand is commonly used, such as in Operations & Maintenance (O&M).
- When using a forward slash between words, do not insert space on either side of the slash. For example, *AI/ML* is correct whereas *AI / ML* is incorrect.
- When referring to API parameters, capitalize *Boolean*. Otherwise, primitive Java data types (*byte*, *short*, *int*, *long*, *float*, *double*, and *char*) start with a lowercase letter, while non-primitive types start with an uppercase letter.
### Topic titles
Here are two styles you can use for topic titles:
* *Present participle phrase* + *noun-based phrase* or *present participle phrase* + *preposition* + *noun-based phrase*, used most often for concept or task topics. For example:
* Configuring security
* Visualizing your data
* Running queries in the console
* *Noun-based phrase*, used most often for reference topics. For example:
* REST API reference
* OpenSearch CLI
* Field types
* Security analytics
Use *example*, not *sample*, in headings that introduce example blocks (for example, code, scripts, and API requests and responses).
## UI text
Consistent, succinct, and clear text is a critical component of a good UI. We help our users complete their tasks by providing simple instructions that follow a logical flow.
### UI best practices
* Follow the OpenSearch Project [naming conventions, voice, tone, and brand personality traits](#naming-conventions-voice-tone-and-brand-personality-traits) guidelines.
* Be consistent with other elements on the page and on the rest of the site.
* Use sentence case in the UI, except for product names and other proper nouns.
### UI voice and tone
Our UI text is people oriented and focused on empowering the user directly. We use language that is conversational, welcoming, engaging, and open and that emphasizes what the user can do with OpenSearch rather than what tasks OpenSearch can perform. The overall tone is knowledgeable but humble, informal but authoritative, informative but not dry, and friendly without being overly familiar.
We talk to readers in their own words, never assuming that they understand how OpenSearch works. We use precise technical terms where appropriate, but we avoid technical jargon and insider lingo. We speak to readers in simple, plain, everyday language.
For more information, see [Voice and tone](#voice-and-tone) and [Brand personality traits](#brand-personality-traits).
### Writing guidelines
UI text is a critical component of a user interface. We help users complete tasks by explaining concepts and providing simple instructions that follow a logical flow. We strive to use language that is consistent, succinct, and clear.
#### What's the purpose of UI text?
UI text includes all words, phrases, and sentences on a screen, and it has the following purposes:
* Describes a concept or defines a term
* Explains how to complete a task
* Describes the purpose of a page, section, table, graph, or dialog box
* Walks users through tutorials and first-run experiences
* Provides context and explanation for individual UI elements that might be unfamiliar to users
* Helps users make a choice or decide if settings are relevant or required for their particular deployment scenario or environment
* Explains an alert or error
#### Basic guidelines
Follow these basic guidelines when writing UI text.
##### Style
* Keep it short. Users dont want to read dense text. Remember that UI text can expand by 30% when its translated into other languages.
* Keep it simple. Try to use simple sentences (one subject, one verb, one main clause and idea) rather than compound or complex sentences.
* Prefer active voice over passive voice. For example, "You can attach up to 10 policies" is active voice, and "Up to 10 policies can be attached" is passive voice.
* Use device-agnostic language rather than mouse-specific language. For example, use _choose_ instead of _click_ (exception: use _select_ for checkboxes).
##### Tone
* Use a tone that is knowledgeable but humble, informal but authoritative, informative but not dry, and friendly without being overly familiar.
* Use everyday language that most users will understand.
* Use second person (you, your) when you address the user.
* Use _we_ if you need to refer to the OpenSearch Project as an organization; for example, "We recommend…."
##### Mechanics
* Use sentence case for all UI text. (Capitalize only the first word in a sentence or phrase as well as any proper nouns, such as service names. All other words are lowercase.)
* Use parallel construction (use phrases and sentences that are grammatically similar). For example, items in a list should start with either all verbs or all nouns.
**Correct**
Snapshots have two main uses:
* Recovering from failure
* Migrating from one cluster to another
**Incorrect**
Snapshots have two main uses:
* Failure recovery
* Migrating from one cluster to another
* Use the serial (Oxford) comma. For example, “issues, bug fixes, and features”, not “issues, bug fixes and features”.
* Dont use the ampersand (&).
* Avoid Latinisms, such as _e.g._, _i.e._, or _etc._ Instead of _e.g._, use _for example_ or _such as_. Instead of _i.e._, use _that is_ or _specifically_. Generally speaking, _etc._ and its equivalents (such as _and more_ or _and so on_) arent necessary.
## Special considerations for blog posts
Blog posts provide an informal approach to educating or inspiring readers through the personal perspective of the authors. Brief posts generally accompany service or feature releases, and longer posts may note best practices or provide creative solutions. Each post must provide a clear community benefit.
To enhance the strengths of the blogging platform, follow these post guidelines:
**Be conversational and informal.**
Posts tend to be more personable, unlike technical documentation. Ask questions, include relevant anecdotes, add recommendations, and generally try to make the post as approachable as possible. However, be careful of slang, jargon, and phrases that a global audience might not understand.
**Keep it short.**
Deep topics dont necessarily require long posts. Shorter, more focused posts are easier for readers to digest. Consider breaking a long post into a series, which can also encourage repeat visitors to the blog channel.
**Avoid redundancy.**
Posts should add to the conversation. Instead of repeating content that is already available elsewhere, link to detail pages and technical documentation. Keep only the information that is specific to the post solution or recommendations.
**Connect with other content.**
All posts should contain one or more calls to action that give readers the opportunity to create resources, learn more about services or features, or connect with other community members. Posts should also include metadata tags such as services, solutions, or learning levels to help readers navigate to related content.
## Inclusive content
When developing OpenSearch Project documentation, we strive to create content that is inclusive and free of bias. We use inclusive language to connect with the diverse and global OpenSearch Project audience, and we are careful in our word choices. Inclusive and bias-free content improves clarity and accessibility of our content for all audiences, so we avoid ableist and sexist language and language that perpetuates racist structures or stereotypes. In practical terms, this means that we do not allow certain terms to appear in our content, and we avoid using others, *depending on the context*.
Our philosophy is that we positively impact users and our industry as we proactively reduce our use of terms that are problematic in some contexts. Instead, we use more technically precise language and terms that are inclusive of all audiences.
### Offensive terms
The following terms may be associated with unconscious racial bias, violence, or politically sensitive topics and should not appear in OpenSearch Project content, if possible. Note that many of these terms are still present but on a path to not being supported. For example, `slave` was removed from the Python programming language in 2018, and the open-source community continues to work toward replacing these terms.
| Dont use | Guidance/Use instead |
|----------------|-----------------------------|
| abort | Don't use because it has unpleasant associations and is unnecessarily harsh sounding. Use *stop*, *end*, or *cancel* instead. |
| black day | blocked day |
| blacklist | deny list |
| kill | Don't use. Replace with *stop*, *end*, *clear*, *remove*, or *cancel*. <br><br> Exception: *Kill* is unavoidable when referring to Linux kill commands. |
| master | primary, main, leader |
| master account | management account |
| slave | replica, secondary, standby |
| white day | open day |
| whitelist | allow list |
### Sensitive terms
The following terms may be problematic *in some contexts*. This doesnt mean that you cant use these terms—just be mindful of their potential associations when using them, and avoid using them to refer to people.
| Avoid using | Guidance/Use instead |
|--------------------------|-------------------------------------|
| blackout | service outage, blocked |
| demilitarized zone (DMZ) | perimeter network, perimeter zone |
## Trademark policy
The “OpenSearch” word mark should be used in its exact form and not abbreviated or combined with any other word or words (e.g., “OpenSearch” software rather than “OPNSRCH” or “OpenSearch-ified”). See the [OpenSearch Trademark Policy](https://opensearch.org/trademark-usage.html) for more information. Also refer to the policy and to the [OpenSearch Brand Guidelines](https://opensearch.org/brand.html) for guidance regarding the use of the OpenSearch logo. When using another partys logo, refer to that partys trademark guidelines.

807
TERMS.md Normal file
View File

@ -0,0 +1,807 @@
# OpenSearch terms
This is how we use our terms, but were always open to hearing your suggestions.
## A
**abort**
Do not use because it has unpleasant associations and is unnecessarily harsh sounding. Use *stop*, *end*, or *cancel* instead.
**above**
Use only for physical space or screen descriptions, for example, "the outlet above the floor" or "the button above the bar pane."
For orientation within a document use *previous*, *preceding*, or *earlier*.
**ad hoc**
Avoid. Use *one-time* instead.
**affect**
Affect as a noun refers to emotion as expressed in face or body language. Affect as a verb means to influence. Do not confuse with effect.
**AI**
No need to define as _artificial intelligence (AI)_.
**AI/ML**
On first mention, use artificial intelligence and machine learning (AI/ML).
**Alerting**
A plugin that notifies you when data from one or more OpenSearch indexes meets certain conditions.
**allow**
Use allow when the user must have security permissions in order to complete the task.
Avoid using allow to refer to making something possible for the user. Instead, rewrite to focus on whats important from the users point of view.
**allow list**
Use to describe a list of items that are allowed (not blocked). Do not use as a verb. Do not use whitelist.
**Amazon OpenSearch Service**
Amazon OpenSearch Service is a managed service that makes it easy to deploy, operate, and scale OpenSearch clusters in the AWS Cloud. Amazon OpenSearch Service is the successor to Amazon Elasticsearch Service (Amazon ES) and supports OpenSearch and legacy Elasticsearch OSS (up to 7.10, the final open-source version of the software).
**Anomaly Detection**
A plugin that automatically detects anomalies in your OpenSearch data in near real time.
**API operation**
Use instead of action, method, or function.
OpenSearch style:
- Use the CopySnapshot operation to...
- The following API operations…
Not OpenSearch style
- Use the CopySnapshot action to...
- Use the CopySnapshot method to...
- Use the CopySnapshot function to...
**app or application**
Use app for mobile software, application for all other uses.
**appear, display, and open**
Messages and pop-up boxes appear. Windows, pages, and applications open. The verb display requires a definite object. For example: The system displays the error message.
**application server**
Do not abbreviate as app server.
**as well as**
Avoid. Replace with in addition to or and as appropriate.
**Asynchronous Search**
A plugin that lets the user send search requests in the background so that the results can be used later.
**auto scaling**
Lower case scaling, auto scaling, and automatic scaling (but not autoscaling) are the preferred descriptive terms when generically describing auto scaling functionality.
Do not use hyphenated auto-scaling as a compound modifier. Instead, use scaling (for example, scaling policy), or scalable (for example, scalable target or scalable, load-balanced environment).
**AWS Signature Version 4**
Use on first appearance. On subsequent appearances, *Signature Version 4* may be used. Only use *SigV4* when space is limited.
## B
**below**
Use only for physical space or screen descriptions, such as “the outlet below the vent,” or “the button below the bar pane.”
For orientation within a document, use *following* or *later*.
**big data**
**black day**
Do not use. Use *blocked day* instead.
**blacklist**
Do not use. Use *deny list* instead.
**blackout**
Avoid using. Use *service outage* or *blocked* instead.
**BM25**
A ranking function used to estimate the relevance of documents to a given search query. BM25 extends [TFIDF](#t) by normalizing document length.
**Boolean**
Avoid using the name of a Boolean value at the beginning of a sentence or sentence fragment. In general, capitalize the word Boolean. For specific programming languages, follow the usage in that language.
OpenSearch style:
- You can use the Boolean functions with Boolean expressions or integer expressions.
- IsTruncated(): A Boolean value that specifies whether the resolved target list is truncated.
**bottom**
Use only as a general screen reference, such as “scroll to the bottom of the page.” Dont use for window, page, or pane references to features or controls. Rather, use *lower* instead. For example, you can use the following wording: “Choose the button on the lower left.”
**browse**
Use when referring to scanning information or browsing the web. Dont use when describing how to navigate to a particular item on our site or a computer. Instead, use *see* or *navigate to*.
**build (n., v.)**
Use as a verb to refer to compiling and linking code. Use as a noun only to refer to a compiled version of a program (for example, *Use the current build of Amazon Linux 2*...) in a programming reference.
## C
**CA**
certificate authority
**certs, certificates**
Use _certificates_ on first mention. Its OK to use _certs_ thereafter.
**checkbox, checkboxes**
**CI/CD**
Use _continuous integration_ and _continuous delivery (CI/CD)_ or _continuous integration and delivery (CI/CD)_ on first mention.
**CLI**
No need to define as _command-line interface (CLI)_.
**cluster**
A collection of one or more nodes.
**cluster manager**
A single node that routes requests for the cluster and makes changes to other nodes. Each cluster contains a single cluster manager.
**command line, command-line**
Two words as a noun. Hyphenate as an adjective.
**console**
A tool inside OpenSearch Dashboards used to interact with the OpenSearch REST API.
**Cross-Cluster Replication**
A plugin that replicates indexes, mappings, and metadata from one OpenSearch cluster to another. Follows an active-passive model where the follower index pulls data from a leader index.
**cyber**
Except when dictated by open standards, use as a prefix in a closed compound: dont use spaces or hyphens between _cyber_ and the rest of the word.
## D
**data**
Use data is, not data are. Dont use datas. Use pieces of data or equivalent to describe individual items within a set of data.
**data center**
**dataset**
**data source**
**data store, datastore**
Two words when used generically, but one word when referring to the VMware product.
**data type**
**dates**
Use one of the following date formats:
- When a human-readable date format is preferred, spell out the date using the Month D, YYYY format (for example, _October 1, 2022_). Do not use an ordinal number for the day (use _1_, not _1st_). If the context is clear, you can omit the year on subsequent mention. If the specific day isnt known, use the Month YYYY format (for example, _October 2022_).
- When a numeric, lexicographically sortable date is required, use the YYYY-MM-DD format (for example, _2022-10-01_). Make sure to add a zero (0) in front of a single-digit month and day. This is the ISO 8601 standard date format. Make sure also that you use a hyphen (-) and avoid omitting the year. Doing so avoids the ambiguity thats caused by the common, locally used formats of MM/DD and DD/MM.
**demilitarized zone (DMZ)**
Avoid using. Use *perimeter network* or *perimeter zone* instead.
**deny list**
Use to describe a list of items that arent allowed (blocked). Do not use _blacklist_.
**disable**
Use *disable* to describe making a feature or command unavailable. For example:
- Clear the checkbox to disable automatic monitoring.
- The feature is disabled by default.
Note that alternatives to *disable*—such as *deactivate*, *turn off*, or *stop*—are acceptable usage where appropriate and may be found in existing documentation. In all cases, use language that corresponds to the language used in the UI, if applicable.
Do not use *disable* to refer to users.
**double-click**
Always hyphenated. Dont use _double click_.
**dropdown list**
**due to**
Dont use. Use _because of_ instead.
## E
**easy, easier, easily**
Avoid the use of *easy*, *easier*, or *easily* if possible when describing or comparing an OpenSearch Project product, feature, or procedure in technical content. Use of these terms is audience dependent. These terms are potentially misleading or inaccurate and might be perceived as condescending by some technical users. Instead, describe what the user can do.
On documentation landing pages, its acceptable to use *easy*, *easier*, or *easily* within the service description only.
**effect**
_Effect_ as a noun refers to something thats caused by something else. _Effect_ as a verb means to bring about. Do not confuse with _affect_.
**e.g.**
Avoid. Use _for example_ or _such as_ instead.
**Elastic IP address**
**email**
Use as a singular noun or adjective to refer to the collective concept, and use _message_ or _mail_ for individual items. Use _send email_ as the verb form. Dont use the plural form because its a collective noun.
**enable**
Use *enable* to describe making a feature or command available. For example:
- Select the checkbox to enable automatic monitoring.
- The feature is enabled by default.
Note that alternatives to *enable*—such as *activate*, *turn on*, or *start*—are acceptable usage where appropriate and may be found in existing documentation. In all cases, use language that corresponds to the language used in the UI, if applicable.
Avoid using *enable* to refer to making something possible for the user. Instead, rewrite to focus on what's important from the user's point of view. For example, “With ABC, you can do XYZ” is a stronger statement than “ABC enables you to XYZ.” Additionally, using a task-based statement is usually more clear than the vague “…enables you to….”
**enter**
In general, use in preference to _type_ when a user adds text or other input (such as numbers or symbols).
**etc., et cetera**
Do not use.
Generally speaking, etc. and its equivalents (such as and more or and so on) arent necessary.
**execute**
Replace with a more specific verb. In the sense of carrying out an action, use *run*, *process*, or *apply*. In the sense of initiating an operation, use *start*, *launch*, or *initiate*.
Exception: *Execution* is unavoidable for third-party terms for which no alternative was determined, such as SQL execution plans. *Executable* is also unavoidable.
## F
**fail over (v.), failover (n.)**
**Faiss**
**file name**
**frontend (n., adj.)**
Use frontend as an adjective and a noun. Do not use front end or front-end. Do not make frontend possessive except as part of a compound noun, such as frontend system.
## G
**generative AI**
Do not use _GenAI_, _Gen AI_, _gen AI_, or _genAI_. To avoid the overuse of *generative AI*, *AI/ML-powered applications* may also be used.
**geodistance**
**geohash**
**geohex**
**geopoint**
**geopolygon**
**geoshape**
**geospatial**
**geotile**
## H
**hang**
Do not use. This term is unnecessarily violent for technical documentation. Use *stop responding* instead.
**hardcode**
**hard disk drive (HDD)**
**high availability (HA)**
**high performance computing (HPC)**
**hostname**
**Hugging Face**
## I
**i.e.**
Do not use. Use _that_ is or _specifically_ instead.
**if, whether**
Do not use *if* to mean *whether*. It is best to use *whether* in reference to a choice or alternatives ("we're going whether it rains or not") and *if* when establishing a condition ("we will go if it doesn't rain").
**in, on**
Use _in Windows_ or _in Linux_ in reference to components of the OS or work in the OS. Use on Windows in reference to Windows applications. Examples:
- Use the Devices and Printers Control Panel in Windows to install a new printer.
- In Windows, run the setup command.
- Select an application that runs on Windows.
Run applications and instances _in the cloud_, but extend services to the cloud.
Use *on the forum*. Whatever is on the internet (the various websites, etc.), you are *on* because you cannot be *in* it.
**index, indexes**
In technical documentation and the UI, use *indexes* as the plural form of *index*. Use *indices* only in the context of mathematical expressions. Variable and setting names should not be changed.
In blog posts, use the plural *indexes* unless there is a domain-specific reason (for example, a mathematical or financial context) to use *indices*.
**Index Management (IM)**
**Index State Management (ISM)**
**ingest pipeline**
Not _ingestion pipeline_.
**inline**
**install in, on**
install in a folder, directory, or path; install on a disk, drive, or instance.
**internet**
Do not capitalize.
**invalid**
Avoid using. Use *not valid* instead.
**IP address**
Dont abbreviate as _IP only_.
## J
**just**
Use *just* in the sense of *just now* (as in "the resources that you just created"). Otherwise, use *only* in all other contexts (to mean "limited to; nothing more than").
## K
**keystore**
**key-value**
Not _key/value_.
**kill**
Do not use. Replace with *stop*, *end*, *clear*, *remove*, or *cancel*.
Exception: *Kill* is unavoidable when referring to Linux kill commands.
**k-means**
A simple and popular unsupervised clustering ML algorithm built on top of Tribuo library that chooses random centroids and calculates iteratively to optimize the position of the centroids until each observation belongs to the cluster with the nearest mean.
**k-NN**
Short for _k-nearest neighbors_, the k-NN plugin enables users to search for the k-nearest neighbors to a query point across an index of vectors. No need to define.
## L
**launch, start**
You _start_ an application but _launch_ an instance, environment, or cluster.
**let**
Avoid using _let_ to refer to making something in a service or feature possible for the user. Instead, rewrite to focus on whats important from the users point of view.
**leverage**
Replace with _use_.
**lifecycle**
One word in reference to software.
**like (prep.)**
OK to use to call out something for comparison.
As a general rule, if you can replace like with similar to, its OK to use like. But, if you can replace _like_ with _such as_, use _such as_.
**LLM**
Define on first appearance as _large language model (LLM)_.
**locate in, on**
Located _in_ (a folder, directory, path), located on a disk drive or instance.
**log in (v.), login (adj., n.)**
Use with technologies with interfaces that use this verb. Also note that you log in to an instance, not log into. Also use log out and logout.
**Logstash**
A light-weight, open-source, server-side data processing pipeline that allows you to collect data from a variety of sources, transform it on the fly, and send it to your desired destination.
**lower left, lower right**
Hyphenate as adjectives. Use instead of *bottom left* and *bottom right*, unless the field name uses *bottom*. For example, "The lower-right corner."
**LTS**
Long-Term Support
**Lucene**
Apache Lucene™ is a high-performance, full-featured search engine library written entirely in Java. OpenSearch uses a modified version of Lucene as the basis for search operations within OpenSearch.
## M
**machine learning**
When *machine learning* is used multiple times in a document, use *machine learning (ML)* on first mention and *ML* thereafter. There is no need to redefine *ML* when *AI/ML* has already been defined. If spelled out, write *machine learning* as two words (no hyphen) in all cases, including when used as an adjective before a noun.
**Machine Learning (ML) Commons**
A new plugin that makes it easy to develop new ML features. It allows engineers to leverage existing open-source ML algorithms and reduce the efforts to build them from scratch.
**master**
Do not use. Use *primary*, *main*, or *leader* instead.
**master account**
Do not use. Use *management account* instead.
**may**
Avoid. Use _can_ or _might_ instead.
**multilayer, multilayered**
**must, shall, should**
_Must_ and _shall_ refer to requirements. If the reader doesnt follow the instruction, something wont work right.
_Should_ is used with recommendations. If the reader doesnt follow the instruction, it might be harder or slower, but itll work.
## N
**navigate to**
Not navigate _in_.
**near real time (n.), near real-time (adj.) (NRT)**
Use _near real time_ as a noun; use near real-time as an adjective. Dont add a hyphen between _near_ and _real time_ or _real-time_.
Spell out _near real time_ on first mention; _NRT_ can be used on subsequent mentions.
**node**
A server that stores your data and processes search requests with OpenSearch, usually as part of a cluster. Do not use _master node_ and avoid using _worker node_.
**non-production**
Hyphenate to make the term easier to scan and read.
## O
**onsite**
**OpenSearch**
OpenSearch is a community-driven, open-source search and analytics suite derived from Apache 2.0 licensed Elasticsearch 7.10.2 and Kibana 7.10.2. It consists of a search engine daemon, OpenSearch, and a visualization and user interface, OpenSearch Dashboards.
**OpenSearch Dashboards**
The default visualization tool for data in OpenSearch. On first appearance, use the full name. *Dashboards* may be used on subsequent appearances.
open source (n.), open-source (adj.)
Use _open source_ as a noun (for example, “The code used throughout this tutorial is open source and can be freely modified”). Use _open-source_ as an adjective _(open-source software)_.
**OpenSearch Playground**
Do not precede with _the_. OpenSearch Playground provides a central location for existing and evaluating users to explore features in OpenSearch and OpenSearch Dashboards without downloading or installing any OpenSearch components locally.
**operating system**
When referencing operating systems in documentation, follow these guidelines:
- In general, if your docs or procedures apply to both Linux and macOS, you can also include Unix.
- Unix and UNIX arent the same. UNIX is a trademarked name thats owned by The Open Group. In most cases, you should use Unix.
- When referring to the Mac operating system, use macOS. Dont say Mac, Mac OS, or OS X.
- When referring to Windows, its not necessary to prefix with Microsoft.
- If you need to reference multiple Unix-like operating systems, you should separate by commas and use the following order: Linux, macOS, or Unix.
**or earlier, or later**
OK to use with software versions.
## P
**Painless**
The default scripting language for OpenSearch, either used inline or stored for repeat use. Similar to Javas language specification.
**per**
- Do not use to mean _according to_ (for example, per the agreement).
- OK to use in meaning of _to_, _in_, _for_, or _by each_ (one per account) where space is limited and in set terms and phrases, such as any of the following:
- queries per second (QPS)
- bits per second (bps)
- megabytes per second (MBps)
- Consider writing around _per_ elsewhere. _Per_ can sound stuffy and confusing to some global users.
**percent**
Spell out in blog posts (for example, _30 percent_).
Use % in headlines, quotations, and tables or in technical copy.
**Performance Analyzer**
An agent and REST API that allows you to query numerous performance metrics for your cluster, including aggregations of those metrics, independent of the Java Virtual Machine (JVM).
**please**
Avoid using except in quoted text.
**plugin**
Tools inside of OpenSearch that can be customized to enhance OpenSearchs functionality. For a list of core plugins, see the [OpenSearch plugin installation]({{site.url}}{{site.baseurl}}/opensearch/install/plugins/) page. Capitalize if it appears as part of the product name in the UI.
**pop-up**
**premise, premises**
With reference to property and buildings, always form as plural.
Correct: an on-premises solution
Incorrect: an on-premise solution, an on-prem solution
**pretrain**
**primary shard**
A Lucene instance that contains data for some or all of an index.
**purge**
Use only in reference to specific programming methods. Otherwise, use *delete*, *clear*, or *remove* instead.
## Q
**query**
A call used to request information about your data.
## R
**real time (n.) real-time (adj.)**
Use with caution; this term can imply a degree of responsiveness or speed that may not be true. When needed, use _real time_ as a noun (for example “The request is sent in real time”). Use _real-time_ as an adjective (“A real-time feed is displayed...”).
**recall**
The quantity of documents returned from a query.
**replica shard**
Copy of a primary shard. Helps improve performance when using indexes across multiple nodes.
**repo**
Use as a synonym for repository, on second and subsequent use.
**RPM Package Manager (RPM)**
Formerly known as RedHat Package Manager. An open-source package management system for use with Linux distributions.
**rule**
A set of conditions, internals, and actions that create notifications.
## S
**screenshot**
**segregate**
Avoid using. Use *separate* or *isolate* instead.
**setting**
A key-value pair that creates a mapping in one of the many YAML configuration files used throughout OpenSearch. Sometimes alternatively called parameters, the programming language manipulating the key-value pair usually dictates the name of this mapping in a YAML file. For OpenSearch documentation (Java), they are properly a `Setting` object.
The following examples of settings illustrate key-value pairs with a colon separating the two elements:
`Settings.index.number_of_shards: 4`
`plugins.security.audit.enable_rest: true`
**set up (v.), setup (n., adj.)**
Use _set up_ as a verb (“To set up a new user...”). Use _setup_ as a noun or adjective (“To begin setup...”).
**shard**
A piece of an index that consumes CPU and memory. Operates as a full Lucene index.
**simple, simply**
Don't use. Both *simple* and *simply* are not neutral in tone and might sound condescending to some users. If you mean *only*, use *only* instead.
**since**
Use only to describe time events. Dont use in place of because.
**slave**
Do not use. Use *replica*, *secondary*, or *standby* instead.
**Snapshot Management (SM)**
**solid state drive (SSD)**
**standalone**
**start, launch**
You _start_ an application but _launch_ an instance, environment, or cluster.
**startup (n.), start up (v.)**
Never hyphenated. Use _startup_ as a noun (for example, “The following startup procedure guides you through...”). Use _start up_ as a verb (“You can start up the instances by...”).
**Stochastic Gradient Descent (SGD)**
## T
**term frequencyinverse document frequency (TFIDF)**
A numerical statistic that is intended to reflect how important a word is to a document in a collection or corpus.
**time out (verb), timeout (noun, adjective)**
Never hyphenate. Use _time out_ as a verb (“The request will time out if the server doesnt respond”). Use _timeout_ as a noun or adjective (“You can set the timeout interval by entering a number into...”).
**time frame**
**time-series data**
Data that's provided as part of a metric. The time value is assumed to be when the value occurred.
**timestamp**
**time zone**
**trade-off**
**trigger**
Avoid using as a verb to refer to an action that precipitates a subsequent action. It is OK to use when referring to a feature name, such as a *trigger function* or *time-triggered architecture*. As a verb, use an alternative, such as *initiate*, *invoke*, *launch*, or *start*.
**truststore**
**turn on, turn off**
Use *turn on* and *turn off* in reference to a toggle to describe switching a setting or mode on or off.
Don't use *choose*, *select*, *clear*, *slide*, *enable*, or *disable* for a toggle.
For making a feature available or unavailable, use *enable*.
## U
**UltraWarm**
A storage tier that you can use to store and analyze your data with Elasticsearch and Kibana that is optimized for performance. To learn more about the service, see the introductory [blog post](https://aws.amazon.com/about-aws/whats-new/2020/05/aws-announces-amazon-elasticsearch-service-ultrawarm-general-availability/).
**upper left, upper right**
Hyphenate as adjectives. Use instead of *top left* and *top right*, unless the field name uses *top*. For example, "The upper-right corner."
**US**
No periods, as specified in the Chicago Manual of Style.
**user**
In most cases, replace with the more direct form you. Reserve _user_ for cases where you are referring to a third party (not the audience you are writing for).
**username**
## V
**version**
**v., vs., versus**
Do not use. Use _compared_ to or _compared with_ instead.
**via**
Do not use. Replace with by using, through, or with or a more specific phrase such as by accessing or by choosing.
## W
**web**
**webpage**
Never _web page_.
**website**
Never _web site_.
**while, although, whereas**
Only use _while_ to mean “during an interval of time.” Dont use it to mean although because it is often ambiguous. _Whereas_ is a better alternative to although in many cases, but it can sound overly formal.
**white day**
Do not use. Use *open day* instead.
**whitelist**
Do not use. Use *allow list* instead.
**white space**
**wish, want, desire, need**
_Wish_ and _desire_ are indirect and nuanced versions of _want_. Dont use them. Be direct.
Do not confuse wants with needs. Use the term thats appropriate to the situation. _Need_ connotes a requirement or obligation, whereas _want_ indicates that you have an intent but still a choice of valid actions.
## Y
**Yellowdog Updater Modified (YUM)**
An open-source tool for command-line and graphical-based package management for RPM (RedHat Package Manager)-based Linux systems.

View File

@ -0,0 +1,44 @@
---
layout: default
title: Breaking changes
nav_order: 5
permalink: /breaking-changes/
---
## 1.x
### Migrating to OpenSearch and limits on the number of nested JSON objects
Migrating from Elasticsearch OSS version 6.8 to OpenSearch version 1.x will fail when a cluster contains any document that includes more than 10,000 nested JSON objects across all fields. Elasticsearch version 7.0 introduced the `index.mapping.nested_objects.limit` setting to guard against out-of-memory errors and assigned the setting a default of `10000`. OpenSearch adopted this setting at its inception and enforces the limitation on nested JSON objects. However, because the setting is not present in Elasticsearch 6.8 and not recognized by this version, migration to OpenSearch 1.x can result in incompatibility issues that block shard relocation between Elasticsearch 6.8 and OpenSearch versions 1.x when the number of nested JSON objects in any document surpasses the default limit.
Therefore, we recommend evaluating your data for these limits before attempting to migrate from Elasticsearch 6.8.
## 2.0.0
### Remove mapping types parameter
The `type` parameter has been removed from all OpenSearch API endpoints. Instead, indexes can be categorized by document type. For more details, see issue [#1940](https://github.com/opensearch-project/opensearch/issues/1940).
### Deprecate non-inclusive terms
Non-inclusive terms are deprecated in version 2.x and will be permanently removed in OpenSearch 3.0. We are using the following replacements:
- "Whitelist" is now "Allow list"
- "Blacklist" is now "Deny list"
- "Master" is now "Cluster Manager"
### Add OpenSearch Notifications plugins
In OpenSearch 2.0, the Alerting plugin is now integrated with new plugins for Notifications. If you want to continue to use the notification action in the Alerting plugin, install the new backend plugins `notifications-core` and `notifications`. If you want to manage notifications in OpenSearch Dashboards, use the new `notificationsDashboards` plugin. For more information, see [Notifications]({{site.url}}{{site.baseurl}}/observing-your-data/notifications/index/) on the OpenSearch documentation page.
### Drop support for JDK 8
A Lucene upgrade forced OpenSearch to drop support for JDK 8. As a consequence, the [Java high-level REST client]({{site.url}}{{site.baseurl}}/clients/java-rest-high-level/) no longer supports JDK 8. Restoring JDK 8 support is currently an `opensearch-java` proposal [#156](https://github.com/opensearch-project/opensearch-java/issues/156) and will require removing OpenSearch core as a dependency from the Java client (issue [#262](https://github.com/opensearch-project/opensearch-java/issues/262)).
## 2.5.0
### Wildcard query behavior for text fields
OpenSearch 2.5 contains a bug fix to correct the behavior of the `case_insensitive` parameter for the `wildcard` query on text fields. As a result, a wildcard query on text fields that ignored case sensitivity and erroneously returned results prior to the bug fix will not return the same results. For more information, see issue [#8711](https://github.com/opensearch-project/OpenSearch/issues/8711).

85
_about/index.md Normal file
View File

@ -0,0 +1,85 @@
---
layout: default
title: Getting started
nav_order: 1
has_children: false
has_toc: false
nav_exclude: true
permalink: /about/
redirect_from:
- /docs/opensearch/
- /opensearch/
- /opensearch/index/
---
{%- comment -%}The `/docs/opensearch/` redirect is specifically to support the UI links in OpenSearch Dashboards 1.0.0.{%- endcomment -%}
# OpenSearch and OpenSearch Dashboards
**Version {{site.opensearch_major_minor_version}}**
{: .label .label-blue }
This section contains documentation for OpenSearch and OpenSearch Dashboards.
## Getting started
- [Intro to OpenSearch]({{site.url}}{{site.baseurl}}/intro/)
- [Quickstart]({{site.url}}{{site.baseurl}}/quickstart/)
- [Install OpenSearch]({{site.url}}{{site.baseurl}}/install-and-configure/install-opensearch/index/)
- [Install OpenSearch Dashboards]({{site.url}}{{site.baseurl}}/install-and-configure/install-dashboards/index/)
- [See the FAQ](https://opensearch.org/faq)
## Why use OpenSearch?
With OpenSearch, you can perform the following use cases:
<table style="table-layout: auto ; width: 100%;">
<tbody>
<tr style="text-align: center; vertical-align:center;">
<td><img src="{{site.url}}{{site.baseurl}}/images/1_search.png" class="no-border" alt="Fast, scalable full-text search" height="100"/></td>
<td><img src="{{site.url}}{{site.baseurl}}/images/2_monitoring.png" class="no-border" alt="Application and infrastructure monitoring" height="100"/></td>
<td><img src="{{site.url}}{{site.baseurl}}/images/3_security.png" class="no-border" alt="Security and event information management" height="100"/></td>
<td><img src="{{site.url}}{{site.baseurl}}/images/4_tracking.png" class="no-border" alt="Operational health tracking" height="100"/></td>
</tr>
<tr style="text-align: left; vertical-align:top; font-weight: bold; color: rgb(0,59,92)">
<td>Fast, Scalable Full-text Search</td>
<td>Application and Infrastructure Monitoring</td>
<td>Security and Event Information Management</td>
<td>Operational Health Tracking</td>
</tr>
<tr style="text-align: left; vertical-align:top;">
<td>Help users find the right information within your application, website, or data lake catalog. </td>
<td>Easily store and analyze log data, and set automated alerts for underperformance.</td>
<td>Centralize logs to enable real-time security monitoring and forensic analysis.</td>
<td>Use observability logs, metrics, and traces to monitor your applications and business in real time.</td>
</tr>
</tbody>
</table>
**Additional features and plugins:**
OpenSearch has several features and plugins to help index, secure, monitor, and analyze your data. Most OpenSearch plugins have corresponding OpenSearch Dashboards plugins that provide a convenient, unified user interface.
- [Anomaly detection]({{site.url}}{{site.baseurl}}/monitoring-plugins/ad/) - Identify atypical data and receive automatic notifications
- [KNN]({{site.url}}{{site.baseurl}}/search-plugins/knn/) - Find “nearest neighbors” in your vector data
- [Performance Analyzer]({{site.url}}{{site.baseurl}}/monitoring-plugins/pa/) - Monitor and optimize your cluster
- [SQL]({{site.url}}{{site.baseurl}}/search-plugins/sql/index/) - Use SQL or a piped processing language to query your data
- [Index State Management]({{site.url}}{{site.baseurl}}/im-plugin/) - Automate index operations
- [ML Commons plugin]({{site.url}}{{site.baseurl}}/ml-commons-plugin/index/) - Train and execute machine-learning models
- [Asynchronous search]({{site.url}}{{site.baseurl}}/search-plugins/async/) - Run search requests in the background
- [Cross-cluster replication]({{site.url}}{{site.baseurl}}/replication-plugin/index/) - Replicate your data across multiple OpenSearch clusters
## The secure path forward
OpenSearch includes a demo configuration so that you can get up and running quickly, but before using OpenSearch in a production environment, you must [configure the Security plugin manually]({{site.url}}{{site.baseurl}}/security/configuration/index/) with your own certificates, authentication method, users, and passwords.
## Looking for the Javadoc?
See [opensearch.org/javadocs/](https://opensearch.org/javadocs/).
## Get involved
[OpenSearch](https://opensearch.org) is supported by Amazon Web Services. All components are available under the [Apache License, Version 2.0](https://www.apache.org/licenses/LICENSE-2.0.html) on [GitHub](https://github.com/opensearch-project/).
The project welcomes GitHub issues, bug fixes, features, plugins, documentation---anything at all. To get involved, see [Contributing](https://opensearch.org/source.html) on the OpenSearch website.
---
<small>OpenSearch includes certain Apache-licensed Elasticsearch code from Elasticsearch B.V. and other source code. Elasticsearch B.V. is not the source of that other source code. ELASTICSEARCH is a registered trademark of Elasticsearch B.V.</small>

View File

@ -1,23 +1,17 @@
---
layout: default
title: About OpenSearch
nav_order: 1
has_children: false
has_toc: false
redirect_from:
- /docs/opensearch/
- /opensearch/
title: Intro to OpenSearch
nav_order: 2
permalink: /intro/
---
{%- comment -%}The `/docs/opensearch/` redirect is specifically to support the UI links in OpenSearch Dashboards 1.0.0.{%- endcomment -%}
# Introduction to OpenSearch
OpenSearch is a distributed search and analytics engine based on [Apache Lucene](https://lucene.apache.org/). After adding your data to OpenSearch, you can perform full-text searches on it with all of the features you might expect: search by field, search multiple indices, boost fields, rank results by score, sort results by field, and aggregate results.
OpenSearch is a distributed search and analytics engine based on [Apache Lucene](https://lucene.apache.org/). After adding your data to OpenSearch, you can perform full-text searches on it with all of the features you might expect: search by field, search multiple indexes, boost fields, rank results by score, sort results by field, and aggregate results.
Unsurprisingly, people often use search engines like OpenSearch as the backend for a search application---think [Wikipedia](https://en.wikipedia.org/wiki/Wikipedia:FAQ/Technical#What_software_is_used_to_run_Wikipedia?) or an online store. It offers excellent performance and can scale up and down as the needs of the application grow or shrink.
An equally popular, but less obvious use case is log analytics, in which you take the logs from an application, feed them into OpenSearch, and use the rich search and visualization functionality to identify issues. For example, a malfunctioning web server might throw a 500 error 0.5% of the time, which can be hard to notice unless you have a real-time graph of all HTTP status codes that the server has thrown in the past four hours. You can use [OpenSearch Dashboards]({{site.url}}{{site.baseurl}}/dashboards/) to build these sorts of visualizations from data in OpenSearch.
An equally popular, but less obvious use case is log analytics, in which you take the logs from an application, feed them into OpenSearch, and use the rich search and visualization functionality to identify issues. For example, a malfunctioning web server might throw a 500 error 0.5% of the time, which can be hard to notice unless you have a real-time graph of all HTTP status codes that the server has thrown in the past four hours. You can use [OpenSearch Dashboards]({{site.url}}{{site.baseurl}}/dashboards/index/) to build these sorts of visualizations from data in OpenSearch.
## Clusters and nodes
@ -29,9 +23,9 @@ You can run OpenSearch locally on a laptop---its system requirements are minimal
In a single node cluster, such as a laptop, one machine has to do everything: manage the state of the cluster, index and search data, and perform any preprocessing of data prior to indexing it. As a cluster grows, however, you can subdivide responsibilities. Nodes with fast disks and plenty of RAM might be great at indexing and searching data, whereas a node with plenty of CPU power and a tiny disk could manage cluster state. For more information on setting node types, see [Cluster formation]({{site.url}}{{site.baseurl}}/opensearch/cluster/).
## Indices and documents
## Indexes and documents
OpenSearch organizes data into *indices*. Each index is a collection of JSON *documents*. If you have a set of raw encyclopedia articles or log lines that you want to add to OpenSearch, you must first convert them to [JSON](https://www.json.org/). A simple JSON document for a movie might look like this:
OpenSearch organizes data into *indexes*. Each index is a collection of JSON *documents*. If you have a set of raw encyclopedia articles or log lines that you want to add to OpenSearch, you must first convert them to [JSON](https://www.json.org/). A simple JSON document for a movie might look like this:
```json
{
@ -55,14 +49,14 @@ When you add the document to an index, OpenSearch adds some metadata, such as th
}
```
Indices also contain mappings and settings:
Indexes also contain mappings and settings:
- A *mapping* is the collection of *fields* that documents in the index have. In this case, those fields are `title` and `release_date`.
- Settings include data like the index name, creation date, and number of shards.
## Primary and replica shards
OpenSearch splits indices into *shards* for even distribution across nodes in a cluster. For example, a 400 GB index might be too large for any single node in your cluster to handle, but split into ten shards, each one 40 GB, OpenSearch can distribute the shards across ten nodes and work with each shard individually.
OpenSearch splits indexes into *shards* for even distribution across nodes in a cluster. For example, a 400 GB index might be too large for any single node in your cluster to handle, but split into ten shards, each one 40 GB, OpenSearch can distribute the shards across ten nodes and work with each shard individually.
By default, OpenSearch creates a *replica* shard for each *primary* shard. If you split your index into ten shards, for example, OpenSearch also creates ten replica shards. These replica shards act as backups in the event of a node failure---OpenSearch distributes replica shards to different nodes than their corresponding primary shards---but they also improve the speed and rate at which the cluster can process search requests. You might specify more than one replica per index for a search-heavy workload.
@ -71,7 +65,7 @@ Despite being a piece of an OpenSearch index, each shard is actually a full Luce
## REST API
You interact with OpenSearch clusters using the REST API, which offers a lot of flexibility. You can use clients like [curl](https://curl.haxx.se/) or any programming language that can send HTTP requests. To add a JSON document to an OpenSearch index (i.e. index a document), you send an HTTP request:
You interact with OpenSearch clusters using the REST API, which offers a lot of flexibility. You can use clients like [curl](https://curl.se/) or any programming language that can send HTTP requests. To add a JSON document to an OpenSearch index (i.e. index a document), you send an HTTP request:
```json
PUT https://<host>:<port>/<index-name>/_doc/<document-id>
@ -83,14 +77,36 @@ PUT https://<host>:<port>/<index-name>/_doc/<document-id>
To run a search for the document:
```
```json
GET https://<host>:<port>/<index-name>/_search?q=wind
```
To delete the document:
```
```json
DELETE https://<host>:<port>/<index-name>/_doc/<document-id>
```
You can change most OpenSearch settings using the REST API, modify indices, check the health of the cluster, get statistics---almost everything.
You can change most OpenSearch settings using the REST API, modify indexes, check the health of the cluster, get statistics---almost everything.
## Advanced concepts
The following section describes more advanced OpenSearch concepts.
### Translog
Any index changes, such as document indexing or deletion, are written to disk during a Lucene commit. However, Lucene commits are expensive operations, so they cannot be performed after every change to the index. Instead, each shard records every indexing operation in a transaction log called _translog_. When a document is indexed, it is added to the memory buffer and recorded in the translog. After a process or host restart, any data in the in-memory buffer is lost. Recording the document in the translog ensures durability because the translog is written to disk.
Frequent refresh operations write the documents in the memory buffer to a segment and then clear the memory buffer. Periodically, a [flush](#flush) performs a Lucene commit, which includes writing the segments to disk using `fsync`, purging the old translog, and starting a new translog. Thus, a translog contains all operations that have not yet been flushed.
### Refresh
Periodically, OpenSearch performs a _refresh_ operation, which writes the documents from the in-memory Lucene index to files. These files are not guaranteed to be durable because an `fsync` is not performed. A refresh makes documents available for search.
### Flush
A _flush_ operation persists the files to disk using `fsync`, ensuring durability. Flushing ensures that the data stored only in the translog is recorded in the Lucene index. OpenSearch performs a flush as needed to ensure that the translog does not grow too large.
### Merge
In OpenSearch, a shard is a Lucene index, which consists of _segments_ (or segment files). Segments store the indexed data and are immutable. Periodically, smaller segments are merged into larger ones. Merging reduces the overall number of segments on each shard, frees up disk space, and improves search performance. Eventually, segments reach a maximum size specified in the merge policy and are no longer merged into larger segments. The merge policy also specifies how often merges are performed.

165
_about/quickstart.md Normal file
View File

@ -0,0 +1,165 @@
---
layout: default
title: Quickstart
nav_order: 3
permalink: /quickstart/
redirect_from:
- /opensearch/install/quickstart/
---
# Quickstart
Get started using OpenSearch and OpenSearch Dashboards by deploying your containers with [Docker](https://www.docker.com/). Before proceeding, you need to [get Docker](https://docs.docker.com/get-docker/) and [Docker Compose](https://github.com/docker/compose) installed on your local machine.
The Docker Compose commands used in this guide are written with a hyphen (for example, `docker-compose`). If you installed Docker Desktop on your machine, which automatically installs a bundled version of Docker Compose, then you should remove the hyphen. For example, change `docker-compose` to `docker compose`.
{: .note}
## Starting your cluster
You'll need a special file, called a Compose file, that Docker Compose uses to define and create the containers in your cluster. The OpenSearch Project provides a sample Compose file that you can use to get started. Learn more about working with Compose files by reviewing the official [Compose specification](https://docs.docker.com/compose/compose-file/).
1. Before running OpenSearch on your machine, you should disable memory paging and swapping performance on the host to improve performance and increase the number of memory maps available to OpenSearch. See [important system settings]({{site.url}}{{site.baseurl}}/opensearch/install/important-settings/) for more information.
```bash
# Disable memory paging and swapping.
sudo swapoff -a
# Edit the sysctl config file that defines the host's max map count.
sudo vi /etc/sysctl.conf
# Set max map count to the recommended value of 262144.
vm.max_map_count=262144
# Reload the kernel parameters.
sudo sysctl -p
```
1. Download the sample Compose file to your host. You can download the file with command line utilities like `curl` and `wget`, or you can manually copy [docker-compose.yml](https://github.com/opensearch-project/documentation-website/blob/{{site.opensearch_major_minor_version}}/assets/examples/docker-compose.yml) from the OpenSearch Project documentation-website repository using a web browser.
```bash
# Using cURL:
curl -O https://raw.githubusercontent.com/opensearch-project/documentation-website/{{site.opensearch_major_minor_version}}/assets/examples/docker-compose.yml
# Using wget:
wget https://raw.githubusercontent.com/opensearch-project/documentation-website/{{site.opensearch_major_minor_version}}/assets/examples/docker-compose.yml
```
1. In your terminal application, navigate to the directory containing the `docker-compose.yml` file you just downloaded, and run the following command to create and start the cluster as a background process.
```bash
docker-compose up -d
```
1. Confirm that the containers are running with the command `docker-compose ps`. You should see an output like the following:
```bash
$ docker-compose ps
NAME COMMAND SERVICE STATUS PORTS
opensearch-dashboards "./opensearch-dashbo…" opensearch-dashboards running 0.0.0.0:5601->5601/tcp
opensearch-node1 "./opensearch-docker…" opensearch-node1 running 0.0.0.0:9200->9200/tcp, 9300/tcp, 0.0.0.0:9600->9600/tcp, 9650/tcp
opensearch-node2 "./opensearch-docker…" opensearch-node2 running 9200/tcp, 9300/tcp, 9600/tcp, 9650/tcp
```
1. Query the OpenSearch REST API to verify that the service is running. You should use `-k` (also written as `--insecure`) to disable hostname checking because the default security configuration uses demo certificates. Use `-u` to pass the default username and password (`admin:<custom-admin-password>`).
```bash
curl https://localhost:9200 -ku admin:<custom-admin-password>
```
Sample response:
```json
{
"name" : "opensearch-node1",
"cluster_name" : "opensearch-cluster",
"cluster_uuid" : "W0B8gPotTAajhMPbC9D4ww",
"version" : {
"distribution" : "opensearch",
"number" : "2.6.0",
"build_type" : "tar",
"build_hash" : "7203a5af21a8a009aece1474446b437a3c674db6",
"build_date" : "2023-02-24T18:58:37.352296474Z",
"build_snapshot" : false,
"lucene_version" : "9.5.0",
"minimum_wire_compatibility_version" : "7.10.0",
"minimum_index_compatibility_version" : "7.0.0"
},
"tagline" : "The OpenSearch Project: https://opensearch.org/"
}
```
1. Explore OpenSearch Dashboards by opening `http://localhost:5601/` in a web browser on the same host that is running your OpenSearch cluster. The default username is `admin` and the default password is set in your `docker-compose.yml` file in the `OPENSEARCH_INITIAL_ADMIN_PASSWORD=<custom-admin-password>` setting.
## Create an index and field mappings using sample data
Create an index and define field mappings using a dataset provided by the OpenSearch Project. The same fictitious e-commerce data is also used for sample visualizations in OpenSearch Dashboards. To learn more, see [Getting started with OpenSearch Dashboards]({{site.url}}{{site.baseurl}}/dashboards/index/).
1. Download [ecommerce-field_mappings.json](https://github.com/opensearch-project/documentation-website/blob/{{site.opensearch_major_minor_version}}/assets/examples/ecommerce-field_mappings.json). This file defines a [mapping]({{site.url}}{{site.baseurl}}/opensearch/mappings/) for the sample data you will use.
```bash
# Using cURL:
curl -O https://raw.githubusercontent.com/opensearch-project/documentation-website/{{site.opensearch_major_minor_version}}/assets/examples/ecommerce-field_mappings.json
# Using wget:
wget https://raw.githubusercontent.com/opensearch-project/documentation-website/{{site.opensearch_major_minor_version}}/assets/examples/ecommerce-field_mappings.json
```
1. Download [ecommerce.json](https://github.com/opensearch-project/documentation-website/blob/{{site.opensearch_major_minor_version}}/assets/examples/ecommerce.json). This file contains the index data formatted so that it can be ingested by the bulk API. To learn more, see [index data]({{site.url}}{{site.baseurl}}/opensearch/index-data/) and [Bulk]({{site.url}}{{site.baseurl}}/api-reference/document-apis/bulk/).
```bash
# Using cURL:
curl -O https://raw.githubusercontent.com/opensearch-project/documentation-website/{{site.opensearch_major_minor_version}}/assets/examples/ecommerce.json
# Using wget:
wget https://raw.githubusercontent.com/opensearch-project/documentation-website/{{site.opensearch_major_minor_version}}/assets/examples/ecommerce.json
```
1. Define the field mappings with the mapping file.
```bash
curl -H "Content-Type: application/x-ndjson" -X PUT "https://localhost:9200/ecommerce" -ku admin:<custom-admin-password> --data-binary "@ecommerce-field_mappings.json"
```
1. Upload the index to the bulk API.
```bash
curl -H "Content-Type: application/x-ndjson" -X PUT "https://localhost:9200/ecommerce/_bulk" -ku admin:<custom-admin-password> --data-binary "@ecommerce.json"
```
1. Query the data using the search API. The following command submits a query that will return documents where `customer_first_name` is `Sonya`.
```bash
curl -H 'Content-Type: application/json' -X GET "https://localhost:9200/ecommerce/_search?pretty=true" -ku admin:<custom-admin-password> -d' {"query":{"match":{"customer_first_name":"Sonya"}}}'
```
Queries submitted to the OpenSearch REST API will generally return a flat JSON by default. For a human readable response body, use the query parameter `pretty=true`. For more information about `pretty` and other useful query parameters, see [Common REST parameters]({{site.url}}{{site.baseurl}}/opensearch/common-parameters/).
1. Access OpenSearch Dashboards by opening `http://localhost:5601/` in a web browser on the same host that is running your OpenSearch cluster. The default username is `admin` and the password is set in your `docker-compose.yml` file in the `OPENSEARCH_INITIAL_ADMIN_PASSWORD=<custom-admin-password>` setting.
1. On the top menu bar, go to **Management > Dev Tools**.
1. In the left pane of the console, enter the following:
```json
GET ecommerce/_search
{
"query": {
"match": {
"customer_first_name": "Sonya"
}
}
}
```
1. Choose the triangle icon at the top right of the request to submit the query. You can also submit the request by pressing `Ctrl+Enter` (or `Cmd+Enter` for Mac users). To learn more about using the OpenSearch Dashboards console for submitting queries, see [Running queries in the console]({{site.url}}{{site.baseurl}}/dashboards/run-queries/).
## Next steps
You successfully deployed your own OpenSearch cluster with OpenSearch Dashboards and added some sample data. Now you're ready to learn about configuration and functionality in more detail. Here are a few recommendations on where to begin:
- [About the Security plugin]({{site.url}}{{site.baseurl}}/security/index/)
- [OpenSearch configuration]({{site.url}}{{site.baseurl}}/install-and-configure/configuring-opensearch/)
- [OpenSearch plugin installation]({{site.url}}{{site.baseurl}}/opensearch/install/plugins/)
- [Getting started with OpenSearch Dashboards]({{site.url}}{{site.baseurl}}/dashboards/index/)
- [OpenSearch tools]({{site.url}}{{site.baseurl}}/tools/index/)
- [Index APIs]({{site.url}}{{site.baseurl}}/api-reference/index-apis/index/)
## Common issues
Review these common issues and suggested solutions if your containers fail to start or exit unexpectedly.
### Docker commands require elevated permissions
Eliminate the need for running your Docker commands with `sudo` by adding your user to the `docker` user group. See Docker's [Post-installation steps for Linux](https://docs.docker.com/engine/install/linux-postinstall/) for more information.
```bash
sudo usermod -aG docker $USER
```
### Error message: "-bash: docker-compose: command not found"
If you installed Docker Desktop, then Docker Compose is already installed on your machine. Try `docker compose` (without the hyphen) instead of `docker-compose`. See [Use Docker Compose](https://docs.docker.com/get-started/08_using_compose/).
### Error message: "docker: 'compose' is not a docker command."
If you installed Docker Engine, then you must install Docker Compose separately, and you will use the command `docker-compose` (with a hyphen). See [Docker Compose](https://github.com/docker/compose).
### Error message: "max virtual memory areas vm.max_map_count [65530] is too low"
OpenSearch will fail to start if your host's `vm.max_map_count` is too low. Review the [important system settings]({{site.url}}{{site.baseurl}}/opensearch/install/important-settings/) if you see the following errors in the service log, and set `vm.max_map_count` appropriately.
```bash
opensearch-node1 | ERROR: [1] bootstrap checks failed
opensearch-node1 | [1]: max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]
opensearch-node1 | ERROR: OpenSearch did not exit normally - check the logs at /usr/share/opensearch/logs/opensearch-cluster.log
```

53
_about/version-history.md Normal file
View File

@ -0,0 +1,53 @@
---
layout: default
title: Version history
nav_order: 4
permalink: /version-history/
---
# Version history
OpenSearch version | Release highlights | Release date
:--- | :--- | :---
[2.12.0](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-2.12.0.md) | Makes concurrent segment search and conversational search generally available. Provides an experimental OpenSearch Assistant Toolkit, including agents and tools, workflow automation, and OpenSearch Assistant for OpenSearch Dashboards UI. Adds a new match-only text field, query insights to monitor top N queries, and k-NN search on nested fields. For a full list of release highlights, see the Release Notes. | 20 February 2024
[2.11.1](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-2.11.1.md) | Includes maintenance changes and bug fixes for cross-cluster replication, alerting, observability, OpenSearch Dashboards, index management, machine learning, security, and security analytics. For a full list of release highlights, see the Release Notes. | 30 November 2023
[2.11.0](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-2.11.0.md) | Adds multimodal and sparse neural search capability and the ability to take shallow snapshots that refer to data stored in remote-backed storage. Makes the search comparison tool generally available. Includes a simplified workflow to create threat detectors in Security Analytics and improved security in OpenSearch Dashboards. Experimental features include a new framework and toolset for distributed tracing and updates to conversational search. For a full list of release highlights, see the Release Notes. | 16 October 2023
[2.10.0](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-2.10.0.md) | Makes remote-backed storage generally available. Adds hybrid search capability, custom log types for Security Analytics, IP2Geo ingest processor, and delimited term frequency token filter. Includes a new look and feel for OpenSearch Dashboards and updates the Discover tool. Adds Microsoft Teams webhook support for notifications. Experimental features include concurrent segment search and conversational search. For a full list of release highlights, see the Release Notes. | 25 September 2023
[2.9.0](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-2.9.0.md) | Makes search pipelines and the Neural Search plugin generally available. Adds ML model access control and integration with external ML tools. Implements k-NN byte vectors and efficient filtering with the Faiss engine. Integrates alerting and anomaly detection with OpenSearch Dashboards and adds composite monitors. Adds two new index codec algorithm options. Includes a new ingestion schema for Security Analytics, geoshape aggregations, and extensions---a new mechanism for extending OpenSearch functionality. For a full list of release highlights, see the Release Notes. | 24 July 2023
[2.8.0](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-2.8.0.md) | Adds cross-cluster query with PPL, search pipelines, an option to turn on segment replication as the default replication type, improved searchable snapshot performance, and Amazon OpenSearch Serverless support with SigV4 authentication for multiple data sources. Includes the UI for the flush, refresh, and clear cache operations in OpenSearch Dashboards. For a full list of release highlights, see the Release Notes. | 06 June 2023
[2.7.0](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-2.7.0.md) | Includes searchable snapshots and segment replication, which are now generally available. Adds multiple data sources, observability features, dynamic tenant management, component templates, and shape-based map filters in OpenSearch Dashboards. Includes the flat object field type, hot shard identification, and a new automatic reloading mechanism for ML models. For a full list of release highlights, see the Release Notes. | 02 May 2023
[2.6.0](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-2.6.0.md) | Includes simple schema for observability, index management UI enhancements, Security Analytics enhancements, search backpressure at the coordinator node level, and the ability to add maps to dashboards. Experimental features include a new ML model health dashboard, new text embedding models in ML, and SigV4 authentication in Dashboards. For a full list of release highlights, see the Release Notes. | 28 February 2023
[2.5.0](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-2.5.0.md) | Includes index management UI enhancements, multi-layer maps, Jaeger support for observability, Debian distributions, returning cluster health by awareness attribute, cluster manager task throttling, weighted zonal search request routing policy, and query string support in index rollups. Experimental features include request-level durability in remote-backed storage and GPU acceleration for ML nodes. For a full list of release highlights, see the Release Notes. | 24 January 2023
[2.4.1](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-2.4.1.md) | Includes maintenance changes and bug fixes for gradle check and indexing pressure tests. Adds support for skipping changelog. | 13 December 2022
[2.4.0](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-2.4.0.md) | Includes Windows support, Point-in-time search, custom k-NN filtering, xy_point and xy_shape field types for Cartesian coordinates, GeoHex grid aggregation, and resilience enhancements, including search backpressure. In OpenSearch Dashboards, this release adds snapshot restore functionality, multiple authentication, and aggregate view of saved objects. This release includes the following experimental features: searchable snapshots, Compare Search Results, multiple data sources in OpenSearch Dashboards, a new Model Serving Framework in ML Commons, a new Neural Search plugin that supports semantic search, and a new Security Analytics plugin to analyze security logs. For a full list of release highlights, see the Release Notes. | 15 November 2022
[2.3.0](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-2.3.0.md) | This release includes the following experimental features: segment replication, remote-backed storage, and drag and drop for OpenSearch Dashboards. Experimental features allow you to test new functionality in OpenSearch. Because these features are still being developed, your testing and feedback can help shape the development of the feature before it's official released. We do not recommend use of experimental features in production. Additionally, this release adds maketime and makedate datetime functions for the SQL plugin. Creates a new [OpenSearch Playground](https://playground.opensearch.org) demo site for OpenSearch Dashboards. For a full list of release highlights, see the Release Notes. | 14 September 2022
[2.2.1](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-2.2.1.md) | Includes gradle updates and bug fixes for gradle check. | 01 September 2022
[2.2.0](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-2.2.0.md) | Includes support for Logistic Regression and RCF Summarize machine learning algorithms in ML Commons, Lucene or C-based Nmslib and Faiss libraries for approximate k-NN search, search by relevance using SQL and PPL queries, custom region maps for visualizations, and rollup enhancements. For a full list of release highlights, see the Release Notes. | 11 August 2022
[2.1.0](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-2.1.0.md) | Includes support for dedicated ML node in the ML Commons plugin, relevance search and other features in SQL, multi-terms aggregation, and Snapshot Management. For a full list of release highlights, see the Release Notes. | 07 July 2022
[2.0.1](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-2.0.1.md) | Includes bug fixes and maintenance updates for Alerting and Anomaly Detection. | 16 June 2022
[2.0.0](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-2.0.0.md) | Includes document-level monitors for alerting, OpenSearch Notifications plugins, and Geo Map Tiles in OpenSearch Dashboards. Also adds support for Lucene 9 and bug fixes for all OpenSearch plugins. For a full list of release highlights, see the Release Notes. | 26 May 2022
[2.0.0-rc1](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-2.0.0-rc1.md) | The Release Candidate for 2.0.0. This version allows you to preview the upcoming 2.0.0 release before the GA release. The preview release adds document-level alerting, support for Lucene 9, and the ability to use term lookup queries in document level security. | 03 May 2022
[1.3.15](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-1.3.15.md) | Includes bug fixes and maintenance updates for cross-cluster replication, SQL, OpenSearch Dashboards reporting, and alerting. | 05 March 2024
[1.3.14](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-1.3.14.md) | Includes bug fixes and maintenance updates for OpenSearch security and OpenSearch Dashboards security. | 12 December 2023
[1.3.13](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-1.3.13.md) | Includes bug fixes for Anomaly Detection, adds maintenance updates and infrastructure enhancements. | 21 September 2023
[1.3.12](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-1.3.12.md) | Adds maintenance updates for OpenSearch security and OpenSearch Dashboards observability. Includes bug fixes for observability, OpenSearch Dashboards visualizations, and OpenSearch security. | 10 August 2023
[1.3.11](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-1.3.11.md) | Adds maintenance updates for OpenSearch security, OpenSearch Dashboards security, and ML Commons. | 29 June 2023
[1.3.10](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-1.3.10.md) | Adds infrastructure enhancements and maintenance updates for anomaly detection, observability, and security. Includes bug fixes for index management and OpenSearch security. | 18 May 2023
[1.3.9](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-1.3.9.md) | Adds Debian support. Includes upgrades, enhancements, and maintenance updates for OpenSearch core, k-NN, and OpenSearch security. | 16 March 2023
[1.3.8](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-1.3.8.md) | Adds OpenSearch security enhancements. Updates tool scripts to run on Windows. Includes maintenance updates and bug fixes for Anomaly Detection and OpenSearch security. | 02 February 2023
[1.3.7](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-1.3.7.md) | Adds Windows support. Includes maintenance updates and bug fixes for error handling. | 13 December 2022
[1.3.6](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-1.3.6.md) | Includes maintenance updates and bug fixes for tenancy in the OpenSearch Security Dashboards plugin. | 06 October 2022
[1.3.5](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-1.3.5.md) | Includes maintenance updates and bug fixes for gradle check and OpenSearch security. | 01 September 2022
[1.3.4](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-1.3.4.md) | Includes maintenance updates and bug fixes for OpenSearch and OpenSearch Dashboards. | 14 July 2022
[1.3.3](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-1.3.3.md) | Adds enhancements to Anomaly Detection and ML Commons. Bug fixes for Anomaly Detection, Observability, and k-NN. | 09 June 2022
[1.3.2](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-1.3.2.md) | Bug fixes for Anomaly Detection and the Security Dashboards Plugin, adds the option to install OpenSearch using RPM, as well as enhancements to the ML Commons execute task, and the removal of the job-scheduler zip in Anomaly Detection. | 05 May 2022
[1.3.1](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-1.3.1.md) | Bug fixes when using document-level security, and adjusted ML Commons to use the latest RCF jar and protostuff to RCF model serialization. | 30 March 2022
[1.3.0](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-1.3.0.md) | Adds Model Type Validation to Validate Detector API, continuous transforms, custom actions, applied policy parameter to Explain API, default action retries, and new rollover and transition conditions to Index Management, new ML Commons plugin, parse command to SQL, Application Analytics, Live Tail, Correlation, and Events Flyout to Observability, and auto backport and support for OPENSEARCH_JAVA_HOME to Performance Analyzer. Bug fixes. | 17 March 2022
[1.2.4](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-1.2.4.md) | Updates Performance Analyzer, SQL, and Security plugins to Log4j 2.17.1, Alerting and Job Scheduler to cron-utils 9.1.6, and gson in Anomaly Detection and SQL. | 18 January 2022
[1.2.3](https://github.com/opensearch-project/opensearch-build/blob/main/release-notes/opensearch-release-notes-1.2.3.md) | Updates the version of Log4j used in OpenSearch to Log4j 2.17.0 as recommended by the advisory in [CVE-2021-45105](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-45105). | 22 December 2021
[1.2.0](https://github.com/opensearch-project/OpenSearch/blob/main/release-notes/opensearch.release-notes-1.2.0.md) | Adds observability, new validation API for Anomaly Detection, shard-level indexing back-pressure, new "match" query type for SQL and PPL, support for Faiss libraries in k-NN, and custom Dashboards branding. | 23 November 2021
[1.1.0](https://github.com/opensearch-project/opensearch-build/tree/main/release-notes/opensearch-release-notes-1.1.0.md) | Adds cross-cluster replication, security for Index Management, bucket-level alerting, a CLI to help with upgrading from Elasticsearch OSS to OpenSearch, and enhancements to high cardinality data in the anomaly detection plugin. | 05 October 2021
[1.0.1](https://github.com/opensearch-project/opensearch-build/tree/main/release-notes/opensearch-release-notes-1.0.1.md) | Bug fixes. | 01 September 2021
[1.0.0](https://github.com/opensearch-project/opensearch-build/tree/main/release-notes/opensearch-release-notes-1.0.0.md) | General availability release. Adds compatibility setting for clients that require a version check before connecting. | 12 July 2021
[1.0.0-rc1](https://github.com/opensearch-project/opensearch-build/tree/main/release-notes/opensearch-release-notes-1.0.0-rc1.md) | First release candidate. | 07 June 2021
[1.0.0-beta1](https://github.com/opensearch-project/opensearch-build/tree/main/release-notes/opensearch-release-notes-1.0.0-beta1.md) | Initial beta release. Refactors plugins to work with OpenSearch. | 13 May 2021

View File

@ -0,0 +1,101 @@
---
layout: default
title: Adjacency matrix
parent: Bucket aggregations
grand_parent: Aggregations
nav_order: 10
redirect_from:
- /query-dsl/aggregations/bucket/adjacency-matrix/
---
# Adjacency matrix aggregations
The `adjacency_matrix` aggregation lets you define filter expressions and returns a matrix of the intersecting filters where each non-empty cell in the matrix represents a bucket. You can find how many documents fall within any combination of filters.
Use the `adjacency_matrix` aggregation to discover how concepts are related by visualizing the data as graphs.
For example, in the sample eCommerce dataset, to analyze how the different manufacturing companies are related:
```json
GET opensearch_dashboards_sample_data_ecommerce/_search
{
"size": 0,
"aggs": {
"interactions": {
"adjacency_matrix": {
"filters": {
"grpA": {
"match": {
"manufacturer.keyword": "Low Tide Media"
}
},
"grpB": {
"match": {
"manufacturer.keyword": "Elitelligence"
}
},
"grpC": {
"match": {
"manufacturer.keyword": "Oceanavigations"
}
}
}
}
}
}
}
```
{% include copy-curl.html %}
#### Example response
```json
{
...
"aggregations" : {
"interactions" : {
"buckets" : [
{
"key" : "grpA",
"doc_count" : 1553
},
{
"key" : "grpA&grpB",
"doc_count" : 590
},
{
"key" : "grpA&grpC",
"doc_count" : 329
},
{
"key" : "grpB",
"doc_count" : 1370
},
{
"key" : "grpB&grpC",
"doc_count" : 299
},
{
"key" : "grpC",
"doc_count" : 1218
}
]
}
}
}
```
Lets take a closer look at the result:
```json
{
"key" : "grpA&grpB",
"doc_count" : 590
}
```
- `grpA`: Products manufactured by Low Tide Media.
- `grpB`: Products manufactured by Elitelligence.
- `590`: Number of products that are manufactured by both.
You can use OpenSearch Dashboards to represent this data with a network graph.

View File

@ -0,0 +1,61 @@
---
layout: default
title: Date histogram
parent: Bucket aggregations
grand_parent: Aggregations
nav_order: 20
redirect_from:
- /query-dsl/aggregations/bucket/date-histogram/
---
# Date histogram aggregations
The `date_histogram` aggregation uses [date math]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/date/#date-math) to generate histograms for time-series data.
For example, you can find how many hits your website gets per month:
```json
GET opensearch_dashboards_sample_data_logs/_search
{
"size": 0,
"aggs": {
"logs_per_month": {
"date_histogram": {
"field": "@timestamp",
"interval": "month"
}
}
}
}
```
{% include copy-curl.html %}
#### Example response
```json
...
"aggregations" : {
"logs_per_month" : {
"buckets" : [
{
"key_as_string" : "2020-10-01T00:00:00.000Z",
"key" : 1601510400000,
"doc_count" : 1635
},
{
"key_as_string" : "2020-11-01T00:00:00.000Z",
"key" : 1604188800000,
"doc_count" : 6844
},
{
"key_as_string" : "2020-12-01T00:00:00.000Z",
"key" : 1606780800000,
"doc_count" : 5595
}
]
}
}
}
```
The response has three months worth of logs. If you graph these values, you can see the peak and valleys of the request traffic to your website month over month.

View File

@ -0,0 +1,57 @@
---
layout: default
title: Date range
parent: Bucket aggregations
grand_parent: Aggregations
nav_order: 30
redirect_from:
- /query-dsl/aggregations/bucket/date-range/
---
# Date range aggregations
The `date_range` aggregation is conceptually the same as the `range` aggregation, except that it lets you perform date math.
For example, you can get all documents from the last 10 days. To make the date more readable, include the format with a `format` parameter:
```json
GET opensearch_dashboards_sample_data_logs/_search
{
"size": 0,
"aggs": {
"number_of_bytes": {
"date_range": {
"field": "@timestamp",
"format": "MM-yyyy",
"ranges": [
{
"from": "now-10d/d",
"to": "now"
}
]
}
}
}
}
```
{% include copy-curl.html %}
#### Example response
```json
...
"aggregations" : {
"number_of_bytes" : {
"buckets" : [
{
"key" : "03-2021-03-2021",
"from" : 1.6145568E12,
"from_as_string" : "03-2021",
"to" : 1.615451329043E12,
"to_as_string" : "03-2021",
"doc_count" : 0
}
]
}
}
}
```

View File

@ -0,0 +1,66 @@
---
layout: default
title: Diversified sampler
parent: Bucket aggregations
grand_parent: Aggregations
nav_order: 40
redirect_from:
- /query-dsl/aggregations/bucket/diversified-sampler/
---
# Diversified sampler
The `diversified_sampler` aggregation lets you reduce the bias in the distribution of the sample pool by deduplicating documents containing the same `field`. It does so by using the `max_docs_per_value` and `field` settings, which limit the maximum number of documents collected on a shard for the provided `field`. The `max_docs_per_value` setting is an optional parameter used to determine the maximum number of documents that will be returned per `field`. The default value of this setting is `1`.
Similarly to the [`sampler` aggregation]({{site.url}}{{site.baseurl}}/aggregations/bucket/sampler/), you can use the `shard_size` setting to control the maximum number of documents collected on any one shard, as shown in the following example:
```json
GET opensearch_dashboards_sample_data_logs/_search
{
"size": 0,
"aggs": {
"sample": {
"diversified_": {
"shard_size": 1000,
"field": "response.keyword"
},
"aggs": {
"terms": {
"terms": {
"field": "agent.keyword"
}
}
}
}
}
}
```
{% include copy-curl.html %}
#### Example response
```json
...
"aggregations" : {
"sample" : {
"doc_count" : 3,
"terms" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "Mozilla/5.0 (X11; Linux x86_64; rv:6.0a1) Gecko/20110421 Firefox/6.0a1",
"doc_count" : 2
},
{
"key" : "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)",
"doc_count" : 1
}
]
}
}
}
}
```

View File

@ -0,0 +1,56 @@
---
layout: default
title: Filter
parent: Bucket aggregations
grand_parent: Aggregations
nav_order: 50
redirect_from:
- /query-dsl/aggregations/bucket/filter/
---
# Filter aggregations
A `filter` aggregation is a query clause, exactly like a search query — `match` or `term` or `range`. You can use the `filter` aggregation to narrow down the entire set of documents to a specific set before creating buckets.
The following example shows the `avg` aggregation running within the context of a filter. The `avg` aggregation only aggregates the documents that match the `range` query:
```json
GET opensearch_dashboards_sample_data_ecommerce/_search
{
"size": 0,
"aggs": {
"low_value": {
"filter": {
"range": {
"taxful_total_price": {
"lte": 50
}
}
},
"aggs": {
"avg_amount": {
"avg": {
"field": "taxful_total_price"
}
}
}
}
}
}
```
{% include copy-curl.html %}
#### Example response
```json
...
"aggregations" : {
"low_value" : {
"doc_count" : 1633,
"avg_amount" : {
"value" : 38.363175998928355
}
}
}
}
```

View File

@ -0,0 +1,81 @@
---
layout: default
title: Filters
parent: Bucket aggregations
grand_parent: Aggregations
nav_order: 60
redirect_from:
- /query-dsl/aggregations/bucket/filters/
---
# Filters aggregations
A `filters` aggregation is the same as the `filter` aggregation, except that it lets you use multiple filter aggregations.
While the `filter` aggregation results in a single bucket, the `filters` aggregation returns multiple buckets, one for each of the defined filters.
To create a bucket for all the documents that didn't match the any of the filter queries, set the `other_bucket` property to `true`:
```json
GET opensearch_dashboards_sample_data_logs/_search
{
"size": 0,
"aggs": {
"200_os": {
"filters": {
"other_bucket": true,
"filters": [
{
"term": {
"response.keyword": "200"
}
},
{
"term": {
"machine.os.keyword": "osx"
}
}
]
},
"aggs": {
"avg_amount": {
"avg": {
"field": "bytes"
}
}
}
}
}
}
```
{% include copy-curl.html %}
#### Example response
```json
...
"aggregations" : {
"200_os" : {
"buckets" : [
{
"doc_count" : 12832,
"avg_amount" : {
"value" : 5897.852711970075
}
},
{
"doc_count" : 2825,
"avg_amount" : {
"value" : 5620.347256637168
}
},
{
"doc_count" : 1017,
"avg_amount" : {
"value" : 3247.0963618485744
}
}
]
}
}
}
```

View File

@ -0,0 +1,160 @@
---
layout: default
title: Geodistance
parent: Bucket aggregations
grand_parent: Aggregations
nav_order: 70
redirect_from:
- /query-dsl/aggregations/bucket/geo-distance/
---
# Geodistance aggregations
The `geo_distance` aggregation groups documents into concentric circles based on distances from an origin `geo_point` field.
It's the same as the `range` aggregation, except that it works on geo locations.
For example, you can use the `geo_distance` aggregation to find all pizza places within 1 km of you. The search results are limited to the 1 km radius specified by you, but you can add another result found within 2 km.
You can only use the `geo_distance` aggregation on fields mapped as `geo_point`.
A point is a single geographical coordinate, such as your current location shown by your smart-phone. A point in OpenSearch is represented as follows:
```json
{
"location": {
"type": "point",
"coordinates": {
"lat": 83.76,
"lon": -81.2
}
}
}
```
You can also specify the latitude and longitude as an array `[-81.20, 83.76]` or as a string `"83.76, -81.20"`
This table lists the relevant fields of a `geo_distance` aggregation:
Field | Description | Required
:--- | :--- |:---
`field` | Specify the geopoint field that you want to work on. | Yes
`origin` | Specify the geopoint that's used to compute the distances from. | Yes
`ranges` | Specify a list of ranges to collect documents based on their distance from the target point. | Yes
`unit` | Define the units used in the `ranges` array. The `unit` defaults to `m` (meters), but you can switch to other units like `km` (kilometers), `mi` (miles), `in` (inches), `yd` (yards), `cm` (centimeters), and `mm` (millimeters). | No
`distance_type` | Specify how OpenSearch calculates the distance. The default is `sloppy_arc` (faster but less accurate), but can also be set to `arc` (slower but most accurate) or `plane` (fastest but least accurate). Because of high error margins, use `plane` only for small geographic areas. | No
The syntax is as follows:
```json
{
"aggs": {
"aggregation_name": {
"geo_distance": {
"field": "field_1",
"origin": "x, y",
"ranges": [
{
"to": "value_1"
},
{
"from": "value_2",
"to": "value_3"
},
{
"from": "value_4"
}
]
}
}
}
}
```
This example forms buckets from the following distances from a `geo-point` field:
- Fewer than 10 km
- From 10 to 20 km
- From 20 to 50 km
- From 50 to 100 km
- Above 100 km
```json
GET opensearch_dashboards_sample_data_logs/_search
{
"size": 0,
"aggs": {
"position": {
"geo_distance": {
"field": "geo.coordinates",
"origin": {
"lat": 83.76,
"lon": -81.2
},
"ranges": [
{
"to": 10
},
{
"from": 10,
"to": 20
},
{
"from": 20,
"to": 50
},
{
"from": 50,
"to": 100
},
{
"from": 100
}
]
}
}
}
}
```
{% include copy-curl.html %}
#### Example response
```json
...
"aggregations" : {
"position" : {
"buckets" : [
{
"key" : "*-10.0",
"from" : 0.0,
"to" : 10.0,
"doc_count" : 0
},
{
"key" : "10.0-20.0",
"from" : 10.0,
"to" : 20.0,
"doc_count" : 0
},
{
"key" : "20.0-50.0",
"from" : 20.0,
"to" : 50.0,
"doc_count" : 0
},
{
"key" : "50.0-100.0",
"from" : 50.0,
"to" : 100.0,
"doc_count" : 0
},
{
"key" : "100.0-*",
"from" : 100.0,
"doc_count" : 14074
}
]
}
}
}
```

View File

@ -0,0 +1,280 @@
---
layout: default
title: Geohash grid
parent: Bucket aggregations
grand_parent: Aggregations
nav_order: 80
redirect_from:
- /query-dsl/aggregations/bucket/geohash-grid/
---
# Geohash grid aggregations
The `geohash_grid` aggregation buckets documents for geographical analysis. It organizes a geographical region into a grid of smaller regions of different sizes or precisions. Lower values of precision represent larger geographical areas, and higher values represent smaller, more precise geographical areas. You can aggregate documents on [geopoint]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/geo-point/) or [geoshape]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/geo-shape/) fields using a geohash grid aggregation. One notable difference is that a geopoint is only present in one bucket, but a geoshape is counted in all geohash grid cells with which it intersects.
The number of results returned by a query might be far too many to display each geopoint individually on a map. The `geohash_grid` aggregation buckets nearby geopoints together by calculating the geohash for each point, at the level of precision that you define (between 1 to 12; the default is 5). To learn more about geohash, see [Wikipedia](https://en.wikipedia.org/wiki/Geohash).
The web logs example data is spread over a large geographical area, so you can use a lower precision value. You can zoom in on this map by increasing the precision value:
```json
GET opensearch_dashboards_sample_data_logs/_search
{
"size": 0,
"aggs": {
"geo_hash": {
"geohash_grid": {
"field": "geo.coordinates",
"precision": 4
}
}
}
}
```
{% include copy-curl.html %}
#### Example response
```json
...
"aggregations" : {
"geo_hash" : {
"buckets" : [
{
"key" : "c1cg",
"doc_count" : 104
},
{
"key" : "dr5r",
"doc_count" : 26
},
{
"key" : "9q5b",
"doc_count" : 20
},
{
"key" : "c20g",
"doc_count" : 19
},
{
"key" : "dr70",
"doc_count" : 18
}
...
]
}
}
}
```
You can visualize the aggregated response on a map using OpenSearch Dashboards.
The more accurate you want the aggregation to be, the more resources OpenSearch consumes because of the number of buckets that the aggregation has to calculate. By default, OpenSearch does not generate more than 10,000 buckets. You can change this behavior by using the `size` attribute, but keep in mind that the performance might suffer for very wide queries consisting of thousands of buckets.
## Aggregating geoshapes
To run an aggregation on a geoshape field, first create an index and map the `location` field as a `geo_shape`:
```json
PUT national_parks
{
"mappings": {
"properties": {
"location": {
"type": "geo_shape"
}
}
}
}
```
{% include copy-curl.html %}
Next, index some documents into the `national_parks` index:
```json
PUT national_parks/_doc/1
{
"name": "Yellowstone National Park",
"location":
{"type": "envelope","coordinates": [ [-111.15, 45.12], [-109.83, 44.12] ]}
}
```
{% include copy-curl.html %}
```json
PUT national_parks/_doc/2
{
"name": "Yosemite National Park",
"location":
{"type": "envelope","coordinates": [ [-120.23, 38.16], [-119.05, 37.45] ]}
}
```
{% include copy-curl.html %}
```json
PUT national_parks/_doc/3
{
"name": "Death Valley National Park",
"location":
{"type": "envelope","coordinates": [ [-117.34, 37.01], [-116.38, 36.25] ]}
}
```
{% include copy-curl.html %}
You can run an aggregation on the `location` field as follows:
```json
GET national_parks/_search
{
"aggregations": {
"grouped": {
"geohash_grid": {
"field": "location",
"precision": 1
}
}
}
}
```
{% include copy-curl.html %}
When aggregating geoshapes, one geoshape can be counted for multiple buckets because it overlaps multiple grid cells:
<details open markdown="block">
<summary>
Response
</summary>
{: .text-delta}
```json
{
"took" : 24,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "national_parks",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"name" : "Yellowstone National Park",
"location" : {
"type" : "envelope",
"coordinates" : [
[
-111.15,
45.12
],
[
-109.83,
44.12
]
]
}
}
},
{
"_index" : "national_parks",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"name" : "Yosemite National Park",
"location" : {
"type" : "envelope",
"coordinates" : [
[
-120.23,
38.16
],
[
-119.05,
37.45
]
]
}
}
},
{
"_index" : "national_parks",
"_id" : "3",
"_score" : 1.0,
"_source" : {
"name" : "Death Valley National Park",
"location" : {
"type" : "envelope",
"coordinates" : [
[
-117.34,
37.01
],
[
-116.38,
36.25
]
]
}
}
}
]
},
"aggregations" : {
"grouped" : {
"buckets" : [
{
"key" : "9",
"doc_count" : 3
},
{
"key" : "c",
"doc_count" : 1
}
]
}
}
}
```
</details>
Currently, OpenSearch supports geoshape aggregation through the API but not in OpenSearch Dashboards visualizations. If you'd like to see geoshape aggregation implemented for visualizations, upvote the related [GitHub issue](https://github.com/opensearch-project/dashboards-maps/issues/250).
{: .note}
## Supported parameters
Geohash grid aggregation requests support the following parameters.
Parameter | Data type | Description
:--- | :--- | :---
field | String | The field on which aggregation is performed. This field must be mapped as a `geo_point` or `geo_shape` field. If the field contains an array, all array values are aggregated. Required.
precision | Integer | The granularity level used to determine grid cells for bucketing results. Cells cannot exceed the specified size (diagonal) of the required precision. Valid values are in the [0, 12] range. Optional. Default is 5.
bounds | Object | The bounding box for filtering geopoints and geoshapes. The bounding box is defined by the upper-left and lower-right vertices. Only shapes that intersect with this bounding box or are completely enclosed by this bounding box are included in the aggregation output. The vertices are specified as geopoints in one of the following formats: <br>- An object with a latitude and longitude<br>- An array in the [`longitude`, `latitude`] format<br>- A string in the "`latitude`,`longitude`" format<br>- A geohash <br>- WKT<br> See the [geopoint formats]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/geo-point#formats) for formatting examples. Optional.
size | Integer | The maximum number of buckets to return. When there are more buckets than `size`, OpenSearch returns buckets with more documents. Optional. Default is 10,000.
shard_size | Integer | The maximum number of buckets to return from each shard. Optional. Default is max (10, `size` &middot; number of shards), which provides a more accurate count of more highly prioritized buckets.
## Geohash precision
The relationship between geohash precision and the approximate grid cell dimensions is described in the following table.
Precision /<br>geohash length | Latitude bits | Longitude bits | Latitude error | Longitude error | Cell height | Cell width
:---:|:-------------:|:--------------:|:--------------:|:---------------:|:-----------:|:----------:
1 | 2 | 3 | ±23 | ±23 | 4992.6 km | 5009.4 km
2 | 5 | 5 | ±2.8 | ±5.6 | 624.1 km | 1252.3 km
3 | 7 | 8 | ±0.70 | ±0.70 | 156 km | 156.5 km
4 | 10 | 10 | ±0.087 | ±0.18 | 19.5 km | 39.1 km
5 | 12 | 13 | ±0.022 | ±0.022 | 4.9 km | 4.9 km
6 | 15 | 15 | ±0.0027 | ±0.0055 | 609.4 m | 1.2 km
7 | 17 | 18 | ±0.00068 | ±0.00068 | 152.5 m | 152.9 m
8 | 20 | 20 | ±0.00086 | ±0.000172 | 19 m | 38.2 m
9 | 22 | 23 | ±0.000021 | ±0.000021 | 4.8 m | 4.8 m
10 | 25 | 25 | ±0.00000268 | ±0.00000536 | 59.5 cm | 1.2 m
11 | 27 | 28 | ±0.00000067 | ±0.00000067 | 14.9 cm | 14.9 cm
12 | 30 | 30 | ±0.00000008 | ±0.00000017 | 1.9 cm | 3.7 cm

View File

@ -0,0 +1,393 @@
---
layout: default
title: Geohex grid
parent: Bucket aggregations
grand_parent: Aggregations
nav_order: 85
redirect_from:
- /aggregations/geohexgrid/
- /query-dsl/aggregations/geohexgrid/
- /query-dsl/aggregations/bucket/geohex-grid/
---
# Geohex grid aggregations
The Hexagonal Hierarchical Geospatial Indexing System (H3) partitions the Earth's areas into identifiable hexagon-shaped cells.
The H3 grid system works well for proximity applications because it overcomes the limitations of Geohash's non-uniform partitions. Geohash encodes latitude and longitude pairs, leading to significantly smaller partitions near the poles and a degree of longitude near the equator. However, the H3 grid system's distortions are low and limited to 5 partitions of 122. These five partitions are placed in low-use areas (for example, in the middle of the ocean), leaving the essential areas error free. Thus, grouping documents based on the H3 grid system provides a better aggregation than the Geohash grid.
The geohex grid aggregation groups [geopoints]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/geo-point/) into grid cells for geographical analysis. Each grid cell corresponds to an [H3 cell](https://h3geo.org/docs/core-library/h3Indexing/#h3-cell-indexp) and is identified using the [H3Index representation](https://h3geo.org/docs/core-library/h3Indexing/#h3index-representation).
## Precision
The `precision` parameter controls the level of granularity that determines the grid cell size. The lower the precision, the larger the grid cells.
The following example illustrates low-precision and high-precision aggregation requests.
To start, create an index and map the `location` field as a `geo_point`:
```json
PUT national_parks
{
"mappings": {
"properties": {
"location": {
"type": "geo_point"
}
}
}
}
```
{% include copy-curl.html %}
Index the following documents into the sample index:
```json
PUT national_parks/_doc/1
{
"name": "Yellowstone National Park",
"location": "44.42, -110.59"
}
```
{% include copy-curl.html %}
```json
PUT national_parks/_doc/2
{
"name": "Yosemite National Park",
"location": "37.87, -119.53"
}
```
{% include copy-curl.html %}
```json
PUT national_parks/_doc/3
{
"name": "Death Valley National Park",
"location": "36.53, -116.93"
}
```
{% include copy-curl.html %}
You can index geopoints in several formats. For a list of all supported formats, see the [geopoint documentation]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/geo-point#formats).
{: .note}
## Low-precision requests
Run a low-precision request that buckets all three documents together:
```json
GET national_parks/_search
{
"aggregations": {
"grouped": {
"geohex_grid": {
"field": "location",
"precision": 1
}
}
}
}
```
{% include copy-curl.html %}
You can use either the `GET` or `POST` HTTP method for geohex grid aggregation queries.
{: .note}
The response groups documents 2 and 3 together because they are close enough to be bucketed in one grid cell:
```json
{
"took" : 4,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "national_parks",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"name" : "Yellowstone National Park",
"location" : "44.42, -110.59"
}
},
{
"_index" : "national_parks",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"name" : "Yosemite National Park",
"location" : "37.87, -119.53"
}
},
{
"_index" : "national_parks",
"_id" : "3",
"_score" : 1.0,
"_source" : {
"name" : "Death Valley National Park",
"location" : "36.53, -116.93"
}
}
]
},
"aggregations" : {
"grouped" : {
"buckets" : [
{
"key" : "8129bffffffffff",
"doc_count" : 2
},
{
"key" : "8128bffffffffff",
"doc_count" : 1
}
]
}
}
}
```
## High-precision requests
Now run a high-precision request:
```json
GET national_parks/_search
{
"aggregations": {
"grouped": {
"geohex_grid": {
"field": "location",
"precision": 6
}
}
}
}
```
{% include copy-curl.html %}
All three documents are bucketed separately because of higher granularity:
```json
{
"took" : 5,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "national_parks",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"name" : "Yellowstone National Park",
"location" : "44.42, -110.59"
}
},
{
"_index" : "national_parks",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"name" : "Yosemite National Park",
"location" : "37.87, -119.53"
}
},
{
"_index" : "national_parks",
"_id" : "3",
"_score" : 1.0,
"_source" : {
"name" : "Death Valley National Park",
"location" : "36.53, -116.93"
}
}
]
},
"aggregations" : {
"grouped" : {
"buckets" : [
{
"key" : "8629ab6dfffffff",
"doc_count" : 1
},
{
"key" : "8629857a7ffffff",
"doc_count" : 1
},
{
"key" : "862896017ffffff",
"doc_count" : 1
}
]
}
}
}
```
## Filtering requests
High-precision requests are resource intensive, so we recommend using a filter like `geo_bounding_box` to limit the geographical area. For example, the following query applies a filter to limit the search area:
```json
GET national_parks/_search
{
"size" : 0,
"aggregations": {
"filtered": {
"filter": {
"geo_bounding_box": {
"location": {
"top_left": "38, -120",
"bottom_right": "36, -116"
}
}
},
"aggregations": {
"grouped": {
"geohex_grid": {
"field": "location",
"precision": 6
}
}
}
}
}
}
```
{% include copy-curl.html %}
The response contains the two documents that are within the `geo_bounding_box` bounds:
```json
{
"took" : 4,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"filtered" : {
"doc_count" : 2,
"grouped" : {
"buckets" : [
{
"key" : "8629ab6dfffffff",
"doc_count" : 1
},
{
"key" : "8629857a7ffffff",
"doc_count" : 1
}
]
}
}
}
}
```
You can also restrict the geographical area by providing the coordinates of the bounding envelope in the `bounds` parameter. Both `bounds` and `geo_bounding_box` coordinates can be specified in any of the [geopoint formats]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/geo-point#formats). The following query uses the well-known text (WKT) "POINT(`longitude` `latitude`)" format for the `bounds` parameter:
```json
GET national_parks/_search
{
"size": 0,
"aggregations": {
"grouped": {
"geohex_grid": {
"field": "location",
"precision": 6,
"bounds": {
"top_left": "POINT (-120 38)",
"bottom_right": "POINT (-116 36)"
}
}
}
}
}
```
{% include copy-curl.html %}
The response contains only the two results that are within the specified bounds:
```json
{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"grouped" : {
"buckets" : [
{
"key" : "8629ab6dfffffff",
"doc_count" : 1
},
{
"key" : "8629857a7ffffff",
"doc_count" : 1
}
]
}
}
}
```
The `bounds` parameter can be used with or without the `geo_bounding_box` filter; these two parameters are independent and can have any spatial relationship to each other.
## Supported parameters
Geohex grid aggregation requests support the following parameters.
Parameter | Data type | Description
:--- | :--- | :---
field | String | The field that contains the geopoints. This field must be mapped as a `geo_point` field. If the field contains an array, all array values are aggregated. Required.
precision | Integer | The granularity level used to determine grid cells for bucketing results. Cells cannot exceed the specified size (diagonal) of the required precision. Valid values are in the [0, 15] range. Optional. Default is 5.
bounds | Object | The bounding box for filtering geopoints. The bounding box is defined by the upper-left and lower-right vertices. The vertices are specified as geopoints in one of the following formats: <br>- An object with a latitude and longitude<br>- An array in the [`longitude`, `latitude`] format<br>- A string in the "`latitude`,`longitude`" format<br>- A geohash <br>- WKT<br> See the [geopoint formats]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/geo-point#formats) for formatting examples. Optional.
size | Integer | The maximum number of buckets to return. When there are more buckets than `size`, OpenSearch returns buckets with more documents. Optional. Default is 10,000.
shard_size | Integer | The maximum number of buckets to return from each shard. Optional. Default is max (10, `size` &middot; number of shards), which provides a more accurate count of more highly prioritized buckets.

View File

@ -0,0 +1,550 @@
---
layout: default
title: Geotile grid
parent: Bucket aggregations
grand_parent: Aggregations
nav_order: 87
redirect_from:
- /query-dsl/aggregations/bucket/geotile-grid/
---
# Geotile grid aggregations
The geotile grid aggregation groups documents into grid cells for geographical analysis. Each grid cell corresponds to a [map tile](https://en.wikipedia.org/wiki/Tiled_web_map) and is identified using the `{zoom}/{x}/{y}` format. You can aggregate documents on [geopoint]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/geo-point/) or [geoshape]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/geo-shape/) fields using a geotile grid aggregation. One notable difference is that a geopoint is only present in one bucket, but a geoshape is counted in all geotile grid cells with which it intersects.
## Precision
The `precision` parameter controls the level of granularity that determines the grid cell size. The lower the precision, the larger the grid cells.
The following example illustrates low-precision and high-precision aggregation requests.
To start, create an index and map the `location` field as a `geo_point`:
```json
PUT national_parks
{
"mappings": {
"properties": {
"location": {
"type": "geo_point"
}
}
}
}
```
{% include copy-curl.html %}
Index the following documents into the sample index:
```json
PUT national_parks/_doc/1
{
"name": "Yellowstone National Park",
"location": "44.42, -110.59"
}
```
{% include copy-curl.html %}
```json
PUT national_parks/_doc/2
{
"name": "Yosemite National Park",
"location": "37.87, -119.53"
}
```
{% include copy-curl.html %}
```json
PUT national_parks/_doc/3
{
"name": "Death Valley National Park",
"location": "36.53, -116.93"
}
```
{% include copy-curl.html %}
You can index geopoints in several formats. For a list of all supported formats, see the [geopoint documentation]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/geo-point#formats).
{: .note}
## Low-precision requests
Run a low-precision request that buckets all three documents together:
```json
GET national_parks/_search
{
"aggregations": {
"grouped": {
"geotile_grid": {
"field": "location",
"precision": 1
}
}
}
}
```
{% include copy-curl.html %}
You can use either the `GET` or `POST` HTTP method for geotile grid aggregation queries.
{: .note}
The response groups all documents together because they are close enough to be bucketed in one grid cell:
<details open markdown="block">
<summary>
Response
</summary>
{: .text-delta}
```json
{
"took": 51,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 3,
"relation": "eq"
},
"max_score": 1,
"hits": [
{
"_index": "national_parks",
"_id": "1",
"_score": 1,
"_source": {
"name": "Yellowstone National Park",
"location": "44.42, -110.59"
}
},
{
"_index": "national_parks",
"_id": "2",
"_score": 1,
"_source": {
"name": "Yosemite National Park",
"location": "37.87, -119.53"
}
},
{
"_index": "national_parks",
"_id": "3",
"_score": 1,
"_source": {
"name": "Death Valley National Park",
"location": "36.53, -116.93"
}
}
]
},
"aggregations": {
"grouped": {
"buckets": [
{
"key": "1/0/0",
"doc_count": 3
}
]
}
}
}
```
</details>
## High-precision requests
Now run a high-precision request:
```json
GET national_parks/_search
{
"aggregations": {
"grouped": {
"geotile_grid": {
"field": "location",
"precision": 6
}
}
}
}
```
{% include copy-curl.html %}
All three documents are bucketed separately because of higher granularity:
<details open markdown="block">
<summary>
Response
</summary>
{: .text-delta}
```json
{
"took": 15,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 3,
"relation": "eq"
},
"max_score": 1,
"hits": [
{
"_index": "national_parks",
"_id": "1",
"_score": 1,
"_source": {
"name": "Yellowstone National Park",
"location": "44.42, -110.59"
}
},
{
"_index": "national_parks",
"_id": "2",
"_score": 1,
"_source": {
"name": "Yosemite National Park",
"location": "37.87, -119.53"
}
},
{
"_index": "national_parks",
"_id": "3",
"_score": 1,
"_source": {
"name": "Death Valley National Park",
"location": "36.53, -116.93"
}
}
]
},
"aggregations": {
"grouped": {
"buckets": [
{
"key": "6/12/23",
"doc_count": 1
},
{
"key": "6/11/25",
"doc_count": 1
},
{
"key": "6/10/24",
"doc_count": 1
}
]
}
}
}
```
</details>
You can also restrict the geographical area by providing the coordinates of the bounding envelope in the `bounds` parameter. Both `bounds` and `geo_bounding_box` coordinates can be specified in any of the [geopoint formats]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/geo-point#formats). The following query uses the well-known text (WKT) "POINT(`longitude` `latitude`)" format for the `bounds` parameter:
```json
GET national_parks/_search
{
"size": 0,
"aggregations": {
"grouped": {
"geotile_grid": {
"field": "location",
"precision": 6,
"bounds": {
"top_left": "POINT (-120 38)",
"bottom_right": "POINT (-116 36)"
}
}
}
}
}
```
{% include copy-curl.html %}
The response contains only the two results that are within the specified bounds:
<details open markdown="block">
<summary>
Response
</summary>
{: .text-delta}
```json
{
"took": 48,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 3,
"relation": "eq"
},
"max_score": 1,
"hits": [
{
"_index": "national_parks",
"_id": "1",
"_score": 1,
"_source": {
"name": "Yellowstone National Park",
"location": "44.42, -110.59"
}
},
{
"_index": "national_parks",
"_id": "2",
"_score": 1,
"_source": {
"name": "Yosemite National Park",
"location": "37.87, -119.53"
}
},
{
"_index": "national_parks",
"_id": "3",
"_score": 1,
"_source": {
"name": "Death Valley National Park",
"location": "36.53, -116.93"
}
}
]
},
"aggregations": {
"grouped": {
"buckets": [
{
"key": "6/11/25",
"doc_count": 1
},
{
"key": "6/10/24",
"doc_count": 1
}
]
}
}
}
```
</details>
The `bounds` parameter can be used with or without the `geo_bounding_box` filter; these two parameters are independent and can have any spatial relationship to each other.
## Aggregating geoshapes
To run an aggregation on a geoshape field, first create an index and map the `location` field as a `geo_shape`:
```json
PUT national_parks
{
"mappings": {
"properties": {
"location": {
"type": "geo_shape"
}
}
}
}
```
{% include copy-curl.html %}
Next, index some documents into the `national_parks` index:
```json
PUT national_parks/_doc/1
{
"name": "Yellowstone National Park",
"location":
{"type": "envelope","coordinates": [ [-111.15, 45.12], [-109.83, 44.12] ]}
}
```
{% include copy-curl.html %}
```json
PUT national_parks/_doc/2
{
"name": "Yosemite National Park",
"location":
{"type": "envelope","coordinates": [ [-120.23, 38.16], [-119.05, 37.45] ]}
}
```
{% include copy-curl.html %}
```json
PUT national_parks/_doc/3
{
"name": "Death Valley National Park",
"location":
{"type": "envelope","coordinates": [ [-117.34, 37.01], [-116.38, 36.25] ]}
}
```
{% include copy-curl.html %}
You can run an aggregation on the `location` field as follows:
```json
GET national_parks/_search
{
"aggregations": {
"grouped": {
"geotile_grid": {
"field": "location",
"precision": 6
}
}
}
}
```
{% include copy-curl.html %}
When aggregating geoshapes, one geoshape can be counted for multiple buckets because it overlaps with multiple grid cells:
<details open markdown="block">
<summary>
Response
</summary>
{: .text-delta}
```json
{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "national_parks",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"name" : "Yellowstone National Park",
"location" : {
"type" : "envelope",
"coordinates" : [
[
-111.15,
45.12
],
[
-109.83,
44.12
]
]
}
}
},
{
"_index" : "national_parks",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"name" : "Yosemite National Park",
"location" : {
"type" : "envelope",
"coordinates" : [
[
-120.23,
38.16
],
[
-119.05,
37.45
]
]
}
}
},
{
"_index" : "national_parks",
"_id" : "3",
"_score" : 1.0,
"_source" : {
"name" : "Death Valley National Park",
"location" : {
"type" : "envelope",
"coordinates" : [
[
-117.34,
37.01
],
[
-116.38,
36.25
]
]
}
}
}
]
},
"aggregations" : {
"grouped" : {
"buckets" : [
{
"key" : "6/12/23",
"doc_count" : 1
},
{
"key" : "6/12/22",
"doc_count" : 1
},
{
"key" : "6/11/25",
"doc_count" : 1
},
{
"key" : "6/11/24",
"doc_count" : 1
},
{
"key" : "6/10/24",
"doc_count" : 1
}
]
}
}
}
```
</details>
Currently, OpenSearch supports geoshape aggregation through the API but not in OpenSearch Dashboards visualizations. If you'd like to see geoshape aggregation implemented for visualizations, upvote the related [GitHub issue](https://github.com/opensearch-project/dashboards-maps/issues/250).
{: .note}
## Supported parameters
Geotile grid aggregation requests support the following parameters.
Parameter | Data type | Description
:--- | :--- | :---
field | String | The field that contains the geopoints. This field must be mapped as a `geo_point` field. If the field contains an array, all array values are aggregated. Required.
precision | Integer | The granularity level used to determine grid cells for bucketing results. Cells cannot exceed the specified size (diagonal) of the required precision. Valid values are in the [0, 29] range. Optional. Default is 7.
bounds | Object | The bounding box for filtering geopoints. The bounding box is defined by the upper-left and lower-right vertices. The vertices are specified as geopoints in one of the following formats: <br>- An object with a latitude and longitude<br>- An array in the [`longitude`, `latitude`] format<br>- A string in the "`latitude`,`longitude`" format<br>- A geohash <br>- WKT<br> See the [geopoint formats]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/geo-point#formats) for formatting examples. Optional.
size | Integer | The maximum number of buckets to return. When there are more buckets than `size`, OpenSearch returns buckets with more documents. Optional. Default is 10,000.
shard_size | Integer | The maximum number of buckets to return from each shard. Optional. Default is max (10, `size` &middot; number of shards), which provides a more accurate count of more highly prioritized buckets.

View File

@ -0,0 +1,59 @@
---
layout: default
title: Global
parent: Bucket aggregations
grand_parent: Aggregations
nav_order: 90
redirect_from:
- /query-dsl/aggregations/bucket/global/
---
# Global aggregations
The `global` aggregations lets you break out of the aggregation context of a filter aggregation. Even if you have included a filter query that narrows down a set of documents, the `global` aggregation aggregates on all documents as if the filter query wasn't there. It ignores the `filter` aggregation and implicitly assumes the `match_all` query.
The following example returns the `avg` value of the `taxful_total_price` field from all documents in the index:
```json
GET opensearch_dashboards_sample_data_ecommerce/_search
{
"size": 0,
"query": {
"range": {
"taxful_total_price": {
"lte": 50
}
}
},
"aggs": {
"total_avg_amount": {
"global": {},
"aggs": {
"avg_price": {
"avg": {
"field": "taxful_total_price"
}
}
}
}
}
}
```
{% include copy-curl.html %}
#### Example response
```json
...
"aggregations" : {
"total_avg_amount" : {
"doc_count" : 4675,
"avg_price" : {
"value" : 75.05542864304813
}
}
}
}
```
You can see that the average value for the `taxful_total_price` field is 75.05 and not the 38.36 as seen in the `filter` example when the query matched.

View File

@ -0,0 +1,54 @@
---
layout: default
title: Histogram
parent: Bucket aggregations
grand_parent: Aggregations
nav_order: 100
redirect_from:
- /query-dsl/aggregations/bucket/histogram/
---
# Histogram aggregations
The `histogram` aggregation buckets documents based on a specified interval.
With `histogram` aggregations, you can visualize the distributions of values in a given range of documents very easily. Now OpenSearch doesnt give you back an actual graph of course, thats what OpenSearch Dashboards is for. But it'll give you the JSON response that you can use to construct your own graph.
The following example buckets the `number_of_bytes` field by 10,000 intervals:
```json
GET opensearch_dashboards_sample_data_logs/_search
{
"size": 0,
"aggs": {
"number_of_bytes": {
"histogram": {
"field": "bytes",
"interval": 10000
}
}
}
}
```
{% include copy-curl.html %}
#### Example response
```json
...
"aggregations" : {
"number_of_bytes" : {
"buckets" : [
{
"key" : 0.0,
"doc_count" : 13372
},
{
"key" : 10000.0,
"doc_count" : 702
}
]
}
}
}
```

View File

@ -0,0 +1,45 @@
---
layout: default
title: Bucket aggregations
has_children: true
has_toc: false
nav_order: 3
redirect_from:
- /opensearch/bucket-agg/
- /query-dsl/aggregations/bucket-agg/
- /query-dsl/aggregations/bucket/
- /aggregations/bucket-agg/
---
# Bucket aggregations
Bucket aggregations categorize sets of documents as buckets. The type of bucket aggregation determines the bucket for a given document.
You can use bucket aggregations to implement faceted navigation (usually placed as a sidebar on a search result landing page) to help your users filter the results.
## Supported bucket aggregations
OpenSearch supports the following bucket aggregations:
- [Adjacency matrix]({{site.url}}{{site.baseurl}}/aggregations/bucket/adjacency-matrix/)
- [Date histogram]({{site.url}}{{site.baseurl}}/aggregations/bucket/date-histogram/)
- [Date range]({{site.url}}{{site.baseurl}}/aggregations/bucket/date-range/)
- [Diversified sampler]({{site.url}}{{site.baseurl}}/aggregations/bucket/diversified-sampler/)
- [Filter]({{site.url}}{{site.baseurl}}/aggregations/bucket/filter/)
- [Filters]({{site.url}}{{site.baseurl}}/aggregations/bucket/filters/)
- [Geodistance]({{site.url}}{{site.baseurl}}/aggregations/bucket/geo-distance/)
- [Geohash grid]({{site.url}}{{site.baseurl}}/aggregations/bucket/geohash-grid/)
- [Geohex grid]({{site.url}}{{site.baseurl}}/aggregations/bucket/geohex-grid/)
- [Geotile grid]({{site.url}}{{site.baseurl}}/aggregations/bucket/geotile-grid/)
- [Global]({{site.url}}{{site.baseurl}}/aggregations/bucket/global/)
- [Histogram]({{site.url}}{{site.baseurl}}/aggregations/bucket/histogram/)
- [IP range]({{site.url}}{{site.baseurl}}/aggregations/bucket/ip-range/)
- [Missing]({{site.url}}{{site.baseurl}}/aggregations/bucket/missing/)
- [Multi-terms]({{site.url}}{{site.baseurl}}/aggregations/bucket/multi-terms/)
- [Nested]({{site.url}}{{site.baseurl}}/aggregations/bucket/nested/)
- [Range]({{site.url}}{{site.baseurl}}/aggregations/bucket/range/)
- [Reverse nested]({{site.url}}{{site.baseurl}}/aggregations/bucket/reverse-nested/)
- [Sampler]({{site.url}}{{site.baseurl}}/aggregations/bucket/sampler/)
- [Significant terms]({{site.url}}{{site.baseurl}}/aggregations/bucket/significant-terms/)
- [Significant text]({{site.url}}{{site.baseurl}}/aggregations/bucket/significant-text/)
- [Terms]({{site.url}}{{site.baseurl}}/aggregations/bucket/terms/)

Some files were not shown because too many files have changed in this diff Show More