[[analysis-standard-tokenizer]] === Standard Tokenizer A tokenizer of type `standard` providing grammar based tokenizer that is a good tokenizer for most European language documents. The tokenizer implements the Unicode Text Segmentation algorithm, as specified in http://unicode.org/reports/tr29/[Unicode Standard Annex #29]. The following are settings that can be set for a `standard` tokenizer type: [cols="<,<",options="header",] |======================================================================= |Setting |Description |`max_token_length` |The maximum token length. If a token is seen that exceeds this length then it is discarded. Defaults to `255`. |=======================================================================