[[analysis-keyword-tokenizer]] === Keyword Tokenizer The `keyword` tokenizer is a ``noop'' tokenizer that accepts whatever text it is given and outputs the exact same text as a single term. It can be combined with token filters to normalise output, e.g. lower-casing email addresses. [float] === Example output [source,console] --------------------------- POST _analyze { "tokenizer": "keyword", "text": "New York" } --------------------------- ///////////////////// [source,console-result] ---------------------------- { "tokens": [ { "token": "New York", "start_offset": 0, "end_offset": 8, "type": "word", "position": 0 } ] } ---------------------------- ///////////////////// The above sentence would produce the following term: [source,text] --------------------------- [ New York ] --------------------------- [discrete] [[analysis-keyword-tokenizer-token-filters]] === Combine with token filters You can combine the `keyword` tokenizer with token filters to normalise structured data, such as product IDs or email addresses. For example, the following <> request uses the `keyword` tokenizer and <> filter to convert an email address to lowercase. [source,console] --------------------------- POST _analyze { "tokenizer": "keyword", "filter": [ "lowercase" ], "text": "john.SMITH@example.COM" } --------------------------- ///////////////////// [source,console-result] ---------------------------- { "tokens": [ { "token": "john.smith@example.com", "start_offset": 0, "end_offset": 22, "type": "word", "position": 0 } ] } ---------------------------- ///////////////////// The request produces the following token: [source,text] --------------------------- [ john.smith@example.com ] --------------------------- [float] === Configuration The `keyword` tokenizer accepts the following parameters: [horizontal] `buffer_size`:: The number of characters read into the term buffer in a single pass. Defaults to `256`. The term buffer will grow by this size until all the text has been consumed. It is advisable not to change this setting.