mirror of
https://github.com/honeymoose/OpenSearch.git
synced 2025-03-09 14:34:43 +00:00
As data frame rows with missing values for analyzed fields are skipped, we can be more efficient by including a query that only picks documents that have values for all analyzed fields. Besides improving the number of documents we go through, we also provide a more accurate measurement of how many rows we need which reduces the memory requirements. This also adds an integration test that runs outlier detection on data with missing fields.