Commit Graph

25 Commits

Author SHA1 Message Date
Régis Hanol aa1138ff71
FIX: reindex_search job should work on model with no search data (#11819)
Lots of changes but it's mostly a refactoring.

The interesting part that was fix are the 'load_problem_<model>_ids' methods.
They will now return records with no search data associated so they can be properly indexed for the search.
This "bad" state usually happens after a migration.
2021-01-25 11:23:36 +01:00
Guo Xiang Tan 069a109cbb
DEV: Require scheduled job in development to avoid loading file twice.
This removes the need to memoize constant in order to avoid the "warning: already initialized constant".
2020-09-01 10:14:40 +08:00
Guo Xiang Tan 3766122a82
DEV: Allow developmental post search index versions. 2020-07-23 15:19:46 +08:00
Guo Xiang Tan 609ba50fe8
DEV: Add more granularity to `SearchIndexer` versions.
Sometimes, we just want to reindex a specific model and not all the
things.
2020-07-23 14:24:06 +08:00
Sam Saffron 9ffc022cf4
DEV: improve verbose mode for reindexer
This makes the verbose mode provide a bit of progress notification
while reindexing as it can take many hours to do a giant site
2020-06-24 17:29:45 +10:00
Sam Saffron dcad720a4c
DEV: add optional verbose logging to re-index job
This verbose logging can be useful when executing the job by hand
for debugging purposes

In general people will not use this
2020-06-24 15:37:08 +10:00
Sam Saffron 451e9c4bb9
DEV: minor SQL formatting change
Moved join prior to left join to make query less confusing.

Has no material impact on performance.
2020-05-12 16:55:42 +10:00
Joffrey JAFFEUX b6ef473a31
DEV: prevents redefinition of various constants (#8199)
CLEANUP_GRACE_PERIOD
MAX_AWARDED
POLL_MAILBOX_TIMEOUT_ERROR_KEY
2019-10-17 09:18:07 +02:00
Krzysztof Kotlarek 427d54b2b0 DEV: Upgrading Discourse to Zeitwerk (#8098)
Zeitwerk simplifies working with dependencies in dev and makes it easier reloading class chains. 

We no longer need to use Rails "require_dependency" anywhere and instead can just use standard 
Ruby patterns to require files.

This is a far reaching change and we expect some followups here.
2019-10-02 14:01:53 +10:00
Sam Saffron 22abad4151 PERF: stop reindexing and skipping deleted posts 2019-06-04 17:53:35 +10:00
Sam Saffron 74c4f926fc FIX: drop deleted posts from search index
This does two things

1. Our "index grace period" has been wound down to 1 day, there is no point
keeping a bloated index for a week, usually when people delete stuff they
mean for it to be removed

2. We were never dropping deleted posts from the index, only posts from
deleted topics

These changes speed up search a tiny bit and reduce background work.
2019-06-04 17:19:59 +10:00
Sam Saffron 77300c1d8d DEV: reindex old data in a more consistent way
Previously we were grabbing arbitrary rows in many cases which makes
diagnosing issues in the indexer more complex
2019-06-04 11:47:24 +10:00
Sam Saffron 30990006a9 DEV: enable frozen string literal on all files
This reduces chances of errors where consumers of strings mutate inputs
and reduces memory usage of the app.

Test suite passes now, but there may be some stuff left, so we will run
a few sites on a branch prior to merging
2019-05-13 09:31:32 +08:00
Guo Xiang Tan 108c231d1c FIX: Clean up `topic_search_data` of trashed topics.
This keeps the index and table smaller.
2019-04-08 16:53:39 +08:00
Guo Xiang Tan c4997ce85f FIX: Don't try and reindex posts that have been trashed. 2019-04-08 16:38:43 +08:00
Guo Xiang Tan d151425353
PERF: Delete search data of posts from trashed topics periodically. (#7302)
This keeps both the index and table smaller.
2019-04-03 10:10:41 +08:00
Guo Xiang Tan d8704c11ca PERF: Better use of index when queueing a topci for search reindex.
Also move `Search::INDEX_VERSION` to `SearchIndexer` which is where the
version is actually being used.
2019-04-02 09:53:37 +08:00
Guo Xiang Tan aa2311a7b0 FIX: Don't reindex posts belonging to a deleted topic for search.
Posts belonging to a deleted topic can't be index for search so we need
to avoid loading those post ids.
2019-04-02 07:36:53 +08:00
Guo Xiang Tan 3fc5dbb045 FIX: Don't attempt to reindex posts that have an empty raw.
If the post ids keep loading, we might end up in a situations where
we're always loading the same post ids over and over again without
indexing anything new.

Follow up to daeda80ada.
2019-04-02 07:13:33 +08:00
Guo Xiang Tan daeda80ada
FIX: Don't index posts with empty `Post#raw` for search. (#7263)
* DEV: Remove unnecessary join in `Jobs::ReindexSearch`.

* FIX: Don't index posts with empty `Post#raw` for search.
2019-04-01 10:06:27 +08:00
Sam 86d12bd44b FEATURE: search within title using in:title
Also

- Significantly improved search ranking, title is treated most strongly
- Adds tag names to the index
- Run search re-indexer more aggressively
- Re-index topic and all posts on category change
2018-02-20 14:41:21 +11:00
Neil Lalonde 2c56f8df7c FEATURE: show tags in search results 2017-08-25 11:52:59 -04:00
Sam Saffron 56f7b4e01e PERF: reindex search data without loading large post counts 2017-08-16 08:18:59 -04:00
Erick Guan 6e59149a77 FIX: rebuild index when engine replaced (#5021) 2017-08-16 07:38:34 -04:00
Sam 760e9a756d PERF: push reindex job to daily 2014-07-01 10:09:55 +10:00