Having a large number of post-deploy migrations running out-of-numerical-sequence with pre-deploy migrations can be problematic. For example, if we have the sequence
- db/migrate/2017... - add column
- db/post_migrate/2018... - drop the column
- db/migrate/2021... - add the same column again
It will work fine in numerical order. But if you run the pre-deploy migrations **followed by** the post-deploy migrations, you will not get the same result.
Our post-deploy system is designed to allow for seamless upgrades of Discourse. However, it is reasonable for us to only support this totally seamless experience for a limited period of time. This commit moves all post_deploy migrations which are more than 1 year old (i.e. more than 2 major Discourse versions ago) into the regular pre-deploy migrations directory. This limits the impact of any edge cases caused by out-of-numerical-sequence migrations.
The first thing we needed here was an enum rather than a boolean to determine how a directory_column was created. Now we have `automatic`, `user_field` and `plugin` directory columns.
This plugin API is assuming that the plugin has added a migration to a column to the `directory_items` table.
This was created to be initially used by discourse-solved. PR with API usage - https://github.com/discourse/discourse-solved/pull/137/
Users who use encoded slugs on their sites sometimes run into 500 error when pasting a link to another topic in a post. The problem happens when generating a backward "reflection" link that would appear in a linked topic. Link URL restricted on the database level to 500 chars in length. At first glance, it should work since we have a restriction on topic title length.
But it doesn't work when a site uses encoded slugs, like here (take a look at the URL). The link to a topic, in this case, can be much longer than 500 characters.
By the way, an error happens only when generating a "reflection" link and doesn't happen with a direct link, we truncate that link. It works because, in this case, the original long link is still present in the post body and can be used for navigation. But we can't do the same for backward "reflection" links (without rewriting their implementation), the whole link must be saved to the database.
The simplest and cleanest solution will be just to remove the restriction on the database level. Abuse is impossible here since we are already protected by the restriction on topic title length. There aren’t performance benefits in using length-constrained columns in Postgres, in fact, length-constrained columns need a few extra CPU cycles to check the length when storing data.
The excerpt field in the database is constrained to 1000 chars in length. To support this constraint we added a restriction that the topic_excerpt_maxlength setting must be between 0 and 999.
Unfortunately, sometimes it doesn’t work because:
- topic_excerpt_maxlength restricts the length of a visible to user excerpt. But we HTML-escape text before saving. If an excerpt contains & it’ll be &. One character for the user but 5 characters to save to the database. So if topic_excerpt_maxlength is set to 999 it’s not so hard to have an excerpt of for example 1003 characters in length and run into this issue.
- It’s possible to define a custom excerpt for a topic. Such excerpts bypass check for a length. So if the user defines a too long custom excerpt he will run into this issue.
Removing the constraint on the database level solves the problem. But we still need the constraint for topic_excerpt_maxlength on the setting page, because too long excerpts would make UI wonky.
Because bookmarks have both topic and post ID, when the post was moved into another topic the bookmark was still attached to the post but did not show in the UI. This PR makes it so the all topic IDs for bookmarks attached to a post are updated when a post is moved.
Also included is a migration to fix affected records (e.g. on Meta there are 20 affected records).
See: https://meta.discourse.org/t/improved-bookmarks-with-reminders/144542/203
I was adding specs to ensure that post actions and uploads are removed for permanently deleted posts.
I noticed that post revisions were not permanently destroyed. I added a migration to fix old data.
* FIX: optimise MoveNewSinceToTable
Avoids shuffling all ids around to the app (only use min / max)
Ensure the query for boundaries is ordered by user_id
The bug was mentioned on meta: https://meta.discourse.org/t/pressing-dismiss-new-doesnt-clear-new-topics/179858
Problem is that sometimes the user has TopicUser records with `last_read_post_number` set as NULL. In that case, the topic is still "new" to them and should be dismissed when they click dismiss button.
In addition, I added that condition to post_migration and bumped the number to fix existing records. Migration is written to be idempotent so it will make no harm to already deployed instances.
Original PR was reverted because of broken migration https://github.com/discourse/discourse/pull/12058
I fixed it by adding this line
```
AND topics.id IN(SELECT id FROM topics ORDER BY created_at DESC LIMIT :max_new_topics)
```
This time it is left joining a limited amount of topics. I tested it on few databases and it worked quite smooth
Because of a where clause of duration_minutes != duration, where
duration_minutes was NULL, the previous migration to fill the new
duration_minutes column failed. This corrects the failed migration
by just running the update where duration_minutes is NULL and duration
IS NOT NULL.
Previous commit is 4af77f1
Run the 'MigrateSearchDataAfterDefaultLocaleRename' post migration in batches of 500k records.
This will hopefully prevent any potential deadlocks on large tables.
The problem with this index is that on sites with a high non-pm to pm
posts ratio, the index is esstentially duplicating the existing index on
`PostSearchData#search_data`. If the site is huge, the index ends up
taking up more diskspace.
Very large batches can take an enormous amount of time due to churn
Limiting to 200k changes at a time gives us a far larger chance of finishing
the job without timing out or deadlocking.
A giant transaction in a post migration can be very risky.
This splits the large amount of work this migration needs to do into 2 parts:
1. A re-runnable cleanup job prior to transaction
2. A minimally sized transaction to add the database constraint
This avoids large amounts of churn on the table
Because previous migration was already deployed and some databases were already migrated, I needed to add some conditions to the migration.
Previous migration - https://github.com/discourse/discourse/blob/master/db/post_migrate/20200629232159_rename_path_whitelist_to_allowed_paths.rb
What will happen in a scenario when previous migration was not run.
1. column allowed_paths will be created
2. allowed_path will be populated with data from path_whitelist
3. path_whitelist column will be dropped
What will happen in a scenario when previous migration was already run.
1. column allowed_paths will not be created because already exists - `unless column_exists?(:embeddable_hosts, :allowed_paths)`
2. Data will not be copied because path_whitelist is missing - `if column_exists?(:embeddable_hosts, :path_whitelist) && column_exists?(:embeddable_hosts, :allowed_paths)`
3. path_whitelist column deletion will be skipped - `if column_exists?(:embeddable_hosts, :path_whitelist)`
This adds an option to "delete on owner reply" to bookmarks. If you select this option in the modal, then reply to the topic the bookmark is in, the bookmark will be deleted on reply.
This PR also changes the checkboxes for these additional bookmark options to an Integer column in the DB with a combobox to select the option you want.
The use cases are:
* Sometimes I will bookmark the topics to read it later. In this case we definitely don’t need to keep the bookmark after I replied to it.
* Sometimes I will read the topic in mobile and I will prefer to reply in PC later. Or I may have to do some research before reply. So I will bookmark it for reply later.
This is a no-op for modern installations. It only affects older sites.
Follow-up to f7d676dce1
This needs to be done in a transaction, because we need to drop and recreate the badge_posts view.
This reverts commit 20780a1eee.
* SECURITY: re-adds accidentally reverted commit:
03d26cd6: ensure embed_url contains valid http(s) uri
* when the merge commit e62a85cf was reverted, git chose the 2660c2e2 parent to land on
instead of the 03d26cd6 parent (which contains security fixes)
* PERF: Dematerialize topic_reply_count
It's only ever used for trust level promotions that run daily, or compared to 0. We don't need to track it on every post creation.
* UX: Add symbol in TL3 report if topic reply count is capped
* DEV: Drop user_stats.topic_reply_count column
We need to recreate the badge_posts view each time we change
column on the posts table.
The p.* is auto expanded on view create, cascade should not have
been used
avg_time on posts and topics have not been used in a year.
This uses a re-runnable ddl transaction diasabled migration to
drop the column, cause it touchs very high traffic table and may
deadlock
This reverts commit 6f9177e2ed.
We decided on a completely different approach to the problem.
Instead we will let blocked emails be treated as canonical.
Introduce the concept of "high priority notifications" which include PM and bookmark reminder notifications. Now bookmark reminder notifications act in the same way as PM notifications (float to top of recent list, show in the green bubble) and most instances of unread_private_messages in the UI have been replaced with unread_high_priority_notifications.
The user email digest is changed to just have a section about unread high priority notifications, the unread PM section has been removed.
A high_priority boolean column has been added to the Notification table and relevant indices added to account for it.
unread_private_messages has been kept on the User model purely for backwards compat, but now just returns unread_high_priority_notifications count so this may cause some inconsistencies in the UI.
This syncs the value of the `reply_id` column into the `reply_post_id` column until all servers have been deployed and the post migrations ran.
Follow-up to ab07b945c2
Temporarily recreate already dropped functions in the discourse_functions schema in order to allow restoring of backups which still reference dropped functions.
DEV: deprecate `invite.via_email` in favor of `invite.emailed_status`
This commit adds a new column `emailed_status` in `invites` table for
tracking email sending status.
0 - not required
1 - pending
2 - bulk pending
3 - sending
4 - sent
For normal email invites, invite record is created with emailed_status
set to 'pending'.
When bulk invites are sent invite record is created with emailed_status
set to 'bulk pending'.
For invites that generates link, invite record is created with
emailed_status set to 'not required'.
When invite email is in queue emailed_status is updated to 'sending'
Once the email is sent via `InviteEmail` job the invite emailed_status
is updated to 'sent'.
This reduces chances of errors where consumers of strings mutate inputs
and reduces memory usage of the app.
Test suite passes now, but there may be some stuff left, so we will run
a few sites on a branch prior to merging
This is a feature that used to be present in discourse-assign but is
much easier to implement in core. It also allows a topic to be assigned
without it claiming for review and vice versa and allows it to work with
category group reviewers.
Includes support for flags, reviewable users and queued posts, with REST API
backwards compatibility.
Co-Authored-By: romanrizzi <romanalejandro@gmail.com>
Co-Authored-By: jjaffeux <j.jaffeux@gmail.com>
Migrates email user options to a new data structure, where `email_always`, `email_direct` and `email_private_messages` are replace by
* `email_messages_level`, with options: `always`, `only_when_away` and `never` (defaults to `always`)
* `email_level`, with options: `always`, `only_when_away` and `never` (defaults to `only_when_away`)
* Doing it in a post migration was a bad idea
because the migration will fail if the site
is down while trying to download uploads
which points to the instance. This mainly
affects self-hosters using `discourse_docker`
where `./launcher rebuild` will take the
existing container down.
This moves us away from the delayed drops pattern which
was problematic on two counts. First, it uses a hardcoded "delay for"
duration which may be too short for certain deployment strategies.
Second, delayed drop doesn't ensure that it only runs after
the latest application code has been deployed. If the migration runs
and the application code fails to deploy, running the migration after
"delay for" has been met will cause the application to blow up.
The new strategy allows post deployment migrations to be skipped if the
env `SKIP_POST_DEPLOYMENT_MIGRATIONS` is provided.
```
SKIP_POST_DEPLOYMENT_MIGRATIONS=1 rake db:migrate
-> deploy app servers
SKIP_POST_DEPLOYMENT_MIGRATIONS=0 rake db:migrate
```
To aid with the generation of a post deployment migration, a generator
has been added. Simply run `rails generate post_migration`.