Commit Graph

126 Commits

Author SHA1 Message Date
cocococosti 4264fe0d67
Adding primary group id 2024-10-22 13:01:18 -04:00
cocococosti 93d6a8f17e
Adding titles import 2024-10-22 12:48:18 -04:00
cocococosti b5178dbab7
Removing more chat stuff 2024-10-22 12:15:18 -04:00
cocococosti 2b5f8990e3
Removing chats import 2024-10-22 11:09:58 -04:00
cocococosti 7197c4ce6d
Minor fix 2024-10-21 21:30:36 -04:00
cocococosti 59bbbcc961
Adding signatures import 2024-10-21 00:43:01 -04:00
cocococosti a41b04eb17
Adding missing user column to user import 2024-10-17 15:15:07 -04:00
cocococosti b430e00a0c
fix 2024-10-17 15:15:07 -04:00
cocococosti 0f20356538
Addign flair group is to user import 2024-10-17 15:15:07 -04:00
cocococosti cf76e56b50
Adding trust levels to user import. Adding existing groups. 2024-10-17 15:15:06 -04:00
cocococosti 2fe04ccdec
Adding existing groups to group members import 2024-10-17 15:15:06 -04:00
Gerhard Schlager 8d6ab2a098
Import more data about badges 2024-10-17 15:15:05 -04:00
Gerhard Schlager b9f4f51d9d
Fix ACL during check of uploads 2024-10-17 15:15:05 -04:00
Gerhard Schlager 1acdb1d046
Handle missing usernames in mentions
The warning about missing users/groups in mentions can be disabled by setting the NO_MENTION_WARNINGS env variable.
2024-10-17 15:15:05 -04:00
cocococosti 08521f59ca
Adding import of topic custom fields 2024-10-17 15:15:05 -04:00
cocococosti 1008a064ee
fix so the generic importer works with already exisiting categories 2024-10-17 15:15:04 -04:00
cocococosti 0572d752be
Add like count and support for uppercase tags 2024-10-17 15:15:04 -04:00
cocococosti 7f9649b467
Add acting user to suspended users 2024-10-17 15:14:56 -04:00
Alan Guo Xiang Tan 322a3be2db
DEV: Remove logical OR assignment of constants (#29201)
Constants should always be only assigned once. The logical OR assignment
of a constant is a relic of the past before we used zeitwerk for
autoloading and had bugs where a file could be loaded twice resulting in
constant redefinition warnings.
2024-10-16 10:09:07 +08:00
Selase Krakani dd34f1927b
FIX: Imports of upload-only chat messages (#29162)
The current implementation adds a "note" for chat messages with empty
messages, however chat messages with only uploads  are allowed. This change
allows such messages to be imported.
2024-10-10 15:18:10 +00:00
Selase Krakani 9825bde811
DEV: Generic bulk chat import support (#28697)
* DEV: WIP generic bulk chat import support

This first iteration implements bulk import for:

* direct_messages
* chat_channels
* user_chat_channel_memberships
* chat_threads
* user_chat_thread_memberships
* chat_messages
* chat_reactions
* chat_mentions

* DEV: Refactor raw placeholder interpolation to support chat messages

This change adds support for chat message placeholder interpolation
and switches to using `Chat::Message.cook` for cooking in the interim
instead of hand-cooking chat messages like we currently do for posts

* DEV: Extend upload references import to support chat message uploads

* DEV: Explicity set chat retention

- Set both channel and dm chats to 0
- Add temporary workaround for testing only chat imports

* DEV: Compute channel and thread membership metadata

Compute and set various user channel/thread membership stats and
remove hardcoded test index seed data

* FIX: Fix chat reactions import

Allow multiple reactions on a message by a user
2024-10-08 11:55:30 +00:00
Gerhard Schlager d4379af7f2
FIX: Import script didn't set `public` attribute of polls (#28864) 2024-10-02 20:02:13 +02:00
Selase Krakani d896f5cb70
DEV: Include post and topic attributes in imported quotes (#27851)
Currently, quotes imported via generic bulk import script do not include
references to the quoted post. This change includes both topic and post attributes
in a quote if the placeholder metadata includes a `post_id`
2024-07-11 16:47:21 +00:00
Gerhard Schlager 7c26d5d084
FIX: Import script was broken after upgrade of sqlite3 gem (#27648) 2024-07-02 17:38:15 +10:00
Selase Krakani f2c4474c1e
DEV: Improve user generic bulk importer anonymization (#27307)
* DEV: Improve user generic bulk importer anonymization

Add support for properly anonymizing:
 - email
 - date_of_birth
 - location
 - website
 - bio

* DEV: Remove uneeded anon username check in `import_user_emails`
2024-06-05 11:25:17 +00:00
Loïc Guitaut 2a28cda15c DEV: Update to lastest rubocop-discourse 2024-05-27 18:06:14 +02:00
Selase Krakani 949c70372c
DEV: Add support for various fields in generic bulk importer (#27114)
* user_profiles - `location`
* users - `date_of_birth`
* topics - `pinned_at`, `pinned_until`, `pinned_globally`

This also include changes to correctly import PMs. Currently PM topics
are skipped because of a check in `import_users` step which requires `category_id`
to be present.
2024-05-24 13:46:06 +02:00
Selase Krakani 61e12aaebe
FEATURE: Extend PM recipient bulk imports (#27063)
* FIX: Support multiple topic allowed user imports

* FEATURE: Add topic allowed groups import support
2024-05-17 13:45:20 +02:00
Ítalo Alves 73481e8f45
FIX: Add check for existing provider_uids to generic import (#26914)
Co-authored-by: Gerhard Schlager <gerhard.schlager@discourse.org>
2024-05-17 11:36:31 +02:00
Gerhard Schlager 1872047053
DEV: Uploads import script can download files (#26816)
Uploads import script can download files
2024-05-04 22:48:16 +02:00
Gerhard Schlager e3882a0c48
DEV: Add support for `user_associated_accounts` to import script (#26779) 2024-04-29 19:48:32 +02:00
Gerhard Schlager a538e2f153
DEV: Import script should use case-insensitive check for tag names (#26699) 2024-04-29 19:27:28 +02:00
Gerhard Schlager 4d045bfc61
DEV: Import script should insert more data into `user_stats` table (#26551)
This SQL tries to insert as much data as possible into the `user_stats` table by either calculating or by approximating stats based on existing. It also fixes an error in the calculation of `reply_count`which mistakenly contained all posts, not just replies.

This change also disables some steps in the `import:ensure_consistency` rake task by setting the `SKIP_USER_STATS` env variable. Otherwise, the rake task will overwrite the calculated data in the `user_stats` table with inaccurate data. I'm not changing or removing the logic from the rake task yet because other bulk import scripts seem to depend on it.
2024-04-11 14:05:21 +02:00
Gerhard Schlager bc98740205
DEV: Improve generic import script (#25972)
* FEATURE: Import into `category_users` table
* FIX: Failed to import `user_options` unless `timezone` was set
* FIX: Prevent reusing original `id` from intermediate DB in `user_fields`
* FEATURE: Order posts by `post_nuber` if available
* FEATURE: Allow `[mention]` placeholder to reference users by"id" or "name" (username)
* FEATURE: Support `[quote]` placeholders in posts
* FEATURE: Support `[link]` placeholders in posts
* FEATURE: Support all kinds of permalinks and remove support for `old_relative_url`
* PERF: Speed up pre-cooking by removing DB lookups
2024-03-05 22:23:36 +01:00
Gerhard Schlager 38ff1a38bd
DEV: Improve uploads_importer script (#25971)
* Print instructions when the `sqlite3` gem can't be loaded
* Use `display_filename` instead of `filename` if available
* Support uploading for a multisite
2024-03-05 16:27:45 +01:00
Gerhard Schlager 241bf48497 DEV: Allow rebakes to generate optimized images at the same time
Previously only Sidekiq was allowed to generate more than one optimized image at the same time per machine. This adds an easy mechanism to allow the same in rake tasks and other tools.
2024-01-16 14:33:16 +01:00
Gerhard Schlager dc8c6b8958 DEV: Lots of improvements to the generic_bulk import script
Notable changes:
* Imports a lot more tables from core and plugins
  * site settings
  * uploads with necessary upload references
  * groups and group members
  * user profiles
  * user options
  * user fields & values
  * muted users
  * user notes (plugin)
  * user followers (plugin)
  * user avatars
  * tag groups and tags
  * tag users (notification settings for tags / user)
  * category permissions
  * polls with options and votes
  * post votes (plugin)
  * solutions (plugin)
  * gamification scores (plugin)
  * events (plugin)
  * badges and badge groupings
  * user badges
  * optimized images
  * topic users (notification settings for topics)
  * post custom fields
  * permalinks and permalink normalizations

* It creates the `migration_mappings` table which is used to store the mapping for a handful of imported tables

* Detects duplicate group names and renames them

* Pre-cooking for attachments, images and mentions

* Outputs instructions when gems are missing

* Supports importing uploads from a DB generated by `uploads_importer.rb`

* Checks that all required plugins exists and enables them if needed

* A couple of optimizations and additions in `import.rake`
2023-12-11 16:23:07 +01:00
Gerhard Schlager d725b3ca9e DEV: Add script for preprocessing uploads as part of a migration
This script preprocesses all uploads within a intermediate DB (output of converters) and uploads those files to S3. It does the same for optimized images. This speeds up migrations when you have to run them multiple times, because you only have to preprocess and upload the files once.

This script is very hacky and mostly undocumented for now. That will change in the future.
2023-12-11 16:23:07 +01:00
Jarek Radosz 694b5f108b
DEV: Fix various rubocop lints (#24749)
These (21 + 3 from previous PRs) are soon to be enabled in rubocop-discourse:

Capybara/VisibilityMatcher
Lint/DeprecatedOpenSSLConstant
Lint/DisjunctiveAssignmentInConstructor
Lint/EmptyConditionalBody
Lint/EmptyEnsure
Lint/LiteralInInterpolation
Lint/NonLocalExitFromIterator
Lint/ParenthesesAsGroupedExpression
Lint/RedundantCopDisableDirective
Lint/RedundantRequireStatement
Lint/RedundantSafeNavigation
Lint/RedundantStringCoercion
Lint/RedundantWithIndex
Lint/RedundantWithObject
Lint/SafeNavigationChain
Lint/SafeNavigationConsistency
Lint/SelfAssignment
Lint/UnreachableCode
Lint/UselessMethodDefinition
Lint/Void

Previous PRs:
Lint/ShadowedArgument
Lint/DuplicateMethods
Lint/BooleanSymbol
RSpec/SpecFilePathSuffix
2023-12-06 23:25:00 +01:00
Constanza 28f27b2490
DEV: Adding polls, solutions, upload references and other improvements to the Discourse merger script (#23689) 2023-11-16 14:32:53 +01:00
David Taylor 8a5d97ef3f
DEV: Update importers from PostUpload to UploadReference (#23681)
Discourse stopped using PostUpload in 9db8f00b3d. Since then, these importers have been writing to the table, but any data was totally unused. This commit updates the easy cases to use UploadReference, and adds an error to the discourse_merger import script, which needs more significant work.
2023-09-27 15:01:04 +01:00
Gerhard Schlager 0b29dc5d38 DEV: Add experimental generic bulk import script 2023-08-09 20:56:14 +02:00
David Taylor 436b3b392b
DEV: Apply syntax_tree formatting to `script/*` 2023-01-09 11:13:22 +00:00
Leonardo Mosquera bfecbde837
Fixes for vBulletin bulk importer (#17618)
* Allow taking table prefix from env var

* FIX: remove unused column references

The columns `filedata` and `extension` are not present in a v4.2.4
database, and they aren't used in the method anyways.

* FIX: report progress for tables without imported_id

* FIX: effectively check for AR validation errors

NOTE: other migration scripts also have this problem; see /t/58202

* FIX: properly count Posts when importing attachments

* FIX: improve logging

* Remove leftover comment

* FIX: show progress when exporting Permalink file

* PERF: stream Permalink file

The current way results in tons of memory usage; write once per line instead

* Document fixes needed

* WIP - deduplicate category names

* Ignore non alphanumeric chars for grouping

* FIX: properly deduplicate user emails by merging accounts

* FIX: don't merge empty UserEmails

* Improve logging

* Merge users AFTER fixing primary key sequences

* Parallelize user merging

* Save duplicated users structure for debugging purposes

* Add progress logging for the (multiple hour) user merging step
2022-11-28 16:30:19 -03:00
Loïc Guitaut ab6ca78486 FIX: Use proper ActiveRecord method in import scripts
`ActiveRecord::Base.connection_config` has been deprecated since Rails
6.1 and was completely removed from Rails 7.
Instead we need to use
`ActiveRecord::Base.connection_db_config.configuration_hash`.

Import scripts were forgotten when we did the Rails 7 upgrade, this
patch fixes them.
2022-05-09 11:09:27 +02:00
Michael Brown 3bf3b9a4a5 DEV: pull email address validation out to a new EmailAddressValidator
We validate the *format* of email addresses in many places with a match against
a regex, often with very slightly different syntax.

Adding a separate EmailAddressValidator simplifies the code in a few spots and
feels cleaner.

Deprecated the old location in case someone is using it in a plugin.

No functionality change is in this commit.

Note: the regex used at the moment does not support using address literals, e.g.:
* localpart@[192.168.0.1]
* localpart@[2001:db8::1]
2022-02-17 21:49:22 -05:00
Gerhard Schlager 33d6ed60a4
DEV: Don't import year of birth (#15937)
The cakeday plugin doesn't use the year.
2022-02-14 18:10:35 +01:00
Peter Zhu c5fd8c42db
DEV: Fix methods removed in Ruby 3.2 (#15459)
* File.exists? is deprecated and removed in Ruby 3.2 in favor of
File.exist?
* Dir.exists? is deprecated and removed in Ruby 3.2 in favor of
Dir.exist?
2022-01-05 18:45:08 +01:00
Leonardo Mosquera 48a08cc397
FIX: Vanilla importer fixes (#14699)
Import script was out of date
2021-10-27 14:22:37 +02:00
Gerhard Schlager a4d0d866aa
DEV: Bulk imports should find existing users by email (#14468)
Without this change, bulk imports unconditionally create new user records even when a user with the same email address exists.
2021-09-29 00:20:06 +02:00