Commit Graph

918 Commits

Author SHA1 Message Date
Régis Hanol 4fdad12998 FIX: downsize_uploads script to support external storage
Also ensured we update the sha1 property of the upload record to match the actual file.
2019-10-08 17:54:39 +02:00
jelle van der Waa 2d4c9bbaac import_scripts: add fluxbb prefix to missing query (#8163)
Signed-off-by: Jelle van der Waa <jelle@archlinux.org>
2019-10-08 11:46:00 +11:00
Sam Saffron 1d5c2b36f6 DEV: improve diagnostics on mem leak checker
This adds mwrap logging to each iteration so we can see how much
leaks per iteration and where it is coming from
2019-10-04 09:47:33 +10:00
Sam Saffron 038a38ae1c DEV: add debugging scripts for memory leaks
These scripts are somewhat rough but I needed them to help debug a memory
leak we have noticed in rails 6.

The biggest object script finds all the biggest objects we have in memory
after boot.

The test memory leak runs a very simple iteration through all multisites
and observed memory.
2019-10-03 16:36:31 +10:00
Krzysztof Kotlarek 35b1185a08 FIX: Revert Demon::DemonBase back to Demon::Base (#8132)
I introduced DemonBase because I had got some conflict between `demon/base.rb` and `jobs/base.rb`, however, to not rename base class, it is possible to use regex on absolute path in Zeitwerk custom inflector.
2019-10-02 14:54:08 +10:00
Krzysztof Kotlarek 427d54b2b0 DEV: Upgrading Discourse to Zeitwerk (#8098)
Zeitwerk simplifies working with dependencies in dev and makes it easier reloading class chains. 

We no longer need to use Rails "require_dependency" anywhere and instead can just use standard 
Ruby patterns to require files.

This is a far reaching change and we expect some followups here.
2019-10-02 14:01:53 +10:00
Gerhard Schlager b48ca9dee9 DEV: Simplify username validation in base importer
The `UsernameValidator` does already all the hard work. No need to do any additional checks in the import script. The checks were out-of-date anyway.
2019-10-01 20:33:09 +02:00
Gerhard Schlager ed1e5ef6cc FIX: By default, don't abort Google Groups crawling on error 2019-09-18 18:14:04 +02:00
Gerhard Schlager ab96239f2a FIX: Google Groups crawler failed to login
Trying to automate the login into a Google account is quite hard. This makes the crawler use the content of a cookies.txt file instead. It also removes a couple of deprecation warnings and adds some color to the output.
2019-09-18 13:09:20 +02:00
Sam Saffron cd1ab206d9 DEV: add missing ultra low queue to mwrap sidekiq
note: mwrap is used for analysis of memory bloat and leaks of processes
2019-09-18 11:18:35 +10:00
Sam Saffron 015051ecaf PERF: avoid spinning a thread each time we close a connection
This is a temporary workaround for the issue in https://github.com/rails/rails/pull/36949

Discussing a proper fix in Rails with the Rails team.

Prior to this fix we were spinning up a thread every time we closed a connection
to the db.
2019-09-12 17:34:04 +10:00
Krzysztof Kotlarek 1d73754e84 FIX: Modify frozen String and profile_db_generator uses category id (#8080) 2019-09-09 17:38:37 +10:00
Dan Ungureanu ab7038bfc2
DEV: User simulator tried to modify frozen string. 2019-08-16 17:32:17 +03:00
Gerhard Schlager 888b635cfc Import avatars and likes in the Zendesk AP importer
Co-authored-by: Justin DiRose <justin@justindirose.com>
2019-08-14 10:42:52 +02:00
Gerhard Schlager 4ed517a344 DEV: Make Rubocop happy
Follow-up to 6cc9fe42
2019-08-12 23:10:58 +02:00
Mohamad Abras 6cc9fe42ce add mongo adapter to nodebb importer (#8000) 2019-08-12 14:15:11 -04:00
Régis Hanol 19dda59932 FIX: add back verbose option to DbHelper.remap 2019-07-31 17:30:08 +02:00
Rishabh dcb47d902b
REFACTOR: Rename SiteSetting.disable_edit_notifications to disable_system_edit_notifications (#7958)
* REFACTOR: Rename SiteSetting.disable_edit_notifications to disable_system_edit_notifications

- The older name could cause some confusion because the setting does not disable all edit notifications, only system ones.

* FIX: Add frozen_string_literal: true in the migration

* DEV: Deprecate 'disable_edit_notifications'
2019-07-31 20:20:41 +05:30
Régis Hanol 89fce2ce71 DEV: remove duplicate Remap class and use DbHelper.remap instead
Follow-up to 9cd3f96dee
2019-07-29 18:43:40 +02:00
Gerhard Schlager fd12c414e7 DEV: Refactor helper methods for upload markdown
Follow-up to a61ff167
2019-07-25 16:36:35 +02:00
Gerhard Schlager a61ff16740 DEV: Make attachment markdown reusable 2019-07-25 14:04:18 +02:00
Joffrey JAFFEUX cc46de8f46
s/discourse-staff-notes/discourse-user-notes (#7936) 2019-07-24 20:04:27 +02:00
Gerhard Schlager f0fea5991f FIX: Latest Selenium gem broke Google Groups import script
Selenium uses Keep-Alive since version 3.141, so the net-http-persistent gem shouldn't be needed anymore.
2019-07-10 09:45:33 +02:00
Dan Ungureanu ab6ad220c7
DEV: Fix user simulator script. 2019-07-09 18:52:08 +03:00
Arpit Jalan 6d30be1f94 Improve XenForo import script.
- ensure only active, unbanned users are imported.
- ensure only visible threads/posts are imported.
2019-06-18 15:52:34 +05:30
Arpit Jalan 77f5577e30 DEV: Improvements to AnswerHub import script. 2019-06-13 11:46:17 +05:30
Guo Xiang Tan 36c0cfa890 FIX: Use new attachment markdown format in `ImportScripts::Uploader`. 2019-06-11 14:49:28 +08:00
Blake Erickson 0955d9ece9 create answerhub importer (#7671) 2019-06-03 12:17:22 +10:00
Gerhard Schlager 0f3c3bc309 Make import scripts work with frozen strings 2019-05-30 22:22:24 +02:00
Gerhard Schlager c70d0c6659 Use an invalid domain for fake email addresses in importers 2019-05-30 22:22:24 +02:00
Gerhard Schlager d3ba338144 Make Telligent import script more generic 2019-05-30 22:22:24 +02:00
Joffrey JAFFEUX 630e9814bc
datetime is not available at this point (#7630) 2019-05-29 14:06:32 +02:00
Joffrey JAFFEUX 6439004161
DEV: do not use STDERR to print tests timestamps (#7629) 2019-05-29 13:28:02 +02:00
Joffrey JAFFEUX 6be9a6eb2e
DEV: adds time logging to docker_test script (#7627) 2019-05-29 12:06:43 +02:00
Sam Saffron 7429700389 FIX: ensure we can download maxmind without redis or db config
This also corrects FileHelper.download so it supports "follow_redirect"
correctly (it used to always follow 1 redirect) and adds a `validate_url`
param that will bypass all uri validation if set to false (default is true)
2019-05-28 10:28:57 +10:00
Sam Saffron 2bcc3ef46b correct type 2019-05-22 12:28:17 +10:00
Sam Saffron 12264747f7 DEV: script to analyze status of sidekiq queue
This returns a proper count of all queued jobs and finds potential dupes
2019-05-22 12:27:11 +10:00
Gerhard Schlager b788948985 FEATURE: English locale with international date formats
Makes en_US the new default locale
2019-05-20 13:47:20 +02:00
Sam Saffron 678a9a61c4 DEV: lint importer
commit #f490ed3b introduced a few linting issues, resolved now
2019-05-17 16:37:08 +10:00
Edmond Lepedus f490ed3bbc FEATURE: Add attachment support to xenforo importer (#7548)
* FEATURE: Add attachment support to XenForo importer

If `ATTACHMENT_DIR` is provided, importer will scan each imported post
for `[GALLERY]` and `[ATTACH]` tags, attempt to import the referenced files
as Discourse uploads and replace the tags with Discourse markup.

References to files which cannot be imported are stripped.

NOTE: This only imports attachments which are referenced in imported
posts. Any XenForo media or files which are not referenced in any post
using `[ATTACH]` or `[GALLERY]` tags will not be imported. The goal is to
ensure that we don't have posts with missing images and unsightly
markup, NOT to ensure that all attachments are migrated.

* FEATURE: Add attachment support to XenForo importer

If `ATTACHMENT_DIR` is provided, importer will scan each imported post
for `[GALLERY]` and `[ATTACH]` tags, attempt to import the referenced files
as Discourse uploads and replace the tags with Discourse markup.

References to files which cannot be imported are stripped.

NOTE: This only imports attachments which are referenced in imported
posts. Any XenForo media or files which are not referenced in any post
using `[ATTACH]` or `[GALLERY]` tags will not be imported. The goal is to
ensure that we don't have posts with missing images and unsightly
markup, NOT to ensure that all attachments are migrated.

* FEATURE: Add attachment support to XenForo importer

If `ATTACHMENT_DIR` is provided, importer will scan each imported post
for `[GALLERY]` and `[ATTACH]` tags, attempt to import the referenced files
as Discourse uploads and replace the tags with Discourse markup.

References to files which cannot be imported are stripped.

NOTE: This only imports attachments which are referenced in imported
posts. Any XenForo media or files which are not referenced in any post
using `[ATTACH]` or `[GALLERY]` tags will not be imported. The goal is to
ensure that we don't have posts with missing images and unsightly
markup, NOT to ensure that all attachments are migrated.
2019-05-17 16:18:28 +10:00
Sam Saffron b3d42b3f18 DEV: remove unmaintained script
osxdev script has not been maintained for a while, keeping it around is
only causing confusion
2019-05-17 11:47:48 +10:00
Sam Saffron 30990006a9 DEV: enable frozen string literal on all files
This reduces chances of errors where consumers of strings mutate inputs
and reduces memory usage of the app.

Test suite passes now, but there may be some stuff left, so we will run
a few sites on a branch prior to merging
2019-05-13 09:31:32 +08:00
Guo Xiang Tan 152238b4cf DEV: Prefer `public_send` over `send`. 2019-05-07 09:33:21 +08:00
Sam Saffron 9be70a22cd DEV: introduce new API to look up dynamic site setting
This removes all uses of both `send` and `public_send` from consumers of
SiteSetting and instead introduces a `get` helper for dynamic lookup

This leads to much cleaner and safer code long term as we are always explicit
to test that a site setting is really there before sending an arbitrary
string to the class

It also removes a couple of risky stubs from the auth provider test
2019-05-07 11:00:30 +10:00
Gerhard Schlager 74ca49d7cd FIX: Importing of polls from phpBB3 was broken
Follow-up to 24369a81
2019-05-06 12:37:19 +02:00
Robin Ward 3cb0d27d38 DEV: Upgrade our widget handlebars compiler
Now supports subexpressions such as i18n and concat, plus automatic
attaching of widgets similar to ember.
2019-05-02 15:47:57 -04:00
Guo Xiang Tan 24347ace10 FIX: Properly associate user_profiles background urls via upload id.
`Upload#url` is more likely and can change from time to time. When it
does changes, we don't want to have to look through multiple tables to
ensure that the URLs are all up to date. Instead, we simply associate
uploads properly to `UserProfile` so that it does not have to replicate
the URLs in the table.
2019-05-02 14:58:24 +08:00
Michael Brown 7b1783bae8 FIX: cache_critical_dns was never caching pg replica (#7461)
* it's DISCOURSE_DB_REPLICA_HOST not DISCOURSE_DB_BACKUP_HOST
2019-04-30 08:42:51 +08:00
MMX 5d4aa256be FIX: category logo upload error in Discuz importer.(#7453) 2019-04-29 17:01:15 +02:00
Michael K Johnson 9fc3de01bb FEATURE: Add import script for Friends+Me Google+ Exporter JSON archives (#7334)
This script has been used to import over 50,000 Google+ posts
and over 300,000 comments from 29 communities into a single
Discourse instance, as well as for at least three other
imports.  Google+ has closed for the public, but it is still
available at this time for GSuite customers. If GSuite customers
decide to migrate from Google+ to Discourse, or if Google
"sunsets" Google+ for GSuite customers, this importer may be
useful.
https://www.reddit.com/r/FMGE_Support/comments/b8sa5h/fmge_for_gsuite/

Development and use of this script has been discussed in detail:
https://meta.discourse.org/t/bounty-google-private-communities-export-screenscraper-importer/108029
2019-04-23 14:04:09 +10:00
Arpit Jalan 110512d4d0 Improvements to vBulletin bulk import script
- import attachments
- import avatars
- import user signatures
- create permalink file
- reconnect to MySQL db in case of failure
2019-04-11 12:35:19 +05:30
Arpit Jalan a20f58554b IMPORT: create category definitions in `import:ensure_consistency` task 2019-04-11 12:06:37 +05:30
Robin Ward b58867b6e9 FEATURE: New 'Reviewable' model to make reviewable items generic
Includes support for flags, reviewable users and queued posts, with REST API
backwards compatibility.

Co-Authored-By: romanrizzi <romanalejandro@gmail.com>
Co-Authored-By: jjaffeux <j.jaffeux@gmail.com>
2019-03-28 12:45:10 -04:00
Gerhard Schlager 453ba2da7b Make Google Groups scraper work with latest chromedriver 2019-03-25 16:11:22 +01:00
Joe ec2123809f FEATURE: user and group cards on mobile (#7246) 2019-03-25 13:37:17 +01:00
Sam Saffron 3f35315391 DEV: add script to switch ruby version from inside container
This script can be used to flip Ruby to a patched Ruby version
or a different major version from inside the container

It is used to test and compare different Ruby versions
2019-03-25 17:41:24 +11:00
Gerhard Schlager 2349ba3bc4 Improve Google Groups scraper
* Better error detection during login phase
* Experimental support for 2FA and SMS codes
* Detect missing permissions to scrape email addresses
2019-03-24 23:15:13 +01:00
Penar Musaraj 0db2846a5b Add user bios to NodeBB importer 2019-03-20 16:40:26 -04:00
Penar Musaraj b6a7b851c7 Nodebb importer: add permalinks, exclude disabled categories 2019-03-18 21:59:02 -04:00
Penar Musaraj 9334d2f4f7
FEATURE: add more granular user option levels for email notifications (#7143)
Migrates email user options to a new data structure, where `email_always`, `email_direct` and `email_private_messages` are replace by

* `email_messages_level`, with options: `always`, `only_when_away` and `never` (defaults to `always`)
* `email_level`, with options: `always`, `only_when_away` and `never` (defaults to `only_when_away`)
2019-03-15 10:55:11 -04:00
Sam 819d4facda FIX: ruby bench script no longer working
The library used to generate random text changed, this caused the title
of the topic used for testing to change, which meant the slug changed, so
a hit to the topic was a redirect

This fix gives the topic used for performance testing a static name to avoid
this issue in future
2019-03-15 11:31:08 +11:00
Robin Ward fa5a158683 REFACTOR: Move `queue_jobs` out of `SiteSetting`
It is not a setting, and only relevant in specs. The new API is:

```
Jobs.run_later!        # jobs will be thrown on the queue
Jobs.run_immediately!  # jobs will run right away, avoid the queue
```
2019-03-14 10:47:38 -04:00
Gerhard Schlager 78f8114989 FEATURE: Allow discourse script to skip disabling of emails after restore 2019-03-07 21:49:33 +01:00
David Taylor fc7938f7e0
REFACTOR: Migrate GoogleOAuth2Authenticator to use ManagedAuthenticator (#7120)
https://meta.discourse.org/t/future-social-authentication-improvements/94691/3
2019-03-07 11:31:04 +00:00
Gerhard Schlager 941e096df4 Fix error in base import script
Follow-up to 655a08dbbd
2019-03-06 21:58:25 +01:00
maulkin 655a08dbbd FIX: Return actual errors if PostCreator fails (#7096) 2019-03-06 21:29:37 +01:00
Penar Musaraj b1035cc691 FIX: NodeBB import details
- mark imported users as active

- do not strip @ from usernames in post content

- improve uploads path matching
2019-03-06 12:30:36 -05:00
Joffrey JAFFEUX 703c724cf3
REFACTOR: Migrate InstagramAuthenticator to use ManagedAuthenticator (#7081) 2019-03-04 14:54:28 +01:00
Bianca Nenciu 714f6cde79 FIX: Remove duplicate definition of create_categories. 2019-03-04 10:32:09 +02:00
Gerhard Schlager c36c9c2ee5 FEATURE: Import script for AnswerBase
Improves the generic database used by some import scripts:
* Adds additional columns for users
* Adds support for attachments
* Allows setting the data type for keys (numeric or string) to ensure correct sorting
2019-02-28 22:08:12 +01:00
Gerhard Schlager 24369a8166 Improve phpBB3 importer
* Log errors when mapping of posts, messages, etc. fails
* Allow permalink normalizations for old subfolder installation
* Disable importing of polls for now. It's broken.
2019-02-17 23:20:20 +01:00
Gerhard Schlager 8d5dfe1e01 FIX: Don't import parts of the email address as name 2019-02-17 22:59:18 +01:00
Penar Musaraj c50db76f5d FIX: do not treat TIFF, BMP, WEBP as images
Treating TIFF and BMP as images cause us to add them to IMG tags, this is very inconsistent across browsers.

You can still upload these files they will simply not be displayed in IMG tags.
2019-02-11 16:28:43 +11:00
Jeff Atwood 444bc466b0 for docs, normalize on space after code fence when specifying lang 2019-01-21 01:19:28 -08:00
Régis Hanol 1e67bcb456
PERF: bulk feature topic users & reset topic counters after an import 2019-01-17 21:48:23 +01:00
Régis Hanol 788719d271 DEV: speed up posts base imports 2019-01-04 15:30:17 +01:00
Arpit Jalan 71a5369fef FIX: do not convert quote tags to markdown 2018-12-11 20:09:46 +05:30
Arpit Jalan 735a48415d FEATURE: option to use ruby-bbcode-to-md in bulk import script
ruby-bbcode-to-md provides better bbcode to markdown conversion
2018-12-10 10:28:07 +05:30
Arpit Jalan 0365d50797 Improve vBulletin bulk import script to support table prefix.
Improve base bulk import script to convert list tags to ul/li.
2018-12-10 10:10:44 +05:30
David Taylor 160d29b18a
REFACTOR: Migrate TwitterAuthenticator to use ManagedAuthenticator (#6739)
No changes to functionality. TwitterAuthenticator goes from 136 lines to 24, and all twitter-specific logic elsewhere has been deleted 🎉
2018-12-07 15:39:06 +00:00
Régis Hanol 3c9c95ac83 Update Rubocop to 0.60 2018-12-04 10:48:16 +01:00
David Taylor 9248ad1905 DEV: Enable `Style/SingleLineMethods` and `Style/Semicolon` in Rubocop (#6717) 2018-12-04 11:48:13 +08:00
David Taylor 208005f9c9 REFACTOR: Migrate FacebookAuthenticator to use ManagedAuthenticator
Changes to functionality
  - Removed syncing of user metadata including gender, location etc.
    These are no longer available to standard Facebook applications.
  - Removed the remote 'revoke' functionality. No other providers have
    it, and it does not appear to be standard practice in other apps.
  - The 'facebook_no_email' event is no longer logged. The system can
    cope fine with a missing email address.

Data is migrated to the new user_associated_accounts table.
facebook_user_infos can be dropped once we are confident the data has
been migrated successfully.
2018-11-30 11:18:11 +00:00
Sam 6acabec423 FIX: script was missing newlines when generating hosts 2018-11-28 15:18:08 +11:00
Sam 6d9d904df5 add missing newline to end of file 2018-11-23 15:43:27 +11:00
Sam d7b0f0069c no need to double strip this line 2018-11-23 14:48:02 +11:00
Sam 4c6eeaac15 Followup on 0739c3b1d1
This corrects some minor style issues
2018-11-23 14:43:52 +11:00
Sam 0739c3b1d1 DEV: this introduces a script capable of caching critical DNS locally
This is useful for cases where you want to add resiliency to DNS lookups
for redis and postgres, so they will continue to work even if there is
a DNS outage
2018-11-22 18:46:59 +11:00
Régis Hanol a0f0bac752
Add a comment to run the 'import:ensure_consistency' rake task after a bulk import 2018-11-21 16:28:35 +01:00
Guo Xiang Tan 5076487eaf Update `discuz_x` import script to not use `Category#logo_url`. 2018-11-09 14:15:31 +08:00
Gerhard Schlager 77fedaba88 DEV: Add script for pushing translations to Transifex 2018-11-08 23:31:05 +00:00
Gerhard Schlager d6f89a85ef Make Rubocop happy 2018-10-31 01:30:14 +01:00
Gerhard Schlager 65db9326b4 FEATURE: Add download script for Google Groups 2018-10-31 01:12:05 +01:00
Gerhard Schlager efa265cbc8 Rename mbox import script 2018-10-31 01:12:05 +01:00
Gerhard Schlager edbc004a9a Remove old mbox import script 2018-10-31 01:12:05 +01:00
Régis Hanol c39a1022cc PERF: user imports would slow down the more users were imported 2018-10-22 11:14:13 +02:00
Régis Hanol afa22a0c6f REFACTOR: more 'fake_email' to base importer 2018-10-22 11:12:40 +02:00
Régis Hanol 8b20e2500a
Remove unnecessary line 2018-10-19 15:48:48 +02:00
Régis Hanol 637123ff6f Merge users based on their email in vBulletin importer 2018-10-19 15:16:45 +02:00
Régis Hanol 53aa0344bf FIX: properly import vBulletin's hashed password 2018-10-18 10:22:55 +02:00