Commit Graph

1140 Commits

Author SHA1 Message Date
Constanza 067c4deb4c
Fix comment to include phpbb 3.3, which is now supported (#18006) 2022-08-19 16:42:32 -04:00
Constanza ef842a4b29
FEATURE: Adding a simple CSV importer (#17993) 2022-08-19 13:09:30 -04:00
Constanza 8836c8bcdf
FIX: the phpbbb import script was not parsing youtube tags (#17787) 2022-08-05 15:20:32 -04:00
communiteq 603f36ca4a
DEV: Support phpBB 3.3 imports (#17641)
* handle polls with duplicate items
* handle polls with incorrect poll_option_total values
* handle group IDs in personal messages
* support for version 3.3
2022-07-25 22:07:03 +02:00
Jay Pfaffman 7ab5dcf82f
FEATURE: my_bb import supports avatars (#17617) 2022-07-25 15:22:25 +02:00
Constanza b9ac8e5748
Adding 3.2 to the versions of phpbb supported by the migration script (#17483) 2022-07-14 18:06:47 +05:30
Michael Fitz-Payne 1867202a4d DEV(cache_critical_dns): add option to run once and exit
There are situations where a container running Discourse may want to
cache the critical DNS services without running the cache_critical_dns
service, for example running migrations prior to running a full bore
application container.

Add a `--once` argument for the cache_critical_dns script that will
only execute the main loop once, and return the status code for the
script to use when exiting. 0 indicates no errors occured during SRV
resolution, and 1 indicates a failure during the SRV lookup.

Nothing is reported to prometheus in run_once mode. Generally this
mode of operation would be a part of a unix pipeline, in which the exit
status is a more meaningful and immediate signal than a prometheus metric.

The reporting has been moved into it's own method that can be called
only when the script is running as a service.

See /t/69597.
2022-07-06 14:53:02 +10:00
Michael Fitz-Payne aabbc9e63e DOC(cache_critical_dns): add program description
Describes the behaviour and configuration of the cache_critical_dns
script, mainly cribbed from commit messages. Tries to make this program
a bit less of an enigma.
2022-05-26 14:26:57 +10:00
Michael Fitz-Payne 0553788d3b DEV(cache_critical_dns): improve postgres_healthcheck
The `PG::Connection#ping` method is only reliable for checking if the
given host is accepting connections, and not if the authentication
details are valid.

This extends the healthcheck to confirm that the auth details are
able to both create a connection and execute queries against the
database.

We expect the empty query to return an empty result set, so we can
assert on that. If a failure occurs for any reason, the healthcheck will
return false.
2022-05-24 08:20:10 +10:00
Martin Brennan fcc2e7ebbf
FEATURE: Promote polymorphic bookmarks to default and migrate (#16729)
This commit migrates all bookmarks to be polymorphic (using the
bookmarkable_id and bookmarkable_type) columns. It also deletes
all the old code guarded behind the use_polymorphic_bookmarks setting
and changes that setting to true for all sites and by default for
the sake of plugins.

No data is deleted in the migrations, the old post_id and for_topic
columns for bookmarks will be dropped later on.
2022-05-23 10:07:15 +10:00
Gabe Pacuilla 4284ba9c27
FIX(cache_critical_dns): use correct DISCOURSE_DB_USERNAME envvar (#16862) 2022-05-18 13:01:18 -04:00
Gabe Pacuilla 9f246e6969
FIX(cache_critical_dns): use discourse database name and user by default (#16856) 2022-05-17 16:09:32 -04:00
Michael Fitz-Payne 35d5c29e10 DEV(cache_critical_dns): add SRV priority tunables
An SRV RR contains a priority value for each of the SRV targets that
are present, ranging from 0 - 65535. When caching SRV records we may want to
filter out any targets above or below a particular threshold.

This change adds support for specifying a lower and/or upper bound on
target priorities for any SRV RRs. Any targets returned when resolving
the SRV RR whose priority does not fall between the lower and upper
thresholds are ignored.

For example: Let's say we are running two Redis servers, a primary and
cold server as a backup (but not a replica). Both servers would pass health
checks, but clearly the primary should be preferred over the backup
server. In this case, we could configure our SRV RR with the primary
target as priority 1 and backup target as priority 10. The
`DISCOURSE_REDIS_HOST_SRV_LE` could then be set to 1 and the target with
priority 10 would be ignored.

See /t/66045.
2022-05-12 08:08:56 +10:00
Loïc Guitaut ab6ca78486 FIX: Use proper ActiveRecord method in import scripts
`ActiveRecord::Base.connection_config` has been deprecated since Rails
6.1 and was completely removed from Rails 7.
Instead we need to use
`ActiveRecord::Base.connection_db_config.configuration_hash`.

Import scripts were forgotten when we did the Rails 7 upgrade, this
patch fixes them.
2022-05-09 11:09:27 +02:00
Martin Brennan 222c8d9b6a
FEATURE: Polymorphic bookmarks pt. 3 (reminders, imports, exports, refactors) (#16591)
A bit of a mixed bag, this addresses several edge areas of bookmarks and makes them compatible with polymorphic bookmarks (hidden behind the `use_polymorphic_bookmarks` site setting). The main ones are:

* ExportUserArchive compatibility
* SyncTopicUserBookmarked job compatibility
* Sending different notifications for the bookmark reminders based on the bookmarkable type
* Import scripts compatibility
* BookmarkReminderNotificationHandler compatibility

This PR also refactors the `register_bookmarkable` API so it accepts a class descended from a `BaseBookmarkable` class instead. This was done because we kept having to add more and more lambdas/properties inline and it was very messy, so a factory pattern is cleaner. The classes can be tested independently as well.

Some later PRs will address some other areas like the discourse narrative bot, advanced search, reports, and the .ics endpoint for bookmarks.
2022-05-09 09:37:23 +10:00
Leonardo Mosquera 3e5faffb0d
DEV: mbox importer improvements (#16557)
* FIX: support specifying parent_category_id in mbox import metadata
* FIX: elide tabs from topic titles
* FIX: optionally fix Mailman from: addresses
* DEV: optionally elide anything up to the last = in email addresses
* Fix Mailmain broken from: detection
2022-04-29 13:24:29 -03:00
Michael Fitz-Payne 1acc4751ff
FIX: remove refresh seconds override on cache_critical_dns (#16572)
This removes the option to override the sleep time between caching of
DNS records. The override was invalid because `''.to_i` is 0 in Ruby,
causing a tight loop calling the `run` method.
2022-04-27 12:42:35 +08:00
Michael Fitz-Payne 0784c28702 FIX: cache_critical_dns - add TLS support for Redis healthcheck
For Redis connections that operate over TLS, we need to ensure that we
are setting the correct arguments for the Redis client. We can utilise
the existing environment variable `DISCOURSE_REDIS_USE_SSL` to toggle
this behaviour.

No SSL verification is performed for two reasons:
- the Discourse application will perform a verification against any FQDN
  as specified for the Redis host
- the healthcheck is run against the _resolved_ IP address for the Redis
  hostname, and any SSL verification will always fail against a direct
  IP address

If no SSL arguments are provided, the IP address is never cached against
the hostname as no healthy address is ever found in the HealthyCache.
2022-04-27 12:27:58 +10:00
Michael Fitz-Payne c4ea439cc3 DEV: refactor cache_critical_dns for SRV RR awareness
Modify the cache_critical_dns script for SRV RR awareness. The new
behaviour is only enabled when one or more of the following environment
variables are present (and only for a host where the `DISCOURSE_*_HOST_SRV`
variable is present):
- `DISCOURSE_DB_HOST_SRV`
- `DISCOURSE_DB_REPLICA_HOST_SRV`
- `DISCOURSE_REDIS_HOST_SRV`
- `DISCOURSE_REDIS_REPLICA_HOST_SRV`

Some minor changes in refactor to original script behaviour:
- add Name and SRVName classes for storing resolved addresses for a hostname
- pass DNS client into main run loop instead of creating inside the loop
- ensure all times are UTC
- add environment override for system hosts file path and time between DNS
  checks mainly for testing purposes

The environment variable for `BUNDLE_GEMFILE` is set to enables Ruby to
load gems that are installed and vendored via the project's Gemfile.
This script is usually not run from the project directory as it is
configured as a system service (see
71ba9fb7b5/templates/cache-dns.template.yml (L19))
and therefore cannot load gems like `pg` or `redis` from the default
load paths. Setting this environment variable configures bundler to look
in the correct project directory during it's setup phase.

When a `DISCOURSE_*_HOST_SRV` environment variable is present, the
decision for which target to cache is as follows:
- resolve the SRV targets for the provided hostname
- lookup the addresses for all of the resolved SRV targets via the
  A and AAAA RRs for the target's hostname
- perform a protocol-aware healthcheck (PostgreSQL or Redis pings)
- pick the newest target that passes the healthcheck

From there, the resolved address for the SRV target is cached against
the hostname as specified by the original form of the environment
variable.

For example: The hostname specified by the `DISCOURSE_DB_HOST` record
is `database.example.com`, and the `DISCOURSE_DB_HOST_SRV` record is
`database._postgresql._tcp.sd.example.com`. An SRV RR lookup will return
zero or more targets. Each of the targets will be queried for A and AAAA
RRs. For each of the addresses returned, the newest address that passes
a protocol-aware healthcheck will be cached. This address is cached so
that if any newer address for the SRV target appears we can perform a
health check and prefer the newer address if the check passes.

All resolved SRV targets are cached for a minimum of 30 minutes in memory
so that we can prefer newer hosts over older hosts when more than one target
is returned. Any host in the cache that hasn't been seen for more than 30
minutes is purged.

See /t/61485.
2022-04-27 10:14:33 +10:00
David Taylor d81359246a
DEV: Be more lenient in CLI confirmation (#16290)
If someone types `yes` rather than `YES`, continue anyway.

The chance of typing `yes`, when you actually want to stop, is non-existent. The chance of typing `yes` when you meant `YES` is  high, and it's very frustrating when the script quite because you got the case wrong!
2022-03-25 20:14:41 +00:00
David Taylor f3aab19829
DEV: Promote historic post_deploy migrations (#16288)
This commit promotes all post_deploy migrations which existed in Discourse v2.7.13 (timestamp <= 20210328233843)

This reduces the likelihood of issues relating to migration run order

Also fixes a couple of typos in `script/promote_migrations`
2022-03-25 15:48:20 +00:00
Jarek Radosz 2fc70c5572
DEV: Correctly tag heredocs (#16061)
This allows text editors to use correct syntax coloring for the heredoc sections.

Heredoc tag names we use:

languages: SQL, JS, RUBY, LUA, HTML, CSS, SCSS, SH, HBS, XML, YAML/YML, MF, ICS
other: MD, TEXT/TXT, RAW, EMAIL
2022-02-28 20:50:55 +01:00
Jarek Radosz 6f6406ea03
DEV: Fix random typos (#16066) 2022-02-28 10:20:58 +08:00
David Taylor 5374e587a3
DEV: Add message-bus analysis script (#15979)
This will count how many messages are published per-channel and produce a table of channels ordered by 'most messages'
2022-02-18 20:21:17 +00:00
Michael Brown 3bf3b9a4a5 DEV: pull email address validation out to a new EmailAddressValidator
We validate the *format* of email addresses in many places with a match against
a regex, often with very slightly different syntax.

Adding a separate EmailAddressValidator simplifies the code in a few spots and
feels cleaner.

Deprecated the old location in case someone is using it in a plugin.

No functionality change is in this commit.

Note: the regex used at the moment does not support using address literals, e.g.:
* localpart@[192.168.0.1]
* localpart@[2001:db8::1]
2022-02-17 21:49:22 -05:00
Gerhard Schlager 6394d7cddf
DEV: Improve phpBB3 import script (#15956)
* Optional import of custom user fields from phpBB 3.1+
* Optional import of likes from phpBB3
  Requires the phpBB "Thanks for posts" extension
* Fix import of bookmarks from phpBB3
* Update `created_at` of existing user
* Support mapping of phpBB forums to existing Discourse categories
  This is in addition to the ability of merging phpBB forums and importing into newly created Discourse categories.
2022-02-16 13:04:31 +01:00
Gerhard Schlager 33d6ed60a4
DEV: Don't import year of birth (#15937)
The cakeday plugin doesn't use the year.
2022-02-14 18:10:35 +01:00
Gerhard Schlager 6a41ec179c
FIX: Default settings for phpBB3 import were broken (#15913) 2022-02-11 18:18:54 +01:00
David Taylor 9e43f0303d
DEV: Include DISCOURSE_REDIS_REPLICA_HOST in cache_critical_dns (#15877)
This is the replacement for DISCOURSE_REDIS_SLAVE_HOST
2022-02-09 14:41:26 +00:00
Canapin ea2fd75d10
DEV: Fix some regexes in phpBB3 import script (#15829)
1. bbcode hashes don't always have exactly 8 characters.

2. colors aren't always hex values, it can be a color string ("red", "blue", etc).

3. The closing tag of smileys doesn't always include a `:` character (the start of the regex was already right for this particular issue)
2022-02-07 16:16:46 +01:00
David Taylor ed2f700440
DEV: Wait for initdb to complete in docker.rake (#15614)
On slower hardware it can take a while to init the database. If we don't wait, the `rake db:create` step will fail.
2022-01-17 17:45:39 +00:00
Peter Zhu c5fd8c42db
DEV: Fix methods removed in Ruby 3.2 (#15459)
* File.exists? is deprecated and removed in Ruby 3.2 in favor of
File.exist?
* Dir.exists? is deprecated and removed in Ruby 3.2 in favor of
Dir.exist?
2022-01-05 18:45:08 +01:00
David Taylor 0e87f882a7
DEV: Use discourse image for postgres in GitHub Actions (#15291)
The discourse base image already contains a postgres installation, so pulling a separate postgres image is a little wasteful. Using the copy of Postgres in the discourse image saves about 20 seconds on every GitHub actions run.

This commit sets up Postgres with a few performance-improving flags, which we were already using for the `rake docker:test` task (used on our internal CI system).
2021-12-14 17:20:06 +00:00
Jarek Radosz cfabdb72bc
FIX: Ambiguous column in `downsize_uploads` (#14972) 2021-11-16 16:23:32 +01:00
Leonardo Mosquera 48a08cc397
FIX: Vanilla importer fixes (#14699)
Import script was out of date
2021-10-27 14:22:37 +02:00
Dan Ungureanu 69f0f48dc0
DEV: Fix rubocop issues (#14715) 2021-10-27 11:39:28 +03:00
David Taylor 46d96c9feb
DEV: Apply rubocop to script/import_scripts/phorum.rb (#14727)
Followup to b24002018a
2021-10-26 19:16:52 +01:00
Jeremy Waters b24002018a Update phorum.rb
Add attachment/file/upload handling to bring them in from phorum to discourse
2021-10-26 12:41:50 -04:00
Jarek Radosz 451cd4ec3f
DEV: Fix thor deprecation warning (#14680)
```
Deprecation warning: Thor exit with status 0 on errors. To keep this behavior, you must define `exit_on_failure?` in `DiscourseCLI`
```
2021-10-21 21:01:05 +02:00
Theodore Diamantidis 97178cd777
FIX: phpbb import - attachments not embedded in posts (#14570) 2021-10-11 14:27:54 +02:00
Constanza a413a1e015
DEV: process image uploads in the Zendesk API import script (#14524) 2021-10-06 12:24:12 -04:00
Gerhard Schlager a4d0d866aa
DEV: Bulk imports should find existing users by email (#14468)
Without this change, bulk imports unconditionally create new user records even when a user with the same email address exists.
2021-09-29 00:20:06 +02:00
Andrei Prigorshnev 1656b7ed01
DEV: Make db_timestamp_mover work with tables with unique constraints (#14027)
Some tables in the database have constraints on columns with dates. Because of them, the script for moving timestamps can fail from time to time. This PR makes the script work with such tables.

In general, in PostgreSQL it is not always possible to defer constraint checks to the transaction commit (Primary Keys and Unique Constraints can be deferred, but them should be declared as DEFERRABLE to make it possible. Indices created with CREATE UNIQUE INDEX can't be deferred at all).

Since we can't defer constraint checks, I've made it work using a little hack. For example, if we need to move all timestamps by one day, the script will move timestamps by 1000 years and one day, and then return timestamps back by 1000 years. The script use this hack only for columns that have unique constraints.
2021-08-12 19:24:21 +04:00
Vinoth Kannan cd9262b7d3
DEV: minor improvements in the vanilla import script. (#14026)
We're parsing the post raw based on the record format now.
2021-08-12 15:07:44 +05:30
Andrei Prigorshnev b66674fec2
DEV: ignore the given_daily_likes table when moving timestamps on Try (#13971)
This will fix the try-reset build that failed today. Probably this going to happen again with other tables that have constraints on date columns. I'm going to modify the script to make it work without ignoring such tables. After that, the only table we're going to need to ignore will be the 2FA table.

Before I fixed that, don't hesitate to tag me if the try-reset build fail again.
2021-08-06 18:27:23 +04:00
Ruoxin Wang f9aaed7020
FIX: MyBB importer exposes deleted posts (#13700)
The MyBB importer exposes posts marked as deleted in MyBB. This may lead to a privacy issue.
2021-07-22 09:55:02 +02:00
Joffrey JAFFEUX a10fd95a7e
DEV: removes unused version_bump script (#13811) 2021-07-21 10:35:24 -04:00
Joffrey JAFFEUX 265e32e3e2
DEV: uses main instead of master in version_bump script (#13805) 2021-07-21 10:40:35 +02:00
Andrei Prigorshnev efc8d5f134
DEV: ignore the 2FA table when moving timestamps (#13793)
or 2FA will be broken after moving. We use this script for moving timestamps when restoring Try.
2021-07-20 15:49:20 +04:00
Andrei Prigorshnev 7fe3e4bab8
DEV: fix joining in the script for moving timestamps (#13721)
Without checking if t.table_schema = '#{@schema}' the SELECT with JOIN in the script were returning every column twice in case there is a 'backup' scheme with exactly the same tables as in the 'public scheme'
2021-07-13 17:21:15 +04:00
Andrei Prigorshnev 8a0349a01f
DEV: ignore some tables when updating timestamps using db_timestamps_mover.rb (#13714)
Some tables have constraints on columns with a date which can cause problems when moving timestamps. By now I think it's enough to just ignore them.
2021-07-13 12:36:41 +04:00
Andrei Prigorshnev 696bd0bf05
DEV: add a script for moving timestamps in database (#13682)
We're going to use this script for updating timestamps on Try, but it can be used with a local database during development as well.

Usage:

Commands:
  ruby db_timestamp_updater.rb yesterday <date> move all timestamps by x days so that <date> will be moved to yesterday
  ruby db_timestamp_updater.rb 100              move all timestamps forward by 100 days
  ruby db_timestamp_updater.rb -100             move all timestamps backward by 100 days
The script moves all timestamps in the database by the same amount of days forward or backward. No need to change the script if we add a new column in the future.

The more simple solution would be just to move timestamps in several tables (topics, posts, and so on). I didn't want to go that way because it could generate additional work in the future. For example, if we add a new column with a timestamp and users can see that timestamp we'd need to add that column to the script. Or, for example, if we move a post's timestamp to the future but forget to move a timestamp of topic timer or user action it can cause weird bugs.
2021-07-09 20:04:28 +04:00
Alan Guo Xiang Tan 37b8ce79c9
FEATURE: Add last visit indication to topic view page. (#13471)
This PR also removes grey old unread bubble from the topic badges by
dropping `TopicUser#highest_seen_post_number`.
2021-07-05 14:17:31 +08:00
Dan Ungureanu 6ea4bbd2ec
DEV: Prefer .pluck_first over .pluck.first (#13607) 2021-07-02 10:03:54 +08:00
Ikko Ashimine 88da06cba0 FIX: typo in discourse
occurences -> occurrences
2021-06-30 11:08:29 -04:00
David Taylor 20070f5089
DEV: Update script/promote_migrations (#13513)
Introduces the --plugins-base flag for updating plugins in a different directory. Followup to 49f39434c4
2021-06-24 13:57:23 +01:00
Jarek Radosz 046a875222
DEV: Improve `script/downsize_uploads.rb` (#13508)
* Only shrink images that are used in Posts and no other models
* Don't save the upload if the size is the same
2021-06-24 00:09:40 +02:00
David Taylor 49f39434c4 DEV: Introduce script/promote_migrations tool
Post-deploy migrations exist to allow for seamless Discourse upgrades. By design, they cause migrations to run out of numerical order. This has the potential to cause some unexpected edge cases. To reduce the likelihood of these edge cases, we will promote historical post_deploy migrations to regular migrations after a full Discourse stable release cycle.

This script is intended to be run at least during every Discourse release cycle.

This means that truly seamless upgrades will not be possible between non-consecutive Discourse versions. (Upgrades will still work, but may cause some server errors for users during the upgrade)
2021-06-23 17:43:38 +01:00
Bianca Nenciu 5efed91128 FIX: Set random values for digest_attempted_at
Setting a random value in the interval 1 week ago ... now works better
because this spreads digest scheduling over a week because digests are
sent one week from the date of the last digest.
2021-06-22 12:05:15 +08:00
Arpit Jalan 365d339985
DEV: fix Flarum import script (#13385) 2021-06-15 19:08:55 +05:30
Jarek Radosz 3bb765ac92
DEV: Remove the remaining Travis code (#13255)
The second attempt at #10041 now that all our plugins use GitHub Actions CI instead.
2021-06-02 20:29:47 +02:00
Lecter 4bf195d502
FEATURE: Flarum import script (#13139) 2021-05-27 02:30:50 +02:00
Bianca Nenciu d0779a87bb
FIX: Use max_category_nesting when importing categories (#13105)
It allowed for a parent category and a sub-category.
2021-05-26 12:40:26 +03:00
Josh Soref 59097b207f
DEV: Correct typos and spelling mistakes (#12812)
Over the years we accrued many spelling mistakes in the code base. 

This PR attempts to fix spelling mistakes and typos in all areas of the code that are extremely safe to change 

- comments
- test descriptions
- other low risk areas
2021-05-21 11:43:47 +10:00
Blake Erickson 4b7b586861 DEV: Remove unused rswag script
We no longer use this script for helping generate rswag formatted docs because we can use json schema now.
2021-05-07 09:36:55 +08:00
Justin DiRose c1517e428e
DEV: Add vBulletin5 bulk importer (#12904)
This is a pretty straightforward bulk importer, just tailored to the vBulletin 5 database structure.

Also made a few minor improvements to the base importer -- should be self explanatory in the code.
2021-04-30 11:03:33 -05:00
David Taylor 8b87cb07c3
DEV: Return a non-zero exit code when `discourse remap` is aborted (#12873)
This makes it much easier to integrate the script with external tools
2021-04-28 14:10:35 +01:00
Pilou e7892df10d
FIX: handle charset=windows-1252 in mbox import script (#12832)
Co-authored-by: Loïc Dachary <loic@dachary.org>
2021-04-27 15:43:31 +02:00
Michael Maroszek 5bec0e5763
fix vbulletin importer to import unreferenced attachments (#12187) 2021-04-19 21:05:16 +02:00
Justin DiRose e302e32a3f
DEV: Add Higher Logic import script (#12623)
Wrote up a new script to import from Higher Logic. Nothing too crazy going on here. Two major things about this script:

    It requires you to convert a Microsoft SQL file to a format MySQL can read.
    Higher Logic stores posts (at least in the case of the import I ran) with the email thread shown in the post body. The script does its best to truncate this out, but the logic may need to be improved on future imports. For the import I ran, it worked just fine as is. 🤷‍♂️
2021-04-06 16:53:55 -05:00
Justin DiRose 1aaf588fb7
DEV: Improvements to vanilla_mysql importer (#12308)
Made some improvements to the Vanilla MySQL script -- mainly because not all SQL imports require use of the VanillaBodyParser. Still left it as an option to turn on and use if so desired. Also added subcategory support, importing of likes, and solve status.
2021-03-11 10:21:56 -06:00
Jarek Radosz a60e26e799
DEV: Clean up and refactor CI workflow(s) (#12144)
Includes:

* DEV: Remove external plugin linting (that's covered by CI in their repositories)
* DEV: Move lint stages to a separate workflow (partial de-`if`-ication of workflows)
* DEV: Run CI on `main` branch too
* DEV: Update postgres to 13
* DEV: Update redis to 6.x

Other changes:
* DEV: Remove matrix.os
* DEV: Remove env.BUILD_TYPE
* DEV: Remove env.TARGET
* DEV: Rename `build_types` config option to `build_type`
* DEV: Lowercase `target` and `build_type` names
* DEV: Rename `ci` to `tests`
* DEV: Rename `lint` to `linting`
* DEV: Lower the wizard qunit timeout (30 min -> 10)
* DEV: Ruby version is no longer configurable
* DEV: Run plugin tests only in the `plugins` target
* DEV: Use binstubs where applicable
* DEV: We don't open PRs to `tests-passed`
2021-02-22 10:28:32 +01:00
Blake Erickson dbcda617b3
DEV: Add a CSV importer for restoring deleted users (#12147)
This is an importer I wrote to restore some users that were
accidentally deleted for being purged as old staged users or old
unactivated users.

It reads from CSV files exported from a discourse sql backup.
2021-02-19 13:46:54 -07:00
Blake Erickson ed0e4582a1
DEV: If disabled do not change setting after import (#12142)
When running an import script there are many site settings that are
changed but we reset them back to where they were originally before the
import. However, there are two settings that we don't roll back:

```
purge_unactivated_users_grace_period_days
purge_deleted_uploads_grace_period_days
```

which could have some unintended consequences.

My first question is do we *really* have to change these settings? I'm
not a huge fan of changing someones settings without them really knowing
they were changed.

If we really do have to change these settings here is my proposed PR
where we don't alter the `purge_unactivated_users_grace_period_days` if
it has been disabled.

As I'm writing this another change we could make is that we don't change
either of these site settings if we detect that they aren't set to the
default values.

The drive behind this PR is that there is a discourse instance which
relies on staged users as part of their workflow and this setting was
changed by accident via the import script causing users to be deleted
that shouldn't have been.
2021-02-19 09:33:35 -07:00
Michael Maroszek 144584aacb
fix vbulletin importer to hide soft-deleted posts (#12057)
equal to theads posts can be soft-deleted which results in a visibile = 2 state. at the moment those posts will be imported fully visible.
2021-02-12 14:29:05 +01:00
Gerhard Schlager 4d719725c8
FEATURE: Allow overriding the backup location when restoring via CLI (#12015)
You can use `discourse restore --location=local FILENAME` if you want to restore a backup that is stored locally even though the `backup_location` has the value `s3`.
2021-02-09 16:02:44 +01:00
David Taylor 821bb1e8cb
FEATURE: Rename 'Discourse SSO' to DiscourseConnect (#11978)
The 'Discourse SSO' protocol is being rebranded to DiscourseConnect. This should help to reduce confusion when 'SSO' is used in the generic sense.

This commit aims to:
- Rename `sso_` site settings. DiscourseConnect specific ones are prefixed `discourse_connect_`. Generic settings are prefixed `auth_`
- Add (server-side-only) backwards compatibility for the old setting names, with deprecation notices
- Copy `site_settings` database records to the new names
- Rename relevant translation keys
- Update relevant translations

This commit does **not** aim to:
- Rename any Ruby classes or methods. This might be done in a future commit
- Change any URLs. This would break existing integrations
- Make any changes to the protocol. This would break existing integrations
- Change any functionality. Further normalization across DiscourseConnect and other auth methods will be done separately

The risks are:
- There is no backwards compatibility for site settings on the client-side. Accessing auth-related site settings in Javascript is fairly rare, and an error on the client side would not be security-critical.
- If a plugin is monkey-patching parts of the auth process, changes to locale keys could cause broken error messages. This should also be unlikely. The old site setting names remain functional, so security-related overrides will remain working.

A follow-up commit will be made with a post-deploy migration to delete the old `site_settings` rows.
2021-02-08 10:04:33 +00:00
Bianca Nenciu a71b219c9a
Improvements to phpBB3 import script (#10999)
* FEATURE: Import attachments

* FEATURE: Add support for importing multiple forums in one

* FEATURE: Add support for category and tag mapping

* FEATURE: Import groups

* FIX: Add spaces around images

* FEATURE: Custom mapping of user rank to trust levels

* FIX: Do not fail import if it cannot import polls

* FIX: Optimize existing records lookup

Co-authored-by: Gerhard Schlager <mail@gerhard-schlager.at>
Co-authored-by: Jarek Radosz <jradosz@gmail.com>
2021-01-14 21:44:43 +02:00
Arpit Jalan bd7cbcd8f8
Improve Vanilla import script. (#11701)
- import groups and group users
- import uploads/attachments
- improved code tag parsing
- improved text formatting
- mark topics as solved
2021-01-13 23:10:00 +05:30
Justin DiRose f6e87e1e5e
DEV: Improvements to Discourse Merger script (#11660)
After running the Discourse merge script, it was pretty evident it held up well after all these years ;)

Made a few fixes:

    Included an environment variable for DB_PASS as likely the password will need to be changed if running the import in an official Docker container (recommended)
    Set a hard order for imported categories, otherwise sometimes they'd be imported in a weird order making things unpredictable for parent/child category imports
    Fixed a couple of instances where we added unique indexes (such as on category slugs)
    Set up upload regex to handle AWS URLs better
    Fixed the script to work with frozen string literals
2021-01-08 09:31:39 -06:00
Gerhard Schlager fc9155f3ee
DEV: Lint MessageFormat strings to prevent usage of "one {foo 1 bar}" (#11608)
Follow-up to 6b53f26f
2021-01-04 12:29:20 +01:00
Gerhard Schlager 6b53f26fc0
DEV: Lint MessageFormat strings to prevent usage of "one {1 foo}" (#11605) 2020-12-29 21:42:47 +01:00
Régis Hanol a85d5edbf1
DEV: set digest_attempted_at during migrations (#11369) 2020-12-14 10:58:14 +11:00
David Taylor cf21de0e7a
DEV: Migrate Github authentication to ManagedAuthenticator (#11170)
This commit adds an additional find_user_by_email hook to ManagedAuthenticator so that GitHub login can continue to support secondary email addresses

The github_user_infos table will be dropped in a follow-up commit.

This is the last core authenticator to be migrated to ManagedAuthenticator 🎉
2020-11-10 10:09:15 +00:00
Jarek Radosz 2f4a1ff61b
DEV: Update rubocop-discourse from 2.3.2 to 2.4.0 (#11079)
Also fixes whitespace related issues raised by rubocop.
2020-10-30 15:04:29 +01:00
Gerhard Schlager 57095f0bb7 FIX: Killing a Unicorn worker shouldn't kill a running backup or restore process
By spawning and forking the backup and restore, the process owner changes from 🦄 to the init process.
2020-10-13 19:48:53 +02:00
Shane Dela Rosa 8b678b5dbe Update vanilla_body_parser for stability 2020-09-23 09:54:26 -04:00
Gerhard Schlager 76477a1c8b FIX: Forking prevented notifications from being sent after backup
This is a workaround for https://github.com/rubyjs/mini_racer/issues/175
2020-09-18 17:35:17 +02:00
Rachel Carvalho 812e0d6b5e
FIX: improve Vanilla importing (#10478)
* ensure emails don't have spaces
* import banned users as suspended for 1k yrs
* upgrade users to TL2 if they have comments
* topic: import views, closed and pinned info
* import messages
* encode vanilla usernames for permalinks. Vanilla usernames can contain spaces and special characters.
* parse Vanilla's new rich body format
2020-08-24 16:19:57 -04:00
Neil Lalonde 530058918e
DEV: Support env var for prometheus port in cache_critical_dns 2020-08-17 15:48:14 -04:00
Robin Ward 7df57b35da REFACTOR: Remove `Discourse.__widget_helpers`
It's now a variable in the context where the templates are created.
2020-08-06 14:35:46 -04:00
Justin DiRose d266baff2b
DEV: Improve q2a import script (#10346)
The changes here are largely pulled from the below gist by RGJ on meta.discourse.org. Thanks much for the changes!

https://gist.github.com/discoursehosting/769eff2014d5482f0ab776de03dc3349
2020-07-31 12:04:03 -05:00
Krzysztof Kotlarek e0d9232259
FIX: use allowlist and blocklist terminology (#10209)
This is a PR of the renaming whitelist to allowlist and blacklist to the blocklist.
2020-07-27 10:23:54 +10:00
Discourse Translator Bot 29788f2c26 DEV: Switch from Transifex to Crowdin 2020-07-16 14:00:02 +02:00
Krzysztof Kotlarek 93ff54e184
FIX: improvements for vanilla bulk import (#10212)
Adjustments to the base:
1. PG connection doesn't require host - it was broken on import droplet
2. Drop `topic_reply_count` - it was removed here - https://github.com/discourse/discourse/blob/master/db/post_migrate/20200513185052_drop_topic_reply_count.rb
3. Error with `backtrace.join("\n")` -> `e.backtrace.join("\n")`
4. Correctly link the user and avatar to quote block

Adjustments to vanilla:
1. Top-level Vanilla categories are valid categories
2. Posts have `format` column which should be used to decide if the format is HTML or Markdown
3. Remove no UTF8 characters
4. Remove not supported HTML elements like `font` `span` `sub` `u`
2020-07-14 15:58:27 +10:00
Régis Hanol 7a6d772ad2 DEV: couple bug fixes in getsatisfaction importer
- Ensure we don't modify a frozen string
- Ensure we have a slug before trying to create a permalink
2020-07-06 17:41:28 +02:00
Gerhard Schlager 1f053173a4 FEATURE: Import script for jForum 2020-06-21 12:12:42 +02:00
Régis Hanol 47a1157458 DEV: various bugfixes in bulk importer 2020-06-19 17:53:06 +02:00
Régis Hanol 5143309014 DEV: ensure values are converted to integers in bulk importer 2020-06-18 17:42:14 +02:00
Bernhard Suttner e31471585a
DEV: allow to have duplicate topic titles if categegory is different (#10034)
Co-authored-by: Robin Ward <robin.ward@gmail.com>

Co-authored-by: Robin Ward <robin.ward@gmail.com>
2020-06-18 11:19:47 -04:00
Régis Hanol 823b940b9d PERF: improve loading of indexes in bulk import
Similar strategy as for c52191d in which we stream the results from the database into
an automatically growing array instead of using a hash.
2020-06-18 16:32:27 +02:00
Régis Hanol c52191d49e PERF: improve loading a imported_ids in bulk imports
- Stream the queries that load the imported_ids
- Use an array instead of a hash for keeping the mapping between imported_ids and new ids
- Ensure we always treat the imported_ids as integers instead of strings
2020-06-16 19:55:08 +02:00
Jarek Radosz 669c940ec3 Revert "DEV: Remove the remaining ENV["TRAVIS"] usage (#10041)"
This reverts commit 78aff841e3.

See https://review.discourse.org/t/dev-remove-the-remaining-env-travis-usage-10041/12737/4?u=cvx
2020-06-16 19:42:00 +02:00
Jarek Radosz 78aff841e3
DEV: Remove the remaining ENV["TRAVIS"] usage (#10041) 2020-06-16 17:41:15 +02:00
Jarek Radosz 3d55f2e3b7
FIX: Improvements and fixes to the image downsizing script (#9950)
Fixed bugs, added specs, extracted the upload downsizing code to a class, added support for non-S3 setups, changed it so that images aren't downloaded twice.

This code has been tested on production and successfully resized ~180k uploads.

Includes:

* DEV: Extract upload downsizing logic
* DEV: Add support for non-S3 uploads
* DEV: Process only images uploaded by users
* FIX: Incorrect usage of `count` and `exist?` typo
* DEV: Spec S3 image downsizing
* DEV: Avoid downloading images twice
* DEV: Update filesizes earlier in the process
* DEV: Return false on invalid upload
* FIX: Download images that currently above the limit (If the image size limit is decreased, then there was no way to resize those images that now fall outside the allowed size range)
* Update script/downsize_uploads.rb (Co-authored-by: Régis Hanol <regis@hanol.fr>)
2020-06-11 14:47:59 +02:00
Justin DiRose be28fc73a0
DEV: Improvements to Drupal script (#10016)
Refactors script to follow conventions of other importers and adds some features including like import, processing of post raw text, and, if needed, SSO import.
2020-06-10 10:59:17 -05:00
Arpit Jalan a93d24501c FIX: base import script was not updating first_post_created_at column
FEATURE: new rake task to update first_post_created_at column

The not-equal operator (`<>`) in PostgreSQL does not compare values
with NULL. We should instead use `IS DISTINCT FROM` when comparing
values with NULL.
2020-06-04 11:26:40 +05:30
Gerhard Schlager f683c5d0e0 DEV: Check English locale for errors in CI
Moves the most important checks into a linter. It gets executed by Lefthook as well as the docker rake task and Github actions. Doing those checks in rspec takes too long and it produces errors when the discourse:test Docker image contains old, invalid locale files.
2020-06-03 21:54:58 +02:00
Jarek Radosz 4babdf510b DEV: Update facter usage
`Facter.reset` (65d167eac9/lib/facter.rb (L126-L137)) clears `Facter::Options[:external_dir]` which seems to be the 4.x equivalent of `Facter::Util::Config.external_facts_dirs`.

This commit also makes sure that version 4.0 or higher is installed.
2020-06-01 05:50:49 +02:00
Jarek Radosz 8c8391c742 DEV: Don't require wget in script/bench 2020-06-01 05:50:48 +02:00
Jarek Radosz eb462bfb3d
FIX: Improve image downsizing script (#9549)
Correctly handles more upload formats in posts, updates post custom fields, fixes more edge cases, adds debugging capabilities. (VERBOSE=1 and INTERACTIVE=1 flags)

Includes these commits and some more:

* DEV: Show the fixed image dimensions
* FIX: Support more upload url formats
* DEV: Remove the old upload after updating posts
* FIX: Use the `process_post_#{id}` mutex
* FIX: Avoid rebaking twice
* DEV: Print out the link to the post
* DEV: Process posts chronologically
* DEV: Do a dry-run before saving, pause on any issue
* FIX: Also process deleted posts
* DEV: Make matchers case-insensitive
* DEV: Pause on "detached" uploads, add more debug info
* DEV: Print out time when finished
* DEV: Add support for WORKER_ID/WORKER_COUNT
* DEV: Fix the onebox in cooked text heuristic
* DEV: Don't report already processed posts
* DEV: Beep when done!
* DEV: Ignore issues with deleted posts
* DEV: Ignore issues with deleted topics
* DEV: Multiline SQL
* DEV: Use the bulk attribute assignment
* DEV: Add ENV["INTERACTIVE"] mode
* DEV: Handle post custom fields
* DEV: Bail on non-S3 sites
* DEV: Allow sizes smaller than 1 mpix
2020-05-26 15:38:23 +02:00
Kane York 869f9b20a2
PERF: Dematerialize topic_reply_count (#9769)
* PERF: Dematerialize topic_reply_count

It's only ever used for trust level promotions that run daily, or compared to 0. We don't need to track it on every post creation.

* UX: Add symbol in TL3 report if topic reply count is capped

* DEV: Drop user_stats.topic_reply_count column
2020-05-14 15:42:00 -07:00
Blake Erickson ab919332dc
DEV: api documentation updates (#9612)
* DEV: api documentation updates

- Created a script to convert json responses to rswag
- Documented several api endpoints
- Switched rswag to use header based auth

* Update script, fix some schema missmatches
2020-05-11 13:06:49 -06:00
Francesco Frassinelli c6f68e4006
Fix syntax error in fluxbb.rb (#9727) 2020-05-11 11:07:57 -04:00
Kyle E. Mitchell 6c968b7945
Add script for compiling copyright deposits (#9646)
* Add script for compiling copyright deposits

* git mv copyright-deposit script/
2020-05-06 12:51:45 -04:00
Krzysztof Kotlarek 9bff0882c3
FEATURE: Nokogumbo (#9577)
* FEATURE: Nokogumbo

Use Nokogumbo HTML parser.
2020-05-05 13:46:57 +10:00
Martin Brennan 867bc3b48e
FIX: Change base importer to create new Bookmark records (#9603)
Also add a spec and fixture with a mock importer that we can use to test the create_X methods of the base importer
2020-05-01 11:34:55 +10:00
Sam Saffron d0d5a138c3
DEV: stop freezing frozen strings
We have the `# frozen_string_literal: true` comment on all our
files. This means all string literals are frozen. There is no need
to call #freeze on any literals.

For files with `# frozen_string_literal: true`

```
puts %w{a b}[0].frozen?
=> true

puts "hi".frozen?
=> true

puts "a #{1} b".frozen?
=> true

puts ("a " + "b").frozen?
=> false

puts (-("a " + "b")).frozen?
=> true
```

For more details see: https://samsaffron.com/archive/2018/02/16/reducing-string-duplication-in-ruby
2020-04-30 16:48:53 +10:00
discoursehosting 094ddb1c1f
VBulletin5 importer improvements (#9477)
- no more hard coded contenttypes
- permalinks for topics, categories, subcategories
- better uploads handling
- tag support
2020-04-22 22:04:59 +02:00
Jarek Radosz 28c706bd09
DEV: Do less work in docker_test (#9470)
* DEV: Update the working tree just once.

`git pull` was effectively doing `git fetch` and `git merge FETCH_HEAD`, and only then we were checking out the desired branch/commit. This change will skip the the merge step.

* DEV: Don't run lefthook in docker_test
2020-04-21 03:48:58 +02:00
Jay Pfaffman 3c72cbc5de
FIX: Google groups import changed login URL (#9432)
I'm not clear why changing only the `wait_for_url` address was necessary and not also the `get` a few lines above, but this change seems to work for me on both literatecomputing.com Groups and a public group.
2020-04-15 16:45:14 -04:00
Gerhard Schlager 2b2584912a Improve Telligent import script
* Imports private messages
* Replaces internal links for topics and replies
* Allows incremental import of accepted answers
2020-04-03 18:10:52 +02:00
Robin Ward b2b7afd310 Rename the server side widget hbs compiler 2020-03-27 12:06:14 -04:00
Gerhard Schlager 739430c01e FIX: mbox import failed if no tags were configured 2020-03-26 16:41:11 +01:00
Gerhard Schlager d216483c53 FIX: Importing with pgbouncer failed
Checking if all records have been imported uses a temp table in PostgreSQL. This fails when pgbouncer is used unless the temp table is created inside a transaction.
2020-03-26 16:41:09 +01:00
Gerhard Schlager c94b63bc75 DEV: Improve import of attachments from Telligent 2020-03-26 16:37:55 +01:00
Jarek Radosz d21d80198c
DEV: Update rubocop-discourse (#9270)
Includes:
* DEV: Use `eq_time` matcher
2020-03-26 16:32:41 +01:00
Robin Ward eaa324ecbd Revert "Move the widget-hbs compiler to js from es6"
This reverts commit 5d66a2c16e.
2020-03-25 16:13:26 -04:00
Robin Ward 5d66a2c16e Move the widget-hbs compiler to js from es6 2020-03-25 15:03:21 -04:00
Gerhard Schlager 5b2b769eb7 DEV: Ensure uploads aren't deleted during imports
Sidekiq might delete uploads if you, for some reason, create upload records before using them in posts.
2020-03-24 17:14:16 +01:00
Gerhard Schlager 445b35381d Improve Telligent import script
* Detects mostly all attachments and it's a lot faster
* Parses user properties in Ruby instead of the DB, because that's less errorprone
* Imports user avatars
* Imports topic views by users
* Better handling of quotes and YouTube links
2020-03-23 09:18:12 +01:00
Gerhard Schlager d27ece9ded FIX: Method from Telligent import script was deleted by accident 2020-03-14 22:10:40 +01:00
Gerhard Schlager e825f47daa DEV: Better handling of incremental scrapes for Google Groups 2020-03-14 00:00:36 +01:00
Gerhard Schlager 0a88232e87 DEV: Improve mbox import script
* Better documentation of settings
* Add option to exclude trimmed parts of emails (enabled by default) to not revail email addresses
2020-03-14 00:00:36 +01:00
Gerhard Schlager 36062f43c8 DEV: Improve Telligent import script
* Adds ability to map forums to categories and tags as well as ignore forums.
* Fixes regular expression for detecting attachments in posts.
* Handles "remote attachments" 😮 by inserting a link.
* Imports view counts for topics.
* Handles incorrect references of parent posts.
* Better handling of quotes.
* Finds a lot more attachments by trying to replace various Unicode characters in filenames.
2020-03-14 00:00:36 +01:00
Gerhard Schlager ba1b840816 DEV: Don't deactivate suspended users during import
Otherwise a cleanup job might delete those deactivated users.
2020-03-14 00:00:36 +01:00
Justin DiRose 6c948f27ea
FIX: Missing constant in SMF2 importer (#9178) 2020-03-11 10:19:59 -05:00
Gerhard Schlager 0e752db411 DEV: Improve mbox import script
* Customizable email subject prefixes to remove "Re" and "Fwd" as well as localized prefixes.
* Configuration option for prefixes like [FOO] or (BAR) which can be replaced with tags during import.
* Bugfix: Import script might have skipped some users due to missing ORDER BY.
2020-03-09 10:26:45 +01:00
Gerhard Schlager edc8d58ac3 FEATURE: Add site setting to disable staged user cleanup
... and disabled the cleanup during imports, otherwise a running Sidekiq might delete users before posts are created
2020-03-09 10:26:41 +01:00
Jarek Radosz f7ea2fdea5
FIX: Import posts of missing users from phpbb3 (#9085)
Posts without a user probably shouldn't happen unless there was some direct database tampering, but data like that has been seen in the wild.

The importer will assign those posts to the "system" user.
2020-03-06 22:54:40 +01:00
Gerhard Schlager d7ccb58559 FIX: Google Groups scraper failed to login 2020-03-02 17:24:48 +01:00
Brad Morrical ff5ff8d0d2
fix invalid byte sequence in UTF-8 (ArgumentError) (#9077) 2020-02-28 10:26:18 -05:00
Justin DiRose f35ee5e887
DEV: Improvements to SMF2 script (#9006) 2020-02-24 12:51:45 -06:00
Sam Saffron 28292d2759
PERF: avoid shelling to get hostname aggressively
Previously we had many places in the app that called `hostname` to get
hostname of a server. This commit replaces the pattern in 2 ways

1. We cache the result in `Discourse.os_hostname` so it is only ever called once

2. We prefer to use Socket.gethostname which avoids making a shell command

This improves performance as we are not spawning hostname processes throughout
the app lifetime
2020-02-18 15:13:19 +11:00
David Taylor 5919618a87
DEV: Drop legacy OpenID 2.0 support (#8894)
This is not used in core or official plugins, and has been printing a deprecation notice since v2.3.0beta4. All OpenID 2.0 code and dependencies have been dropped. The user_open_ids table remains for now, in case anyone has missed the deprecation notice, and needs to migrate their data.

Context at https://meta.discourse.org/t/-/113249
2020-02-07 17:32:35 +00:00
Régis Hanol 0843e3e6ce FIX: add support for sub-sub-categories in base_importer
Also delegates 'post_already_imported?' and 'user_already_imported?' to the base importer.
2020-02-05 10:40:28 +01:00
Jarek Radosz 8a82ceb3bc
FIX: Improve downsize_uploads (#8409)
With this change the script:
* Actually removes original large-sized images
* Doesn't save processed files if their size has increased
* Prevents inconsistent state
2020-01-27 03:31:11 +01:00
Gerhard Schlager ab07b945c2
Merge pull request #8736 from gschlager/rename_reply_id_column
REFACTOR: Rename `post_replies.reply_id` column to `post_replies.reply_post_id`
2020-01-17 17:24:49 +01:00
Gerhard Schlager e474cda321 REFACTOR: Restoring of backups and migration of uploads to S3 2020-01-14 11:41:35 +01:00
Sam Saffron 710eafdd35 FIX: ensure we consistently pick the same topic for bench
We pick the first topic with 30 responses as our bench topic.

Previously we simply picked the last topic, but hand no guarantee on ordering.

This also attempts to correct previous runs of the bench.
2020-01-08 16:33:45 +11:00