We've had a couple of problems with the R2 gem where it generated a broken RTL CSS bundle that caused a badly broken layout when Discourse is used in an RTL language, see a3ce93b and 5926386. For this reason, we're replacing R2 with `rtlcss` that can handle modern CSS features better than R2 does.
`rltcss` is written in JS and available as an npm package. Calling the `rltcss` from rubyland is done via the `rtlcss_wrapper` gem which contains a distributable copy of the `rtlcss` package and loads/calls it with Mini Racer. See https://github.com/discourse/rtlcss_wrapper for more details.
Internal topic: t/76263.
This should resolve these warnings under Ruby 3.1
```
warning: Passing safe_level with the 2nd argument of ERB.new is deprecated
```
Unfortunately Sprockets 3.x has not seen a rubygems release since 2018, so we need to fetch these improvements via git.
Our fork was needed for OpenSSL 3 and Ruby 2.X compatibility.
The OpenSSL 3 part was merged into the gem for version 3.
Discourse dropped support for Ruby 2.X.
That means we don't need our fork anymore.
This commit introduces the necessary gems and config, but adds all our ruby code directories to the `--ignore-files` list.
Future commits will apply syntax_tree to parts of the codebase, removing the ignore patterns as we go
* FIX: Ensure we have a patched version of CGI gem
Per https://github.com/ruby/cgi/pull/29 the current shipped version of
the CGI gem doesn't allow for leading dots in domain names, which breaks
setting cookies like `.example.com`.
* Update Gemfile
Co-authored-by: Jarek Radosz <jradosz@gmail.com>
Co-authored-by: Jarek Radosz <jradosz@gmail.com>
* Allow taking table prefix from env var
* FIX: remove unused column references
The columns `filedata` and `extension` are not present in a v4.2.4
database, and they aren't used in the method anyways.
* FIX: report progress for tables without imported_id
* FIX: effectively check for AR validation errors
NOTE: other migration scripts also have this problem; see /t/58202
* FIX: properly count Posts when importing attachments
* FIX: improve logging
* Remove leftover comment
* FIX: show progress when exporting Permalink file
* PERF: stream Permalink file
The current way results in tons of memory usage; write once per line instead
* Document fixes needed
* WIP - deduplicate category names
* Ignore non alphanumeric chars for grouping
* FIX: properly deduplicate user emails by merging accounts
* FIX: don't merge empty UserEmails
* Improve logging
* Merge users AFTER fixing primary key sequences
* Parallelize user merging
* Save duplicated users structure for debugging purposes
* Add progress logging for the (multiple hour) user merging step
We now use Ember CLI (core/plugins) and DiscourseJSProcessor (themes) for all Ember and template compilation. This commit removes the remnants of the legacy Sprockets-based Ember compilation system.
Sprockets, and its DiscourseJSProcess-based Babel transformations, is still in use for a few assets. Ideally that will be removed/replaced in the near future.
`Faraday` is very commonly used in official and third-party plugins, and we will likely be increasing our use of it in core. This commit adds it as a direct dependency and adds the official faraday-retry gem which is very commonly used (e.g. by Octokit).
This commit introduces rails system tests run with chromedriver, selenium,
and headless chrome to our testing toolbox.
We use the `webdrivers` gem and `selenium-webdriver` which is what
the latest Rails uses so the tests run locally and in CI out of the box.
You can use `SELENIUM_VERBOSE_DRIVER_LOGS=1` to show extra
verbose logs of what selenium is doing to communicate with the system
tests.
By default JS logs are verbose so errors from JS are shown when
running system tests, you can disable this with
`SELENIUM_DISABLE_VERBOSE_JS_LOGS=1`
You can use `SELENIUM_HEADLESS=0` to run the system
tests inside a chrome browser instead of headless, which can be useful to debug things
and see what the spec sees. See note above about `bin/ember-cli` to avoid
surprises.
I have modified `bin/turbo_rspec` to exclude `spec/system` by default,
support for parallel system specs is a little shaky right now and we don't
want them slowing down the turbo by default either.
### PageObjects and System Tests
To make querying and inspecting parts of the page easier
and more reusable inbetween system tests, we are using the
concept of [PageObjects](https://www.selenium.dev/documentation/test_practices/encouraged/page_object_models/) in
our system tests. A "Page" here is generally corresponds to
an overarching ember route, e.g. "Topic" for `/t/324345/some-topic`,
and this contains logic for querying components within the topic
such as "Posts".
I have also split "Modals" into their own entity. Further down the
line we may want to explore creating independent "Component"
contexts.
Capybara DSL should be included in each PageObject class,
reference for this can be found at https://rubydoc.info/github/teamcapybara/capybara/master#the-dsl
For system tests, since they are so slow, we want to focus on
the "happy path" and not do every different possible context
and branch check using them. They are meant to be overarching
tests that check a number of things are correct using the full stack
from JS and ember to rails to ruby and then the database.
### CI Setup
Whenever a system spec fails, a screenshot
is taken and a build artifact is produced _after the entire CI run is complete_,
which can be downloaded from the Actions UI in the repo.
Most importantly, a step to build the Ember app using Ember CLI
is needed, otherwise the JS assets cannot be found by capybara:
```
- name: Build Ember CLI
run: bin/ember-cli --build
```
A new `--build` argument has been added to `bin/ember-cli` for this
case, which is not needed locally if you already have the discourse
rails server running via `bin/ember-cli -u` since the whole server is built and
set up by default.
Co-authored-by: David Taylor <david@taylorhq.com>
Following the Rails 7 upgrade, the `DISCOURSE_SMTP_ENABLE_START_TLS`
setting doesn’t work anymore. This is because Rails upgraded the
`net-smtp` gem to the 0.3.1 version which enables `starttls` by default.
The `mail` gem doesn’t support this new behavior yet and doesn’t know
how to disable TLS. This should be fixed in an upcoming release.
Meanwhile applying this patch allows us to get back the previous
behavior which is expected by many.
Latest redis interoduces a block form of multi / pipelined, this was incorrectly
passed through and not namespaced.
Fix also updates logster, we held off on upgrading it due to missing functions
This reverts commit 01107e418e.
We have seen some random occurrences of corrupted assets, and think it may be related to the sprockets 4 update. Reverting for investigation
The main difference is that Sprockets 4.0 no longer tries to compile everything by default. This is good for us, because we can remove all our custom 'exclusion' logic which was working around the old sprockets 3.0 behavior.
The other big change is that lambdas can no longer be added to the `config.assets.precompile` array. Instead, we can do the necessary globs ourselves, and add the desired files manually.
A small patch is required to make ember-rails compatible. Since we plan to remove this dependency in the near future, I do not intend to upstream this change.
I have compared the `bin/rake assets:precompile` output before and after this change, and verified that all files are present.
The main difference is that Sprockets 4.0 no longer tries to compile everything by default. This is good for us, because we can remove all our custom 'exclusion' logic which was working around the old sprockets 3.0 behavior.
The other big change is that lambdas can no longer be added to the `config.assets.precompile` array. Instead, we can do the necessary globs ourselves, and add the desired files manually.
A small patch is required to make ember-rails compatible. Since we plan to remove this dependency in the near future, I do not intend to upstream this change.
I have compared the `bin/rake assets:precompile` output before and after this change, and verified that all files are present.
This reverts commit f5cf647e57.
The gem breaks usage of Rails URL helpers when used outside views and
controllers, for example in
88ecb83382/app/models/upload.rb (L239-L242)
the `upload_short_path` method call fails with an undefined method
exception when this gem is enabled.
The lazy route initialization cuts down boot time of rails.
On my local system it cuts out 200ms of boot time taking me from 3.2 to 3 seconds.
This is not a radically enormous amount of time, but paper cuts add up, and a faster boot in dev will make everyone happy.
TBD if we want to also include this in production.
Gem is heavily maintained by @amatsuda, last commit 3 days ago.
`bin/rake annotate` is an alias of `bin/annotate --models`
`bin/rake annotate:clean` generates annotations by using a temporary, freshly migrated database. This should help us to produce more consistent annotations, even if development databases have been polluted by plugin migrations.
A GitHub actions task is also added which generates annotations on a clean database, and raises an error if they differ from the committed annotations.
* Move onebox gem in core library
* Update template file path
* Remove warning for onebox gem caching
* Remove onebox version file
* Remove onebox gem
* Add sanitize gem
* Require onebox library in lazy-yt plugin
* Remove onebox web specific code
This code was used in standalone onebox Sinatra application
* Merge Discourse specific AllowlistedGenericOnebox engine in core
* Fix onebox engine filenames to match class name casing
* Move onebox specs from gem into core
* DEV: Rename `response` helper to `onebox_response`
Fixes a naming collision.
* Require rails_helper
* Don't use `before/after(:all)`
* Whitespace
* Remove fakeweb
* Remove poor unit tests
* DEV: Re-add fakeweb, plugins are using it
* Move onebox helpers
* Stub Instagram API
* FIX: Follow additional redirect status codes (#476)
Don’t throw errors if we encounter 303, 307 or 308 HTTP status codes in responses
* Remove an empty file
* DEV: Update the license file
Using the copy from https://choosealicense.com/licenses/gpl-2.0/#
Hopefully this will enable GitHub to show the license UI?
* DEV: Update embedded copyrights
* DEV: Add Onebox copyright notice
* DEV: Add MIT license, convert COPYRIGHT.txt to md
* DEV: Remove an incorrect copyright claim
Co-authored-by: Jarek Radosz <jradosz@gmail.com>
Co-authored-by: jbrw <jamie@goatforce5.org>
The main image_optim gem now includes the timeout feature
that we had in our fork. So it is now safe to switch off of our fork and
back to the image_optim gem.
This is the link to the commit in the image_optim repo that adds the
timeout option:
ec3767dde0
One difference with the new timeout implementation is that image_optim
now handles the timeout exceptions instead of bubbling them up:
1ed0328587/lib/image_optim.rb (L128-L129)
```
rescue Errors::TimeoutExceeded
handler.result
```
So a timeout will just return `nil`, which is the same response if it
couldn't optimize an image. I don't think we were really watching for
or doing anything about these timeout warnings in our logs so I think
this is an okay change to have and we will have less warnings in our
logs now too.
Rails 6.1.3.1 deprecates a few API and has some internal changes that break our tests suite, so this commit fixes all the deprecations and errors and now Discourse should be fully compatible with Rails 6.1.3.1. We also have a new release of the rails_failover gem that's compatible with Rails 6.1.3.1.
To add an extra layer of security, we sanitize settings before shipping them to the client. We don't sanitize those that have the "html" type.
The CookedPostProcessor already uses Loofah for sanitization, so I chose to also use it for this. I added it to our gemfile since we installed it as a transitive dependency.
Version 2.8 brings some changes to how address fields are handled and
this commits updates that and should also include a fix which handles
encoded attachment filenames.
The fork contains a bugfix to correctly decode mail attachments.
* DEV: Add schema checking to api doc testing
This commit improves upon rswag which lacks schema checking. rswag
really only checks that the https status matches, but this change adds
in the json-schema_builder gem which also has schema validation.
Now we can define schemas for each of our requests/responses in the
`spec/requests/api/schemas` directory which will make our documentation
specs a lot cleaner.
If we update a serializer by either adding or removing an attribute the
tests will now fail (this is a good thing!). Also if you change the type
of an attribute say from an array to a string the tests will now fail.
This will help significantly with keeping the docs in sync with actual
code changes! Now if you change how an endpoint will respond you will
have to update the docs too in order for the tests to pass. :D
This PR is inspired by:
https://www.tealhq.com/post/how-teal-keeps-their-api-tests-and-documentation-in-sync
* Swap out json schema validator gem
Swapped out the outdated json-schema_builder gem with the json_schemer
gem.
* Add validation fields to schema
In order to have "strict" validation we need to add
`additionalProperties: false` to the schema, and we need to specify
which attributes are required.
Updated the debugging test output to print out the error details if
there are any.
We are switching over to a fork because we are currently on a pinned
version of ember-rails 0.18.5 which is pretty old. Upgrading to the
latest version causes many things to break which isn't really worth the
time to debug while we plan to completely switch over to ember-cli
somewhat soonish. Our fork contains a single cherry-pick commit
https://github.com/emberjs/ember-rails/pull/534
which will fix an issue when running the `rails g migration` command and
it spits out a bunch of deprecation warnings.
We are no longer directly referencing the rb-inotify gem directly in
code. This was just a spec level dependency anyways.
Using `git log -S "Inotify"` resulted in these two commits as usages of
`Inotify`:
- b56b11d96a
- 9cf03b352c
both from 2013, but we no longer are using inotify in
https://github.com/discourse/discourse/blob/master/lib/tasks/autospec.rake
which appears to be the only file that was using it.
Based on this info we can safely remove rb-inotify from the Gemfile.
Just as a side note we still do have a couple of gems that do have
rb-inotify as a dependency: listen, and lru_redux.
* DEV: Switch our fast_xor gem for xorcist
We use the `xor` function as part of password hashing and we want to use
a faster version than the native ruby xor'ing feature so we use a gem
for this.
fast_xor has been abandoned, and xorcist fixed our initial holdup for
switching in https://github.com/fny/xorcist/issues/4
xorcist also has jruby support so we can remove our jruby fallback
logic.
* Move using statement inside of class
The rotp gem is currently pinned to version 5.1.0 and this will bump it
up to version 6.0.1.
Follow up to: 85d4370f79
because this issue we were waiting on is now closed:
https://github.com/mdp/rotp/issues/98
Because version 6 is now encoding the params I needed to update the
tests as well.
Currently we have pinned highline to version 1.7.0. This is the gem that
we use to have an interactive command line for tasks like `rake
admin:create`.
Upgrading to the latest version 2.0.3 will remove ruby 2.7 deprecation
warnings.
I'm not sure why *this* gem was pinned. I manually executed a couple of
our rake tasks that use this and everything seems fine.
Not ready for an upgrade due to: https://github.com/mdp/rotp/issues/98
The policy here is that for cases like this we pin the version and add
a comment explaining why it is pinned.
We can revisit in a few months depending on upstream.
This is very minor, see: https://github.com/advisories/GHSA-j6w9-fv6q-3q52
An attacker can elevate own cookie usage to bypass server cookie restrictions
Technically this is a security commit, but the surface area is extremely
low, we do not expect any real world impact.
This includes a fix for CVE-2020-8185 we are not vulnerable as we do not use
the impacted middleware. However it still makes sense to stay upgraded, other
small fixes exist in this release.
It adds a somewhat unnecessary middleware before `ActionDispatch::DebugExceptions` and totally bypasses it. Apps that register exception interceptors with `ActionDispatch::DebugExceptions` would therefore stop working if better_errors is used.
We removed pry-nav a while back because it is not up to date with pry but it is super useful. Luckily pry-byebug is here to save us all from Satan's power.
To get this to work you need to add the following to your $HOME/.pryrc file.
```
if defined?(PryByebug)
Pry.commands.alias_command 'c', 'continue'
Pry.commands.alias_command 's', 'step'
Pry.commands.alias_command 'n', 'next'
Pry.commands.alias_command 'f', 'finish'
end
Pry::Commands.command /^$/, "repeat last command" do
pry_instance.run_command Pry.history.to_a.last
end
```
The require-ing of pry, pry-rails, and pry-byebug in specs is controlled by the IMPROVED_SPEC_DEBUGGING flag (disabled by default).
This reverts commit 20780a1eee.
* SECURITY: re-adds accidentally reverted commit:
03d26cd6: ensure embed_url contains valid http(s) uri
* when the merge commit e62a85cf was reverted, git chose the 2660c2e2 parent to land on
instead of the 03d26cd6 parent (which contains security fixes)
Upgrades Rails to latest, this version has better compatibility
with Ruby 2.7
During the upgrade we needed a new cleaner mechanism for configuring
message bus.
All tests are green.
If anything weird pops up please revert.
This reverts commit e23f1a9071.
Reverting as this currently breaks our plugin linting job in GithHub Action and Jenkins. Will re-revert after all the plugins get the latest rubocop config and/or a (potential) rubocop issue is fixed.
rspec-rails 4.0 was released so we no longer need to depend on a
beta version. Also updates minor on a bunch of rspec gems.
Thanks to @ryanwi for raising this.
TLDR; this commit vastly improves how whitespaces are handled when converting from HTML to Markdown.
It also adds support for converting HTML <tables> to markdown tables.
The previous 'remove_whitespaces!' method was traversing the whole HTML tree and used a heuristic to remove
leading and trailing whitespaces whenever it was appropriate (ie. mostly before and after HTML block elements)
It was a good idea, but it was very limited and leaded to bad conversion when the html had leading whitespaces on several lines for example.
One such example can be found [here](https://meta.discourse.org/t/86782).
For various reasons, most of the whitespaces in a HTML file is ignored when the page is being displayed in a browser.
The rules that the browsers follow are the [CSS' White Space Processing Rules](https://www.w3.org/TR/css-text-3/#white-space-rules).
They can be quite complicated when you take into account RTL languages and other various tidbits but they boils down to the following:
- Collapse whitespaces down to one space (0x20) inside an inline context (ie. nodes/tags that are being displaying on the same line)
- Remove any leading/trailing whitespaces inside an inline context
One quick & dirty way of getting this 90% solved would be to do 'HTML.gsub!(/[[:space:]]+/, " ")'.
We would also need to hoist <pre> elements in order to not mess with their whitespaces.
Unfortunately, this solution let some whitespaces creep around HTML tags which leads to more '.strip!' calls than I can bear.
I decided to "emulate" the browser's handling of whitespaces and came up with a solution in 4 parts
1. remove_not_allowed!
The HtmlToMarkdown library is recursively "visiting" all the nodes in the HTML in order to convert them to Markdown.
All the nodes that aren't handled by the library (eg. <script>, <style> or any non-textual HTML tags) are "swallowed".
In order to reduce the number of nodes visited, the method 'remove_not_allowed!' will automatically delete all the nodes
that have no "visitor" (eg. a 'visit_<tag>' method) defined.
2. remove_hidden!
Similar purpose as the previous method (eg. reducing number of nodes visited), there's no point trying to convert something that is hidden.
The 'remove_hidden!' method removes any nodes that was hidden using the "hidden" HTML attribute, some CSS or with a width or height equal to 0.
3. hoist_line_breaks!
The 'hoist_line_breaks!' method is there to handle <br> tags. I know those tiny <br> don't do much but they can be quite annoying.
The <br> tags are inline elements but they visually work like a block element (ie. they create a new line).
If you have the following HTML "<i>Foo<br>Bar</i>", it ends up visually similar to "<i>Foo</i><br><i>Bar</i>".
The latter being much more easy to process than the former, so that's what this method is doing.
The "hoist_line_breaks" will hoist <br> tags out of inline tags until their parent is a block element.
4. remove_whitespaces!
The "remove_whitespaces!" is where all the whitespace removal is happening. It's broken down into 4 methods as well
- remove_whitespaces!
- is_inline?
- collapse_spaces!
- remove_trailing_space!
The 'remove_whitespace!' method is recursively walking the HTML tree (skipping <pre> tags).
If a node has any children, they will be chunked into groups of inline elements vs block elements.
For each chunks of inline elements, it will call the "collapse_space!" and "remove_trailing_space!" methods.
For each chunks of block elements, it will call "remote_whitespace!" to keep walking the HTML tree recursively.
The "is_inline?" method determines whether a node is part of a inline context.
A node is inline iif it's a text node or it's an inline tag, but not <br>, and all its children are also inline.
The "collapse_spaces!" method will collapse any kind of (white) space into a single space (" ") character, even accros tags.
For example, if we have " Foo \n<i> Bar </i>\t42", it will return "Foo <i>Bar </i>42".
Finally, the "remove_trailing_space!" method is there to remove any trailing space that might creep in at the end of the inline chunk.
This solution is not 100% bullet-proof.
It does not support RTL languages at all and has some caveats that I felt were not worth the work to get properly fixed.
FIX: better detection of hidden elements when converting HTML to Markdown
FIX: take into account the 'allowed_href_schemes' site setting when converting HTML <a> to Markdown
FIX: added support for 'mailto:' scheme when converting <a> from HTML to Markdown
FIX: added support for <img> dimensions when converting from HTML to Markdown
FIX: added support for <dl>, <dd> and <dt> when converting from HTML to Markdown
FIX: added support for multilines emphases, strongs and strikes when converting from HTML to Markdown
FIX: added support for <acronym> when converting from HTML to Markdown
DEV: remove unused 'sanitize' gem
Wow, did you just read all that?! Congratz, here's a cookie: 🍪.
pry-nav is not yet supported on latest pry, this holds off on
upgrading pry, which in turn holds off on upgrading deps
Stripping pry-nav for now till it works with latest pry
Latest version of Rails contains compatibility fixes for Ruby 2.7 and some
minor security fixes we would like to have
It also broke some of the multisite tests.
Rails tries to use the same connection for reading from a replica as writing
to the leader during tests, because, with everything happening in a
transaction, changes to the DB wouldn't otherwise be reflected in the
replica connection.
The difference now is that Rails tries to do this for connections opened
after the test has started which affected rails multisite connections.
The upshot of this is that, as things stand, you are likely to
experience problems if you try to connect to a different multisite DB in
a test when the `current_db` is not 'default'.
This adds rubocop-rspec, and enables some cops that were either already passing or are passing now, after fixing them in this commit.
Some new cops are disabled for now, with annotation: "TODO" or "To be decided". Those either need to be discussed first, or require manual changes, or the number of found and fixed offenses is too large to bundle them up in a single PR.
Includes:
* DEV: Update rubocop's `TargetRubyVersion` to 2.6
* DEV: Enable RSpec/VoidExpect
* DEV: Enable RSpec/SharedContext
* DEV: Enable RSpec/EmptyExampleGroup (Removed an obsolete empty spec file)
* DEV: Enable RSpec/ItBehavesLike
* DEV: Remove RSpec/ScatteredLet (It's too strict, as it doesn't recognize fab! as a let-like)
* DEV: Remove RSpec/MultipleExpectations
json is shipped out of sync with Ruby. Even though we use OJ for many things
we still use the json gem sometimes, this ensures we use the latest
b8b29e79ad/config/initializers/100-oj.rb (L9-L9)