Commit Graph

2972 Commits

Author SHA1 Message Date
Krzysztof Kotlarek 9bff0882c3
FEATURE: Nokogumbo (#9577)
* FEATURE: Nokogumbo

Use Nokogumbo HTML parser.
2020-05-05 13:46:57 +10:00
Krzysztof Kotlarek 37e93914fc
FIX: the muted message should be sent after edit (#9593)
Recently, we added feature that we are sending `/muted` to users who muted specific topic just before `/latest` so the client knows to ignore those messages - https://github.com/discourse/discourse/pull/9482

Same `/muted` message should be included when the post is edited
2020-05-01 08:33:57 +10:00
Vinoth Kannan 71241a50f7 DEV: improve code readability & add tests for user guardian.
a511bea4cc
2020-04-30 20:59:33 +05:30
Régis Hanol 501b19b6e0
FIX: server-side HtmlToMarkdown improvements (#9586)
TLDR; this commit vastly improves how whitespaces are handled when converting from HTML to Markdown.
It also adds support for converting HTML <tables> to markdown tables.

The previous 'remove_whitespaces!' method was traversing the whole HTML tree and used a heuristic to remove
leading and trailing whitespaces whenever it was appropriate (ie. mostly before and after HTML block elements)

It was a good idea, but it was very limited and leaded to bad conversion when the html had leading whitespaces on several lines for example.
One such example can be found [here](https://meta.discourse.org/t/86782).

For various reasons, most of the whitespaces in a HTML file is ignored when the page is being displayed in a browser.
The rules that the browsers follow are the [CSS' White Space Processing Rules](https://www.w3.org/TR/css-text-3/#white-space-rules).
They can be quite complicated when you take into account RTL languages and other various tidbits but they boils down to the following:

- Collapse whitespaces down to one space (0x20) inside an inline context (ie. nodes/tags that are being displaying on the same line)
- Remove any leading/trailing whitespaces inside an inline context

One quick & dirty way of getting this 90% solved would be to do 'HTML.gsub!(/[[:space:]]+/, " ")'.
We would also need to hoist <pre> elements in order to not mess with their whitespaces.
Unfortunately, this solution let some whitespaces creep around HTML tags which leads to more '.strip!' calls than I can bear.

I decided to "emulate" the browser's handling of whitespaces and came up with a solution in 4 parts

1. remove_not_allowed!

The HtmlToMarkdown library is recursively "visiting" all the nodes in the HTML in order to convert them to Markdown.
All the nodes that aren't handled by the library (eg. <script>, <style> or any non-textual HTML tags) are "swallowed".
In order to reduce the number of nodes visited, the method 'remove_not_allowed!' will automatically delete all the nodes
that have no "visitor" (eg. a 'visit_<tag>' method) defined.

2. remove_hidden!

Similar purpose as the previous method (eg. reducing number of nodes visited), there's no point trying to convert something that is hidden.
The 'remove_hidden!' method removes any nodes that was hidden using the "hidden" HTML attribute, some CSS or with a width or height equal to 0.

3. hoist_line_breaks!

The 'hoist_line_breaks!' method is there to handle <br> tags. I know those tiny <br> don't do much but they can be quite annoying.
The <br> tags are inline elements but they visually work like a block element (ie. they create a new line).
If you have the following HTML "<i>Foo<br>Bar</i>", it ends up visually similar to "<i>Foo</i><br><i>Bar</i>".
The latter being much more easy to process than the former, so that's what this method is doing.
The "hoist_line_breaks" will hoist <br> tags out of inline tags until their parent is a block element.

4. remove_whitespaces!

The "remove_whitespaces!" is where all the whitespace removal is happening. It's broken down into 4 methods as well

- remove_whitespaces!
- is_inline?
- collapse_spaces!
- remove_trailing_space!

The 'remove_whitespace!' method is recursively walking the HTML tree (skipping <pre> tags).
If a node has any children, they will be chunked into groups of inline elements vs block elements.
For each chunks of inline elements, it will call the "collapse_space!" and "remove_trailing_space!" methods.
For each chunks of block elements, it will call "remote_whitespace!" to keep walking the HTML tree recursively.

The "is_inline?" method determines whether a node is part of a inline context.
A node is inline iif it's a text node or it's an inline tag, but not <br>, and all its children are also inline.

The "collapse_spaces!" method will collapse any kind of (white) space into a single space (" ") character, even accros tags.
For example, if we have "  Foo \n<i> Bar </i>\t42", it will return "Foo <i>Bar </i>42".

Finally, the "remove_trailing_space!" method is there to remove any trailing space that might creep in at the end of the inline chunk.

This solution is not 100% bullet-proof.
It does not support RTL languages at all and has some caveats that I felt were not worth the work to get properly fixed.

FIX: better detection of hidden elements when converting HTML to Markdown
FIX: take into account the 'allowed_href_schemes' site setting when converting HTML <a> to Markdown
FIX: added support for 'mailto:' scheme when converting <a> from HTML to Markdown
FIX: added support for <img> dimensions when converting from HTML to Markdown
FIX: added support for <dl>, <dd> and <dt> when converting from HTML to Markdown
FIX: added support for multilines emphases, strongs and strikes when converting from HTML to Markdown
FIX: added support for <acronym> when converting from HTML to Markdown
DEV: remove unused 'sanitize' gem

Wow, did you just read all that?! Congratz, here's a cookie: 🍪.
2020-04-30 12:21:25 +02:00
Sam Saffron d0d5a138c3
DEV: stop freezing frozen strings
We have the `# frozen_string_literal: true` comment on all our
files. This means all string literals are frozen. There is no need
to call #freeze on any literals.

For files with `# frozen_string_literal: true`

```
puts %w{a b}[0].frozen?
=> true

puts "hi".frozen?
=> true

puts "a #{1} b".frozen?
=> true

puts ("a " + "b").frozen?
=> false

puts (-("a " + "b")).frozen?
=> true
```

For more details see: https://samsaffron.com/archive/2018/02/16/reducing-string-duplication-in-ruby
2020-04-30 16:48:53 +10:00
Roman Rizzi 394babcae3
FIX: Only show the review page to users that can see it. Do not publish the reviewable count update message to everyone. (#9556) 2020-04-27 14:51:25 -03:00
Sam Saffron 8c1e008c59
DEV: Skip erratic spec for now
Spec fails intermittently due to CDN state.
2020-04-25 13:20:04 +10:00
Penar Musaraj 585e7bcfe8 DEV: update specs followup to 67e96f6 2020-04-23 16:11:17 -04:00
Arpit Jalan 39be639c37 FIX: update GitHub screen_name on login via GitHub 2020-04-23 20:54:26 +05:30
Dan Ungureanu 4e5f9d4cd1
DEV: Drop 'key' column from user_api_keys (#9388) 2020-04-22 12:13:19 +03:00
Jarek Radosz ab52bed014
DEV: Remove the return value of disable_if_low_on_disk_space (#9469)
It was used only in specs.
2020-04-21 03:48:33 +02:00
Jarek Radosz 5a81e3999c
DEV: Remove `bypass_bump` from CookedPostProcessor (#9468)
It was only passing it along to `PullHotlinkedImages` and that class have not used that arg since April 2014 (c52ee665b4)
2020-04-21 03:48:19 +02:00
Robin Ward 8f5314bf98 FIX: An `opts` hash was not, in fact, optional :) 2020-04-20 14:17:13 -04:00
Jeff Wong e3590d4ead
FEATURE: add user_session_refreshed trigger (#9412)
Trigger an event for plugins to consume when a user session is refreshed.

This allows external auth to be notified about account activity, and be
able to take action such as use oauth refresh tokens to keep oauth
tokens valid.
2020-04-14 09:32:24 -07:00
Robin Ward e1f8014acd
FEATURE: Support for publishing topics as pages (#9364)
If the feature is enabled, staff members can construct a URL and publish a
topic for others to browse without the regular Discourse chrome.

This is useful if you want to use Discourse like a CMS and publish
topics as articles, which can then be embedded into other systems.
2020-04-08 12:52:36 -04:00
Sam Saffron 0375a5ac0b
DEV: reduce logging when no external id is specified
Previously we were returning an unknown sso error and logging a message
when external id was blank. This noise is not needed.
2020-04-08 12:42:28 +10:00
Blake Erickson d04ba4b3b2
DEPRECATION: Remove support for api creds in query params (#9106)
* DEPRECATION: Remove support for api creds in query params

This commit removes support for api credentials in query params except
for a few whitelisted routes like rss/json feeds and the handle_mail
route.

Several tests were written to valid these changes, but the bulk of the
spec changes are just switching them over to use header based auth so
that they will pass without changing what they were actually testing.

Original commit that notified admins this change was coming was created
over 3 months ago: 2db2003187

* fix tests

* Also allow iCalendar feeds

Co-authored-by: Rafael dos Santos Silva <xfalcox@gmail.com>
2020-04-06 16:55:44 -06:00
Joffrey JAFFEUX b248c553c2
DEV: ensures CustomEmoji cache is cleared after spec (#9361) 2020-04-06 19:41:59 +02:00
Vinoth Kannan fd39c85c1a FIX: add category hashtags support for sub-sub categories.
Hashtags will include last two levels only (ex: "parent:child").
2020-04-06 20:43:38 +05:30
Krzysztof Kotlarek ce00da3bcd
FIX: guardian always got user but sometimes it is anonymous (#9342)
* FIX: guardian always got user but sometimes it is anonymous

```
  def initialize(user = nil, request = nil)
    @user = user.presence || AnonymousUser.new
    @request = request
  end
```

AnonymouseUser defines `blank?` method
```
  class AnonymousUser
    def blank?
      true
    end
    ...
  end
```
so if we would use @user.present? it would be correct, however, just @user is always true
2020-04-06 09:56:47 +10:00
Daniel Waterworth 76610acb6f FIX: Default to light theme in wizard so that previews are displayed
Previously, without a theme selection, the previews wouldn't show.
2020-04-02 18:37:45 +01:00
Kane York cdaa60b56b FEATURE: Allow admins to disable self-service account deletion
https://meta.discourse.org/t/-/146276
2020-04-01 15:16:07 -07:00
Mark VanLandingham 689c61b462
DEV: Allow plugins to add wizard steps after specific steps (#9315) 2020-04-01 08:36:50 -05:00
Arpit Jalan b2a0d34bb7
FEATURE: add setting `auto_approve_email_domains` to auto approve users (#9323)
* FEATURE: add setting `auto_approve_email_domains` to auto approve users

This commit adds a new site setting `auto_approve_email_domains` to
auto approve users based on their email address domain.

Note that if a domain already exists in `email_domains_whitelist` then
`auto_approve_email_domains` needs to be duplicated there as well,
since users won’t be able to register with email address that is
not allowed in `email_domains_whitelist`.

* Update config/locales/server.en.yml

Co-Authored-By: Robin Ward <robin.ward@gmail.com>
2020-03-31 23:59:15 +05:30
Joffrey JAFFEUX 0996c3b7b3
FEATURE: allows multiple custom emoji groups (#9308)
Note: DBHelper would fail with a sql syntax error on columns like "group".

Co-authored-by: Jarek Radosz <jradosz@gmail.com>
2020-03-30 20:16:10 +02:00
Bianca Nenciu 7952cbb9a2
FIX: Perform crop using user-specified image sizes (#9224)
* FIX: Perform crop using user-specified image sizes

It used to resize the images to max width and height first and then
perform the crop operation. This is wrong because it ignored the user
specified image sizes from the Markdown.

* DEV: Use real images in test
2020-03-26 16:40:00 +02:00
Sam Saffron 25f1f23288
FEATURE: Stricter rules for user presence
Previously we would consider a user "present" and "last seen" if the
browser window was visible.

This has many edge cases, you could be considered present and around for
days just by having a window open and no screensaver on.

Instead we now also check that you either clicked, transitioned around app
or scrolled the page in the last minute in combination with window
visibility

This will lead to more reliable notifications via email and reduce load of
message bus for cases where a user walks away from the terminal
2020-03-26 17:36:52 +11:00
Mark VanLandingham c14f6d4ced
FEATURE: Allow plugins to exclude wizard steps (#9275) 2020-03-25 11:36:42 -05:00
Kane York 58ae0d4bd9
DEV: Add test case for /srv/status probers (#9259) 2020-03-24 16:28:07 +11:00
Sam Saffron 10b37e1e36
FIX: add support for sub-sub category slugs in search
Previous to this change slugs for leaves in 3 level nestings would not work

Our UX picks only the last two levels

This also makes the results consistent for slugs as it enforces order.
2020-03-20 15:36:50 +11:00
David Taylor 19814c5e81
FIX: Allow CSP to work correctly for non-default hostnames/schemes (#9180)
- Define the CSP based on the requested domain / scheme (respecting force_https)
- Update EnforceHostname middleware to allow secondary domains, add specs
- Add URL scheme to anon cache key so that CSP headers are cached correctly
2020-03-19 19:54:42 +00:00
Vinoth Kannan f6d6f1701f FIX: use the new duration attribute in `set_or_create_timer` method.
New `duration` attribute is introduced for the `set_or_create_timer` method in the commit aad12822b7 for "based on last post" and "auto delete replies" topic timers.
2020-03-19 21:45:05 +05:30
Dan Ungureanu 1393950dbc
FIX: Improve HTML to Markdown conversion (#9231)
This commit ensures that whitespaces are preserved in <pre>, but removed
inside text paragraphs.
2020-03-18 19:31:10 +02:00
David Taylor 3723c64257
DEV: Correct references to theme flags
Followup to d1474e94
2020-03-13 16:45:55 +00:00
David Taylor 3d71b68195
DEV: Introduce plugin api for conditionally rendering assets (#9200) 2020-03-13 15:30:31 +00:00
Martin Brennan 2237ba8c9d
FIX: Add topic deleted check to email/sender (#9166)
It already had a deleted post check and log reason, add a topic one too to avoid errors
2020-03-13 10:04:15 +10:00
Daniel Waterworth 59578dfc5b FIX: Notification emails with attachments are incorrectly structured
Two behaviors in the mail gem collide:

 1. Attachments are added as extra parts at the top level,
 2. When there are both text and html parts, the content type is set to
    'multipart/alternative'.

Since attachments aren't alternative renderings, for emails that contain
attachments and both html and text parts, some coercing is necessary.
2020-03-12 15:42:24 +00:00
Martin Brennan 793f39139a
FEATURE: Send notifications for time-based and At Desktop bookmark reminders (#9071)
* This PR implements the scheduling and notification system for bookmark reminders. Every 5 minutes a schedule runs to check any reminders that need to be sent before now, limited to **300** reminders at a time. Any leftover reminders will be sent in the next run. This is to avoid having to deal with fickle sidekiq and reminders in the far-flung future, which would necessitate having a background job anyway to clean up any missing `enqueue_at` reminders.

* If a reminder is sent its `reminder_at` time is cleared and the `reminder_last_sent_at` time is filled in. Notifications are only user-level notifications for now.

* All JavaScript and frontend code related to displaying the bookmark reminder notification is contained here. The reminder functionality is now re-enabled in the bookmark modal as well.

* This PR also implements the "Remind me next time I am at my desktop" bookmark reminder functionality. When the user is on a mobile device they are able to select this option. When they choose this option we set a key in Redis saying they have a pending at desktop reminder. The next time they change devices we check if the new device is desktop, and if it is we send reminders using a DistributedMutex. There is also a job to ensure consistency of these reminders in Redis (in case Redis drops the ball) and the at desktop reminders expire after 20 days.

* Also in this PR is a fix to delete all Bookmarks for a user via `UserDestroyer`
2020-03-12 10:16:00 +10:00
David Taylor d1474e94a1
FEATURE: Allow themes to specify modifiers in their about.json file (#9097)
There are three modifiers:
- serialize_topic_excerpts (boolean)
- csp_extensions (array of strings)
- svg_icons (array of strings)

When multiple themes are active, the values will be combined. The combination method varies based on the setting. CSP/SVG arrays will be combined. serialize_topic_excerpts will use `Enumerable#any`.
2020-03-11 13:30:45 +00:00
Jarek Radosz 29b35aa64c
DEV: Improve flaky time-sensitive specs (#9141) 2020-03-10 22:13:17 +01:00
Jarek Radosz aec26ad2f0
FIX: Preserve TopicCreator's timestamp resolution (#9158)
Continuation of #9140 (e35bc8b). It's the last piece required for #9141.
2020-03-10 15:35:40 +01:00
Roman Rizzi 826b4793c0
FEATURE: Approve suspect users is now true by default. The suspect users list was removed (#9151) 2020-03-10 08:56:42 -03:00
Jarek Radosz e35bc8bebd
FIX: Preserve PostCreator's created_at resolution (#9140)
PostMover passes to PostCreator a `created_at` that is a `ActiveSupport::WithTimeZone` instance (and also `is_a? Time`). Previously it was always being passed through `Time.zone.parse` so it would lose sub-second information. Now, it takes `Time` input as-is, while still parsing other types.
2020-03-09 17:38:13 +01:00
Joffrey JAFFEUX 60b47d622e
UX: adds support for a color setting type (#9016) 2020-03-09 10:07:03 +01:00
David Taylor 5b3630dba3
FIX: Do not raise an error when in:all search is performed by anon (#9113)
Also improve in:all specs to catch to catch similar failures
2020-03-05 17:50:29 +00:00
Sam Saffron e23e247dff
DEV: spec suite fails on leap years
Something about year math fails here on leap years, just
set up the clock so it is consistent to avoid it.
2020-02-29 18:30:08 +11:00
Dan Ungureanu 8cbb6e35cb
DEV: Fix build
Follow up to 60184a290c.
2020-02-28 11:31:04 +02:00
Robin Ward a47e0a3fda FIX: TOTP could not be used on sites with colons in their names
This is because the TOTP gem identifies as a colon as an addressable
protocol. The solution for now is to remove the colon in the issuer
name.

Changing the issuer changes the token values, but now it was completely
broken for colons so this should not be breaking anyone new.
2020-02-20 16:35:30 -05:00
Roman Rizzi c7787464cd
FEATURE: Admins can configure the reflag cooldown window and if posts flagged as spam by TL3+ users get automatically hidden (#9010) 2020-02-20 14:43:33 -03:00
Penar Musaraj 7a09e2cce2 DEV: Improve video onebox stripping spec
Followup to 70819080
2020-02-20 11:45:12 -05:00