* FEATURE: Let sites add a sitemap.xml file.
This PR adds the same features discourse-sitemap provides to core. Sitemaps are only added to the robots.txt file if the `enable_sitemap` setting is enabled and `login_required` disabled.
After merging discourse/discourse-sitemap#34, this change will take priority over the sitemap plugin because it will disable itself. We're also using the same sitemaps table, so our migration won't try to create it
again using `if_not_exists: true`.
Sometimes we need to update a _lot_ of ACLs on S3 (such as when secure media
is enabled), and since it takes ~1s per upload to update the ACL, this is
best spread out over many jobs instead of having to do the whole thing serially.
In future, it will be better to have a job that can be run based on
a column on uploads (e.g. acl_stale) so we can track progress, similar
to how we can set the baked_version to nil to rebake posts.
raw_html posts (i.e. those which are pulled as part of our comments integration) don't go through our markdown pipeline, so `upload://` URLs are not supported. Running pull_hotlinked_images will break any images in the post.
In future we may add support for pulling hotlinked images in these posts. But for now, disabling it will stop it breaking images.
This commit introduces a new use_polymorphic_bookmarks site setting
that is default false and hidden, that will be used to help continuous
development of polymorphic bookmarks. This setting **should not** be
enabled anywhere in production yet, it is purely for local development.
This commit uses the setting to enable create/update/delete actions
for polymorphic bookmarks on the server and client side. The bookmark
interactions on topics/posts are all usable. Listing, searching,
sending bookmark reminders, and other edge cases will be handled
in subsequent PRs.
Comprehensive UI tests will be added in the final PR -- we already
have them for regular bookmarks, so it will just be a matter of
changing them to be for polymorphic bookmarks.
The meaning of reminder_at and reminder_last_sent_at changed after
commit 6d422a8033. A bookmark reminder
will fire only if reminder_last_sent_at is null, but before that it
fired everytime reminder_at was set. This is no longer true because
sometimes reminder_at continues to exist even after a reminder fired.
The user can select what happens with a bookamrk after it expires. New
option allow bookmark's reminder to be kept even after it has expired.
After a bookmark's reminder notification is created, the reminder date
will be highlighted in red until the user resets the reminder date.
User can do that using the new Clear Reminder button from the dropdown.
This allows text editors to use correct syntax coloring for the heredoc sections.
Heredoc tag names we use:
languages: SQL, JS, RUBY, LUA, HTML, CSS, SCSS, SH, HBS, XML, YAML/YML, MF, ICS
other: MD, TEXT/TXT, RAW, EMAIL
We validate the *format* of email addresses in many places with a match against
a regex, often with very slightly different syntax.
Adding a separate EmailAddressValidator simplifies the code in a few spots and
feels cleaner.
Deprecated the old location in case someone is using it in a plugin.
No functionality change is in this commit.
Note: the regex used at the moment does not support using address literals, e.g.:
* localpart@[192.168.0.1]
* localpart@[2001:db8::1]
This commit introduces two new APIs for handling unused uploads, one
can be used to exclude uploads in bulk when the data model allow and
the other one excludes uploads one by one.
Whenever we got a bounced email in the Email::Receiver we
previously would just set bounced: true on the EmailLog and
discard the status/diagnostic code. This commit changes this
flow to store the bounce error code (defined in the RFC at
https://www.iana.org/assignments/smtp-enhanced-status-codes/smtp-enhanced-status-codes.xhtml)
not just in the Email::Receiver, but also via webhook events
from other mail services and from SNS.
This commit does not surface the bounce error in the UI,
we can do that later if necessary.
This commit introduces our own handling and warning for Sidekiq's new 'non-json-serializable' warning. This decouples us from Sidekiq's own deprecation cycle, and allows us to use our own deprecation system. It also means that the dump/parse happens in test mode, which will help us to catch occurrences before they reach production.
Job arguments go via JSON, and so symbols will appear as strings in the Job's `#execute` method. The latest version of Sidekiq has started warning about this to reduce developer confusion.
The latest version of Sidekiq introduced a warning when jobs are queued with arguments which 'do not stringify to JSON safely'. In the vast majority of cases, this is because a hash is passed with symbols as keys. When those args are passed to the job, the keys will be stringified.
Our job wrapper already takes care of this issue by calling '.with_indifferent_access' on the args before passing them to `#execute`, so we don't need to change anything about our use. All we need to do is satisfy Sidekiq's warning system by 'stringifying' all the keys before enqueuing the job.
* File.exists? is deprecated and removed in Ruby 3.2 in favor of
File.exist?
* Dir.exists? is deprecated and removed in Ruby 3.2 in favor of
Dir.exist?
This commit introduces scheduled problem checks for the admin dashboard, which are long running or otherwise cumbersome problem checks that will be run every 10 minutes rather than every time the dashboard is loaded. If these scheduled checks add a problem, the problem will remain until it is cleared or until the scheduled job runs again.
An example of a check that should be scheduled is validating credentials against an external provider.
This commit also introduces the concept of a `priority` to the problems generated by `AdminDashboardData` and the scheduled checks. This is `low` by default, and can be set to `high`, but this commit does not change any part of the UI with this information, only adds a CSS class.
I will be making a follow up PR to check group SMTP credentials.
We send the reminder using the GroupMessage class, which supports removing previous messages. We can't match them by raw because they could mention different moderators. Also, I had to change the subject to remove dynamically generated values, which is necessary for finding them.
This commit introduces a new site setting "google_oauth2_hd_groups". If enabled, group information will be fetched from Google during authentication, and stored in the Discourse database. These 'associated groups' can be connected to a Discourse group via the "Membership" tab of the group preferences UI.
The majority of the implementation is generic, so we will be able to add support to more authentication methods in the near future.
https://meta.discourse.org/t/managing-group-membership-via-authentication/175950
Old OnceOff job could perform pretty slowly on sites with millions of emails
New implementation operates in batches in a migration, minimizing locking.
This commit adds token_hash and scopes columns to email_tokens table.
token_hash is a replacement for the token column to avoid storing email
tokens in plaintext as it can pose a security risk. The new scope column
ensures that email tokens cannot be used to perform a different action
than the one intended.
To sum up, this commit:
* Adds token_hash and scope to email_tokens
* Reuses code that schedules critical_user_email
* Refactors EmailToken.confirm and EmailToken.atomic_confirm methods
* Periodically cleans old, unconfirmed or expired email tokens
When this setting is turned on, it will check that normalized emails
are unique. Normalized emails are emails without any dots or plus
aliases.
This setting can be used to block use of aliases of the same email
address.
Sometimes, a user may have a malformed email such as
`test@test.com<mailto:test@test.com` their email address,
and as a topic participant will be included as a CC email
when sending a GroupSmtpEmail. This causes the CC parsing to
fail and further down the line in Email::Sender the code
to check the CC addresses expects an array but gets a string
instead because of the parse failure.
Instead, we can just check if the CC addresses are valid
and drop them if they are not in the GroupSmtpEmail job.
An upstream validation bug in the aws-sdk-sns library could enable RCE under certain circumstances. This commit updates the upstream gem, and adds additional validation to provide defense-in-depth.
This reverts commit f5cf647e57.
The gem breaks usage of Rails URL helpers when used outside views and
controllers, for example in
88ecb83382/app/models/upload.rb (L239-L242)
the `upload_short_path` method call fails with an undefined method
exception when this gem is enabled.
The lazy route initialization cuts down boot time of rails.
On my local system it cuts out 200ms of boot time taking me from 3.2 to 3 seconds.
This is not a radically enormous amount of time, but paper cuts add up, and a faster boot in dev will make everyone happy.
TBD if we want to also include this in production.
Gem is heavily maintained by @amatsuda, last commit 3 days ago.
We don't actually use the reminder_type for bookmarks anywhere;
we are just storing it. It has no bearing on the UI. It used
to be relevant with the at_desktop bookmark reminders (see
fa572d3a7a)
This commit marks the column as readonly, ignores it, and removes
the index, and it will be dropped in a later PR. Some plugins
are relying on reminder_type partially so some stubs have been
left in place to avoid errors.
We don't need no stinkin' denormalization! This commit ignores
the topic_id column on bookmarks, to be deleted at a later date.
We don't really need this column and it's better to rely on the
post.topic_id as the canonical topic_id for bookmarks, then we
don't need to remember to update both columns if the bookmarked
post moves to another topic.
Discourse is sending regularly message to admins when potential problems are persisted. Most of the time they have exactly the same content. In that case, when there are no replies, the old one should be trashed before a new one is created.
PresenceChannel aims to be a generic system for allow the server, and end-users, to track the number and identity of users performing a specific task on the site. For example, it might be used to track who is currently 'replying' to a specific topic, editing a specific wiki post, etc.
A few key pieces of information about the system:
- PresenceChannels are identified by a name of the format `/prefix/blah`, where `prefix` has been configured by some core/plugin implementation, and `blah` can be any string the implementation wants to use.
- Presence is a boolean thing - each user is either present, or not present. If a user has multiple clients 'present' in a channel, they will be deduplicated so that the user is only counted once
- Developers can configure the existence and configuration of channels 'just in time' using a callback. The result of this is cached for 2 minutes.
- Configuration of a channel can specify permissions in a similar way to MessageBus (public boolean, a list of allowed_user_ids, and a list of allowed_group_ids). A channel can also be placed in 'count_only' mode, where the identity of present users is not revealed to end-users.
- The backend implementation uses redis lua scripts, and is designed to scale well. In the future, hard limits may be introduced on the maximum number of users that can be present in a channel.
- Clients can enter/leave at will. If a client has not marked itself 'present' in the last 60 seconds, they will automatically 'leave' the channel. The JS implementation takes care of this regular check-in.
- On the client-side, PresenceChannel instances can be fetched from the `presence` ember service. Each PresenceChannel can be used entered/left/subscribed/unsubscribed, and the service will automatically deduplicate information before interacting with the server.
- When a client joins a PresenceChannel, the JS implementation will automatically make a GET request for the current channel state. To avoid this, the channel state can be serialized into one of your existing endpoints, and then passed to the `subscribe` method on the channel.
- The PresenceChannel JS object is an ember object. The `users` and `count` property can be used directly in ember templates, and in computed properties.
- It is important to make sure that you `unsubscribe()` and `leave()` any PresenceChannel objects after use
An example implementation may look something like this. On the server:
```ruby
register_presence_channel_prefix("site") do |channel|
next nil unless channel == "/site/online"
PresenceChannel::Config.new(public: true)
end
```
And on the client, a component could be implemented like this:
```javascript
import Component from "@ember/component";
import { inject as service } from "@ember/service";
export default Component.extend({
presence: service(),
init() {
this._super(...arguments);
this.set("presenceChannel", this.presence.getChannel("/site/online"));
},
didInsertElement() {
this.presenceChannel.enter();
this.presenceChannel.subscribe();
},
willDestroyElement() {
this.presenceChannel.leave();
this.presenceChannel.unsubscribe();
},
});
```
With this template:
```handlebars
Online: {{presenceChannel.count}}
<ul>
{{#each presenceChannel.users as |user|}}
<li>{{avatar user imageSize="tiny"}} {{user.username}}</li>
{{/each}}
</ul>
```
This bug was introduced by f66007ec83.
In PostJobsEnqueuer we previously did not fire the after_post_create
event and after_topic_create event for private message topics. This was
changed in the above commit in order to publish message bus messages
for topic tracking state updates. Unfortunately this caused the
NotifyMailingListSubscribers job to be enqueued for all posts including
private messages, and admins and the users involved in the PMs got
emailed the contents of the PMs if they had mailing list mode enabled.
Luckily the impact of this was mitigated by a Guardian#can_see? check
for each mailing list mode user in the NotifyMailingListSubscribers job.
We never want to notify mailing list mode subscribers for private messages
so an early return has been added there, plus the logic in PostJobsEnqueuer
has been fixed, and tests have been added to that class where there were
none before.
There are certain design decisions that were made in this commit.
Private messages implements its own version of topic tracking state because there are significant differences between regular and private_message topics. Regular topics have to track categories and tags while private messages do not. It is much easier to design the new topic tracking state if we maintain two different classes, instead of trying to mash this two worlds together.
One MessageBus channel per user and one MessageBus channel per group. This allows each user and each group to have their own channel backlog instead of having one global channel which requires the client to filter away unrelated messages.