When copying an existing upload stub temporary object
on S3 to its final destination we were not copying across
its additional headers such as content-disposition and
cache-control, which led to issues like attachments not
downloading with their original filename when clicking
the download links in posts.
This is because the metadata_directive = REPLACE option
was not being passed to object.copy_from(), so only the
source object's headers were being used. Added an option
for apply_metadata_to_destination to apply this option
conditionally, because we may not always want to replace
this metadata, but we definitely do when copying a temporary
upload.
When a user archives a personal message, they are redirected back to the
inbox and will refresh the list of the topics for the given filter.
Publishing an event to the user results in an incorrect incoming message
because the list of topics has already been refreshed.
This does mean that if a user has two tabs opened, the non-active tab
will not receive the incoming message but at this point we do not think
the technical trade-offs are worth it to support this feature. We
basically have to somehow exclude a client from an incoming message
which is not easy to do.
Follow-up to fc1fd1b416
In order to include the new/unread count in the browse more message
under suggested topics, a couple of technical changes have to be made.
1. `PrivateMessageTopicTrackingState` is now auto-injected which is
similar to how it is done for `TopicTrackingState`. This is done so
we don't have to attempt to pass the `PrivateMessageTopicTrackingState`
object multiple levels down into the suggested-topics component. While
the object is auto-injected, we only fetch the initial state and start
tracking when the relevant private messages routes has been hit and only
when a private message's suggested topics is loaded. This is
done as we do not want to add the extra overhead of fetching the inital
state to all page loads but instead wait till the private messages
routes are hit.
2. Previously, we would stop tracking once the `user-private-messages`
route has been deactivated. However, that is not ideal since
navigating out of the route and back means we send an API call to the
server each time. Since `PrivateMessageTopicTrackingState` is kept in
sync cheaply via messageBus, we can just continue to track the state
even if the user has navigated away from the relevant stages.
When forwarding emails into the group inbox, we now use the
original sender email as the from_address since
2ac9fd9dff. However, we have not
been saving the original CC addresses of the forwarded email,
which are needed to include those recipients in on the conversation
when replying via the group inbox.
This commit captures the CC addresses on the incoming email, and
makes sure the emails are created as staged users and added to the
list of topic allowed users so they are included on CC's sent by
the GroupSmtpEmail and other jobs.
When 2ac9fd9dff was done, this
affected the small post that is created when forwarding an email
into the group inbox. Instead of using the name and the email of
the user who forwarded the email, it used the original from email
and name to create the small post. So instead of something like
"Discourse Team forwarded the above email" we ended up with
"John Smith forwarded the above email" which is incorrect.
This fixes the issue by creating a staged user for the forwarding
email address (if such a user does not yet exist) and uses that
for the "forwarded" small post instead.
Other locale characters in file names (e.g. é, ä) as well
as special characters can cause issues on S3, notably the S3
copy object operation does not support these special characters.
Instead of storing the original file name in the key, which is
unnecessary, we now generate a random file name with the original
extension for the temporary file and use that for all external
upload stub operations.
The seed-fu gem resets the sequence on all the tables it touches. In some situations, this can cause primary keys to be re-used. This commit introduces a freedom patch which ensures seed-fu only touches the sequence when it is less than the id of one of the seeded records.
PresenceChannel aims to be a generic system for allow the server, and end-users, to track the number and identity of users performing a specific task on the site. For example, it might be used to track who is currently 'replying' to a specific topic, editing a specific wiki post, etc.
A few key pieces of information about the system:
- PresenceChannels are identified by a name of the format `/prefix/blah`, where `prefix` has been configured by some core/plugin implementation, and `blah` can be any string the implementation wants to use.
- Presence is a boolean thing - each user is either present, or not present. If a user has multiple clients 'present' in a channel, they will be deduplicated so that the user is only counted once
- Developers can configure the existence and configuration of channels 'just in time' using a callback. The result of this is cached for 2 minutes.
- Configuration of a channel can specify permissions in a similar way to MessageBus (public boolean, a list of allowed_user_ids, and a list of allowed_group_ids). A channel can also be placed in 'count_only' mode, where the identity of present users is not revealed to end-users.
- The backend implementation uses redis lua scripts, and is designed to scale well. In the future, hard limits may be introduced on the maximum number of users that can be present in a channel.
- Clients can enter/leave at will. If a client has not marked itself 'present' in the last 60 seconds, they will automatically 'leave' the channel. The JS implementation takes care of this regular check-in.
- On the client-side, PresenceChannel instances can be fetched from the `presence` ember service. Each PresenceChannel can be used entered/left/subscribed/unsubscribed, and the service will automatically deduplicate information before interacting with the server.
- When a client joins a PresenceChannel, the JS implementation will automatically make a GET request for the current channel state. To avoid this, the channel state can be serialized into one of your existing endpoints, and then passed to the `subscribe` method on the channel.
- The PresenceChannel JS object is an ember object. The `users` and `count` property can be used directly in ember templates, and in computed properties.
- It is important to make sure that you `unsubscribe()` and `leave()` any PresenceChannel objects after use
An example implementation may look something like this. On the server:
```ruby
register_presence_channel_prefix("site") do |channel|
next nil unless channel == "/site/online"
PresenceChannel::Config.new(public: true)
end
```
And on the client, a component could be implemented like this:
```javascript
import Component from "@ember/component";
import { inject as service } from "@ember/service";
export default Component.extend({
presence: service(),
init() {
this._super(...arguments);
this.set("presenceChannel", this.presence.getChannel("/site/online"));
},
didInsertElement() {
this.presenceChannel.enter();
this.presenceChannel.subscribe();
},
willDestroyElement() {
this.presenceChannel.leave();
this.presenceChannel.unsubscribe();
},
});
```
With this template:
```handlebars
Online: {{presenceChannel.count}}
<ul>
{{#each presenceChannel.users as |user|}}
<li>{{avatar user imageSize="tiny"}} {{user.username}}</li>
{{/each}}
</ul>
```
The generate_presigned_put endpoint for direct external uploads
(such as the one for the uppy-image-uploader) records allowed
S3 metadata values on the uploaded object. We use this to store
the sha1-checksum generated by the UppyChecksum plugin, for later
comparison in ExternalUploadManager.
However, we were not doing this for the create_multipart endpoint,
so the checksum was never captured and compared correctly.
Also includes a fix to make sure UppyChecksum is the last preprocessor to run.
It is important that the UppyChecksum preprocessor is the last one to
be added; the preprocessors are run in order and since other preprocessors
may modify the file (e.g. the UppyMediaOptimization one), we need to
checksum once we are sure the file data has "settled".
This bug was introduced by f66007ec83.
In PostJobsEnqueuer we previously did not fire the after_post_create
event and after_topic_create event for private message topics. This was
changed in the above commit in order to publish message bus messages
for topic tracking state updates. Unfortunately this caused the
NotifyMailingListSubscribers job to be enqueued for all posts including
private messages, and admins and the users involved in the PMs got
emailed the contents of the PMs if they had mailing list mode enabled.
Luckily the impact of this was mitigated by a Guardian#can_see? check
for each mailing list mode user in the NotifyMailingListSubscribers job.
We never want to notify mailing list mode subscribers for private messages
so an early return has been added there, plus the logic in PostJobsEnqueuer
has been fixed, and tests have been added to that class where there were
none before.
This is unnecessary, as when the temporary key is created
in S3Store we already include the s3_bucket_folder_path, and
the key will always start with temp/ to assist with lifecycle
rules for multipart uploads.
This was affecting Discourse.store.object_from_path,
Discourse.store.signed_url_for_path, and possibly others.
See also: e0102a5
This can be used to change the list of topic posters. For example,
discourse-solved can use this to move the user who posted the solution
after the original poster.
There are certain design decisions that were made in this commit.
Private messages implements its own version of topic tracking state because there are significant differences between regular and private_message topics. Regular topics have to track categories and tags while private messages do not. It is much easier to design the new topic tracking state if we maintain two different classes, instead of trying to mash this two worlds together.
One MessageBus channel per user and one MessageBus channel per group. This allows each user and each group to have their own channel backlog instead of having one global channel which requires the client to filter away unrelated messages.
Previously we had temp/ in the middle of the S3 key path like so
* /uploads/default/temp/randomstring/test.png (normal site)
* /sitename/uploads/default/temp/randomstring/test.png (s3 folder path site)
* /standard10/uploads/sitename/temp/randomstring/test.png (multisite site)
However this necessitates making a lifecycle rule to clean up incomplete
S3 multipart uploads for every site, something which we cannot do. It makes
much more sense to have a structure with /temp at the start of the key,
which is what this commit does:
* /temp/uploads/default/randomstring/test.png (normal site)
* /temp/sitename/uploads/default/randomstring/test.png (s3 folder path site)
* /temp/standard10/uploads/sitename/randomstring/test.png (multisite site)
This pull request introduces the endpoints required, and the JavaScript functionality in the `ComposerUppyUpload` mixin, for direct S3 multipart uploads. There are four new endpoints in the uploads controller:
* `create-multipart.json` - Creates the multipart upload in S3 along with an `ExternalUploadStub` record, storing information about the file in the same way as `generate-presigned-put.json` does for regular direct S3 uploads
* `batch-presign-multipart-parts.json` - Takes a list of part numbers and the unique identifier for an `ExternalUploadStub` record, and generates the presigned URLs for those parts if the multipart upload still exists and if the user has permission to access that upload
* `complete-multipart.json` - Completes the multipart upload in S3. Needs the full list of part numbers and their associated ETags which are returned when the part is uploaded to the presigned URL above. Only works if the user has permission to access the associated `ExternalUploadStub` record and the multipart upload still exists.
After we confirm the upload is complete in S3, we go through the regular `UploadCreator` flow, the same as `complete-external-upload.json`, and promote the temporary upload S3 into a full `Upload` record, moving it to its final destination.
* `abort-multipart.json` - Aborts the multipart upload on S3 and destroys the `ExternalUploadStub` record if the user has permission to access that upload.
Also added are a few new columns to `ExternalUploadStub`:
* multipart - Whether or not this is a multipart upload
* external_upload_identifier - The "upload ID" for an S3 multipart upload
* filesize - The size of the file when the `create-multipart.json` or `generate-presigned-put.json` is called. This is used for validation.
When the user completes a direct S3 upload, either regular or multipart, we take the `filesize` that was captured when the `ExternalUploadStub` was first created and compare it with the final `Content-Length` size of the file where it is stored in S3. Then, if the two do not match, we throw an error, delete the file on S3, and ban the user from uploading files for N (default 5) minutes. This would only happen if the user uploads a different file than what they first specified, or in the case of multipart uploads uploaded larger chunks than needed. This is done to prevent abuse of S3 storage by bad actors.
Also included in this PR is an update to vendor/uppy.js. This has been built locally from the latest uppy source at d613b849a6. This must be done so that I can get my multipart upload changes into Discourse. When the Uppy team cuts a proper release, we can bump the package.json versions instead.
When emails were forwarded to a group inbox by the email address
of the group, for example when an email ends up in spam and must
be manually forwarded to the group+site@discoursemail.com address,
the OP of the topic ended up being the group's email address instead
of the sender who originally sent the email to the group inbox.
This commit detects that an email has been forwarded using existing
tools, and if the from address matches one of the group incoming
email addresses, then we look at the forwarded email's from address
and use that instead for the incoming email from address as well as
the staged/regular user used for the Topic.user.
This will make it much cleaner to forward emails into a group inbox,
and will prevent issues with PostAlerter where the OP is double-notified
for these emails.
Currently, pinned topics are ordered by the `bumped_at` column. This behavior is not desired because it gives admins no control over the order of pinned topics. This PR makes pinned topics ordered by the `pinned_at` column. A topic that is pinned last appears first in topic lists. If an admin wants an already pinned topic to appear first in the list of pinned topics, they'll have to unpin that topic and pin it again.
Meta topic: https://meta.discourse.org/t/how-do-i-set-the-order-of-pinned-topics/16935/23?u=osama.
Emails can include the marker in a different language, depending on
site and user settings. The email receiver always looked for the marker
in default language.
Uploads can be reused between site settings. This change allows the same
upload to be exported only once and then the same file is reused. The
same applies to import.
Allow admins to configure exceptions to our Rails rate limiter.
Configuration happens in the environment variables, and work with both
IPs and CIDR blocks.
Example:
```
env:
DISCOURSE_MAX_REQS_PER_IP_EXCEPTIONS: >-
14.15.16.32/27
216.148.1.2
```