Commit Graph

72 Commits

Author SHA1 Message Date
Natalie Tay 489aac3fdd
FIX: Disallow table cells to be weighted actual articles can be main content (#27508)
For Topic Embeds, we would prefer <article> to be the main article in a topic, rather than a table cell <td> with potentially a lot of data. However, in an example URL like here, the table cell (the very large code snippet) is seen as the Topic Embed's article due to the determined content weight by the Readability library we use.

In the newly released 0.7.1 cantino/ruby-readability#94, the library has a new option to exclude the library's default <td> element into content weighting. This is more in line with the original library where they only weighted <p>. So this PR excludes the td, as seen in the tests, to allow the actual article to be seen as the article. This PR also adds the details tag into the allow-list.
2024-06-19 09:50:49 +08:00
Sérgio Saquetim 766231b102
FIX: Prevent crash importing topics on a tagged embeddable host (#27254) 2024-05-30 12:04:36 -03:00
Jean 63b7a36fac
FEATURE: Extend embeddable hosts with Individual tags and author assignments (#26868)
* FEATURE: Extend embeddable hosts with tags and author assignments
2024-05-16 15:47:01 -04:00
Sérgio Saquetim 175d2451a9
DEV: Add `topic_embed_import_create_args` plugin modifier (#26527)
* DEV: Add `topic_embed_import_create_args` plugin modifier

This modifier allows a plugin to change the arguments used when creating
 a new topic for an imported article.

For example: let's say you want to prepend "Imported: " to the title of
every imported topic. You could use this modifier like so:

```ruby
# In your plugin's code
plugin.register_modifier(:topic_embed_import_create_args) do |args|
  args[:title] = "Imported: #{args[:title]}"
  args
end
```

In this example, the modifier is prepending "Imported: " to the `title` in the `create_args` hash. This modified title would then be used when the new topic is created.
2024-04-05 12:37:53 -03:00
Angus McLeod 7dc552c9cc
DEV: Add `import_embed_unlisted` site setting (#26222) 2024-03-27 08:57:43 -04:00
Ted Johansson 7e5d2a95ee
DEV: Convert min_trust_level_to_tag_topics to groups (#25273)
We're changing the implementation of trust levels to use groups. Part of this is to have site settings that reference trust levels use groups instead. It converts the min_trust_level_to_tag_topics site setting to tag_topic_allowed_groups.
2024-01-26 13:25:03 +08:00
Penar Musaraj f2cf5434f3
Revert "DEV: Convert min_trust_level_to_tag_topics to groups (#25258)" (#25262)
This reverts commit c7e3d27624 due to
test failures. This is temporary.
2024-01-15 11:33:47 -05:00
Ted Johansson c7e3d27624
DEV: Convert min_trust_level_to_tag_topics to groups (#25258)
We're changing the implementation of trust levels to use groups. Part of this is to have site settings that reference trust levels use groups instead. It converts the min_trust_level_to_tag_topics site setting to tag_topic_allowed_groups.
2024-01-15 20:59:08 +08:00
Rafael dos Santos Silva 13735f35fb
FEATURE: Cache embed contents in the database (#25133)
* FEATURE: Cache embed contents in the database

This will be useful for features that rely on the semantic content of topics, like the many AI features



Co-authored-by: Roman Rizzi <rizziromanalejandro@gmail.com>
2024-01-05 10:09:31 -03:00
Daniel Waterworth 6e161d3e75
DEV: Allow fab! without block (#24314)
The most common thing that we do with fab! is:

    fab!(:thing) { Fabricate(:thing) }

This commit adds a shorthand for this which is just simply:

    fab!(:thing)

i.e. If you omit the block, then, by default, you'll get a `Fabricate`d object using the fabricator of the same name.
2023-11-09 16:47:59 -06:00
Krzysztof Kotlarek 5f20748e40 SECURITY: SSRF vulnerability in TopicEmbed
Block redirects when making the final request in TopicEmbed to prevent Server Side Request Forgery (SSRF)
2023-11-09 13:39:08 +11:00
Blake Erickson 644dded000
SECURITY: Use canonical url for topic embeddings (#22085)
This prevents duplicate topics from being created when using embed_urls
that only differ on query params.
2023-06-13 11:08:08 -06:00
Sam 2ccc5fc66e
FEATURE: add support for figure and figcaption tags in embeddings (#21276)
Many blog posts use these to illustrate and images were previously omitted

Additionally strip superfluous HTML and BODY tags from embed HTML.

This was incorrectly returned from server.
2023-04-27 19:57:06 +10:00
Ted Johansson f3f30d6865
SECURITY: Encode embed url (#21133)
The embed_url in "This is a companion discussion..." could be used for
XSS.

Co-authored-by: Blake Erickson <o.blakeerickson@gmail.com>
2023-04-18 15:05:29 +08:00
David Taylor cb932d6ee1
DEV: Apply syntax_tree formatting to `spec/*` 2023-01-09 11:49:28 +00:00
Loïc Guitaut 3eaac56797 DEV: Use proper wording for contexts in specs 2022-08-04 11:05:02 +02:00
Gerhard Schlager f3b2ee8e1b
FIX: Use default locale for footer of embedded topics (#17760)
The content from the remote site and the footer get cached for 10 minutes, so Discourse should use the default locale instead of the user locale for the footer. Otherwise Discourse might cache the message in a different language.
2022-08-02 20:49:28 +02:00
Phil Pirozhkov 493d437e79
Add RSpec 4 compatibility (#17652)
* Remove outdated option

04078317ba

* Use the non-globally exposed RSpec syntax

https://github.com/rspec/rspec-core/pull/2803

* Use the non-globally exposed RSpec syntax, cont

https://github.com/rspec/rspec-core/pull/2803

* Comply to strict predicate matchers

See:
 - https://github.com/rspec/rspec-expectations/pull/1195
 - https://github.com/rspec/rspec-expectations/pull/1196
 - https://github.com/rspec/rspec-expectations/pull/1277
2022-07-28 10:27:38 +08:00
Loïc Guitaut 296aad430a DEV: Use `describe` for methods in specs 2022-07-27 16:35:27 +02:00
David Taylor c9dab6fd08
DEV: Automatically require 'rails_helper' in all specs (#16077)
It's very easy to forget to add `require 'rails_helper'` at the top of every core/plugin spec file, and omissions can cause some very confusing/sporadic errors.

By setting this flag in `.rspec`, we can remove the need for `require 'rails_helper'` entirely.
2022-03-01 17:50:50 +00:00
Jarek Radosz 2fc70c5572
DEV: Correctly tag heredocs (#16061)
This allows text editors to use correct syntax coloring for the heredoc sections.

Heredoc tag names we use:

languages: SQL, JS, RUBY, LUA, HTML, CSS, SCSS, SH, HBS, XML, YAML/YML, MF, ICS
other: MD, TEXT/TXT, RAW, EMAIL
2022-02-28 20:50:55 +01:00
Alan Guo Xiang Tan e4e37257cc FIX: Handle malformed URLs in `TopicEmbed.absolutize_urls`. 2022-01-21 11:18:54 +08:00
Bianca Nenciu cc1b45f58b
FIX: Convert URLs embedded topics to absolute form (#14975)
Sometimes the expanded post contained broken relative URLs because they
were not converted to their absolute form.
2021-11-17 16:39:49 +11:00
Dan Ungureanu 69f0f48dc0
DEV: Fix rubocop issues (#14715) 2021-10-27 11:39:28 +03:00
Roman Rizzi 4c2d5158c5
FIX: Follow the canonical URL when importing a remote topic. (#14489)
FinalDestination now supports the `follow_canonical` option, which will perform an initial GET request, parse the canonical link if present, and perform a HEAD request to it.

We use this mode during embeds to avoid treating URLs with different query parameters as different topics.
2021-10-01 12:48:21 -03:00
Rafael dos Santos Silva 2e0992c757
DEV: Allow TopicEmbed.import to optionally receive a list of tags (#14301)
This will be used by the rss-polling plugin
2021-09-13 17:01:59 -03:00
Rafael dos Santos Silva ef0471d7ae
DEV: Allow passing cook_method to TopicEmbed.import to override default (#14209)
DEV: Allow passing cook_method to TopicEmbed.import to override default

This will be used in the rss-polling plugin when we want to have
oneboxes on feed content, like youtube for example.
2021-09-01 15:46:39 -03:00
Rafael dos Santos Silva 560c13211a
DEV: Allow passing a category parameter when importing a topic (#14069)
This will be used in the rss pooling plugin to address the feature
request at https://meta.discourse.org/t/-/200644?u=falco
2021-08-17 18:17:07 -03:00
Joffrey JAFFEUX ed818a4a19
FIX: prevents malformed href to crash TopicEmbed (#12910)
If the associated page of a remote url passed to `TopicEmber.new(remote_url)` contained a malformed link like: `<a href="(http://foo.bar)">Baz</a>` it would raise an uncaught exception:

```
Job exception: Invalid scheme format: (http
```
2021-04-30 11:10:19 +02:00
jbrw da9b837da0
DEV: More robust processing of URLs (#11361)
* DEV: More robust processing of URLs

The previous `UrlHelper.encode_component(CGI.unescapeHTML(UrlHelper.unencode(uri))` method would naively process URLs, which could result in a badly formed response.

`Addressable::URI.normalized_encode(uri)` appears to deal with these edge-cases in a more robust way.

* DEV: onebox should use UrlHelper

* DEV: fix spec

* DEV: Escape output when rendering local links
2020-12-03 17:16:01 -05:00
Robin Ward 80a5482f28 Embedded topics are now unlisted by default
Previously this site setting `embed unlisted` defaulted to false and
empty topics would be generated for embed, but those topics tend to take
up a lot of room on the topic lists.

This new default creates invisible topics by default until they receive
their first reply.
2020-10-05 12:09:20 -04:00
Krzysztof Kotlarek e0d9232259
FIX: use allowlist and blocklist terminology (#10209)
This is a PR of the renaming whitelist to allowlist and blacklist to the blocklist.
2020-07-27 10:23:54 +10:00
Roman Rizzi a41476800b
FIX: Don't raise an exception if a topic cannot be retrieved (#9906) 2020-05-28 11:59:20 -03:00
Blake Erickson da839e6d26 SECURITY: Use FinalDestination for topic embeds 2020-05-27 09:26:09 -06:00
Michael Brown d9a02d1336
Revert "Revert "Merge branch 'master' of https://github.com/discourse/discourse""
This reverts commit 20780a1eee.

* SECURITY: re-adds accidentally reverted commit:
  03d26cd6: ensure embed_url contains valid http(s) uri
* when the merge commit e62a85cf was reverted, git chose the 2660c2e2 parent to land on
  instead of the 03d26cd6 parent (which contains security fixes)
2020-05-23 00:56:13 -04:00
Jeff Atwood 20780a1eee Revert "Merge branch 'master' of https://github.com/discourse/discourse"
This reverts commit e62a85cf6f, reversing
changes made to 2660c2e21d.
2020-05-22 20:25:56 -07:00
Blake Erickson 03d26cd6f0 SECURITY: ensure embed_url contains valid http(s) uri 2020-05-22 14:54:56 -06:00
Krzysztof Kotlarek 9bff0882c3
FEATURE: Nokogumbo (#9577)
* FEATURE: Nokogumbo

Use Nokogumbo HTML parser.
2020-05-05 13:46:57 +10:00
Robin Ward 25bed4f643 FIX: Concurrency issues with making topic embedded posts visible 2020-04-20 15:11:59 -04:00
Robin Ward 56a23c68f1 FIX: Embedded topics couldn't update their titles 2020-04-20 14:27:43 -04:00
Robin Ward b6b92a562c
FEATURE: New site setting `embed_unlisted` (#9391)
If enabled, posts imported to discourse via embeddings will default to
unlisted until they receive a reply.
2020-04-13 15:17:02 -04:00
Penar Musaraj 99fd65328c FIX: Skip absolutizing URLs when source URI is invalid 2020-02-07 10:54:24 -05:00
Sam Saffron 2408d55551 FIX: embedding topics would fail with some HTML
When truncating content we try to search for first paragraph, if HTML had
no P it would fallback to first div which may have nested elements.
2019-08-07 12:45:55 +10:00
Kyle Zhao 0e1d6151b9 FIX: Frozen string error in `TopicEmbed.import` (#7938)
When `SiteSetting.embed_truncate` is enabled (by default), the truncated
string is mutatable and does not raise an error.

However, when the setting is disabled, the `contents` string is frozen
and immutable, and will raise a `FrozenError`.
2019-07-25 09:21:01 -04:00
Daniel Waterworth e219588142 DEV: Prefabrication (test optimization) (#7414)
* Introduced fab!, a helper that creates database state for a group

It's almost identical to let_it_be, except:

 1. It creates a new object for each test by default,
 2. You can disable it using PREFABRICATION=0
2019-05-07 13:12:20 +10:00
Sam Saffron 4ea21fa2d0 DEV: use #frozen_string_literal: true on all spec
This change both speeds up specs (less strings to allocate) and helps catch
cases where methods in Discourse are mutating inputs.

Overall we will be migrating everything to use #frozen_string_literal: true
it will take a while, but this is the first and safest move in this direction
2019-04-30 10:27:42 +10:00
Kyle Zhao e25a6e085e FIX: drop title updates through RSS feeds
can create an update loop
2018-08-28 16:25:04 +10:00
Guo Xiang Tan 932195d828 DEV: Update test case for `TopicEmbed`. 2018-08-24 09:42:12 +08:00
Kyle Zhao baf413d527 FIX: update TopicEmbed's title and user correctly 2018-08-21 18:31:01 +08:00
Sam bdd9775869 improve spec 2018-05-02 17:16:00 +10:00