* FEATURE: use canonical links in posts.rss feed
Previously we used non canonical links in posts.rss
These links get crawled frequently by crawlers when discovering new
content forcing crawlers to hop to non canonical pages just to end up
visiting canonical pages
This uses up expensive crawl time and adds load on Discourse sites
Old links were of the form:
`https://DOMAIN/t/SLUG/43/21`
New links are of the form
`https://DOMAIN/t/SLUG/43?page=2#post_21`
This also adds a post_id identified element to crawler view that was
missing.
Note, to avoid very expensive N+1 queries required to figure out the
page a post is on during rss generation, we cache that information.
There is a smart "cache breaker" which ensures worst case scenario is
a "page drift" - meaning we would publicize a post is on page 11 when
it is actually on page 10 due to post deletions. Cache holds for up to
12 hours.
Change only impacts public post RSS feeds (`/posts.rss`)
This commit updates the RSS post content to use email formatting. Many
plugins are using the `reduce_cooked` method to format content that is
not displayed outside of Discourse application. Using email formatting
also strips the secure media and various other things that is only meant
for Discourse client side application.
The modification date should always be a meta tag to make this less confusing. Especially for imported posts.
That's more in line with how the rest of Discourse presents post dates.
* Do not show "Uncategorized" category in topics list.
* Use "BreadcrumbList" only if topic is in a category.
* Add tags list as keywords to the first post.
* Add "dateModified" even if it is the same with "datePublished".
* Show "crawler-linkback-list" only if there are links to be shown.
- Eliminate superfluous "author wrote" block
- Eliminate block-quote for all posts
- Move participant count and reply count to 1 line
- Prioritize name over username if forum requests
- Use fabrication in list controller spec to speed up spec
The data-vocabulary.org schema is being deprecated.
We're now using the BreadcrumList data from the latest and greatest schema.org.
FIX: categories_breadcrumb helper to support more than 2 levels of categories.
user_url() failed for usernames containing Unicode characters because it expects URL encoded usernames. RSS feeds do not support IRIs, so lets convert them to URIs by encoding the usernames.
* Cleaning up crawler styles, improving some schema.org markup
* Cleaning up crawler styles, improving some schema.org markup
* additional styling
* add space for pagination
* show likes value in crawler view if count is > 0
* remove <hr> since horizontal line is already provided by css - this removes one of 2 horizontal lines in post crawler view