Commit Graph

7 Commits

Author SHA1 Message Date
Sam Saffron 4ea21fa2d0 DEV: use #frozen_string_literal: true on all spec
This change both speeds up specs (less strings to allocate) and helps catch
cases where methods in Discourse are mutating inputs.

Overall we will be migrating everything to use #frozen_string_literal: true
it will take a while, but this is the first and safest move in this direction
2019-04-30 10:27:42 +10:00
Sam d84256a876 FEATURE: add Noindex to robots.txt for disallowed routes
This strips pages out of indexes that should not exist see:

https://meta.discourse.org/t/pages-listed-in-the-robots-txt-are-crawled-and-indexed-by-google/100309/11?u=sam
2018-11-02 16:39:47 +11:00
Robin Ward 3d7dbdedc0 FEATURE: An API to help sites build robots.txt files programatically
This is mainly useful for subfolder sites, who need to expose their
robots.txt contents to a parent site.
2018-04-16 15:43:20 -04:00
Régis Hanol df7970a6f6 prefix the robots.txt rules with the directory when using subfolder 2018-04-11 22:05:02 +02:00
Sam 3a7b696703 FEATURE: allow for setting crawl delay per user agent
Also moved to default crawl delay bing so no more than a req every 5 seconds is allowed

New site settings:

"slow_down_crawler_user_agents" - list of crawlers that will be slowed down
"slow_down_crawler_rate" - how many seconds to wait between requests

Not enforced server side yet
2018-04-06 10:15:23 +10:00
Neil Lalonde ced7e9a691 FEATURE: control which web crawlers can access using a whitelist or blacklist 2018-03-22 15:41:02 -04:00
Guo Xiang Tan 77d4c4d8dc Fix all the errors to get our tests green on Rails 5.1. 2017-09-25 13:48:58 +08:00