Commit Graph

6 Commits

Author SHA1 Message Date
Sam d84256a876 FEATURE: add Noindex to robots.txt for disallowed routes
This strips pages out of indexes that should not exist see:

https://meta.discourse.org/t/pages-listed-in-the-robots-txt-are-crawled-and-indexed-by-google/100309/11?u=sam
2018-11-02 16:39:47 +11:00
Robin Ward 3d7dbdedc0 FEATURE: An API to help sites build robots.txt files programatically
This is mainly useful for subfolder sites, who need to expose their
robots.txt contents to a parent site.
2018-04-16 15:43:20 -04:00
Régis Hanol df7970a6f6 prefix the robots.txt rules with the directory when using subfolder 2018-04-11 22:05:02 +02:00
Sam 3a7b696703 FEATURE: allow for setting crawl delay per user agent
Also moved to default crawl delay bing so no more than a req every 5 seconds is allowed

New site settings:

"slow_down_crawler_user_agents" - list of crawlers that will be slowed down
"slow_down_crawler_rate" - how many seconds to wait between requests

Not enforced server side yet
2018-04-06 10:15:23 +10:00
Neil Lalonde ced7e9a691 FEATURE: control which web crawlers can access using a whitelist or blacklist 2018-03-22 15:41:02 -04:00
Guo Xiang Tan 77d4c4d8dc Fix all the errors to get our tests green on Rails 5.1. 2017-09-25 13:48:58 +08:00