Commit Graph

9 Commits

Author SHA1 Message Date
Sam f6fdc1ebe8 FEATURE: flexible crawler detection
You can use the crawler user agents site setting to amend what user agents
are considered crawlers based on a string match in the user agent

Also improves performance of crawler detection slightly
2017-09-29 12:31:50 +10:00
mcmcclur a307ad6517 Update crawler_detection.rb
Add HTTrack to the list of detected crawlers so that Discourse will serve vanilla HTML per https://meta.discourse.org/t/a-basic-discourse-archival-tool/62614/25
2017-05-16 11:17:05 -04:00
Robin Ward 2a4006fe0c Add `YandexBot` to our list of crawlers 2016-07-26 13:21:37 -04:00
Jeff Atwood bbb1348118 add Swiftbot to crawler regex 2015-05-02 03:18:58 -07:00
Erick Guan 026cdd8fc3 FEATURE: add 360Spider UA to allow 360 crawl Discourse sites 2015-03-16 22:58:33 +08:00
Jeff Atwood ceef06e771 add support for "Save Page Now" archive.org/web 2015-01-06 01:05:45 -08:00
riking 37dbc4b5e6 Add archive.org to crawler list to serve no-js to 2014-11-02 16:51:23 -08:00
Vikhyat Korrapati e3702ecb30 Improved crawler detection: add Twitterbot, Facebook, curl, Bing, Baidu. 2014-03-16 19:30:20 +05:30
Robin Ward c4b5455c21 REFACTOR: Rename `GooglebotDetection` to `CrawlerDetection` because we
will likely whitelist more crawlers in the future.
2014-02-20 16:07:02 -05:00