discourse

Commit Graph

Author	SHA1	Message	Date
Sam	d7657d8e47	correct specs, ensure crawler layout only applies to html	2018-01-16 16:28:11 +11:00
Sam	7b562d2f46	FEATURE: much improved and simplified crawler detection - phase one does it match 'trident\|webkit\|gecko\|chrome\|safari\|msie\|opera' yes- well it is possibly a browser - phase two does it match 'rss\|bot\|spider\|crawler\|facebook\|archive\|wayback\|ping\|monitor' probably a crawler then Based off: https://gist.github.com/SamSaffron/6cfad7ea3e6df321ffb7a84f93720a53	2018-01-16 15:41:45 +11:00
Sam	f6fdc1ebe8	FEATURE: flexible crawler detection You can use the crawler user agents site setting to amend what user agents are considered crawlers based on a string match in the user agent Also improves performance of crawler detection slightly	2017-09-29 12:31:50 +10:00
mcmcclur	a307ad6517	Update crawler_detection.rb Add HTTrack to the list of detected crawlers so that Discourse will serve vanilla HTML per https://meta.discourse.org/t/a-basic-discourse-archival-tool/62614/25	2017-05-16 11:17:05 -04:00
Robin Ward	2a4006fe0c	Add `YandexBot` to our list of crawlers	2016-07-26 13:21:37 -04:00
Jeff Atwood	bbb1348118	add Swiftbot to crawler regex	2015-05-02 03:18:58 -07:00
Erick Guan	026cdd8fc3	FEATURE: add 360Spider UA to allow 360 crawl Discourse sites	2015-03-16 22:58:33 +08:00
Jeff Atwood	ceef06e771	add support for "Save Page Now" archive.org/web	2015-01-06 01:05:45 -08:00
riking	37dbc4b5e6	Add archive.org to crawler list to serve no-js to	2014-11-02 16:51:23 -08:00
Vikhyat Korrapati	e3702ecb30	Improved crawler detection: add Twitterbot, Facebook, curl, Bing, Baidu.	2014-03-16 19:30:20 +05:30
Robin Ward	c4b5455c21	REFACTOR: Rename `GooglebotDetection` to `CrawlerDetection` because we will likely whitelist more crawlers in the future.	2014-02-20 16:07:02 -05:00

11 Commits