This reduces chances of errors where consumers of strings mutate inputs
and reduces memory usage of the app.
Test suite passes now, but there may be some stuff left, so we will run
a few sites on a branch prior to merging
- phase one does it match 'trident|webkit|gecko|chrome|safari|msie|opera'
yes- well it is possibly a browser
- phase two does it match 'rss|bot|spider|crawler|facebook|archive|wayback|ping|monitor'
probably a crawler then
Based off: https://gist.github.com/SamSaffron/6cfad7ea3e6df321ffb7a84f93720a53
You can use the crawler user agents site setting to amend what user agents
are considered crawlers based on a string match in the user agent
Also improves performance of crawler detection slightly