A platform for community discussion. Free, open, simple.
Go to file
Osama Sayegh 7bd3986b21
FEATURE: Replace `Crawl-delay` directive with proper rate limiting (#15131)
We have a couple of site setting, `slow_down_crawler_user_agents` and `slow_down_crawler_rate`, that are meant to allow site owners to signal to specific crawlers that they're crawling the site too aggressively and that they should slow down.

When a crawler is added to the `slow_down_crawler_user_agents` setting, Discourse currently adds a `Crawl-delay` directive for that crawler in `/robots.txt`. Unfortunately, many crawlers don't support the `Crawl-delay` directive in `/robots.txt` which leaves the site owners no options if a crawler is crawling the site too aggressively.

This PR replaces the `Crawl-delay` directive with proper rate limiting for crawlers added to the `slow_down_crawler_user_agents` list. On every request made by a non-logged in user, Discourse will check the User Agent string and if it contains one of the values of the `slow_down_crawler_user_agents` list, Discourse will only allow 1 request every N seconds for that User Agent (N is the value of the `slow_down_crawler_rate` setting) and the rest of requests made within the same interval will get a 429 response. 

The `slow_down_crawler_user_agents` setting becomes quite dangerous with this PR since it could rate limit lots if not all of anonymous traffic if the setting is not used appropriately. So to protect against this scenario, we've added a couple of new validations to the setting when it's changed:

1) each value added to setting must 3 characters or longer
2) each value cannot be a substring of tokens found in popular browser User Agent. The current list of prohibited values is: apple, windows, linux, ubuntu, gecko, firefox, chrome, safari, applewebkit, webkit, mozilla, macintosh, khtml, intel, osx, os x, iphone, ipad and mac.
2021-11-30 12:55:25 +03:00
.devcontainer FEATURE: Support for GitHub Codespaces development (#11440) 2020-12-08 21:35:15 -03:00
.github Segment Ember CLI tests 2021-11-26 12:14:30 -05:00
.vscode-sample DEV: Move vscode config files to `.vscode-sample` directory (#11943) 2021-02-03 14:14:39 +00:00
app FEATURE: Replace `Crawl-delay` directive with proper rate limiting (#15131) 2021-11-30 12:55:25 +03:00
bin DEV: Re-allow node 17, with a warning (#15083) 2021-11-24 21:16:33 +01:00
config FEATURE: Replace `Crawl-delay` directive with proper rate limiting (#15131) 2021-11-30 12:55:25 +03:00
db FEATURE: Display pending posts on user’s page 2021-11-29 10:26:33 +01:00
docs DEV: Switch to using uppy uploads in composer by default (#15058) 2021-11-30 08:33:06 +10:00
images Replace README logo with PNG (#14044) 2021-08-13 14:23:49 -04:00
lib FEATURE: Replace `Crawl-delay` directive with proper rate limiting (#15131) 2021-11-30 12:55:25 +03:00
log
plugins DEV: Don't create unnecessary scope methods (#15104) 2021-11-26 16:34:07 +01:00
public DEV: updates popper to 2.10.2 (#14986) 2021-11-17 13:47:55 +01:00
script FIX: Ambiguous column in `downsize_uploads` (#14972) 2021-11-16 16:23:32 +01:00
spec FEATURE: Replace `Crawl-delay` directive with proper rate limiting (#15131) 2021-11-30 12:55:25 +03:00
test DEV: Print usedJSHeapSize to the console after QUnit run (#14462) 2021-09-28 16:32:56 +01:00
vendor FEATURE: Local chunked uppy backup uploads with a new uploader plugin (#14894) 2021-11-23 08:45:42 +10:00
.editorconfig
.eslintignore FIX: browser-update should work with old browsers (#12436) 2021-03-18 19:09:01 +02:00
.eslintrc DEV: Avoid using globals (#14909) 2021-11-13 13:10:13 +01:00
.git-blame-ignore-revs DEV: the referenced commit bc97… was rebased into 445d… (#11626) 2021-01-07 08:14:54 +11:00
.gitattributes
.gitignore DEV: Add `GeoList2-ASN.mmdb` and `.bundle` to `.gitignore` (#13902) 2021-07-30 16:17:55 +01:00
.licensed.yml DEV: Add a basic licensed config (#10128) 2020-06-25 18:01:36 -03:00
.npmrc DEV: Prevent npm usage (#13945) 2021-08-04 22:04:58 +02:00
.prettierignore FIX: browser-update should work with old browsers (#12436) 2021-03-18 19:09:01 +02:00
.prettierrc DEV: upgrades dev config (#10588) 2020-09-04 13:33:03 +02:00
.rspec
.rspec_parallel
.rubocop.yml Revert "Bump rubocop-discourse to 2.3.0." 2020-07-24 13:18:49 +08:00
.ruby-gemset.sample
.ruby-version.sample Update .ruby-version.sample 2021-07-30 14:42:14 -04:00
.template-lintrc.js enable eol-last for eslint and ember-template-lint (#12678) 2021-04-12 17:22:00 -07:00
Brewfile
CONTRIBUTING.md
COPYRIGHT.md DEV: Absorb onebox gem into core (#12979) 2021-05-26 15:11:35 +05:30
Gemfile Revert "DEV: Avoid duplication of gems in gemfile." (#14784) 2021-11-01 17:58:24 +05:30
Gemfile.lock Build(deps): Bump parser from 3.0.3.0 to 3.0.3.1 (#15130) 2021-11-29 22:52:04 +01:00
LICENSE.txt DEV: Absorb onebox gem into core (#12979) 2021-05-26 15:11:35 +05:30
README.md Remove Atom from README 2021-09-16 14:28:56 -04:00
Rakefile FIX: Do not dump schema during production database migrations (#12785) 2021-04-21 16:26:20 +01:00
adminjs
config.ru
d
discourse.sublime-project
jsapp
lefthook.yml DEV: Lint SCSS with prettier in pre-commit (#15033) 2021-11-22 11:30:12 +10:00
package.json FEATURE: Display pending posts on user’s page 2021-11-29 10:26:33 +01:00
translator.yml DEV: Add styleguide locale files to Crowdin (#10876) 2020-10-09 13:23:32 +11:00
yarn.lock DEV: updates popper to 2.10.2 (#14986) 2021-11-17 13:47:55 +01:00

README.md

Discourse is the 100% open source discussion platform built for the next decade of the Internet. Use it as a:

  • mailing list
  • discussion forum
  • long-form chat room

To learn more about the philosophy and goals of the project, visit discourse.org.

Screenshots

Boing Boing

Mobile

Browse lots more notable Discourse instances.

Development

To get your environment setup, follow the community setup guide for your operating system.

  1. If you're on macOS, try the macOS development guide.
  2. If you're on Ubuntu, try the Ubuntu development guide.
  3. If you're on Windows, try the Windows 10 development guide.

If you're familiar with how Rails works and are comfortable setting up your own environment, you can also try out the Discourse Advanced Developer Guide, which is aimed primarily at Ubuntu and macOS environments.

Before you get started, ensure you have the following minimum versions: Ruby 2.7+, PostgreSQL 13+, Redis 6.0+. If you're having trouble, please see our TROUBLESHOOTING GUIDE first!

Setting up Discourse

If you want to set up a Discourse forum for production use, see our Discourse Install Guide.

If you're looking for business class hosting, see discourse.org/buy.

If you're looking for our remote work solution, see teams.discourse.com.

Requirements

Discourse is built for the next 10 years of the Internet, so our requirements are high.

Discourse supports the latest, stable releases of all major browsers and platforms:

Browsers Tablets Phones
Apple Safari iPadOS iOS
Google Chrome Android Android
Microsoft Edge
Mozilla Firefox

Built With

  • Ruby on Rails — Our back end API is a Rails app. It responds to requests RESTfully in JSON.
  • Ember.js — Our front end is an Ember.js app that communicates with the Rails API.
  • PostgreSQL — Our main data store is in Postgres.
  • Redis — We use Redis as a cache and for transient data.
  • BrowserStack — We use BrowserStack to test on real devices and browsers.

Plus lots of Ruby Gems, a complete list of which is at /main/Gemfile.

Contributing

Build Status

Discourse is 100% free and open source. We encourage and support an active, healthy community that accepts contributions from the public including you!

Before contributing to Discourse:

  1. Please read the complete mission statements on discourse.org. Yes we actually believe this stuff; you should too.
  2. Read and sign the Electronic Discourse Forums Contribution License Agreement.
  3. Dig into CONTRIBUTING.MD, which covers submitting bugs, requesting new features, preparing your code for a pull request, etc.
  4. Always strive to collaborate with mutual respect.
  5. Not sure what to work on? We've got some ideas.

We look forward to seeing your pull requests!

Security

We take security very seriously at Discourse; all our code is 100% open source and peer reviewed. Please read our security guide for an overview of security measures in Discourse, or if you wish to report a security issue.

The Discourse Team

The original Discourse code contributors can be found in AUTHORS.MD. For a complete list of the many individuals that contributed to the design and implementation of Discourse, please refer to the official Discourse blog and GitHub's list of contributors.

Copyright 2014 - 2021 Civilized Discourse Construction Kit, Inc.

Licensed under the GNU General Public License Version 2.0 (or later); you may not use this work except in compliance with the License. You may obtain a copy of the License in the LICENSE file, or at:

https://www.gnu.org/licenses/old-licenses/gpl-2.0.txt

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Discourse logo and “Discourse Forum” ®, Civilized Discourse Construction Kit, Inc.

Dedication

Discourse is built with love, Internet style.