Commit Graph

25 Commits

Author SHA1 Message Date
Sam Saffron d9413a61d2 PERF: move crawl_topic_links to the low queue
Crawling topic links can be somewhat delayed no need to run it in the default
queue.
2019-05-22 10:18:49 +10:00
Sam Saffron 30990006a9 DEV: enable frozen string literal on all files
This reduces chances of errors where consumers of strings mutate inputs
and reduces memory usage of the app.

Test suite passes now, but there may be some stuff left, so we will run
a few sites on a branch prior to merging
2019-05-13 09:31:32 +08:00
Guo Xiang Tan e1b16e445e Rename `FileHelper.is_image?` -> `FileHelper.is_supported_image?`. 2018-09-12 09:22:28 +08:00
Guo Xiang Tan 4ba5e678d8 Require dependencies to enable live reload in dev for Sidekiq. 2017-10-06 11:39:00 +08:00
Robin Ward 2f8f2aa1dd FEATURE: Whitelists for inline oneboxing 2017-07-21 15:41:47 -04:00
Régis Hanol 2e7753c27f User 'FileHelper.is_image?' to check wether a link is poiting to an image 2017-06-22 12:54:42 +02:00
Robin Ward b23fc2bf84 Helper to find the final destination for a URL 2017-05-22 15:52:41 -04:00
Robin Ward 773445b8df FIX: Topic Crawling should only crawl HTTP/S urls 2017-05-22 11:57:20 -04:00
Robin Ward ea9f93dcc5 FIX: Don't crawl non-http/s links 2017-05-19 16:57:41 -04:00
Sam add6e12ce4 FIX: topic links with long titles can not be crawled
0..255 == 256 numbers column fits 255
2015-08-18 17:34:46 +10:00
Robin Ward 1434e46ed2 FIX: Excon was wrapping our `ReadOnly` exception
This was preventing the crawling of many topic links
2015-05-27 14:29:52 -04:00
Sam cd9e499b77 Don't try loading embeds on deleted topics 2015-05-06 16:53:28 +10:00
Sam bb20f64cb2 use standard error so its easier to catch 2015-03-23 12:20:50 +11:00
Akshay 6301a43d57 Not initializing variable for looping if unused in loop 2014-08-15 03:24:55 +05:30
Sam a2e2d0e886 Merge pull request #2316 from mutiny/refactor-where-first
Refactor `where(...).first` to `find_by(...)`
2014-05-08 09:10:45 +10:00
Camille Roux f14c71b9d4 Fix the Amazon links regex 2014-05-06 19:19:32 +02:00
Camille Roux e77e7f23ca Update the Amazon links regexp
Added all the countries displayed in the Amazon footer
2014-05-06 18:36:07 +02:00
Louis Rose 1574485443 Perform the where(...).first to find_by(...) refactoring.
This refactoring was automated using the command: bundle exec "ruby refactorings/where_dot_first_to_find_by/app.rb"
2014-05-06 14:41:59 +01:00
Robin Ward a57f802048 If there's a `TopicEmbed` record for a url, we don't have to crawl it.
This should help sites like Boing Boing where sometimes links are
crawled before saved in WordPress.
2014-04-17 14:00:22 -04:00
Robin Ward e80851b0fa Special case: When crawling a link to an image, just put the filename as
the title.
2014-04-10 13:45:13 -04:00
Robin Ward 99e2bab62d Use `update_all` to prevent `after_commit` from executing again. 2014-04-10 13:19:57 -04:00
Robin Ward aa63868d5e FIX: Problem crawling amazon titles 2014-04-08 16:39:47 -04:00
Robin Ward 1e3faddfe4 FIX: Change crawl size to 10k. Youtube for example doesn't work with the
first 1k
2014-04-07 16:03:47 -04:00
Robin Ward 7e0028ba50 FIX: Don't crawl in test mode, raise correct exception when parameters
are missing
2014-04-07 14:38:18 -04:00
Robin Ward 7e3ea5d644 Support for crawling topic links 2014-04-07 14:08:34 -04:00