The 1.1 release comes 2 months after the previous one and is relatively lightweight by comparison. The main changes are :
There have been several minor changes as well.
Dependency upgrades
Jackson Databind (2.6.6) and Apache Storm (1.0.2)
Core
- HTTP protocol : store response headers verbatim in metadata (#317) - used by the WARC module
- FetcherBolt - added option to throttle based on number of URLs in queues (#311)
- Conventional 'never-refetch' Date for nextFetchDate (#331)
- Added metadata.lastProcessedDate
- Deprecated StatusStreamBolt and copied as DummyIndexer
Elasticsearch
Archetype
There have been several minor changes as well.
Remember that you can get regular updates about major commits on the project by following us on Twitter @stormcrawlerapi..
BTW there should be an exciting announcement in the next couple of weeks about a cool use of StormCrawler by a high-profile user, watch this space!
As usual, thanks to all contributors and users and happy crawling!
PS: If you are near the Bristol, you might be interested in coming to the talk I'll be giving at Bristech on the Oct 6th.
NOTE
A patch release 1.1.1 has been published on the 21st Sept and includes #335 (thanks to Jeff Bolle for pointing it out).
No comments:
Post a Comment
Note: only a member of this blog may post a comment.