StormCrawler 2.2 has just been released. This marks the beginning of having releases only for 2.x, 1.18 was the last release for the 1.x branch which is now discontinued. In case you were wondering why there was no "What's new in StormCrawler 2.1", it is simply that it contained the same modifications as 1.18 and did not get its own announcement.
This version contains many bugfixes, as usual, users are advised to upgrade to this version.
Happy crawling and thanks to our sponsors, contributors and users!
PS: I am tempted to run a workshop on webcrawling with StormCrawler at the BigData conference in Vilnius in November. Anyone interested? If so please get in touch and let me know what you'd like to learn about. https://bigdataconference.eu/
Dependency upgrades
Core
- StackOverFlow issue in CharsetIdentification #895
- OkHttp protocol: make connection pool configurable #918
- Remove selenium.instances.num #933
- Changed ProtocolFactory to be a singleton #932
- Need to register Status class with Kryo #924
- JSoupParserBolt cannot configure more than one JSoupFilters per worker #925
- Remove static keyword on JSoupFilters field #927
- Support HEAD method in okhttp protocol #923
- Allow to set http.content.limit per page in metadata #922
- OkHttp protocol: add support for Brotli compression (Content-Encoding) #919
- Protocols: Integer.MAX_VALUE not save as max. content size #854
- Protocols: adding support for custom headers #912
- Replace Guava caches with Caffeine #903 and #905
- DelegatorProtocol #900
- Fixed bug with StackOverflowError in fast charset identification #895
- Multi proxy support #890
No comments:
Post a Comment
Note: only a member of this blog may post a comment.