Monday 20 July 2020

Please welcome StormCrawler 2.0

Nearly 6 years after its initial release and after another 32 releases, StormCrawler has just reached version 2.0! 

This is similar to what we did 4 years ago when 1.0 was released, in that the change of major version reflects the version of Apache Storm that StormCrawler is based on. This is not a major refactoring of StormCrawler in any way, although some minor changes can be found, mainly in the way the topologies are submitted. These changes are documented in the READMEs generated by our archetypes.

In terms of functionalities and behavior, StormCrawler 2.0 is similar to the version 1.17 released a few minutes ago.

I expect to keep both branches in parallel for a bit, at least until StormCrawler 2.0 has been sufficiently tested and is used by the majority of our users.

The change to Apache Storm 2 is not just a way of future-proofing StormCrawler, since version 2 is the current branch in Apache Storm. By adopting Storm 2, we are also getting a platform 100% Java making debugging and possible contributions to Apache Storm itself, and we also benefit from Storm's recent improvements such as improved performance and better backpressure model.

I am looking forward to getting feedback (and bugfixes) from the StormCrawler community. Please give StormCrawler 2.0 a try if you can.

Happy crawling! 

No comments:

Post a Comment

Note: only a member of this blog may post a comment.