Monday, 16 September 2013
NUTCH FIGHT! 1.7 vs 2.2.1
›
We've had releases in the Nutch 2.x branch for over a year now. As I described in a previous post , the main difference with the 1.x b...
9 comments:
Monday, 29 July 2013
Nutch training course
›
We are planning to run a 2-day training courses on Apache Nutch on the 24/25 October 2013. It will take place in Bristol, UK (the exact v...
7 comments:
Wednesday, 5 June 2013
DigitalPebble is hiring!
›
We are looking for a candidate with the following skills and expertise : * experience in web crawling, ideally with Apache Nutch ...
Friday, 8 March 2013
Free your Nutch crawls with pluggable indexers
›
I have just committed what should be a very important new feature of the next 1.x release of Apache Nutch , namely the possibility to imple...
3 comments:
Wednesday, 5 September 2012
Using Behemoth on the CommonCrawl dataset
›
Behemoth is an open-source platform for document processing based on Hadoop which provides an excellent way to process document collection...
4 comments:
‹
›
Home
View web version