DigitalPebble's Blog
Friday, 21 October 2011

Nutch hosting and monitoring

›
We now provide hosting and monitoring services for Apache Nutch . For a fixed price, we will set up, run and monitor your Nutch crawler an...
Monday, 26 September 2011

Visualising Nutch mailing-lists traffic

›
The graph below show the traffic on the Nutch dev and user mailing lists ( http://mail-archives.apache.org/mod_mbox/nutch-user/ and http://...
Wednesday, 6 July 2011

Crawler-Commons 0.1 released

›
As announced on various mailing-lists :  The initial release of crawler-commons is available from : http://code.google.com/p/ crawler-comm...
Sunday, 12 June 2011

Nutch 1.3 released + BerlinBuzzwords presentation

›
Nutch 1.3 has been released and contains quite a few changes , some of which have been retrofitted from Nutch 2.0 in trunk. The main modif...
Friday, 27 May 2011

Parsing the Enron email dataset using Tika and Hadoop

›
In order to parse a large collection of emails, such as the Enron Email Dataset , we might choose to use Apache Hadoop , a scalable computin...
5 comments:
Saturday, 7 May 2011

Nutch talk at Berlin Buzzwords 2011

›
I'll be giving a talk on Apache Nutch at Berlin Buzzwords. This talk will give an overview of Apache Nutch. I will describe its main ...
Tuesday, 22 March 2011

Search for US properties with SOLR and Maptimize

›
Our clients 5k50 have recently opened a preview of their real-estate search system which is based on Apache SOLR and Maptimize. Maptimize ...
‹
›
Home
View web version
Powered by Blogger.