Monday 27 September 2010

Apache Nutch 1.2 released

[quoting the announcement by Chris Mattmann]

The Apache Nutch project is pleased to announce the release of Apache Nutch
1.2. The release contents have been pushed out to the main Apache release
site so the releases should be available as soon as the mirrors get the

Apache Nutch, one of the six new Apache TLPs as a result of the April 2010
Board Meeting, is an extensible framework for building out large-scale
web-based search. Layered on top of fellow Apache projects Hadoop,
Lucene/Solr, and Tika, Nutch provides an out of the box platform for
fetching web pages, pdf files, word documents, and more. Nutch parses the
content and its relevant information, indexes its metadata, and makes it
available for efficient query and retrieval over modern Internet protocols.

Apache Nutch 1.2 contains a number of improvements and bug fixes. Details
can be found in the changes file:

Apache Nutch is available in source and binary form from the following
download page:

In the initial 48 hours, the release may not be available on all mirrors.
When downloading from a mirror site, please remember to verify the downloads
using signatures found on the Apache site:

For more information on Apache Nutch, visit the project home page:

No comments:

Post a Comment

Note: only a member of this blog may post a comment.