org.apache.nutch.analysis
Class NutchAnalyzer
java.lang.Object
org.apache.lucene.analysis.Analyzer
org.apache.nutch.analysis.NutchAnalyzer
- All Implemented Interfaces:
- org.apache.hadoop.conf.Configurable, Pluggable
- Direct Known Subclasses:
- NutchDocumentAnalyzer
public abstract class NutchAnalyzer
- extends Analyzer
- implements org.apache.hadoop.conf.Configurable, Pluggable
Extension point for analysis.
All plugins found which implement this extension point are run
sequentially on the parse.
- Author:
- Jérôme Charron
Field Summary |
protected org.apache.hadoop.conf.Configuration |
conf
The current Configuration |
Method Summary |
org.apache.hadoop.conf.Configuration |
getConf()
|
void |
setConf(org.apache.hadoop.conf.Configuration conf)
|
abstract TokenStream |
tokenStream(String fieldName,
Reader reader)
Creates a TokenStream which tokenizes all the text in the provided Reader. |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
conf
protected org.apache.hadoop.conf.Configuration conf
- The current Configuration
NutchAnalyzer
public NutchAnalyzer()
tokenStream
public abstract TokenStream tokenStream(String fieldName,
Reader reader)
- Creates a TokenStream which tokenizes all the text in the provided Reader.
- Specified by:
tokenStream
in class Analyzer
setConf
public void setConf(org.apache.hadoop.conf.Configuration conf)
- Specified by:
setConf
in interface org.apache.hadoop.conf.Configurable
getConf
public org.apache.hadoop.conf.Configuration getConf()
- Specified by:
getConf
in interface org.apache.hadoop.conf.Configurable
Copyright © 2006 The Apache Software Foundation