Class TFIDFKeywordExtractor
- java.lang.Object
-
- com.gengoai.hermes.extraction.keyword.TFIDFKeywordExtractor
-
- All Implemented Interfaces:
Extractor
,KeywordExtractor
,Serializable
public class TFIDFKeywordExtractor extends Object implements KeywordExtractor
Keyword extractor that scores words based on their TFIDF value.- Author:
- David B. Bracewell
- See Also:
- Serialized Form
-
-
Constructor Summary
Constructors Constructor Description TFIDFKeywordExtractor()
Instantiates a new TFIDFKeywordExtractor.TFIDFKeywordExtractor(@NonNull FeaturizingExtractor termExtractor)
Instantiates a new TFIDFKeywordExtractor.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description Extraction
extract(HString source)
Generate anExtraction
from the givenHString
.void
fit(DocumentCollection corpus)
In certain cases a keyword extractor needs to collect corpus level statistics or construct a model of what a good keyword looks like.
-
-
-
Constructor Detail
-
TFIDFKeywordExtractor
public TFIDFKeywordExtractor()
Instantiates a new TFIDFKeywordExtractor.
-
TFIDFKeywordExtractor
public TFIDFKeywordExtractor(@NonNull @NonNull FeaturizingExtractor termExtractor)
Instantiates a new TFIDFKeywordExtractor.- Parameters:
termExtractor
- the specification for filtering and converting annotations to strings
-
-
Method Detail
-
extract
public Extraction extract(HString source)
Description copied from interface:Extractor
Generate anExtraction
from the givenHString
.
-
fit
public void fit(DocumentCollection corpus)
Description copied from interface:KeywordExtractor
In certain cases a keyword extractor needs to collect corpus level statistics or construct a model of what a good keyword looks like. The fit method allows implementations to perform this logic at a corpus level.- Specified by:
fit
in interfaceKeywordExtractor
- Parameters:
corpus
- the corpus to fit the extractor to
-
-