Class RakeKeywordExtractor
- java.lang.Object
-
- com.gengoai.hermes.extraction.keyword.RakeKeywordExtractor
-
- All Implemented Interfaces:
Extractor
,KeywordExtractor
,Serializable
public class RakeKeywordExtractor extends Object implements KeywordExtractor
Implementation of the RAKE keyword extraction algorithm as presented in:Rose, S., Engel, D., Cramer, N., & Cowley, W. (2010). Automatic Keyword Extraction from Individual Documents. In M. W. Berry & J. Kogan (Eds.), Text Mining: Theory and Applications: John Wiley & Sons.
- Author:
- David B. Bracewell
- See Also:
- Serialized Form
-
-
Constructor Summary
Constructors Constructor Description RakeKeywordExtractor()
Instantiates a new Rake keyword extractor using a defaultFeaturizingExtractor
that lower cases words.RakeKeywordExtractor(@NonNull LyreExpression toStringExpression)
Instantiates a new Rake keyword extractor.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description Extraction
extract(@NonNull HString source)
Generate anExtraction
from the givenHString
.void
fit(DocumentCollection corpus)
In certain cases a keyword extractor needs to collect corpus level statistics or construct a model of what a good keyword looks like.
-
-
-
Constructor Detail
-
RakeKeywordExtractor
public RakeKeywordExtractor()
Instantiates a new Rake keyword extractor using a defaultFeaturizingExtractor
that lower cases words.
-
RakeKeywordExtractor
public RakeKeywordExtractor(@NonNull @NonNull LyreExpression toStringExpression)
Instantiates a new Rake keyword extractor.- Parameters:
toStringExpression
- the specification for how to convert tokens/phrases to strings (all other options are ignored).
-
-
Method Detail
-
extract
public Extraction extract(@NonNull @NonNull HString source)
Description copied from interface:Extractor
Generate anExtraction
from the givenHString
.
-
fit
public void fit(DocumentCollection corpus)
Description copied from interface:KeywordExtractor
In certain cases a keyword extractor needs to collect corpus level statistics or construct a model of what a good keyword looks like. The fit method allows implementations to perform this logic at a corpus level.- Specified by:
fit
in interfaceKeywordExtractor
- Parameters:
corpus
- the corpus to fit the extractor to
-
-