Class TextRank

  • All Implemented Interfaces:
    Extractor, KeywordExtractor, Serializable

    public class TextRank
    extends Object
    implements KeywordExtractor

    Implementation of the TextRank algorithm for keyword extraction as defined in: Mihalcea, R., Tarau, P.: "Textrank: Bringing order into texts". In: Lin, D., Wu, D. (eds.) Proceedings of EMNLP 2004. pp. 404–411. Association for Computational Linguistics, Barcelona, Spain. July 2004. Currently supports unweighted undirected graphs.

    See Also:
    Serialized Form
    • Constructor Detail

      • TextRank

        public TextRank()
    • Method Detail

      • extract

        public Extraction extract​(@NonNull
                                  @NonNull HString hString)
        Description copied from interface: Extractor
        Generate an Extraction from the given HString.
        Specified by:
        extract in interface Extractor
        Parameters:
        hString - the source text from which we will generate an Extraction
        Returns:
        the Extraction
      • fit

        public void fit​(@NonNull
                        @NonNull DocumentCollection corpus)
        Description copied from interface: KeywordExtractor
        In certain cases a keyword extractor needs to collect corpus level statistics or construct a model of what a good keyword looks like. The fit method allows implementations to perform this logic at a corpus level.
        Specified by:
        fit in interface KeywordExtractor
        Parameters:
        corpus - the corpus to fit the extractor to