Package com.gengoai.hermes.morphology
Class BreakIteratorTokenizer
- java.lang.Object
-
- com.gengoai.hermes.morphology.BreakIteratorTokenizer
-
- All Implemented Interfaces:
Tokenizer
,Serializable
public class BreakIteratorTokenizer extends Object implements Tokenizer, Serializable
A tokenizer implementation based on Java's BreakIterator class
- Author:
- David B. Bracewell
- See Also:
- Serialized Form
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from interface com.gengoai.hermes.morphology.Tokenizer
Tokenizer.Token
-
-
Constructor Summary
Constructors Constructor Description BreakIteratorTokenizer(Locale locale)
Instantiates a new Break iterator tokenizer.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description Iterable<Tokenizer.Token>
tokenize(Reader reader)
Tokenizes an given reader into tokens.Iterable<Tokenizer.Token>
tokenize(String input)
Tokenizes a given string into token.
-
-
-
Constructor Detail
-
BreakIteratorTokenizer
public BreakIteratorTokenizer(Locale locale)
Instantiates a new Break iterator tokenizer.- Parameters:
locale
- the locale
-
-
Method Detail
-
tokenize
public Iterable<Tokenizer.Token> tokenize(Reader reader)
Description copied from interface:Tokenizer
Tokenizes an given reader into tokens. All IO errors should be rethrown as runtime exceptions.
-
-