Package com.gengoai.hermes.morphology
Interface Tokenizer
-
- All Known Implementing Classes:
BreakIteratorTokenizer
,ENTokenizer
public interface Tokenizer
Low level tokenization of strings
- Author:
- David B. Bracewell
-
-
Nested Class Summary
Nested Classes Modifier and Type Interface Description static class
Tokenizer.Token
An internal token
-
Method Summary
All Methods Instance Methods Abstract Methods Default Methods Modifier and Type Method Description default Iterable<Tokenizer.Token>
tokenize(@NonNull String input)
Tokenizes a given string into token.Iterable<Tokenizer.Token>
tokenize(Reader reader)
Tokenizes an given reader into tokens.
-
-
-
Method Detail
-
tokenize
Iterable<Tokenizer.Token> tokenize(Reader reader)
Tokenizes an given reader into tokens. All IO errors should be rethrown as runtime exceptions.- Parameters:
reader
- the reader- Returns:
- an iterable of tokens.
-
tokenize
default Iterable<Tokenizer.Token> tokenize(@NonNull @NonNull String input)
Tokenizes a given string into token.- Parameters:
input
- the input String- Returns:
- an iterable of tokens
- Throws:
NullPointerException
- if the String is null
-
-