Package com.gengoai.hermes.preprocessing
Class UnicodeNormalizer
- java.lang.Object
-
- com.gengoai.hermes.preprocessing.TextNormalizer
-
- com.gengoai.hermes.preprocessing.UnicodeNormalizer
-
- All Implemented Interfaces:
Serializable
public class UnicodeNormalizer extends TextNormalizer
Converts unicode to canonical form and removes smart quotes.
- Author:
- David B. Bracewell
- See Also:
- Serialized Form
-
-
Constructor Summary
Constructors Constructor Description UnicodeNormalizer()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description String
performNormalization(String input, Language inputLanguage)
Performs a pre-processing operation on the input string in the given input language-
Methods inherited from class com.gengoai.hermes.preprocessing.TextNormalizer
apply
-
-
-
-
Method Detail
-
performNormalization
public String performNormalization(String input, Language inputLanguage)
Description copied from class:TextNormalizer
Performs a pre-processing operation on the input string in the given input language- Specified by:
performNormalization
in classTextNormalizer
- Parameters:
input
- The input textinputLanguage
- The language of the input- Returns:
- The post-processed text
-
-