Package com.gengoai.hermes.format
Class TaggedFormat
- java.lang.Object
-
- com.gengoai.hermes.format.WholeFileTextFormat
-
- com.gengoai.hermes.format.TaggedFormat
-
- All Implemented Interfaces:
DocFormat
,OneDocPerFileFormat
,Serializable
public class TaggedFormat extends WholeFileTextFormat implements OneDocPerFileFormat, Serializable
Format Name: tagged
Format with words separated by whitespace and sequences labeled in SGML like tags, e.g. <TAG>My text</TAG>. The annotation type of the tagged spans is set via the "annotationType" parameter.
- See Also:
- Serialized Form
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
TaggedFormat.Provider
The type Provider.static class
TaggedFormat.TaggedParameters
The type Tagged parameters.
-
Field Summary
Fields Modifier and Type Field Description static ParameterDef<AnnotationType>
ANNOTATION_TYPE
The constant ANNOTATION_TYPE.static ParameterDef<Boolean>
IS_TOKENIZED
The constant IS_TOKENIZED.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description DocFormatParameters
getParameters()
protected Stream<Document>
readSingleFile(String content)
Converts the content of an entire file into one ore more documents.void
write(Document document, Resource outputResource)
Writes the given document in this format to the given output resource.-
Methods inherited from class com.gengoai.hermes.format.WholeFileTextFormat
read
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface com.gengoai.hermes.format.OneDocPerFileFormat
write
-
-
-
-
Field Detail
-
ANNOTATION_TYPE
public static final ParameterDef<AnnotationType> ANNOTATION_TYPE
The constant ANNOTATION_TYPE.
-
IS_TOKENIZED
public static final ParameterDef<Boolean> IS_TOKENIZED
The constant IS_TOKENIZED.
-
-
Method Detail
-
getParameters
public DocFormatParameters getParameters()
- Specified by:
getParameters
in interfaceDocFormat
- Returns:
- the
DocFormatParameters
set for the instance of this foramt
-
readSingleFile
protected Stream<Document> readSingleFile(String content)
Description copied from class:WholeFileTextFormat
Converts the content of an entire file into one ore more documents.- Specified by:
readSingleFile
in classWholeFileTextFormat
- Parameters:
content
- the content- Returns:
- the stream of documents.
-
write
public void write(Document document, Resource outputResource) throws IOException
Description copied from interface:DocFormat
Writes the given document in this format to the given output resource.- Specified by:
write
in interfaceDocFormat
- Parameters:
document
- the documentoutputResource
- the output resource- Throws:
IOException
- Something went wrong writing the document
-
-