Interface HString
-
- All Superinterfaces:
CharSequence
,Comparable<Span>
,Serializable
,Span
,StringLike
- All Known Subinterfaces:
Annotation
,Document
public interface HString extends Span, StringLike, Serializable
An HString (Hermes String) is a Java String on steroids. It represents the base type of all Hermes text objects. Every HString has an associated span denoting its starting and ending character offset within the document. HStrings implement the CharSequence interface allowing them to be used in many of Java's builtin String methods and they have similar methods as found on Java Strings. Importantly, methods not modifying the underlying string, e.g. substring and find, return an HString whereas methods that modify the string, e.g. toLowerCase, return a String object. The String-Like operations are as follows:
- Author:
- David B. Bracewell
-
-
Method Summary
All Methods Static Methods Instance Methods Abstract Methods Default Methods Modifier and Type Method Description void
add(Relation relation)
Adds an outgoing relation to the HString.default void
addAll(@NonNull Iterable<Relation> relations)
Adds multiple outgoing relations to the HString.default RelationGraph
annotationGraph(Tuple relationTypes, AnnotationType... annotationTypes)
Constructs a relation graph with the given relation types as the edges and the given annotation types as the vertices (theinterleaved(AnnotationType...)
method is used to get the annotations).default List<Annotation>
annotations()
Gets all annotations overlapping this HStringdefault List<Annotation>
annotations(@NonNull AnnotationType type)
Gets annotations of a given type that overlap with this HString.default List<Annotation>
annotations(@NonNull AnnotationType type, @NonNull Predicate<? super Annotation> filter)
Gets annotations of a given type and that test positive for the given filter that overlap with this HString.default Stream<Annotation>
annotationStream()
Gets a java Stream over all annotations overlapping this HString.default Stream<Annotation>
annotationStream(@NonNull AnnotationType type)
Gets a java Stream over annotations of the given type overlapping this HString.default Annotation
asAnnotation()
Gets this HString as an annotation.default Annotation
asAnnotation(@NonNull AnnotationType type)
Attempts to cast this HString to an Annotation of the given type.default boolean
atBeginningOfSentence()
default boolean
atEndOfSentence()
default <T> T
attribute(@NonNull AttributeType<T> attributeType)
Gets the value for a given attribute typedefault <T> T
attribute(@NonNull AttributeType<T> attributeType, T defaultValue)
Gets the value for a given attribute typedefault <T> boolean
attributeEquals(@NonNull AttributeType<T> attributeType, Object targetValue)
Checks if the HString has an attribute of the given type that isequal
to the given target value is used.default <T> boolean
attributeIsA(@NonNull AttributeType<T> attributeType, Object targetValue)
Checks if the HString has an attribute of the given type thatis a
instance of the given target value.AttributeMap
attributeMap()
Exposes the underlying attributes as a Mapdefault Set<BasicCategories>
categories()
default char
charAt(int index)
default List<HString>
charNGrams(int order)
Extracts character n-grams of the given order (e.g.default List<HString>
charNGrams(int minOrder, int maxOrder)
Extracts all character n-grams from the given minimum to given maximum order (e.g.default List<Annotation>
children()
Gets all child annotations, i.e.default List<Annotation>
children(@NonNull String relation)
Gets all child annotations, i.e.default <T> T
computeIfAbsent(@NonNull AttributeType<T> attributeType, @NonNull Supplier<T> supplier)
Sets the value of an attribute if a value is not already set.default HString
context(int windowSize)
Generates a new HString consisting of this HString and its given window size (number) of tokens to the left and right.default HString
context(@NonNull AnnotationType type, int windowSize)
Generates a new HString consisting of this HString and its given window size (number) of annotation type to the left and right.default Tuple2<String,Annotation>
dependency()
default RelationGraph
dependencyGraph()
Creates aRelationGraph
with dependency edges and token vertices.default RelationGraph
dependencyGraph(@NonNull AnnotationType... types)
Creates aRelationGraph
with dependency edges and vertices made up of the given types.default boolean
dependencyIsA(String... values)
Checks if this HString has one of the given dependency relations to its parent.Document
document()
default NDArray
embedding()
default NDArray
embedding(@NonNull Predicate<HString> filter)
default List<Annotation>
enclosedAnnotations()
default List<Annotation>
enclosedAnnotations(@NonNull AnnotationType annotationType)
Gets all annotations of the given type enclosed by this HStringdefault boolean
encloses(HString other)
Checks if this HString encloses the given other HString.default HString
find(String text)
Finds the given text in this HString starting from the beginning of this HString.default HString
find(String text, int start)
Finds the given text in this HString starting from the given start index of this HString.default Stream<HString>
findAll(String text)
Finds all occurrences of the given text in this HStringdefault Annotation
first(@NonNull AnnotationType type)
Gets the first annotation overlapping this HString with the given annotation type.default Annotation
firstToken()
Gets the first token annotation overlapping this HString.default void
forEach(@NonNull AnnotationType type, @NonNull Consumer<? super Annotation> consumer)
Convenience method for processing annotations of a given type.default Language
getLanguage()
default String
getLemma()
Gets the lemmatized version of the HString.default UniversalFeatureSet
getMorphologicalFeatures()
default String
getStemmedForm()
Gets the stemmed version of the HString.default boolean
hasAnnotation(@NonNull AnnotationType annotationType)
Determines if a annotation of a given type is associated with the HStringdefault boolean
hasAttribute(@NonNull AttributeType<?> attributeType)
Determines if an attribute of a given type is associated with the HStringdefault boolean
hasIncomingRelation(@NonNull RelationType relationType)
Determines if an incoming relation of a given type is associated with the HStringdefault boolean
hasIncomingRelation(@NonNull RelationType type, String value)
Checks if the HString has at least one incoming relation of the given type with the given value.default boolean
hasOutgoingRelation(@NonNull RelationType relationType)
Determines if an outgoing relation of a given type is associated with the HStringdefault boolean
hasOutgoingRelation(@NonNull RelationType type, String value)
Checks if the HString has at least one outgoing relation of the given type with the given value.default HString
head()
Gets the token that is highest in the dependency tree for this HStringdefault void
ifNotEmpty(@NonNull Consumer<? super HString> processor)
Runs the given processor on the HString if it is not empty.default List<Annotation>
incoming(@NonNull RelationType type)
Gets all annotations that have relation with this HString as the target where this HString includes all sub-annotations.default List<Annotation>
incoming(@NonNull RelationType type, boolean includeSubAnnotations)
Gets all annotations that have relation with this HString as the target.default List<Annotation>
incoming(@NonNull RelationType type, @NonNull String value, boolean includeSubAnnotations)
Gets all annotations that have relation with this HString as the target.default List<Annotation>
incoming(RelationType type, String value)
Gets all annotations that have relation with this HString as the target where this HString includes all sub-annotations.default List<Relation>
incomingRelations()
Get all incoming relations to this HString and its sub-annotations.default List<Relation>
incomingRelations(boolean includeSubAnnotations)
Gets all incoming relations to this HString.default List<Relation>
incomingRelations(@NonNull RelationType relationType)
Gets all relations of the given type targeting this HString or one of its sub-annotations.default List<Relation>
incomingRelations(@NonNull RelationType relationType, boolean includeSubAnnotations)
Gets all relations of the given type targeting this HString.default Stream<Relation>
incomingRelationStream()
Get all incoming relations to this HString and its sub-annotations.default Stream<Relation>
incomingRelationStream(boolean includeSubAnnotations)
Gets all incoming relations to this HString.default List<Annotation>
interleaved(AnnotationType... types)
Returns the annotations of the given types that overlap this string in a maximum match fashion.default boolean
isA(@NonNull BasicCategories... categories)
Checks if this HString has a base category of one of the ones given.default boolean
isAnnotation()
Is this HString an annotation?default boolean
isDocument()
default boolean
isInstance(AnnotationType type)
Returns true this HString is an instance of the given annotation typedefault Annotation
last(@NonNull AnnotationType type)
Gets the last annotation overlapping this HString with the given annotation type.default Annotation
lastToken()
default HString
leftContext(int windowSize)
Generates an HString representing thewindowSize
tokens to the left of the start of this HString.default HString
leftContext(@NonNull AnnotationType type, int windowSize)
Generates an HString representing thewindowSize
of given annotation types to the left of the start of this HString.default int
length()
default Annotation
next(@NonNull AnnotationType type)
Gets the annotation of a given type that is next in order (of span) to this onedefault List<Annotation>
outgoing(@NonNull RelationType type, boolean includeSubAnnotations)
Gets all annotations with which this HString has an outgoing relation of the given type.default List<Annotation>
outgoing(@NonNull RelationType type, String value, boolean includeSubAnnotations)
Gets all annotations with which this HString has an outgoing relation of the given type and value.default List<Annotation>
outgoing(RelationType type)
Gets all annotations with which this HString or any of its sub-annotations has an outgoing relation of the given type.default List<Annotation>
outgoing(RelationType type, String value)
Gets all annotations with which this HString or any of its sub-annotations has an outgoing relation of the given type and value.default List<Relation>
outgoingRelations()
Get all outgoing relations to this HString and its sub-annotations.default List<Relation>
outgoingRelations(boolean includeSubAnnotations)
Gets all outgoing relations to this HString.default List<Relation>
outgoingRelations(@NonNull RelationType relationType)
Gets all relations of the given type originating from this HString or one of its sub-annotations.default List<Relation>
outgoingRelations(@NonNull RelationType relationType, boolean includeSubAnnotations)
Gets all relations of the given type originating from this HString.default Stream<Relation>
outgoingRelationStream()
Get all outgoing relations to this HString and its sub-annotations.default Stream<Relation>
outgoingRelationStream(boolean includeSubAnnotations)
Gets all outgoing relations to this HString.default boolean
overlaps(HString other)
Checks if this HString overlaps with the given other.default Annotation
parent()
Gets the dependency parent of this HStringdefault PartOfSpeech
pos()
Gets the part-of-speech of the HStringdefault Annotation
previous(@NonNull AnnotationType type)
Gets the annotation of a given type that is previous in order (of span) to this onedefault <T> T
put(@NonNull AttributeType<T> attributeType, T value)
Sets the value of an attribute.default <E,T extends Collection<E>>
voidputAdd(@NonNull AttributeType<T> attributeType, @NonNull Iterable<E> items)
Allows adding multiple values to a Collection based attribute.default void
putAll(@NonNull HString hString)
Copies the attribute values from the given HString to this onedefault void
putAll(@NonNull Map<AttributeType<?>,?> map)
Sets attributes on this HString from those in the given map.default <T> T
putIfAbsent(@NonNull AttributeType<T> attributeType, T value)
Sets the value of an attribute if a value is not already set.default <T> T
removeAttribute(@NonNull AttributeType<T> attributeType)
Removes an attribute from the HString.void
removeRelation(@NonNull Relation relation)
Removes the given relation from this annotationdefault HString
rightContext(int windowSize)
Generates an HString representing thewindowSize
tokens to the right of the end of this HString without going past the sentence end.default HString
rightContext(@NonNull AnnotationType type, int windowSize)
Generates an HString representing thewindowSize
of given annotation types to the right of the end of this HString without going past the sentence end.default Annotation
sentence()
Assumes the HString only overlaps with a single sentence and returns it.default List<Annotation>
sentences()
Gets the sentences overlapping this HStringdefault Stream<Annotation>
sentenceStream()
Gets a java Stream over the sentences overlapping this HString.default void
setLanguage(Language language)
Sets the language of the HStringdefault List<HString>
split(Predicate<? super Annotation> delimiterPredicate)
Splits this HString using the given predicate to apply against tokens.default List<Annotation>
startingHere(@NonNull AnnotationType type)
Gets annotations of a given type that have the same starting offset as this HString.default HString
substring(int relativeStart, int relativeEnd)
Returns a new HString that is a substring of this one.default Document
toDocument()
Converts this HString into a Document copying the annotations and relations.static HString
toHString(Object o)
Helper function for converting an Object into an HString.default Annotation
tokenAt(int tokenIndex)
Gets the token at the given token index which is a relative offset from this HString.default int
tokenLength()
The length of the HString in tokensdefault List<Annotation>
tokens()
Gets the tokens overlapping this HString.default Stream<Annotation>
tokenStream()
Gets a java Stream over the tokens overlapping this HString.default String
toPOSString()
Converts the HString to a string with part-of-speech information attached using_
as the delimiterdefault String
toPOSString(char delimiter)
Converts the HString to a string with part-of-speech information attached using the given delimiterdefault HString
trim(@NonNull Predicate<? super HString> toTrimPredicate)
Trims tokens off the left and right of this HString that match the given predicate.default HString
trimLeft(@NonNull Predicate<? super HString> toTrimPredicate)
Trims tokens off the left of this HString that match the given predicate.default HString
trimRight(@NonNull Predicate<? super HString> toTrimPredicate)
Trims tokens off the right of this HString that match the given predicate.default HString
union(@NonNull HString other)
Creates a new string by performing a union over the spans of this HString and at least one more HString.static HString
union(@NonNull HString first, @NonNull HString second, @NonNull HString... others)
Creates a new string by performing a union over the spans of two or more HStrings.static HString
union(@NonNull Iterable<? extends HString> strings)
Creates a new string by performing a union over the spans of two or more HStrings.-
Methods inherited from interface java.lang.CharSequence
chars, codePoints, toString
-
Methods inherited from interface com.gengoai.collection.tree.Span
compareTo, encloses, end, isEmpty, overlaps, start
-
Methods inherited from interface com.gengoai.string.StringLike
contains, contentEquals, contentEqualsIgnoreCase, endsWith, indexOf, indexOf, matcher, matcher, matches, replace, replaceAll, replaceFirst, startsWith, subSequence, toCharArray, toLowerCase, toUpperCase
-
-
-
-
Method Detail
-
toHString
static HString toHString(Object o)
Helper function for converting an Object into an HString. Will construct fragments for nulls and strings. Objects not convertible into HStrings result in detached empty annotaitons.- Parameters:
o
- the object to convert- Returns:
- the HString result
-
union
static HString union(@NonNull @NonNull HString first, @NonNull @NonNull HString second, @NonNull @NonNull HString... others)
Creates a new string by performing a union over the spans of two or more HStrings. The new HString will have a span that starts at the minimum starting position of the given strings and end at the maximum ending position of the given strings.- Parameters:
first
- the first HStringsecond
- the second HStringothers
- the other HStrings to union- Returns:
- A new HString representing the union over the spans of the given HStrings.
-
union
static HString union(@NonNull @NonNull Iterable<? extends HString> strings)
Creates a new string by performing a union over the spans of two or more HStrings. The new HString will have a span that starts at the minimum starting position of the given strings and end at the maximum ending position of the given strings.- Parameters:
strings
- the HStrings to union- Returns:
- A new HString representing the union over the spans of the given HStrings.
-
add
void add(Relation relation)
Adds an outgoing relation to the HString.- Parameters:
relation
- the relation to add
-
addAll
default void addAll(@NonNull @NonNull Iterable<Relation> relations)
Adds multiple outgoing relations to the HString.- Parameters:
relations
- the relations to add
-
annotationGraph
default RelationGraph annotationGraph(Tuple relationTypes, AnnotationType... annotationTypes)
Constructs a relation graph with the given relation types as the edges and the given annotation types as the vertices (the
interleaved(AnnotationType...)
method is used to get the annotations). Relations will be determine for annotations by including the relations of their sub-annotations (i.e. sub-spans). This allows, for example, a dependency graph to be built over other annotation types, e.g. phrase chunks.- Parameters:
relationTypes
- the relation types making up the edgesannotationTypes
- annotation types making up the vertices- Returns:
- the relation graph
-
annotationStream
default Stream<Annotation> annotationStream()
Gets a java Stream over all annotations overlapping this HString.- Returns:
- the stream of annotations
-
annotationStream
default Stream<Annotation> annotationStream(@NonNull @NonNull AnnotationType type)
Gets a java Stream over annotations of the given type overlapping this HString.- Parameters:
type
- the type of annotation making up the stream- Returns:
- the stream of given annotation type
-
annotations
default List<Annotation> annotations()
Gets all annotations overlapping this HString- Returns:
- all annotations overlapping with this HString.
-
annotations
default List<Annotation> annotations(@NonNull @NonNull AnnotationType type, @NonNull @NonNull Predicate<? super Annotation> filter)
Gets annotations of a given type and that test positive for the given filter that overlap with this HString.- Parameters:
type
- the type of annotation wantedfilter
- The filter that annotations must pass in order to be accepted- Returns:
- the list of annotations of given type meeting the given filter that overlap with this HString
-
annotations
default List<Annotation> annotations(@NonNull @NonNull AnnotationType type)
Gets annotations of a given type that overlap with this HString.- Parameters:
type
- the type of annotation wanted- Returns:
- the list of annotations of given type that overlap with this HString
-
asAnnotation
default Annotation asAnnotation()
Gets this HString as an annotation. If the HString is already an annotation it is simply cast. Otherwise a detached annotation of typeAnnotationType.ROOT
is created.- Returns:
- An annotation.
-
asAnnotation
default Annotation asAnnotation(@NonNull @NonNull AnnotationType type)
Attempts to cast this HString to an Annotation of the given type. If the HString does not represent an annotation of the given type it will create a dummy detached annotation (orphaned if this HString is orphaned).- Parameters:
type
- the desired annotation type- Returns:
- the annotation
-
atBeginningOfSentence
default boolean atBeginningOfSentence()
- Returns:
- True if this HString's start is the same as the start of its sentence.
-
atEndOfSentence
default boolean atEndOfSentence()
- Returns:
- True if this HString's end is the same as the end of its sentence.
-
attribute
default <T> T attribute(@NonNull @NonNull AttributeType<T> attributeType)
Gets the value for a given attribute type- Type Parameters:
T
- the type parameter- Parameters:
attributeType
- the attribute type- Returns:
- the value associated with the attribute or null
-
attribute
default <T> T attribute(@NonNull @NonNull AttributeType<T> attributeType, T defaultValue)
Gets the value for a given attribute type- Type Parameters:
T
- the type parameter- Parameters:
attributeType
- the attribute typedefaultValue
- the defualt value- Returns:
- the value associated with the attribute or null
-
attributeEquals
default <T> boolean attributeEquals(@NonNull @NonNull AttributeType<T> attributeType, Object targetValue)
Checks if the HString has an attribute of the given type that isequal
to the given target value is used.- Type Parameters:
T
- the attribute type parameter- Parameters:
attributeType
- the attribute value to checktargetValue
- the value we are checking if this string's attribute value is equal- Returns:
- True if the HString has the attribute and it is equal to the given target value
-
attributeIsA
default <T> boolean attributeIsA(@NonNull @NonNull AttributeType<T> attributeType, Object targetValue)
Checks if the HString has an attribute of the given type thatis a
instance of the given target value. When the attribute type is a Tag, theisInstance
method is used otherwiseequals
is used.- Type Parameters:
T
- the attribute type parameter- Parameters:
attributeType
- the attribute value to checktargetValue
- the value we are checking if this string's attribute value is an instance of- Returns:
- True if the HString has the attribute and it is an instance of the given target value
-
attributeMap
AttributeMap attributeMap()
Exposes the underlying attributes as a Map- Returns:
- The attribute names and values as a map
-
categories
default Set<BasicCategories> categories()
- Returns:
- the set of base categories covering all tokens of this HString.
-
charAt
default char charAt(int index)
- Specified by:
charAt
in interfaceCharSequence
- Specified by:
charAt
in interfaceStringLike
-
charNGrams
default List<HString> charNGrams(int order)
Extracts character n-grams of the given order (e.g. 1=unigram, 2=bigram, etc.)- Parameters:
order
- the order of the n-gram to extract- Returns:
- the list of character n-grams of given order making up this HString
-
charNGrams
default List<HString> charNGrams(int minOrder, int maxOrder)
Extracts all character n-grams from the given minimum to given maximum order (e.g. 1=unigram, 2=bigram, etc.)- Parameters:
minOrder
- the minimum ordermaxOrder
- the maximum order- Returns:
- the list of character n-grams of order
minOrder
tomaxOrder
making up this HString - Throws:
IllegalArgumentException
- If minOrder > maxOrder or minOrder <= 0
-
children
default List<Annotation> children(@NonNull @NonNull String relation)
Gets all child annotations, i.e. those annotations that have a dependency relation pointing this HString, with the given dependency relation.- Parameters:
relation
- The dependency relation value- Returns:
- the list of child annotations
-
children
default List<Annotation> children()
Gets all child annotations, i.e. those annotations that have a dependency relation pointing this HString.- Returns:
- the list of child annotations
-
computeIfAbsent
default <T> T computeIfAbsent(@NonNull @NonNull AttributeType<T> attributeType, @NonNull @NonNull Supplier<T> supplier)
Sets the value of an attribute if a value is not already set. Removes the attribute if the value is null and ignores setting a value if the attribute is null.- Type Parameters:
T
- the type parameter- Parameters:
attributeType
- the attribute typesupplier
- the supplier to generate the new value- Returns:
- The old value of the attribute or null
-
context
default HString context(int windowSize)
Generates a new HString consisting of this HString and its given window size (number) of tokens to the left and right. Note sentence boundaries are observed and the context will not go across sentences.- Parameters:
windowSize
- the window size- Returns:
- the contextualized HString
-
context
default HString context(@NonNull @NonNull AnnotationType type, int windowSize)
Generates a new HString consisting of this HString and its given window size (number) of annotation type to the left and right. Note sentence boundaries are observed and the context will not go across sentences.- Parameters:
type
- the annotation typewindowSize
- the window size- Returns:
- the contextualized HString
-
dependency
default Tuple2<String,Annotation> dependency()
- Returns:
- the outgoing dependency relation and parent for this HString or an tuple of empty string and empty annotation if there is no parent.
-
dependencyGraph
default RelationGraph dependencyGraph()
Creates aRelationGraph
with dependency edges and token vertices.- Returns:
- the dependency relation graph
-
dependencyGraph
default RelationGraph dependencyGraph(@NonNull @NonNull AnnotationType... types)
Creates aRelationGraph
with dependency edges and vertices made up of the given types.- Parameters:
types
- The annotation types making up the vertices of the dependency relation graph.- Returns:
- the dependency relation graph
-
dependencyIsA
default boolean dependencyIsA(String... values)
Checks if this HString has one of the given dependency relations to its parent.- Parameters:
values
- the dependency relation values to check- Returns:
- True if this HString dependency relation is one of the given values.
-
document
Document document()
- Returns:
- the document that this HString is associated with
-
embedding
default NDArray embedding()
-
enclosedAnnotations
default List<Annotation> enclosedAnnotations()
- Returns:
- all annotations enclosed by this HString
-
enclosedAnnotations
default List<Annotation> enclosedAnnotations(@NonNull @NonNull AnnotationType annotationType)
Gets all annotations of the given type enclosed by this HString- Parameters:
annotationType
- the annotation type we want- Returns:
- the enclosed annotations
-
encloses
default boolean encloses(HString other)
Checks if this HString encloses the given other HString.- Parameters:
other
- The other HString- Returns:
- True of this one encloses the given other.
-
find
default HString find(String text)
Finds the given text in this HString starting from the beginning of this HString. If the document is annotated with tokens, the match will extend to the token(s) covering the match.- Parameters:
text
- the text to search for- Returns:
- the HString for the match or empty if no match is found.
-
find
default HString find(String text, int start)
Finds the given text in this HString starting from the given start index of this HString. If the document is annotated with tokens, the match will extend to the token(s) covering the match.- Parameters:
text
- the text to search forstart
- the index to start the search from- Returns:
- the HString for the match or empty if no match is found.
-
findAll
default Stream<HString> findAll(String text)
Finds all occurrences of the given text in this HString- Parameters:
text
- the text to search for- Returns:
- A list of HString that are matches to the given string
-
first
default Annotation first(@NonNull @NonNull AnnotationType type)
Gets the first annotation overlapping this HString with the given annotation type.- Parameters:
type
- the annotation type- Returns:
- the first annotation of the given type overlapping this HString or an empty annotation if there is none.
-
firstToken
default Annotation firstToken()
Gets the first token annotation overlapping this HString.- Returns:
- the forst token annotation
-
forEach
default void forEach(@NonNull @NonNull AnnotationType type, @NonNull @NonNull Consumer<? super Annotation> consumer)
Convenience method for processing annotations of a given type.- Parameters:
type
- the annotation typeconsumer
- the consumer to use for processing annotations
-
getLanguage
default Language getLanguage()
- Specified by:
getLanguage
in interfaceStringLike
-
setLanguage
default void setLanguage(Language language)
Sets the language of the HString- Parameters:
language
- The language of the HString.
-
getLemma
default String getLemma()
Gets the lemmatized version of the HString. Lemmas of longer phrases are constructed from token lemmas.- Returns:
- The lemmatized version of the HString.
-
getMorphologicalFeatures
default UniversalFeatureSet getMorphologicalFeatures()
-
getStemmedForm
default String getStemmedForm()
Gets the stemmed version of the HString. Stems of token are determined using theStemmer
associated with the language that the token is in. Tokens store their stem using theSTEM
attribute, so that the stem only needs to be calculated once.Stems of longer phrases are constructed from token stems.- Returns:
- The stemmed version of the HString.
-
hasAnnotation
default boolean hasAnnotation(@NonNull @NonNull AnnotationType annotationType)
Determines if a annotation of a given type is associated with the HString- Parameters:
annotationType
- The annotation type- Returns:
- True if an annotation of the given type is associated with the HString, False otherwise
-
hasAttribute
default boolean hasAttribute(@NonNull @NonNull AttributeType<?> attributeType)
Determines if an attribute of a given type is associated with the HString- Parameters:
attributeType
- The attribute type- Returns:
- True if the attribute is associated with the HString, False otherwise
-
hasIncomingRelation
default boolean hasIncomingRelation(@NonNull @NonNull RelationType type, String value)
Checks if the HString has at least one incoming relation of the given type with the given value. Will check sub-annotations as well.- Parameters:
type
- the relation typevalue
- the relation value- Returns:
- True if there as an incoming relation to this HString or a sub-annotation of the given type with the given value.
-
hasIncomingRelation
default boolean hasIncomingRelation(@NonNull @NonNull RelationType relationType)
Determines if an incoming relation of a given type is associated with the HString- Parameters:
relationType
- The relation type- Returns:
- True if the relation is associated with the HString, False otherwise
-
hasOutgoingRelation
default boolean hasOutgoingRelation(@NonNull @NonNull RelationType type, String value)
Checks if the HString has at least one outgoing relation of the given type with the given value. Will check sub-annotations as well.- Parameters:
type
- the relation typevalue
- the relation value- Returns:
- True if there as an outgoing relation to this HString or a sub-annotation of the given type with the given value.
-
hasOutgoingRelation
default boolean hasOutgoingRelation(@NonNull @NonNull RelationType relationType)
Determines if an outgoing relation of a given type is associated with the HString- Parameters:
relationType
- The relation type- Returns:
- True if the relation is associated with the HString, False otherwise
-
head
default HString head()
Gets the token that is highest in the dependency tree for this HString- Returns:
- the head
-
ifNotEmpty
default void ifNotEmpty(@NonNull @NonNull Consumer<? super HString> processor)
Runs the given processor on the HString if it is not empty.- Parameters:
processor
- the processor to run on this HString if it is not empty.
-
incoming
default List<Annotation> incoming(RelationType type, String value)
Gets all annotations that have relation with this HString as the target where this HString includes all sub-annotations.- Parameters:
type
- the relation typevalue
- the value of the relation- Returns:
- the annotations
-
incoming
default List<Annotation> incoming(@NonNull @NonNull RelationType type, @NonNull @NonNull String value, boolean includeSubAnnotations)
Gets all annotations that have relation with this HString as the target. IfincludedSubAnnotations
istrue
then all sub-annotations are examined as potential targets.- Parameters:
type
- the relation typevalue
- the relation valueincludeSubAnnotations
- True - this HString or any of its sub-annotations can be the target, False - only relations with this exact HString as the target.- Returns:
- the annotations
-
incoming
default List<Annotation> incoming(@NonNull @NonNull RelationType type)
Gets all annotations that have relation with this HString as the target where this HString includes all sub-annotations.- Parameters:
type
- the relation type- Returns:
- the annotations
-
incoming
default List<Annotation> incoming(@NonNull @NonNull RelationType type, boolean includeSubAnnotations)
Gets all annotations that have relation with this HString as the target. IfincludedSubAnnotations
istrue
then all sub-annotations are examined as potential targets.- Parameters:
type
- the relation typeincludeSubAnnotations
- True - this HString or any of its sub-annotations can be the target, False - only relations with this exact HString as the target.- Returns:
- the annotations
-
incomingRelationStream
default Stream<Relation> incomingRelationStream()
Get all incoming relations to this HString and its sub-annotations.- Returns:
- the stream of relations
-
incomingRelationStream
default Stream<Relation> incomingRelationStream(boolean includeSubAnnotations)
Gets all incoming relations to this HString.- Parameters:
includeSubAnnotations
- True - include relations to sub-annotations- Returns:
- the stream of relations
-
incomingRelations
default List<Relation> incomingRelations()
Get all incoming relations to this HString and its sub-annotations.- Returns:
- the collection of relations
-
incomingRelations
default List<Relation> incomingRelations(boolean includeSubAnnotations)
Gets all incoming relations to this HString.- Parameters:
includeSubAnnotations
- True - include relations to sub-annotations- Returns:
- the collection of relations
-
incomingRelations
default List<Relation> incomingRelations(@NonNull @NonNull RelationType relationType)
Gets all relations of the given type targeting this HString or one of its sub-annotations.- Parameters:
relationType
- the relation type- Returns:
- the relations
-
incomingRelations
default List<Relation> incomingRelations(@NonNull @NonNull RelationType relationType, boolean includeSubAnnotations)
Gets all relations of the given type targeting this HString.- Parameters:
relationType
- the relation typeincludeSubAnnotations
- True - include relations to sub-annotations- Returns:
- the relations
-
interleaved
default List<Annotation> interleaved(AnnotationType... types)
Returns the annotations of the given types that overlap this string in a maximum match fashion. Each token in the string is examined and the annotation type with the longest span on that token is chosen. If more than one type has the span length, the first one found will be chose, i.e. the order in which the types are passed in to the method can effect the outcome.
Examples where this is useful is when dealing with multi-word expressions. Using the interleaved method you can retrieve all tokens and multi-word expressions to fully match the span of the string.
- Parameters:
types
- The other types to examine- Returns:
- The list of interleaved annotations
-
isA
default boolean isA(@NonNull @NonNull BasicCategories... categories)
Checks if this HString has a base category of one of the ones given.- Parameters:
categories
- the categories to check for- Returns:
- if this HString has a base category of one of the ones given.
-
isAnnotation
default boolean isAnnotation()
Is this HString an annotation?- Returns:
- True if this HString represents an annotation
-
isDocument
default boolean isDocument()
- Returns:
- True if this HString represents a document
-
isInstance
default boolean isInstance(AnnotationType type)
Returns true this HString is an instance of the given annotation type- Parameters:
type
- the annotation type- Returns:
- True if this HString is an annotation of the given type
-
last
default Annotation last(@NonNull @NonNull AnnotationType type)
Gets the last annotation overlapping this HString with the given annotation type.- Parameters:
type
- the annotation type- Returns:
- the last annotation of the given type overlapping this HString or a detached empty annotation if there is none.
-
lastToken
default Annotation lastToken()
- Returns:
- the last token annotation overlapping this HString
-
leftContext
default HString leftContext(int windowSize)
Generates an HString representing thewindowSize
tokens to the left of the start of this HString.- Parameters:
windowSize
- the number of tokens in the context.- Returns:
- the HString context
-
leftContext
default HString leftContext(@NonNull @NonNull AnnotationType type, int windowSize)
Generates an HString representing thewindowSize
of given annotation types to the left of the start of this HString.- Parameters:
type
- the annotation type to create the context of.windowSize
- the number of tokens in the context.- Returns:
- the HString context
-
length
default int length()
- Specified by:
length
in interfaceCharSequence
- Specified by:
length
in interfaceSpan
- Specified by:
length
in interfaceStringLike
-
next
default Annotation next(@NonNull @NonNull AnnotationType type)
Gets the annotation of a given type that is next in order (of span) to this one- Parameters:
type
- the type of annotation wanted- Returns:
- the next annotation of the given type or null
-
outgoing
default List<Annotation> outgoing(RelationType type)
Gets all annotations with which this HString or any of its sub-annotations has an outgoing relation of the given type.- Parameters:
type
- the relation type- Returns:
- the annotations
-
outgoing
default List<Annotation> outgoing(@NonNull @NonNull RelationType type, boolean includeSubAnnotations)
Gets all annotations with which this HString has an outgoing relation of the given type.- Parameters:
type
- the relation typeincludeSubAnnotations
- True - include annotations for which any of the sub-annotations has an outgoing relation.- Returns:
- the annotations
-
outgoing
default List<Annotation> outgoing(RelationType type, String value)
Gets all annotations with which this HString or any of its sub-annotations has an outgoing relation of the given type and value.- Parameters:
type
- the relation typevalue
- the relation value- Returns:
- the annotations
-
outgoing
default List<Annotation> outgoing(@NonNull @NonNull RelationType type, String value, boolean includeSubAnnotations)
Gets all annotations with which this HString has an outgoing relation of the given type and value.- Parameters:
type
- the relation typevalue
- the relation valueincludeSubAnnotations
- True - include annotations for which any of the sub-annotations has an outgoing relation.- Returns:
- the annotations
-
outgoingRelationStream
default Stream<Relation> outgoingRelationStream()
Get all outgoing relations to this HString and its sub-annotations.- Returns:
- the stream of relations
-
outgoingRelationStream
default Stream<Relation> outgoingRelationStream(boolean includeSubAnnotations)
Gets all outgoing relations to this HString.- Parameters:
includeSubAnnotations
- True - include relations to sub-annotations- Returns:
- the stream of relations
-
outgoingRelations
default List<Relation> outgoingRelations()
Get all outgoing relations to this HString and its sub-annotations.- Returns:
- the collection of relations
-
outgoingRelations
default List<Relation> outgoingRelations(boolean includeSubAnnotations)
Gets all outgoing relations to this HString.- Parameters:
includeSubAnnotations
- True - include relations to sub-annotations- Returns:
- the collection of relations
-
outgoingRelations
default List<Relation> outgoingRelations(@NonNull @NonNull RelationType relationType)
Gets all relations of the given type originating from this HString or one of its sub-annotations.- Parameters:
relationType
- the relation type- Returns:
- the relations
-
outgoingRelations
default List<Relation> outgoingRelations(@NonNull @NonNull RelationType relationType, boolean includeSubAnnotations)
Gets all relations of the given type originating from this HString.- Parameters:
relationType
- the relation typeincludeSubAnnotations
- True - include relations to sub-annotations- Returns:
- the relations
-
overlaps
default boolean overlaps(HString other)
Checks if this HString overlaps with the given other.- Parameters:
other
- The other HString- Returns:
- True of this one overlaps with the given other.
-
parent
default Annotation parent()
Gets the dependency parent of this HString- Returns:
- the parent
-
pos
default PartOfSpeech pos()
Gets the part-of-speech of the HString- Returns:
- The best part-of-speech for the HString
-
previous
default Annotation previous(@NonNull @NonNull AnnotationType type)
Gets the annotation of a given type that is previous in order (of span) to this one- Parameters:
type
- the type of annotation wanted- Returns:
- the previous annotation of the given type or null
-
put
default <T> T put(@NonNull @NonNull AttributeType<T> attributeType, T value)
Sets the value of an attribute. Removes the attribute if the value is null and ignores setting a value if the attribute is null.- Type Parameters:
T
- the type parameter- Parameters:
attributeType
- the attribute typevalue
- the value- Returns:
- The old value of the attribute or null
-
putAdd
default <E,T extends Collection<E>> void putAdd(@NonNull @NonNull AttributeType<T> attributeType, @NonNull @NonNull Iterable<E> items)
Allows adding multiple values to a Collection based attribute.- Type Parameters:
E
- the element type parameterT
- the attribute type parameter- Parameters:
attributeType
- the attribute typeitems
- the items to add
-
putAll
default void putAll(@NonNull @NonNull Map<AttributeType<?>,?> map)
Sets attributes on this HString from those in the given map.- Parameters:
map
- the attribute-value map
-
putAll
default void putAll(@NonNull @NonNull HString hString)
Copies the attribute values from the given HString to this one- Parameters:
hString
- The HString whose attributes we want to copy.
-
putIfAbsent
default <T> T putIfAbsent(@NonNull @NonNull AttributeType<T> attributeType, T value)
Sets the value of an attribute if a value is not already set. Removes the attribute if the value is null and ignores setting a value if the attribute is null.- Type Parameters:
T
- the type parameter- Parameters:
attributeType
- the attribute typevalue
- the value to put- Returns:
- The old value of the attribute or null
-
removeAttribute
default <T> T removeAttribute(@NonNull @NonNull AttributeType<T> attributeType)
Removes an attribute from the HString.- Type Parameters:
T
- the type parameter- Parameters:
attributeType
- the attribute type- Returns:
- the value that was associated with the attribute
-
removeRelation
void removeRelation(@NonNull @NonNull Relation relation)
Removes the given relation from this annotation- Parameters:
relation
- the relation to remove
-
rightContext
default HString rightContext(int windowSize)
Generates an HString representing thewindowSize
tokens to the right of the end of this HString without going past the sentence end.- Parameters:
windowSize
- the number of tokens in the context.- Returns:
- the HString context
-
rightContext
default HString rightContext(@NonNull @NonNull AnnotationType type, int windowSize)
Generates an HString representing thewindowSize
of given annotation types to the right of the end of this HString without going past the sentence end.- Parameters:
type
- the annotation type to create the context of.windowSize
- the number of tokens in the context.- Returns:
- the HString context
-
sentence
default Annotation sentence()
Assumes the HString only overlaps with a single sentence and returns it. This is equivalent to callingfirst(AnnotationType)
with the annotation type set toTypes.SENTENCE
- Returns:
- Returns the first, and possibly only, sentence this HString overlaps with.
-
sentenceStream
default Stream<Annotation> sentenceStream()
Gets a java Stream over the sentences overlapping this HString.- Returns:
- the stream of sentences
-
sentences
default List<Annotation> sentences()
Gets the sentences overlapping this HString- Returns:
- the sentences overlapping this annotation.
-
split
default List<HString> split(Predicate<? super Annotation> delimiterPredicate)
Splits this HString using the given predicate to apply against tokens. An example of where this might be useful is when we want to split long phrases on different punctuation, e.g. commas or semicolons.- Parameters:
delimiterPredicate
- the predicate to use to determine if a token is a delimiter or not- Returns:
- the list of split HString
-
startingHere
default List<Annotation> startingHere(@NonNull @NonNull AnnotationType type)
Gets annotations of a given type that have the same starting offset as this HString.- Parameters:
type
- the type of annotation wanted- Returns:
- the list of annotations of given type have the same starting offset as this HString.
-
substring
default HString substring(int relativeStart, int relativeEnd)
Returns a new HString that is a substring of this one. The substring begins at the specified relativeStart and extends to the character at index relativeEnd - 1. Thus the length of the substring is relativeEnd-relativeStart.- Parameters:
relativeStart
- the relative start within in this HStringrelativeEnd
- the relative end within this HString- Returns:
- the specified substring.
- Throws:
IndexOutOfBoundsException
- - if the relativeStart is negative, or relativeEnd is larger than the length of this HString object, or relativeStart is larger than relativeEnd.
-
toDocument
default Document toDocument()
Converts this HString into a Document copying the annotations and relations.- Returns:
- the new document covering this HString
-
toPOSString
default String toPOSString()
Converts the HString to a string with part-of-speech information attached using_
as the delimiter- Returns:
- the HString with part-of-speech information attached to tokens
-
toPOSString
default String toPOSString(char delimiter)
Converts the HString to a string with part-of-speech information attached using the given delimiter- Parameters:
delimiter
- the delimiter to use to separate word and part-of-speech- Returns:
- the HString with part-of-speech information attached to tokens
-
tokenAt
default Annotation tokenAt(int tokenIndex)
Gets the token at the given token index which is a relative offset from this HString. For example, given the document with the following tokens:
["the", "quick", "brown", "fox", "jumps", "over", "the", "lazy", "dog"]
and this annotated HString spanning["quick", "brown", "fox"]
"quick" would have a relative offset in this HString of 0 and document offset of 1.- Parameters:
tokenIndex
- the token index relative to the tokens overlapping this HString.- Returns:
- the token annotation at the relative offset
-
tokenLength
default int tokenLength()
The length of the HString in tokens- Returns:
- the number of tokens in this annotation
-
tokenStream
default Stream<Annotation> tokenStream()
Gets a java Stream over the tokens overlapping this HString.- Returns:
- the stream of tokens
-
tokens
default List<Annotation> tokens()
Gets the tokens overlapping this HString.- Returns:
- the tokens overlapping this annotation.
-
trim
default HString trim(@NonNull @NonNull Predicate<? super HString> toTrimPredicate)
Trims tokens off the left and right of this HString that match the given predicate.- Parameters:
toTrimPredicate
- the predicate to use to determine if a token should be removed (evaulate to TRUE) or kept (evaluate to FALSE).- Returns:
- the trimmed HString
-
trimLeft
default HString trimLeft(@NonNull @NonNull Predicate<? super HString> toTrimPredicate)
Trims tokens off the left of this HString that match the given predicate.- Parameters:
toTrimPredicate
- the predicate to use to determine if a token should be removed (evaulate to TRUE) or kept (evaluate to FALSE).- Returns:
- the trimmed HString
-
trimRight
default HString trimRight(@NonNull @NonNull Predicate<? super HString> toTrimPredicate)
Trims tokens off the right of this HString that match the given predicate.- Parameters:
toTrimPredicate
- the predicate to use to determine if a token should be removed (evaulate to TRUE) or kept (evaluate to FALSE).- Returns:
- the trimmed HString
-
union
default HString union(@NonNull @NonNull HString other)
Creates a new string by performing a union over the spans of this HString and at least one more HString. The new HString will have a span that starts at the minimum starting position of the given strings and end at the maximum ending position of the given strings.- Parameters:
other
- the HString to union with- Returns:
- A new HString representing the union over the spans of the given HStrings.
-
-