Package jexer.bits
Class ExtendedGraphemeClusterUtils
- java.lang.Object
-
- jexer.bits.ExtendedGraphemeClusterUtils
-
public class ExtendedGraphemeClusterUtils extends java.lang.ObjectExtendedGraphemeClusterUtils implements most, but not all, of the grapheme cluster breaking rules of Unicode TR #29 section 3.1.1. Specifically:- GB3 is deliberately ignored.
- GB4 and GB5 will break at all control characters including CR.
- GB9c is not implemented.
- GB11 and GB12 do not count "evenness" of previous regional indicator (RI) symbols, instead always joining.
-
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static booleanisBraille(int ch)Check if character is in the braille range.static booleanisCjk(int ch)Check if character is in the CJK range.static booleanisControl(int ch)Check if codepoint has the Control Grapheme_Cluster_Break property.static booleanisCR(int ch)Check if codepoint has the CR Grapheme_Cluster_Break property.static booleanisEmoji(int ch)Check if character is in the emoji range (Emoji, Emoji_Component, Extended_Pictographic) AND not in the Basic Multilingual Plane.static booleanisEmojiBMP(int ch)Check if character is in the emoji range of the Basic Multilingual Plane (Emoji, Emoji_Component, Extended_Pictographic).static booleanisEmojiCombiner(int ch)Check if character will always be part of a larger emoji sequence.static booleanisEmojiComponent(int ch)Check if character is in the Emoji_Component range.static booleanisExtend(int ch)Check if codepoint has the Extend Grapheme_Cluster_Break property.static booleanisJexerDefaultGlyph(int ch)Check if character is a less-common Unicode symbol that is used by a default Jexer user interface component.static booleanisL(int ch)Check if codepoint has the L Grapheme_Cluster_Break property.static booleanisLegacyComputingSymbol(int ch)Check if character is in the Symbols for Legacy Computing range.static booleanisLF(int ch)Check if codepoint has the LF Grapheme_Cluster_Break property.static booleanisLV(int ch)Check if codepoint has the LV Grapheme_Cluster_Break property.static booleanisLVT(int ch)Check if codepoint has the LVT Grapheme_Cluster_Break property.static booleanisOther(int ch)Check if codepoint has the Other Grapheme_Cluster_Break property.static booleanisPrepend(int ch)Check if codepoint has the Prepend Grapheme_Cluster_Break property.static booleanisRegionalIndicator(int ch)Check if character is a Regional Indicator (RI) symbol.static booleanisSpacingMark(int ch)Check if codepoint has the SpacingMark Grapheme_Cluster_Break property.static booleanisT(int ch)Check if codepoint has the T Grapheme_Cluster_Break property.static booleanisV(int ch)Check if codepoint has the V Grapheme_Cluster_Break property.static booleanisZWJ(int ch)Check if codepoint has the ZWJ Grapheme_Cluster_Break property.static voidmain(java.lang.String[] args)Test the extended grapheme cluster boundary code.static booleanshouldBreak(int firstCh, int secondCh)See if a grapheme cluster break should occur between two codepoints, following most of the rules of Unicode TR #29 section 3.1.1.static java.util.List<ComplexCell>toComplexCells(java.lang.String input)Converts a string into a sequence of grapheme clusters following most of the rules of Unicode TR #29 section 3.1.1.
-
-
-
Method Detail
-
isCjk
public static boolean isCjk(int ch)
Check if character is in the CJK range.- Parameters:
ch- character to check- Returns:
- true if this character is in the CJK range
-
isBraille
public static boolean isBraille(int ch)
Check if character is in the braille range.- Parameters:
ch- character to check- Returns:
- true if this character is in the braille range
-
isEmojiBMP
public static boolean isEmojiBMP(int ch)
Check if character is in the emoji range of the Basic Multilingual Plane (Emoji, Emoji_Component, Extended_Pictographic).- Parameters:
ch- character to check- Returns:
- true if this character is in the emoji range
-
isEmoji
public static boolean isEmoji(int ch)
Check if character is in the emoji range (Emoji, Emoji_Component, Extended_Pictographic) AND not in the Basic Multilingual Plane. For a full check of ALL emoji, use 'isEmoji(x) || isEmojiBMP(x)'.- Parameters:
ch- character to check- Returns:
- true if this character is in the emoji range
-
isEmojiComponent
public static boolean isEmojiComponent(int ch)
Check if character is in the Emoji_Component range. Emoji_Component codepoints are part of larger sequences, but some of them can also stand alone to represent glyphs (Emoji, Extended_Pictographic).- Parameters:
ch- character to check- Returns:
- true if this character is in the emoji component range
-
isEmojiCombiner
public static boolean isEmojiCombiner(int ch)
Check if character will always be part of a larger emoji sequence.- Parameters:
ch- character to check- Returns:
- true if this character is only used to combine/modify emoji codepoints.
-
isRegionalIndicator
public static boolean isRegionalIndicator(int ch)
Check if character is a Regional Indicator (RI) symbol.- Parameters:
ch- character to check- Returns:
- true if this character is a Regional Indicator (RI) symbol
-
isLegacyComputingSymbol
public static boolean isLegacyComputingSymbol(int ch)
Check if character is in the Symbols for Legacy Computing range.- Parameters:
ch- character to check- Returns:
- true if this character is in the Symbols for Legacy Computing range
-
isJexerDefaultGlyph
public static boolean isJexerDefaultGlyph(int ch)
Check if character is a less-common Unicode symbol that is used by a default Jexer user interface component.- Parameters:
ch- character to check- Returns:
- true if this character used in the default Jexer user interface somewhere
-
isPrepend
public static boolean isPrepend(int ch)
Check if codepoint has the Prepend Grapheme_Cluster_Break property.- Parameters:
ch- character to check- Returns:
- true if Prepend
-
isCR
public static boolean isCR(int ch)
Check if codepoint has the CR Grapheme_Cluster_Break property.- Parameters:
ch- character to check- Returns:
- true if CR
-
isLF
public static boolean isLF(int ch)
Check if codepoint has the LF Grapheme_Cluster_Break property.- Parameters:
ch- character to check- Returns:
- true if LF
-
isControl
public static boolean isControl(int ch)
Check if codepoint has the Control Grapheme_Cluster_Break property.- Parameters:
ch- character to check- Returns:
- true if Control
-
isExtend
public static boolean isExtend(int ch)
Check if codepoint has the Extend Grapheme_Cluster_Break property.- Parameters:
ch- character to check- Returns:
- true if Extend
-
isSpacingMark
public static boolean isSpacingMark(int ch)
Check if codepoint has the SpacingMark Grapheme_Cluster_Break property.- Parameters:
ch- character to check- Returns:
- true if SpacingMark
-
isL
public static boolean isL(int ch)
Check if codepoint has the L Grapheme_Cluster_Break property.- Parameters:
ch- character to check- Returns:
- true if L
-
isV
public static boolean isV(int ch)
Check if codepoint has the V Grapheme_Cluster_Break property.- Parameters:
ch- character to check- Returns:
- true if V
-
isT
public static boolean isT(int ch)
Check if codepoint has the T Grapheme_Cluster_Break property.- Parameters:
ch- character to check- Returns:
- true if T
-
isLV
public static boolean isLV(int ch)
Check if codepoint has the LV Grapheme_Cluster_Break property.- Parameters:
ch- character to check- Returns:
- true if LV
-
isLVT
public static boolean isLVT(int ch)
Check if codepoint has the LVT Grapheme_Cluster_Break property.- Parameters:
ch- character to check- Returns:
- true if LVT
-
isZWJ
public static boolean isZWJ(int ch)
Check if codepoint has the ZWJ Grapheme_Cluster_Break property.- Parameters:
ch- character to check- Returns:
- true if ZWJ
-
isOther
public static boolean isOther(int ch)
Check if codepoint has the Other Grapheme_Cluster_Break property.- Parameters:
ch- character to check- Returns:
- true if Other
-
shouldBreak
public static boolean shouldBreak(int firstCh, int secondCh)See if a grapheme cluster break should occur between two codepoints, following most of the rules of Unicode TR #29 section 3.1.1.- Parameters:
firstCh- the first codepoint in the sequencesecondCh- the second codepoint in the sequence- Returns:
- true if a break should be between these codepoints
-
toComplexCells
public static java.util.List<ComplexCell> toComplexCells(java.lang.String input)
Converts a string into a sequence of grapheme clusters following most of the rules of Unicode TR #29 section 3.1.1.- Parameters:
input- a string of codepoints- Returns:
- a sequence of grapheme clusters
-
main
public static void main(java.lang.String[] args)
Test the extended grapheme cluster boundary code.- Parameters:
args- command line arguments
-
-