Class Characters


  • public final class Characters
    extends Static
    Static methods working on char values, and some character constants. Apache SIS uses Unicode symbols directly in the source code for easier reading, except for some symbols that are difficult to differentiate from other similar symbols. For those symbols, constants are declared in this class.
    Since:
    0.3

    Defined in the sis-utility module

    • Nested Class Summary

      Nested Classes 
      Modifier and Type Class Description
      static class  Characters.Filter
      Subsets of Unicode characters identified by their general category.
    • Field Summary

      Fields 
      Modifier and Type Field Description
      static char HYPHEN
      Hyphen character ('‐', Unicode 2010).
      static char LINE_SEPARATOR
      The Unicode line separator (Unicode 2028, HTML <br>).
      static char NO_BREAK_SPACE
      The no-break space (Unicode 00A0, HTML &nbsp;).
      static char PARAGRAPH_SEPARATOR
      The Unicode paragraph separator (Unicode 2029, HTML <p>…</p>).
      static char SOFT_HYPHEN
      Hyphen character to be visible only if there is a line break to insert after it (Unicode 00AD, HTML &shy;).
    • Field Detail

      • HYPHEN

        public static final char HYPHEN
        Hyphen character ('‐', Unicode 2010). This code tells to Line­Appender that a line break is allowed to be inserted after this character.

        For non-breaking hyphen, use the Unicode 2011 character.

        See Also:
        Constant Field Values
      • SOFT_HYPHEN

        public static final char SOFT_HYPHEN
        Hyphen character to be visible only if there is a line break to insert after it (Unicode 00AD, HTML &shy;). Otherwise this character is invisible. When visible, the graphical symbol is similar to the HYPHEN character.
        See Also:
        Constant Field Values
      • NO_BREAK_SPACE

        public static final char NO_BREAK_SPACE
        The no-break space (Unicode 00A0, HTML &nbsp;). Apache SIS uses Unicode symbols directly in the source code for easier reading, except for no-break spaces since they can not be visually distinguished from the ordinary space (Unicode 0020).
        See Also:
        Constant Field Values
    • Method Detail

      • isValidWKT

        public static boolean isValidWKT​(int c)
        Returns true if the given code point is a valid character for Well Known Text (WKT). This method returns true for the following characters:
        A-Z a-z 0-9 _ [ ] ( ) { } < = > . , : ; + - (space) % & ' " * ^ / \ ? | °
        They are ASCII codes 32 to 125 inclusive except ! (33), # (35), $ (36), @ (64) and ` (96), plus the addition of ° (176) despite being formally outside the ASCII character set.
        Parameters:
        c - the code point to test.
        Returns:
        true if the given code point is a valid WKT character.
        Since:
        0.6
        See Also:
        Transliterator
      • isLineOrParagraphSeparator

        public static boolean isLineOrParagraphSeparator​(int c)
        Returns true if the given code point is a line separator, a paragraph separator or one of the '\r' or '\n' control characters.
        Parameters:
        c - the code point to test.
        Returns:
        true if the given code point is a line or paragraph separator.
        See Also:
        LINE_SEPARATOR, PARAGRAPH_SEPARATOR
      • isHexadecimal

        public static boolean isHexadecimal​(int c)
        Returns true if the given character is an hexadecimal digit. This method returns true if c is between '0' and '9' inclusive, or between 'A' and 'F' inclusive, or between 'a' and 'f' inclusive.
        Parameters:
        c - the character to test.
        Returns:
        true if the given character is an hexadecimal digit.
        Since:
        0.5
      • isSuperScript

        public static boolean isSuperScript​(int c)
        Determines whether the given character is a superscript. Most (but not all) superscripts have a Unicode value in the [2070 … 207F] range. Superscripts are the following symbols:
        ⁰ ¹ ² ³ ⁴ ⁵ ⁶ ⁷ ⁸ ⁹ ⁺ ⁻ ⁼ ⁽ ⁾ ⁿ
        Parameters:
        c - the character to test.
        Returns:
        true if the given character is a superscript.
      • isSubScript

        public static boolean isSubScript​(int c)
        Determines whether the given character is a subscript. All subscripts have a Unicode value in the [2080 … 208E]. Subscripts are the following symbols:
        ₀ ₁ ₂ ₃ ₄ ₅ ₆ ₇ ₈ ₉ ₊ ₋ ₌ ₍ ₎
        Parameters:
        c - the character to test.
        Returns:
        true if the given character is a subscript.
      • toSuperScript

        public static char toSuperScript​(char c)
        Converts the given character argument to superscript. Only the following characters can be converted (other characters are left unchanged):
        0 1 2 3 4 5 6 7 8 9 + - = ( ) n
        Parameters:
        c - the character to convert.
        Returns:
        the given character as a superscript, or c if the given character can not be converted.
      • toSubScript

        public static char toSubScript​(char c)
        Converts the given character argument to subscript. Only the following characters can be converted (other characters are left unchanged):
        0 1 2 3 4 5 6 7 8 9 + - = ( )
        Parameters:
        c - the character to convert.
        Returns:
        the given character as a subscript, or c if the given character can not be converted.
      • toNormalScript

        public static char toNormalScript​(char c)
        Converts the given character argument to normal script.
        Parameters:
        c - the character to convert.
        Returns:
        the given character as a normal script, or c if the given character was not a superscript or a subscript.
      • toNormalScript

        public static int toNormalScript​(int c)
        Converts the given code point to normal script.
        Parameters:
        c - the character to convert.
        Returns:
        the given character as a normal script, or c if the given character was not a superscript or a subscript.
        Since:
        1.0