Object
Transliterator
- All Implemented Interfaces:
Serializable
Controls the replacement of characters, abbreviations and names between the objects in memory and their
WKT representations. The mapping is not necessarily one-to-one, for example the replacement of a Unicode
character by an ASCII character may not be reversible. The mapping may also depend on the element to transliterate,
for example some Greek letters like φ, λ and θ are mapped differently when they are used as mathematical symbols in
axis abbreviations rather than texts. Some mappings may also apply to words instead of characters, when the word
come from a controlled vocabulary.
Permitted characters in Well Known Text
The ISO 19162 standard restricts Well Known Text to the following characters in all quoted texts except inREMARKS["…"]
elements:
They are ASCII codes 32 to 125 inclusive except ! (33), # (35), $ (36), @ (64) and ` (96), plus the addition of ° (176) despite being formally outside the ASCII character set. The only exception to this rules is for the text insideA-Z a-z 0-9 _ [ ] ( ) { } < = > . , : ; + - (space) % & ' " * ^ / \ ? | °
REMARKS["…"]
elements,
where all Unicode characters are allowed.
The filter(String)
method is responsible for replacing or removing characters outside the above-cited
set of permitted characters.
Application to mathematical symbols
For Greek letters used as mathematical symbols in coordinate axis abbreviations, the ISO 19162 standard recommends:- (P, L) as the transliteration of the Greek letters (phi, lambda), or (B, L) from German “Breite” and “Länge” used in academic texts worldwide, or (lat, long).
- (U) for (θ) in polar coordinate systems.
- (U, V) for (Ω, θ) in spherical coordinate systems.
toLatinAbbreviation(…)
and toUnicodeAbbreviation(…)
methods are responsible for doing the transliteration at formatting and parsing time, respectively.
Note on conventions
At least two conventions exist about the meaning of (r, θ, φ) in a spherical coordinate system (see Wikipedia or MathWorld for more information). When using the mathematics convention, θ is the azimuthal angle in the equatorial plane (roughly equivalent to longitude λ) while φ is an angle measured from a pole (also known as colatitude). But when using the physics convention, the meaning of θ and φ are interchanged. Furthermore, some other conventions may measure the φ angle from the equatorial plane – like latitude – instead than from the pole. This class does not need to care about the meaning of those angles. The only recommendation is that φ is mapped to U and θ is mapped to V, regardless of their meaning.Replacement of names
The longitude and latitude axis names are explicitly fixed by ISO 19111:2007 to "Geodetic longitude" and "Geodetic latitude". But ISO 19162:2015 §7.5.3(ii) said that the "Geodetic" part in those names shall be omitted at WKT formatting time. ThetoShortAxisName(…)
and toLongAxisName(…)
methods are responsible for doing the transliteration at formatting and parsing time, respectively.- Since:
- 0.6
- See Also:
-
Field Summary
Modifier and TypeFieldDescriptionstatic final Transliterator
A transliterator compliant with ISO 19162 on a "best effort" basis.static final Transliterator
A transliterator that does not perform any replacement. -
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionReturns a character sequences with the non-ASCII characters replaced or removed.toLatinAbbreviation
(CoordinateSystem cs, AxisDirection direction, String abbreviation) Returns the axis abbreviation to format in WKT, ornull
if none.toLongAxisName
(String csType, AxisDirection direction, String name) Returns the axis name to use in memory for an axis parsed from a WKT.toShortAxisName
(CoordinateSystem cs, AxisDirection direction, String name) Returns the axis name to format in WKT, ornull
if none.toUnicodeAbbreviation
(String csType, AxisDirection direction, String abbreviation) Returns the axis abbreviation to use in memory for an axis parsed from a WKT.
-
Field Details
-
DEFAULT
A transliterator compliant with ISO 19162 on a "best effort" basis. All methods perform the default implementation documented in thisTransliterator
class. -
IDENTITY
A transliterator that does not perform any replacement. All methods let names, abbreviations and Unicode characters pass-through unchanged.
-
-
Constructor Details
-
Transliterator
protected Transliterator()For sub-class constructors.
-
-
Method Details
-
filter
Returns a character sequences with the non-ASCII characters replaced or removed. For example, this method replaces “ç” by “c” in “Triangulation française”. This operation is usually not reversible; there is no converse method.Implementations shall not care about opening or closing quotes. The quotes will be doubled by the caller if needed after this method has been invoked.
The default implementation invokes
CharSequences.toASCII(CharSequence)
, replaces line feed and tabulations by single spaces, then remove control characters.- Parameters:
text
- the text to format without non-ASCII characters.- Returns:
- the text to write in Well Known Text.
- See Also:
-
toShortAxisName
Returns the axis name to format in WKT, ornull
if none. This method performs the mapping between the names of axes in memory (designated by "long axis names" in this class) and the names to format in the WKT (designated by "short axis names").Note: the "long axis names" are defined by ISO 19111 — referencing by coordinates while the "short axis names" are defined by ISO 19162 — Well-known text representation of coordinate reference systems.This method can returnnull
if the name should be omitted. ISO 19162 recommends to omit the axis name when it is already given through the mandatory axis direction.The default implementation performs at least the following replacements:
- Replace “Geodetic latitude” (case insensitive) by “Latitude”.
- Replace “Geodetic longitude” (case insensitive) by “Longitude”.
- Return
null
if the axis direction isAxisDirection.GEOCENTRIC_X
,GEOCENTRIC_Y
orGEOCENTRIC_Z
and the name is the same than the axis direction (ignoring case).
- Parameters:
cs
- the enclosing coordinate system, ornull
if unknown.direction
- the direction of the axis to format.name
- the axis name, to be eventually replaced by this method.- Returns:
- the axis name to format, or
null
if the name shall be omitted. - See Also:
-
toLongAxisName
Returns the axis name to use in memory for an axis parsed from a WKT. Since this method is invoked before theCoordinateSystem
instance is created, most coordinate system characteristics are known only asString
. In particular thecsType
argument, if non-null, should be one of the following values:
This method is the converse of"affine"
,"Cartesian"
(note the upper-case"C"
),"cylindrical"
,"ellipsoidal"
,"linear"
,"parametric"
,"polar"
,"spherical"
,"temporal"
or"vertical"
toShortAxisName(CoordinateSystem, AxisDirection, String)
. The default implementation performs at least the following replacements:- Replace “Lat” or “Latitude” (case insensitive) by “Geodetic latitude” or “Spherical latitude”, depending on whether the axis is part of an ellipsoidal or spherical CS respectively.
- Replace “Lon”, “Long” or “Longitude” (case insensitive) by “Geodetic longitude” or “Spherical longitude”, depending on whether the axis is part of an ellipsoidal or spherical CS respectively.
- Return “Geocentric X”, “Geocentric Y” and “Geocentric Z”
for
AxisDirection.GEOCENTRIC_X
,GEOCENTRIC_Y
andGEOCENTRIC_Z
respectively in a Cartesian CS, if the given axis name is only an abbreviation. - Use unique camel-case names for axis names defined by ISO 19111 and ISO 19162. For example, this method replaces “ellipsoidal height” by “Ellipsoidal height”.
Usage note
Axis names are not really free text. They are specified by ISO 19111 and ISO 19162. SIS does not put restriction on axis names, but we nevertheless try to use a unique name when we recognize it.- Parameters:
csType
- the type of the coordinate system, ornull
if unknown.direction
- the parsed axis direction.name
- the parsed axis abbreviation, to be eventually replaced by this method.- Returns:
- the axis name to use. Cannot be null.
-
toLatinAbbreviation
public String toLatinAbbreviation(CoordinateSystem cs, AxisDirection direction, String abbreviation) Returns the axis abbreviation to format in WKT, ornull
if none. The given abbreviation may contain Greek letters, in particular φ, λ and θ. ThistoLatinAbbreviation(…)
method is responsible for replacing Greek letters by Latin letters for ISO 19162 compliance, if desired.The default implementation performs at least the following mapping:
- λ → L (from German Länge) if used in an ellipsoidal CS.
- φ → B (from German Breite) if used in an ellipsoidal CS.
- φ or φ′ or φc or Ω → U if used in a spherical CS, regardless of whether the coordinate system follows physics, mathematics or other conventions.
- θ → V if used in a spherical CS (regardless of above-cited coordinate system convention).
- θ → U if used in a polar CS.
- Parameters:
cs
- the enclosing coordinate system, ornull
if unknown.direction
- the direction of the axis to format.abbreviation
- the axis abbreviation, to be eventually replaced by this method.- Returns:
- the axis abbreviation to format.
- See Also:
-
toUnicodeAbbreviation
Returns the axis abbreviation to use in memory for an axis parsed from a WKT. Since this method is invoked before theCoordinateSystem
instance is created, most coordinate system characteristics are known only asString
. In particular thecsType
argument, if non-null, should be one of the following values:
This method is the converse of"affine"
,"Cartesian"
(note the upper-case"C"
),"cylindrical"
,"ellipsoidal"
,"linear"
,"parametric"
,"polar"
,"spherical"
,"temporal"
or"vertical"
toLatinAbbreviation(CoordinateSystem, AxisDirection, String)
. The default implementation performs at least the following mapping:- P or L → λ if
csType
is"ellipsoidal"
. - B → φ if
csType
is"ellipsoidal"
. - U → Ω if
csType
is"spherical"
, regardless of coordinate system convention. - V → θ if
csType
is"spherical"
, regardless of coordinate system convention. - U → θ if
csType
is"polar"
.
- Parameters:
csType
- the type of the coordinate system, ornull
if unknown.direction
- the parsed axis direction.abbreviation
- the parsed axis abbreviation, to be eventually replaced by this method.- Returns:
- the axis abbreviation to use. Cannot be null.
- P or L → λ if
-