Package com.ibm.icu.impl.number
Class AffixUtils
java.lang.Object
com.ibm.icu.impl.number.AffixUtils
Performs manipulations on affix patterns: the prefix and suffix strings associated with a decimal
format pattern. For example:
To manually iterate over tokens in a literal string, use the following pattern, which is designed to
be efficient.
Affix Pattern | Example Unescaped (Formatted) String |
---|---|
abc | abc |
ab- | ab− |
ab'-' | ab- |
ab'' | ab' |
long tag = 0L; while (AffixPatternUtils.hasNext(tag, patternString)) { tag = AffixPatternUtils.nextToken(tag, patternString); int typeOrCp = AffixPatternUtils.getTypeOrCp(tag); switch (typeOrCp) { case AffixPatternUtils.TYPE_MINUS_SIGN: // Current token is a minus sign. break; case AffixPatternUtils.TYPE_PLUS_SIGN: // Current token is a plus sign. break; case AffixPatternUtils.TYPE_PERCENT: // Current token is a percent sign. break; // ... other types ... default: // Current token is an arbitrary code point. // The variable typeOrCp is the code point. break; } }
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic interface
static interface
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate static final int
private static final int
private static final int
private static final int
private static final int
private static final int
private static final int
private static final int
private static final int
private static final int
static final int
private static final int
Represents a literal character; the value is stored in the code point field.static final int
Represents a double currency symbol '¤¤'.static final int
Represents a sequence of six or more currency symbols.static final int
Represents a quadruple currency symbol '¤¤¤¤'.static final int
Represents a quintuple currency symbol '¤¤¤¤¤'.static final int
Represents a single currency symbol '¤'.static final int
Represents a triple currency symbol '¤¤¤'.static final int
Represents a minus sign symbol '-'.static final int
Represents a percent sign symbol '%'.static final int
Represents a permille sign symbol '‰'.static final int
Represents a plus sign symbol '+'. -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionstatic boolean
containsOnlySymbolsAndIgnorables
(CharSequence affixPattern, UnicodeSet ignorables) Returns whether the given affix pattern contains only symbols and ignorables as defined by the given ignorables set.static boolean
containsType
(CharSequence affixPattern, int type) Checks whether the given affix pattern contains at least one token of the given type, which is one of the constants "TYPE_" inAffixUtils
.static String
escape
(CharSequence input) Version ofescape(java.lang.CharSequence, java.lang.StringBuilder)
that returns a String, or null if input is null.static int
escape
(CharSequence input, StringBuilder output) Takes a string and escapes (quotes) characters that have special meaning in the affix pattern syntax.static int
estimateLength
(CharSequence patternString) Estimates the number of code points present in an unescaped version of the affix pattern string (one that would be returned byunescape(java.lang.CharSequence, com.ibm.icu.impl.FormattedStringBuilder, int, com.ibm.icu.impl.number.AffixUtils.SymbolProvider, com.ibm.icu.text.NumberFormat.Field)
), assuming that all interpolated symbols consume one code point and that currencies consume as many code points as their symbol width.private static int
getCodePoint
(long tag) static final NumberFormat.Field
getFieldForType
(int type) private static int
getOffset
(long tag) private static int
getState
(long tag) private static int
getType
(long tag) private static int
getTypeOrCp
(long tag) This function helps determine the identity of the token consumed bynextToken(long, java.lang.CharSequence)
.static boolean
hasCurrencySymbols
(CharSequence affixPattern) Checks whether the specified affix pattern has any unquoted currency symbols ("¤").private static boolean
hasNext
(long tag, CharSequence string) Returns whether the affix pattern string has any more tokens to be retrieved from a call tonextToken(long, java.lang.CharSequence)
.static void
iterateWithConsumer
(CharSequence affixPattern, AffixUtils.TokenConsumer consumer) Iterates over the affix pattern, calling the TokenConsumer for each token.private static long
makeTag
(int offset, int type, int state, int cp) Encodes the given values into a 64-bit tag.private static long
nextToken
(long tag, CharSequence patternString) Returns the next token from the affix pattern.static String
replaceType
(CharSequence affixPattern, int type, char replacementChar) Replaces all occurrences of tokens with the given type with the given replacement char.static int
unescape
(CharSequence affixPattern, FormattedStringBuilder output, int position, AffixUtils.SymbolProvider provider, NumberFormat.Field field) Executes the unescape state machine.static int
unescapedCount
(CharSequence affixPattern, boolean lengthOrCount, AffixUtils.SymbolProvider provider) Sames asunescape(java.lang.CharSequence, com.ibm.icu.impl.FormattedStringBuilder, int, com.ibm.icu.impl.number.AffixUtils.SymbolProvider, com.ibm.icu.text.NumberFormat.Field)
, but only calculates the length or code point count.
-
Field Details
-
STATE_BASE
private static final int STATE_BASE- See Also:
-
STATE_FIRST_QUOTE
private static final int STATE_FIRST_QUOTE- See Also:
-
STATE_INSIDE_QUOTE
private static final int STATE_INSIDE_QUOTE- See Also:
-
STATE_AFTER_QUOTE
private static final int STATE_AFTER_QUOTE- See Also:
-
STATE_FIRST_CURR
private static final int STATE_FIRST_CURR- See Also:
-
STATE_SECOND_CURR
private static final int STATE_SECOND_CURR- See Also:
-
STATE_THIRD_CURR
private static final int STATE_THIRD_CURR- See Also:
-
STATE_FOURTH_CURR
private static final int STATE_FOURTH_CURR- See Also:
-
STATE_FIFTH_CURR
private static final int STATE_FIFTH_CURR- See Also:
-
STATE_OVERFLOW_CURR
private static final int STATE_OVERFLOW_CURR- See Also:
-
TYPE_CODEPOINT
private static final int TYPE_CODEPOINTRepresents a literal character; the value is stored in the code point field.- See Also:
-
TYPE_MINUS_SIGN
public static final int TYPE_MINUS_SIGNRepresents a minus sign symbol '-'.- See Also:
-
TYPE_PLUS_SIGN
public static final int TYPE_PLUS_SIGNRepresents a plus sign symbol '+'.- See Also:
-
TYPE_APPROXIMATELY_SIGN
public static final int TYPE_APPROXIMATELY_SIGN- See Also:
-
TYPE_PERCENT
public static final int TYPE_PERCENTRepresents a percent sign symbol '%'.- See Also:
-
TYPE_PERMILLE
public static final int TYPE_PERMILLERepresents a permille sign symbol '‰'.- See Also:
-
TYPE_CURRENCY_SINGLE
public static final int TYPE_CURRENCY_SINGLERepresents a single currency symbol '¤'.- See Also:
-
TYPE_CURRENCY_DOUBLE
public static final int TYPE_CURRENCY_DOUBLERepresents a double currency symbol '¤¤'.- See Also:
-
TYPE_CURRENCY_TRIPLE
public static final int TYPE_CURRENCY_TRIPLERepresents a triple currency symbol '¤¤¤'.- See Also:
-
TYPE_CURRENCY_QUAD
public static final int TYPE_CURRENCY_QUADRepresents a quadruple currency symbol '¤¤¤¤'.- See Also:
-
TYPE_CURRENCY_QUINT
public static final int TYPE_CURRENCY_QUINTRepresents a quintuple currency symbol '¤¤¤¤¤'.- See Also:
-
TYPE_CURRENCY_OVERFLOW
public static final int TYPE_CURRENCY_OVERFLOWRepresents a sequence of six or more currency symbols.- See Also:
-
-
Constructor Details
-
AffixUtils
public AffixUtils()
-
-
Method Details
-
estimateLength
Estimates the number of code points present in an unescaped version of the affix pattern string (one that would be returned byunescape(java.lang.CharSequence, com.ibm.icu.impl.FormattedStringBuilder, int, com.ibm.icu.impl.number.AffixUtils.SymbolProvider, com.ibm.icu.text.NumberFormat.Field)
), assuming that all interpolated symbols consume one code point and that currencies consume as many code points as their symbol width. Used for computing padding width.- Parameters:
patternString
- The original string whose width will be estimated.- Returns:
- The length of the unescaped string.
-
escape
Takes a string and escapes (quotes) characters that have special meaning in the affix pattern syntax. This function does not reverse-lookup symbols.Example input: "-$x"; example output: "'-'$x"
- Parameters:
input
- The string to be escaped.output
- The string builder to which to append the escaped string.- Returns:
- The number of chars (UTF-16 code units) appended to the output.
-
escape
Version ofescape(java.lang.CharSequence, java.lang.StringBuilder)
that returns a String, or null if input is null. -
getFieldForType
-
unescape
public static int unescape(CharSequence affixPattern, FormattedStringBuilder output, int position, AffixUtils.SymbolProvider provider, NumberFormat.Field field) Executes the unescape state machine. Replaces the unquoted characters "-", "+", "%", "‰", and "¤" with the corresponding symbols provided by theAffixUtils.SymbolProvider
, and inserts the result into the FormattedStringBuilder at the requested location.Example input: "'-'¤x"; example output: "-$x"
- Parameters:
affixPattern
- The original string to be unescaped.output
- The FormattedStringBuilder to mutate with the result.position
- The index into the FormattedStringBuilder to insert the the string.provider
- An object to generate locale symbols.- Returns:
- The length of the string added to affixPattern.
-
unescapedCount
public static int unescapedCount(CharSequence affixPattern, boolean lengthOrCount, AffixUtils.SymbolProvider provider) Sames asunescape(java.lang.CharSequence, com.ibm.icu.impl.FormattedStringBuilder, int, com.ibm.icu.impl.number.AffixUtils.SymbolProvider, com.ibm.icu.text.NumberFormat.Field)
, but only calculates the length or code point count. More efficient thanunescape(java.lang.CharSequence, com.ibm.icu.impl.FormattedStringBuilder, int, com.ibm.icu.impl.number.AffixUtils.SymbolProvider, com.ibm.icu.text.NumberFormat.Field)
if you only need the length but not the string itself.- Parameters:
affixPattern
- The original string to be unescaped.lengthOrCount
- true to count length (UTF-16 code units); false to count code pointsprovider
- An object to generate locale symbols.- Returns:
- The number of code points in the unescaped string.
-
containsType
Checks whether the given affix pattern contains at least one token of the given type, which is one of the constants "TYPE_" inAffixUtils
.- Parameters:
affixPattern
- The affix pattern to check.type
- The token type.- Returns:
- true if the affix pattern contains the given token type; false otherwise.
-
hasCurrencySymbols
Checks whether the specified affix pattern has any unquoted currency symbols ("¤").- Parameters:
affixPattern
- The string to check for currency symbols.- Returns:
- true if the literal has at least one unquoted currency symbol; false otherwise.
-
replaceType
Replaces all occurrences of tokens with the given type with the given replacement char.- Parameters:
affixPattern
- The source affix pattern (does not get modified).type
- The token type.replacementChar
- The char to substitute in place of chars of the given token type.- Returns:
- A string containing the new affix pattern.
-
containsOnlySymbolsAndIgnorables
public static boolean containsOnlySymbolsAndIgnorables(CharSequence affixPattern, UnicodeSet ignorables) Returns whether the given affix pattern contains only symbols and ignorables as defined by the given ignorables set. -
iterateWithConsumer
public static void iterateWithConsumer(CharSequence affixPattern, AffixUtils.TokenConsumer consumer) Iterates over the affix pattern, calling the TokenConsumer for each token. -
nextToken
Returns the next token from the affix pattern.- Parameters:
tag
- A bitmask used for keeping track of state from token to token. The initial value should be 0L.patternString
- The affix pattern.- Returns:
- The bitmask tag to pass to the next call of this method to retrieve the following token (never negative), or -1 if there were no more tokens in the affix pattern.
- See Also:
-
hasNext
Returns whether the affix pattern string has any more tokens to be retrieved from a call tonextToken(long, java.lang.CharSequence)
.- Parameters:
tag
- The bitmask tag of the previous token, as returned bynextToken(long, java.lang.CharSequence)
.string
- The affix pattern.- Returns:
- true if there are more tokens to consume; false otherwise.
-
getTypeOrCp
private static int getTypeOrCp(long tag) This function helps determine the identity of the token consumed bynextToken(long, java.lang.CharSequence)
. Converts from a bitmask tag, based on a call tonextToken(long, java.lang.CharSequence)
, to its corresponding symbol type or code point.- Parameters:
tag
- The bitmask tag of the current token, as returned bynextToken(long, java.lang.CharSequence)
.- Returns:
- If less than zero, a symbol type corresponding to one of the
TYPE_
constants, such asTYPE_MINUS_SIGN
. If greater than or equal to zero, a literal code point.
-
makeTag
private static long makeTag(int offset, int type, int state, int cp) Encodes the given values into a 64-bit tag.- Bits 0-31 => offset (int32)
- Bits 32-35 => type (uint4)
- Bits 36-39 => state (uint4)
- Bits 40-60 => code point (uint21)
- Bits 61-63 => unused
-
getOffset
private static int getOffset(long tag) -
getType
private static int getType(long tag) -
getState
private static int getState(long tag) -
getCodePoint
private static int getCodePoint(long tag)
-