Class AnyTransliterator
- All Implemented Interfaces:
StringTransform
,Transform<String,
String>
An AnyTransliterator partitions text into runs of the same script, together with adjacent COMMON or INHERITED characters. After determining the script of each run, it transliterates from that script to the given target/variant. It does so by instantiating a transliterator from the source script to the target/variant. If a run consists only of the target script, COMMON, or INHERITED characters, then the run is not changed.
At startup, all possible AnyTransliterators are registered with the system, as determined by examining the registered script transliterators.
- Since:
- ICU 2.2
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionprivate static class
Returns a series of ranges corresponding to scripts.private static class
Lazily initialize a special Transliterator for handling width characters.Nested classes/interfaces inherited from class com.ibm.icu.text.Transliterator
Transliterator.Factory, Transliterator.Position
-
Field Summary
FieldsModifier and TypeFieldDescription(package private) static final String
private ConcurrentHashMap
<Integer, Transliterator> Cache mapping UScriptCode values to Transliterator*.(package private) static final String
(package private) static final String
private String
The target or target/variant string.(package private) static final char
private int
The target script code.(package private) static final char
-
Constructor Summary
ConstructorsModifierConstructorDescriptionAnyTransliterator
(String id, UnicodeFilter filter, String target2, int targetScript2, Transliterator widthFix2, ConcurrentHashMap<Integer, Transliterator> cache2) private
AnyTransliterator
(String id, String theTarget, String theVariant, int theTargetScript) Private constructor -
Method Summary
Modifier and TypeMethodDescriptionvoid
addSourceTargetSet
(UnicodeSet inputFilter, UnicodeSet sourceSet, UnicodeSet targetSet) Returns the set of all characters that may be generated as replacement text by this transliterator, filtered by BOTH the input filter, and the current getFilter().private Transliterator
getTransliterator
(int source) Returns a transliterator from the given source to our target or target/variant.protected void
handleTransliterate
(Replaceable text, Transliterator.Position pos, boolean isIncremental) private boolean
isWide
(int script) (package private) static void
register()
Registers standard transliterators with the system.Temporary hack for registry problem.private static int
scriptNameToCode
(String name) Return the script code for a given name, or UScript.INVALID_CODE if not found.Methods inherited from class com.ibm.icu.text.Transliterator
baseToRules, createFromRules, filteredTransliterate, finishTransliteration, getAvailableIDs, getAvailableSources, getAvailableTargets, getAvailableVariants, getBasicInstance, getDisplayName, getDisplayName, getDisplayName, getElements, getFilter, getFilterAsUnicodeSet, getID, getInstance, getInstance, getInverse, getMaximumContextLength, getSourceSet, getTargetSet, handleGetSourceSet, registerAlias, registerAny, registerClass, registerFactory, registerInstance, registerInstance, registerSpecialInverse, setFilter, setID, setMaximumContextLength, toRules, transform, transliterate, transliterate, transliterate, transliterate, transliterate, transliterate, unregister
-
Field Details
-
TARGET_SEP
static final char TARGET_SEP- See Also:
-
VARIANT_SEP
static final char VARIANT_SEP- See Also:
-
ANY
- See Also:
-
NULL_ID
- See Also:
-
LATIN_PIVOT
- See Also:
-
cache
Cache mapping UScriptCode values to Transliterator*. -
target
The target or target/variant string. -
targetScript
private int targetScriptThe target script code. Never USCRIPT_INVALID_CODE.
-
-
Constructor Details
-
AnyTransliterator
Private constructor- Parameters:
id
- the ID of the form S-T or S-T/V, where T is theTarget and V is theVariant. Must not be empty.theTarget
- the target name. Must not be empty, and must name a script corresponding to theTargetScript.theVariant
- the variant name, or the empty string if there is no varianttheTargetScript
- the script code corresponding to theTarget.
-
AnyTransliterator
public AnyTransliterator(String id, UnicodeFilter filter, String target2, int targetScript2, Transliterator widthFix2, ConcurrentHashMap<Integer, Transliterator> cache2) - Parameters:
id
- the ID of the form S-T or S-T/V, where T is theTarget and V is theVariant. Must not be empty.filter
- The Unicode filter.target2
- the target name.targetScript2
- the script code corresponding to theTarget.widthFix2
- Not used. This parameter is deprecated.cache2
- The Map object for cache.
-
-
Method Details
-
handleTransliterate
protected void handleTransliterate(Replaceable text, Transliterator.Position pos, boolean isIncremental) - Specified by:
handleTransliterate
in classTransliterator
- Parameters:
text
- the buffer holding transliterated and untransliterated textpos
- the indices indicating the start, limit, context start, and context limit of the text.isIncremental
- if true, assume more text may be inserted atpos.limit
and act accordingly. Otherwise, transliterate all text betweenpos.start
andpos.limit
and movepos.start
up topos.limit
.- See Also:
-
getTransliterator
Returns a transliterator from the given source to our target or target/variant. Returns NULL if the source is the same as our target script, or if the source is USCRIPT_INVALID_CODE. Caches the result and returns the same transliterator the next time. The caller does NOT own the result and must not delete it. -
isWide
private boolean isWide(int script) - Parameters:
targetScript2
-- Returns:
-
register
static void register()Registers standard transliterators with the system. Called by Transliterator during initialization. Scan all current targets and register those that are scripts T as Any-T/V. -
scriptNameToCode
Return the script code for a given name, or UScript.INVALID_CODE if not found. -
safeClone
Temporary hack for registry problem. Needs to be replaced by better architecture. -
addSourceTargetSet
Description copied from class:Transliterator
Returns the set of all characters that may be generated as replacement text by this transliterator, filtered by BOTH the input filter, and the current getFilter().SHOULD BE OVERRIDDEN BY SUBCLASSES. It is probably an error for any transliterator to NOT override this, but we can't force them to for backwards compatibility.
Other methods vector through this.
When gathering the information on source and target, the compound transliterator makes things complicated. For example, suppose we have:
Global FILTER = [ax] a > b; :: NULL; b > c; x > d;
While the filter just allows a and x, b is an intermediate result, which could produce c. So the source and target sets cannot be gathered independently. What we have to do is filter the sources for the first transliterator according to the global filter, intersect that transliterator's filter. Based on that we get the target. The next transliterator gets as a global filter (global + last target). And so on.There is another complication:
Global FILTER = [ax] a >|b; b >c;
Even though b would be filtered from the input, whenever we have a backup, it could be part of the input. So ideally we will change the global filter as we go.- Overrides:
addSourceTargetSet
in classTransliterator
- Parameters:
targetSet
- TODO- See Also:
-