Bidi algorithm for ICU
This is an implementation of the Unicode Bidirectional Algorithm. The
algorithm is defined in the
Unicode Standard Annex #9.
Note: Libraries that perform a bidirectional algorithm and reorder strings
accordingly are sometimes called "Storage Layout Engines". ICU's Bidi and
shaping (ArabicShaping) classes can be used at the core of such "Storage
Layout Engines".
General remarks about the API:
The "limit" of a sequence of characters is the position just after
their last character, i.e., one more than that position.
Some of the API methods provide access to "runs". Such a
"run" is defined as a sequence of characters that are at the same
embedding level after performing the Bidi algorithm.
Basic concept: paragraph
A piece of text can be divided into several paragraphs by characters
with the Bidi class
Block Separator
. For handling of
paragraphs, see:
-
#countParagraphs
-
#getParaLevel
-
#getParagraph
-
#getParagraphByIndex
Basic concept: text direction
The direction of a piece of text may be:
-
#LTR
-
#RTL
-
#MIXED
-
#NEUTRAL
Basic concept: levels
Levels in this API represent embedding levels according to the Unicode
Bidirectional Algorithm.
Their low-order bit (even/odd value) indicates the visual direction.
Levels can be abstract values when used for the
paraLevel
and embeddingLevels
arguments of setPara()
; there:
- the high-order bit of an
embeddingLevels[]
value indicates whether the using application is
specifying the level of a character to override whatever the
Bidi implementation would resolve it to.
paraLevel
can be set to the
pseudo-level values LEVEL_DEFAULT_LTR
and LEVEL_DEFAULT_RTL
.
The related constants are not real, valid level values.
DEFAULT_XXX
can be used to specify
a default for the paragraph level for
when the setPara()
method
shall determine it but there is no
strongly typed character in the input.
Note that the value for LEVEL_DEFAULT_LTR
is even
and the one for LEVEL_DEFAULT_RTL
is odd,
just like with normal LTR and RTL level values -
these special values are designed that way. Also, the implementation
assumes that MAX_EXPLICIT_LEVEL is odd.
See Also:
-
#LEVEL_DEFAULT_LTR
-
#LEVEL_DEFAULT_RTL
-
#LEVEL_OVERRIDE
-
#MAX_EXPLICIT_LEVEL
-
#setPara
Basic concept: Reordering Mode
Reordering mode values indicate which variant of the Bidi algorithm to
use.
See Also:
-
#setReorderingMode
-
#REORDER_DEFAULT
-
#REORDER_NUMBERS_SPECIAL
-
#REORDER_GROUP_NUMBERS_WITH_R
-
#REORDER_RUNS_ONLY
-
#REORDER_INVERSE_NUMBERS_AS_L
-
#REORDER_INVERSE_LIKE_DIRECT
-
#REORDER_INVERSE_FOR_NUMBERS_SPECIAL
Basic concept: Reordering Options
Reordering options can be applied during Bidi text transformations.
See Also:
-
#setReorderingOptions
-
#OPTION_DEFAULT
-
#OPTION_INSERT_MARKS
-
#OPTION_REMOVE_CONTROLS
-
#OPTION_STREAMING
Sample code for the ICU Bidi API
Rendering a paragraph with the ICU Bidi API
This is (hypothetical) sample code that illustrates how the ICU Bidi API
could be used to render a paragraph of text. Rendering code depends highly on
the graphics system, therefore this sample code must make a lot of
assumptions, which may or may not match any existing graphics system's
properties.
The basic assumptions are:
- Rendering is done from left to right on a horizontal line.
- A run of single-style, unidirectional text can be rendered at once.
- Such a run of text is passed to the graphics system with characters
(code units) in logical order.
- The line-breaking algorithm is very complicated and Locale-dependent -
and therefore its implementation omitted from this sample code.
package com.ibm.icu.dev.test.bidi;
import com.ibm.icu.text.Bidi;
import com.ibm.icu.text.BidiRun;
public class Sample {
static final int styleNormal = 0;
static final int styleSelected = 1;
static final int styleBold = 2;
static final int styleItalics = 4;
static final int styleSuper=8;
static final int styleSub = 16;
static class StyleRun {
int limit;
int style;
public StyleRun(int limit, int style) {
this.limit = limit;
this.style = style;
}
}
static class Bounds {
int start;
int limit;
public Bounds(int start, int limit) {
this.start = start;
this.limit = limit;
}
}
static int getTextWidth(String text, int start, int limit,
StyleRun[] styleRuns, int styleRunCount) {
// simplistic way to compute the width
return limit - start;
}
// set limit and StyleRun limit for a line
// from text[start] and from styleRuns[styleRunStart]
// using Bidi.getLogicalRun(...)
// returns line width
static int getLineBreak(String text, Bounds line, Bidi para,
StyleRun styleRuns[], Bounds styleRun) {
// dummy return
return 0;
}
// render runs on a line sequentially, always from left to right
// prepare rendering a new line
static void startLine(byte textDirection, int lineWidth) {
System.out.println();
}
// render a run of text and advance to the right by the run width
// the text[start..limit-1] is always in logical order
static void renderRun(String text, int start, int limit,
byte textDirection, int style) {
}
// We could compute a cross-product
// from the style runs with the directional runs
// and then reorder it.
// Instead, here we iterate over each run type
// and render the intersections -
// with shortcuts in simple (and common) cases.
// renderParagraph() is the main function.
// render a directional run with
// (possibly) multiple style runs intersecting with it
static void renderDirectionalRun(String text, int start, int limit,
byte direction, StyleRun styleRuns[],
int styleRunCount) {
int i;
// iterate over style runs
if (direction == Bidi.LTR) {
int styleLimit;
for (i = 0; i < styleRunCount; ++i) {
styleLimit = styleRuns[i].limit;
if (start < styleLimit) {
if (styleLimit > limit) {
styleLimit = limit;
}
renderRun(text, start, styleLimit,
direction, styleRuns[i].style);
if (styleLimit == limit) {
break;
}
start = styleLimit;
}
}
} else {
int styleStart;
for (i = styleRunCount-1; i >= 0; --i) {
if (i > 0) {
styleStart = styleRuns[i-1].limit;
} else {
styleStart = 0;
}
if (limit >= styleStart) {
if (styleStart < start) {
styleStart = start;
}
renderRun(text, styleStart, limit, direction,
styleRuns[i].style);
if (styleStart == start) {
break;
}
limit = styleStart;
}
}
}
}
// the line object represents text[start..limit-1]
static void renderLine(Bidi line, String text, int start, int limit,
StyleRun styleRuns[], int styleRunCount) {
byte direction = line.getDirection();
if (direction != Bidi.MIXED) {
// unidirectional
if (styleRunCount <= 1) {
renderRun(text, start, limit, direction, styleRuns[0].style);
} else {
renderDirectionalRun(text, start, limit, direction,
styleRuns, styleRunCount);
}
} else {
// mixed-directional
int count, i;
BidiRun run;
try {
count = line.countRuns();
} catch (IllegalStateException e) {
e.printStackTrace();
return;
}
if (styleRunCount <= 1) {
int style = styleRuns[0].style;
// iterate over directional runs
for (i = 0; i < count; ++i) {
run = line.getVisualRun(i);
renderRun(text, run.getStart(), run.getLimit(),
run.getDirection(), style);
}
} else {
// iterate over both directional and style runs
for (i = 0; i < count; ++i) {
run = line.getVisualRun(i);
renderDirectionalRun(text, run.getStart(),
run.getLimit(), run.getDirection(),
styleRuns, styleRunCount);
}
}
}
}
static void renderParagraph(String text, byte textDirection,
StyleRun styleRuns[], int styleRunCount,
int lineWidth) {
int length = text.length();
Bidi para = new Bidi();
try {
para.setPara(text,
textDirection != 0 ? Bidi.LEVEL_DEFAULT_RTL
: Bidi.LEVEL_DEFAULT_LTR,
null);
} catch (Exception e) {
e.printStackTrace();
return;
}
byte paraLevel = (byte)(1 & para.getParaLevel());
StyleRun styleRun = new StyleRun(length, styleNormal);
if (styleRuns == null || styleRunCount <= 0) {
styleRuns = new StyleRun[1];
styleRunCount = 1;
styleRuns[0] = styleRun;
}
// assume styleRuns[styleRunCount-1].limit>=length
int width = getTextWidth(text, 0, length, styleRuns, styleRunCount);
if (width <= lineWidth) {
// everything fits onto one line
// prepare rendering a new line from either left or right
startLine(paraLevel, width);
renderLine(para, text, 0, length, styleRuns, styleRunCount);
} else {
// we need to render several lines
Bidi line = new Bidi(length, 0);
int start = 0, limit;
int styleRunStart = 0, styleRunLimit;
for (;;) {
limit = length;
styleRunLimit = styleRunCount;
width = getLineBreak(text, new Bounds(start, limit),
para, styleRuns,
new Bounds(styleRunStart, styleRunLimit));
try {
line = para.setLine(start, limit);
} catch (Exception e) {
e.printStackTrace();
return;
}
// prepare rendering a new line
// from either left or right
startLine(paraLevel, width);
if (styleRunStart > 0) {
int newRunCount = styleRuns.length - styleRunStart;
StyleRun[] newRuns = new StyleRun[newRunCount];
System.arraycopy(styleRuns, styleRunStart, newRuns, 0,
newRunCount);
renderLine(line, text, start, limit, newRuns,
styleRunLimit - styleRunStart);
} else {
renderLine(line, text, start, limit, styleRuns,
styleRunLimit - styleRunStart);
}
if (limit == length) {
break;
}
start = limit;
styleRunStart = styleRunLimit - 1;
if (start >= styleRuns[styleRunStart].limit) {
++styleRunStart;
}
}
}
}
public static void main(String[] args)
{
renderParagraph("Some Latin text...", Bidi.LTR, null, 0, 80);
renderParagraph("Some Hebrew text...", Bidi.RTL, null, 0, 60);
}
}