diff options
author | Lars Knoll <lars.knoll@qt.io> | 2017-12-12 10:14:28 +0100 |
---|---|---|
committer | Lars Knoll <lars.knoll@qt.io> | 2018-01-03 07:47:26 +0000 |
commit | 41b4e154d617a820cd7f7f732838647425a58227 (patch) | |
tree | 27e9300e3fc275bf4e50de8fb2c5e1f8aeb40fab /util/unicode/data/LineBreak.txt | |
parent | 8bfabb34dec8a437a08b5a6e0ecac4a9dd3ae18c (diff) |
Update Text segmentation and line break data to Unicode 10.0
Also adjusted the text segmentation and line break algorithms
so that they can handle the new data, and pass the test suite.
Change-Id: Ib727fd80003e34e96458d7a681996de3fa3691e7
Reviewed-by: Eskil Abrahamsen Blomfeldt <eskil.abrahamsen-blomfeldt@qt.io>
Diffstat (limited to 'util/unicode/data/LineBreak.txt')
-rw-r--r-- | util/unicode/data/LineBreak.txt | 359 |
1 files changed, 284 insertions, 75 deletions
diff --git a/util/unicode/data/LineBreak.txt b/util/unicode/data/LineBreak.txt index b627f874d0..d80210bde3 100644 --- a/util/unicode/data/LineBreak.txt +++ b/util/unicode/data/LineBreak.txt @@ -1,45 +1,45 @@ -# LineBreak-8.0.0.txt -# Date: 2015-02-13, 09:15:00 GMT [KW, LI] +# LineBreak-10.0.0.txt +# Date: 2017-03-08, 02:00:00 GMT [KW, LI] +# © 2017 Unicode®, Inc. +# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries. +# For terms of use, see http://www.unicode.org/terms_of_use.html +# +# Unicode Character Database +# For documentation, see http://www.unicode.org/reports/tr44/ # # Line_Break Property # # This file is a normative contributory data file in the # Unicode Character Database. -# It contains both normative and informative data. -# -# Copyright (c) 1991-2015 Unicode, Inc. -# For terms of use, see http://www.unicode.org/terms_of_use.html # # The format is two fields separated by a semicolon. # Field 0: Unicode code point value or range of code point values # Field 1: Line_Break property, consisting of one of the following values: -# Normative: -# "BK", "CR", "LF", "CM", "SG", "GL", "CB", "SP", "ZW", -# "NL", "WJ", "JL", "JV", "JT", "H2", "H3" -# Informative: -# "XX", "OP", "CL", "CP", "QU", "NS", "EX", "SY", -# "IS", "PR", "PO", "NU", "AL", "ID", "IN", "HY", -# "BB", "BA", "SA", "AI", "B2", "HL", "CJ", "RI" +# Non-tailorable: +# "BK", "CM", "CR", "GL", "LF", "NL", "SP", "WJ", "ZW", "ZWJ" +# Tailorable: +# "AI", "AL", "B2", "BA", "BB", "CB", "CJ", "CL", "CP", "EB", +# "EM", "EX", "H2", "H3", "HL", "HY", "ID", "IN", "IS", "JL", +# "JT", "JV", "NS", "NU", "OP", "PO", "PR", "QU", "RI", "SA", +# "SG", "SY", "XX" # - All code points, assigned and unassigned, that are not listed -# explicitly are given the value "XX". -# The unassigned code points that default to "ID" include ranges in the -# following blocks: -# CJK Unified Ideographs Extension A: U+3400..U+4DBF -# CJK Unified Ideographs: U+4E00..U+9FFF -# CJK Compatibility Ideographs: U+F900..U+FAFF -# CJK Unified Ideographs Extension B: U+20000..U+2A6DF -# CJK Unified Ideographs Extension C: U+2A700..U+2B73F -# CJK Unified Ideographs Extension D: U+2B740..U+2B81F -# CJK Unified Ideographs Extension E: U+2B820..U+2CEAF -# CJK Compatibility Ideographs Supplement: U+2F800..U+2FA1F -# and any other reserved code points on -# Planes 2 and 3: U+20000..U+2FFFD -# U+30000..U+3FFFD -# The unassigned code points that default to "PR" comprise a range in the -# following block: -# Currency Symbols: U+20A0..U+20CF -# - Character ranges are specified as for other property files in -# the Unicode Character Database. +# explicitly are given the value "XX". +# - The unassigned code points in the following blocks default to "ID": +# CJK Unified Ideographs Extension A: U+3400..U+4DBF +# CJK Unified Ideographs: U+4E00..U+9FFF +# CJK Compatibility Ideographs: U+F900..U+FAFF +# - All undesignated code points in Planes 2 and 3, whether inside or +# outside of allocated blocks, default to "ID": +# Plane 2: U+20000..U+2FFFD +# Plane 3: U+30000..U+3FFFD +# - All unassigned code points in the following Plane 1 range, whether +# inside or outside of allocated blocks, also default to "ID": +# Plane 1 range: U+1F000..U+1FFFD +# - The unassigned code points in the following block default to "PR": +# Currency Symbols: U+20A0..U+20CF +# +# Character ranges are specified as for other property files in the +# Unicode Character Database. # # For legacy reasons, there are no spaces before or after the semicolon # which separates the two fields. The comments following the number sign @@ -273,7 +273,11 @@ 0840..0858;AL # Lo [25] MANDAIC LETTER HALQA..MANDAIC LETTER AIN 0859..085B;CM # Mn [3] MANDAIC AFFRICATION MARK..MANDAIC GEMINATION MARK 085E;AL # Po MANDAIC PUNCTUATION +0860..086A;AL # Lo [11] SYRIAC LETTER MALAYALAM NGA..SYRIAC LETTER MALAYALAM SSA 08A0..08B4;AL # Lo [21] ARABIC LETTER BEH WITH SMALL V BELOW..ARABIC LETTER KAF WITH DOT BELOW +08B6..08BD;AL # Lo [8] ARABIC LETTER BEH WITH SMALL MEEM ABOVE..ARABIC LETTER AFRICAN NOON +08D4..08E1;CM # Mn [14] ARABIC SMALL HIGH WORD AR-RUB..ARABIC SMALL HIGH SIGN SAFHA +08E2;AL # Cf ARABIC DISPUTED END OF AYAH 08E3..08FF;CM # Mn [29] ARABIC TURNED DAMMA BELOW..ARABIC MARK SIDEWAYS NOON GHUNNA 0900..0902;CM # Mn [3] DEVANAGARI SIGN INVERTED CANDRABINDU..DEVANAGARI SIGN ANUSVARA 0903;CM # Mc DEVANAGARI SIGN VISARGA @@ -324,6 +328,8 @@ 09F9;PO # No BENGALI CURRENCY DENOMINATOR SIXTEEN 09FA;AL # So BENGALI ISSHAR 09FB;PR # Sc BENGALI GANDA MARK +09FC;AL # Lo BENGALI LETTER VEDIC ANUSVARA +09FD;AL # Po BENGALI ABBREVIATION SIGN 0A01..0A02;CM # Mn [2] GURMUKHI SIGN ADAK BINDI..GURMUKHI SIGN BINDI 0A03;CM # Mc GURMUKHI SIGN VISARGA 0A05..0A0A;AL # Lo [6] GURMUKHI LETTER A..GURMUKHI LETTER UU @@ -368,6 +374,7 @@ 0AF0;AL # Po GUJARATI ABBREVIATION SIGN 0AF1;PR # Sc GUJARATI RUPEE SIGN 0AF9;AL # Lo GUJARATI LETTER ZHA +0AFA..0AFF;CM # Mn [6] GUJARATI SIGN SUKUN..GUJARATI SIGN TWO-CIRCLE NUKTA ABOVE 0B01;CM # Mn ORIYA SIGN CANDRABINDU 0B02..0B03;CM # Mc [2] ORIYA SIGN ANUSVARA..ORIYA SIGN VISARGA 0B05..0B0C;AL # Lo [8] ORIYA LETTER A..ORIYA LETTER VOCALIC L @@ -436,6 +443,7 @@ 0C66..0C6F;NU # Nd [10] TELUGU DIGIT ZERO..TELUGU DIGIT NINE 0C78..0C7E;AL # No [7] TELUGU FRACTION DIGIT ZERO FOR ODD POWERS OF FOUR..TELUGU FRACTION DIGIT THREE FOR EVEN POWERS OF FOUR 0C7F;AL # So TELUGU SIGN TUUMU +0C80;AL # Lo KANNADA SIGN SPACING CANDRABINDU 0C81;CM # Mn KANNADA SIGN CANDRABINDU 0C82..0C83;CM # Mc [2] KANNADA SIGN ANUSVARA..KANNADA SIGN VISARGA 0C85..0C8C;AL # Lo [8] KANNADA LETTER A..KANNADA LETTER VOCALIC L @@ -458,11 +466,12 @@ 0CE2..0CE3;CM # Mn [2] KANNADA VOWEL SIGN VOCALIC L..KANNADA VOWEL SIGN VOCALIC LL 0CE6..0CEF;NU # Nd [10] KANNADA DIGIT ZERO..KANNADA DIGIT NINE 0CF1..0CF2;AL # Lo [2] KANNADA SIGN JIHVAMULIYA..KANNADA SIGN UPADHMANIYA -0D01;CM # Mn MALAYALAM SIGN CANDRABINDU +0D00..0D01;CM # Mn [2] MALAYALAM SIGN COMBINING ANUSVARA ABOVE..MALAYALAM SIGN CANDRABINDU 0D02..0D03;CM # Mc [2] MALAYALAM SIGN ANUSVARA..MALAYALAM SIGN VISARGA 0D05..0D0C;AL # Lo [8] MALAYALAM LETTER A..MALAYALAM LETTER VOCALIC L 0D0E..0D10;AL # Lo [3] MALAYALAM LETTER E..MALAYALAM LETTER AI 0D12..0D3A;AL # Lo [41] MALAYALAM LETTER O..MALAYALAM LETTER TTTA +0D3B..0D3C;CM # Mn [2] MALAYALAM SIGN VERTICAL BAR VIRAMA..MALAYALAM SIGN CIRCULAR VIRAMA 0D3D;AL # Lo MALAYALAM SIGN AVAGRAHA 0D3E..0D40;CM # Mc [3] MALAYALAM VOWEL SIGN AA..MALAYALAM VOWEL SIGN II 0D41..0D44;CM # Mn [4] MALAYALAM VOWEL SIGN U..MALAYALAM VOWEL SIGN VOCALIC RR @@ -470,11 +479,14 @@ 0D4A..0D4C;CM # Mc [3] MALAYALAM VOWEL SIGN O..MALAYALAM VOWEL SIGN AU 0D4D;CM # Mn MALAYALAM SIGN VIRAMA 0D4E;AL # Lo MALAYALAM LETTER DOT REPH +0D4F;AL # So MALAYALAM SIGN PARA +0D54..0D56;AL # Lo [3] MALAYALAM LETTER CHILLU M..MALAYALAM LETTER CHILLU LLL 0D57;CM # Mc MALAYALAM AU LENGTH MARK +0D58..0D5E;AL # No [7] MALAYALAM FRACTION ONE ONE-HUNDRED-AND-SIXTIETH..MALAYALAM FRACTION ONE FIFTH 0D5F..0D61;AL # Lo [3] MALAYALAM LETTER ARCHAIC II..MALAYALAM LETTER VOCALIC LL 0D62..0D63;CM # Mn [2] MALAYALAM VOWEL SIGN VOCALIC L..MALAYALAM VOWEL SIGN VOCALIC LL 0D66..0D6F;NU # Nd [10] MALAYALAM DIGIT ZERO..MALAYALAM DIGIT NINE -0D70..0D75;AL # No [6] MALAYALAM NUMBER TEN..MALAYALAM FRACTION THREE QUARTERS +0D70..0D78;AL # No [9] MALAYALAM NUMBER TEN..MALAYALAM FRACTION THREE SIXTEENTHS 0D79;PO # So MALAYALAM DATE MARK 0D7A..0D7F;AL # Lo [6] MALAYALAM LETTER CHILLU NN..MALAYALAM LETTER CHILLU K 0D82..0D83;CM # Mc [2] SINHALA SIGN ANUSVARAYA..SINHALA SIGN VISARGAYA @@ -700,7 +712,9 @@ 1820..1842;AL # Lo [35] MONGOLIAN LETTER A..MONGOLIAN LETTER CHI 1843;AL # Lm MONGOLIAN LETTER TODO LONG VOWEL SIGN 1844..1877;AL # Lo [52] MONGOLIAN LETTER TODO E..MONGOLIAN LETTER MANCHU ZHA -1880..18A8;AL # Lo [41] MONGOLIAN LETTER ALI GALI ANUSVARA ONE..MONGOLIAN LETTER MANCHU ALI GALI BHA +1880..1884;AL # Lo [5] MONGOLIAN LETTER ALI GALI ANUSVARA ONE..MONGOLIAN LETTER ALI GALI INVERTED UBADAMA +1885..1886;CM # Mn [2] MONGOLIAN LETTER ALI GALI BALUDA..MONGOLIAN LETTER ALI GALI THREE BALUDA +1887..18A8;AL # Lo [34] MONGOLIAN LETTER ALI GALI A..MONGOLIAN LETTER MANCHU ALI GALI BHA 18A9;CM # Mn MONGOLIAN LETTER ALI GALI DAGALGA 18AA;AL # Lo MONGOLIAN LETTER MANCHU ALI GALI LHA 18B0..18F5;AL # Lo [70] CANADIAN SYLLABICS OY..CANADIAN SYLLABICS CARRIER DENTAL S @@ -802,6 +816,7 @@ 1C5A..1C77;AL # Lo [30] OL CHIKI LETTER LA..OL CHIKI LETTER OH 1C78..1C7D;AL # Lm [6] OL CHIKI MU TTUDDAG..OL CHIKI AHAD 1C7E..1C7F;BA # Po [2] OL CHIKI PUNCTUATION MUCAAD..OL CHIKI PUNCTUATION DOUBLE MUCAAD +1C80..1C88;AL # Ll [9] CYRILLIC SMALL LETTER ROUNDED VE..CYRILLIC SMALL LETTER UNBLENDED UK 1CC0..1CC7;AL # Po [8] SUNDANESE PUNCTUATION BINDU SURYA..SUNDANESE PUNCTUATION BINDU BA SATANGA 1CD0..1CD2;CM # Mn [3] VEDIC TONE KARSHANA..VEDIC TONE PRENKHA 1CD3;AL # Po VEDIC SIGN NIHSHVASA @@ -814,6 +829,7 @@ 1CF2..1CF3;CM # Mc [2] VEDIC SIGN ARDHAVISARGA..VEDIC SIGN ROTATED ARDHAVISARGA 1CF4;CM # Mn VEDIC TONE CANDRA ABOVE 1CF5..1CF6;AL # Lo [2] VEDIC SIGN JIHVAMULIYA..VEDIC SIGN UPADHMANIYA +1CF7;CM # Mc VEDIC SIGN ATIKRAMA 1CF8..1CF9;CM # Mn [2] VEDIC TONE RING ABOVE..VEDIC TONE DOUBLE RING ABOVE 1D00..1D2B;AL # Ll [44] LATIN LETTER SMALL CAPITAL A..CYRILLIC LETTER SMALL CAPITAL EL 1D2C..1D6A;AL # Lm [63] MODIFIER LETTER CAPITAL A..GREEK SUBSCRIPT SMALL LETTER CHI @@ -822,8 +838,8 @@ 1D79..1D7F;AL # Ll [7] LATIN SMALL LETTER INSULAR G..LATIN SMALL LETTER UPSILON WITH STROKE 1D80..1D9A;AL # Ll [27] LATIN SMALL LETTER B WITH PALATAL HOOK..LATIN SMALL LETTER EZH WITH RETROFLEX HOOK 1D9B..1DBF;AL # Lm [37] MODIFIER LETTER SMALL TURNED ALPHA..MODIFIER LETTER SMALL THETA -1DC0..1DF5;CM # Mn [54] COMBINING DOTTED GRAVE ACCENT..COMBINING UP TACK ABOVE -1DFC..1DFF;CM # Mn [4] COMBINING DOUBLE INVERTED BREVE BELOW..COMBINING RIGHT ARROWHEAD AND DOWN ARROWHEAD BELOW +1DC0..1DF9;CM # Mn [58] COMBINING DOTTED GRAVE ACCENT..COMBINING WIDE INVERTED BRIDGE BELOW +1DFB..1DFF;CM # Mn [5] COMBINING DELETION MARK..COMBINING RIGHT ARROWHEAD AND DOWN ARROWHEAD BELOW 1E00..1EFF;AL # L& [256] LATIN CAPITAL LETTER A WITH RING BELOW..LATIN SMALL LETTER Y WITH LOOP 1F00..1F15;AL # L& [22] GREEK SMALL LETTER ALPHA WITH PSILI..GREEK SMALL LETTER EPSILON WITH DASIA AND OXIA 1F18..1F1D;AL # Lu [6] GREEK CAPITAL LETTER EPSILON WITH PSILI..GREEK CAPITAL LETTER EPSILON WITH DASIA AND OXIA @@ -855,7 +871,9 @@ 2007;GL # Zs FIGURE SPACE 2008..200A;BA # Zs [3] PUNCTUATION SPACE..HAIR SPACE 200B;ZW # Cf ZERO WIDTH SPACE -200C..200F;CM # Cf [4] ZERO WIDTH NON-JOINER..RIGHT-TO-LEFT MARK +200C;CM # Cf ZERO WIDTH NON-JOINER +200D;ZWJ # Cf ZERO WIDTH JOINER +200E..200F;CM # Cf [2] LEFT-TO-RIGHT MARK..RIGHT-TO-LEFT MARK 2010;BA # Pd HYPHEN 2011;GL # Pd NON-BREAKING HYPHEN 2012..2013;BA # Pd [2] FIGURE DASH..EN DASH @@ -928,7 +946,8 @@ 20BB;PO # Sc NORDIC MARK SIGN 20BC..20BD;PR # Sc [2] MANAT SIGN..RUBLE SIGN 20BE;PO # Sc LARI SIGN -20BF..20CF;PR # Cn [17] <reserved-20BF>..<reserved-20CF> +20BF;PR # Sc BITCOIN SIGN +20C0..20CF;PR # Cn [16] <reserved-20C0>..<reserved-20CF> 20D0..20DC;CM # Mn [13] COMBINING LEFT HARPOON ABOVE..COMBINING FOUR DOTS ABOVE 20DD..20E0;CM # Me [4] COMBINING ENCLOSING CIRCLE..COMBINING ENCLOSING CIRCLE BACKSLASH 20E1;CM # Mn COMBINING LEFT RIGHT ARROW ABOVE @@ -1091,7 +1110,7 @@ 23DC..23E1;AL # Sm [6] TOP PARENTHESIS..BOTTOM TORTOISE SHELL BRACKET 23E2..23EF;AL # So [14] WHITE TRAPEZIUM..BLACK RIGHT-POINTING TRIANGLE WITH DOUBLE VERTICAL BAR 23F0..23F3;ID # So [4] ALARM CLOCK..HOURGLASS WITH FLOWING SAND -23F4..23FA;AL # So [7] BLACK MEDIUM LEFT-POINTING TRIANGLE..BLACK CIRCLE FOR RECORD +23F4..23FF;AL # So [12] BLACK MEDIUM LEFT-POINTING TRIANGLE..OBSERVER EYE SYMBOL 2400..2426;AL # So [39] SYMBOL FOR NULL..SYMBOL FOR SUBSTITUTE FORM TWO 2440..244A;AL # So [11] OCR HOOK..OCR DOUBLE BACKSLASH 2460..249B;AI # No [60] CIRCLED DIGIT ONE..NUMBER TWENTY FULL STOP @@ -1143,7 +1162,9 @@ 2616..2617;AI # So [2] WHITE SHOGI PIECE..BLACK SHOGI PIECE 2618;ID # So SHAMROCK 2619;AL # So REVERSED ROTATED FLORAL HEART BULLET -261A..261F;ID # So [6] BLACK LEFT POINTING INDEX..WHITE DOWN POINTING INDEX +261A..261C;ID # So [3] BLACK LEFT POINTING INDEX..WHITE LEFT POINTING INDEX +261D;EB # So WHITE UP POINTING INDEX +261E..261F;ID # So [2] WHITE RIGHT POINTING INDEX..WHITE DOWN POINTING INDEX 2620..2638;AL # So [25] SKULL AND CROSSBONES..WHEEL OF DHARMA 2639..263B;ID # So [3] WHITE FROWNING FACE..BLACK SMILING FACE 263C..263F;AL # So [4] WHITE SUN WITH RAYS..MERCURY @@ -1188,19 +1209,23 @@ 26EB..26F0;AI # So [6] CASTLE..MOUNTAIN 26F1..26F5;ID # So [5] UMBRELLA ON GROUND..SAILBOAT 26F6;AI # So SQUARE FOUR CORNERS -26F7..26FA;ID # So [4] SKIER..TENT +26F7..26F8;ID # So [2] SKIER..ICE SKATE +26F9;EB # So PERSON WITH BALL +26FA;ID # So TENT 26FB..26FC;AI # So [2] JAPANESE BANK SYMBOL..HEADSTONE GRAVEYARD SYMBOL 26FD..26FF;ID # So [3] FUEL PUMP..WHITE FLAG WITH HORIZONTAL MIDDLE BLACK STRIPE 2700..2704;ID # So [5] BLACK SAFETY SCISSORS..WHITE SCISSORS 2705..2707;AL # So [3] WHITE HEAVY CHECK MARK..TAPE DRIVE -2708..270D;ID # So [6] AIRPLANE..WRITING HAND +2708..2709;ID # So [2] AIRPLANE..ENVELOPE +270A..270D;EB # So [4] RAISED FIST..WRITING HAND 270E..2756;AL # So [73] LOWER RIGHT PENCIL..BLACK DIAMOND MINUS WHITE X 2757;AI # So HEAVY EXCLAMATION MARK SYMBOL 2758..275A;AL # So [3] LIGHT VERTICAL BAR..HEAVY VERTICAL BAR 275B..2760;QU # So [6] HEAVY SINGLE TURNED COMMA QUOTATION MARK ORNAMENT..HEAVY LOW DOUBLE COMMA QUOTATION MARK ORNAMENT 2761;AL # So CURVED STEM PARAGRAPH SIGN ORNAMENT 2762..2763;EX # So [2] HEAVY EXCLAMATION MARK ORNAMENT..HEAVY HEART EXCLAMATION MARK ORNAMENT -2764..2767;AL # So [4] HEAVY BLACK HEART..ROTATED FLORAL HEART BULLET +2764;ID # So HEAVY BLACK HEART +2765..2767;AL # So [3] ROTATED HEAVY BLACK HEART BULLET..ROTATED FLORAL HEART BULLET 2768;OP # Ps MEDIUM LEFT PARENTHESIS ORNAMENT 2769;CL # Pe MEDIUM RIGHT PARENTHESIS ORNAMENT 276A;OP # Ps MEDIUM FLATTENED LEFT PARENTHESIS ORNAMENT @@ -1277,7 +1302,7 @@ 2B76..2B95;AL # So [32] NORTH WEST TRIANGLE-HEADED ARROW TO BAR..RIGHTWARDS BLACK ARROW 2B98..2BB9;AL # So [34] THREE-D TOP-LIGHTED LEFTWARDS EQUILATERAL ARROWHEAD..UP ARROWHEAD IN A RECTANGLE BOX 2BBD..2BC8;AL # So [12] BALLOT BOX WITH LIGHT X..BLACK MEDIUM RIGHT-POINTING TRIANGLE CENTRED -2BCA..2BD1;AL # So [8] TOP HALF BLACK CIRCLE..UNCERTAINTY SIGN +2BCA..2BD2;AL # So [9] TOP HALF BLACK CIRCLE..GROUP MARK 2BEC..2BEF;AL # So [4] LEFTWARDS TWO-HEADED ARROW WITH TRIANGLE ARROWHEADS..DOWNWARDS TWO-HEADED ARROW WITH TRIANGLE ARROWHEADS 2C00..2C2E;AL # Lu [47] GLAGOLITIC CAPITAL LETTER AZU..GLAGOLITIC CAPITAL LETTER LATINATE MYSLITE 2C30..2C5E;AL # Ll [47] GLAGOLITIC SMALL LETTER AZU..GLAGOLITIC SMALL LETTER LATINATE MYSLITE @@ -1355,6 +1380,7 @@ 2E40;BA # Pd DOUBLE HYPHEN 2E41;BA # Po REVERSED COMMA 2E42;OP # Ps DOUBLE LOW-REVERSED-9 QUOTATION MARK +2E43..2E49;BA # Po [7] DASH WITH LEFT UPTURN..DOUBLE STACKED COMMA 2E80..2E99;ID # So [26] CJK RADICAL REPEAT..CJK RADICAL RAP 2E9B..2EF3;ID # So [89] CJK RADICAL CHOKE..CJK RADICAL C-SIMPLIFIED TURTLE 2F00..2FD5;ID # So [214] KANGXI RADICAL ONE..KANGXI RADICAL FLUTE @@ -1453,7 +1479,7 @@ 30FC;CJ # Lm KATAKANA-HIRAGANA PROLONGED SOUND MARK 30FD..30FE;NS # Lm [2] KATAKANA ITERATION MARK..KATAKANA VOICED ITERATION MARK 30FF;ID # Lo KATAKANA DIGRAPH KOTO -3105..312D;ID # Lo [41] BOPOMOFO LETTER B..BOPOMOFO LETTER IH +3105..312E;ID # Lo [42] BOPOMOFO LETTER B..BOPOMOFO LETTER O WITH DOT ABOVE 3131..318E;ID # Lo [94] HANGUL LETTER KIYEOK..HANGUL LETTER ARAEAE 3190..3191;ID # So [2] IDEOGRAPHIC ANNOTATION LINKING MARK..IDEOGRAPHIC ANNOTATION REVERSE MARK 3192..3195;ID # No [4] IDEOGRAPHIC ANNOTATION ONE MARK..IDEOGRAPHIC ANNOTATION FOUR MARK @@ -1476,8 +1502,8 @@ 3400..4DB5;ID # Lo [6582] CJK UNIFIED IDEOGRAPH-3400..CJK UNIFIED IDEOGRAPH-4DB5 4DB6..4DBF;ID # Cn [10] <reserved-4DB6>..<reserved-4DBF> 4DC0..4DFF;AL # So [64] HEXAGRAM FOR THE CREATIVE HEAVEN..HEXAGRAM FOR BEFORE COMPLETION -4E00..9FD5;ID # Lo [20950] CJK UNIFIED IDEOGRAPH-4E00..CJK UNIFIED IDEOGRAPH-9FD5 -9FD6..9FFF;ID # Cn [42] <reserved-9FD6>..<reserved-9FFF> +4E00..9FEA;ID # Lo [20971] CJK UNIFIED IDEOGRAPH-4E00..CJK UNIFIED IDEOGRAPH-9FEA +9FEB..9FFF;ID # Cn [21] <reserved-9FEB>..<reserved-9FFF> A000..A014;ID # Lo [21] YI SYLLABLE IT..YI SYLLABLE E A015;NS # Lm YI SYLLABLE WU A016..A48C;ID # Lo [1143] YI SYLLABLE BIT..YI SYLLABLE YYR @@ -1519,7 +1545,7 @@ A788;AL # Lm MODIFIER LETTER LOW CIRCUMFLEX ACCENT A789..A78A;AL # Sk [2] MODIFIER LETTER COLON..MODIFIER LETTER SHORT EQUALS SIGN A78B..A78E;AL # L& [4] LATIN CAPITAL LETTER SALTILLO..LATIN SMALL LETTER L WITH RETROFLEX HOOK AND BELT A78F;AL # Lo LATIN LETTER SINOLOGICAL DOT -A790..A7AD;AL # L& [30] LATIN CAPITAL LETTER N WITH DESCENDER..LATIN CAPITAL LETTER L WITH BELT +A790..A7AE;AL # L& [31] LATIN CAPITAL LETTER N WITH DESCENDER..LATIN CAPITAL LETTER SMALL CAPITAL I A7B0..A7B7;AL # L& [8] LATIN CAPITAL LETTER TURNED K..LATIN SMALL LETTER OMEGA A7F7;AL # Lo LATIN EPIGRAPHIC LETTER SIDEWAYS I A7F8..A7F9;AL # Lm [2] MODIFIER LETTER CAPITAL H WITH STROKE..MODIFIER LETTER SMALL LIGATURE OE @@ -1546,7 +1572,7 @@ A876..A877;EX # Po [2] PHAGS-PA MARK SHAD..PHAGS-PA MARK DOUBLE SHAD A880..A881;CM # Mc [2] SAURASHTRA SIGN ANUSVARA..SAURASHTRA SIGN VISARGA A882..A8B3;AL # Lo [50] SAURASHTRA LETTER A..SAURASHTRA LETTER LLA A8B4..A8C3;CM # Mc [16] SAURASHTRA CONSONANT SIGN HAARU..SAURASHTRA VOWEL SIGN AU -A8C4;CM # Mn SAURASHTRA SIGN VIRAMA +A8C4..A8C5;CM # Mn [2] SAURASHTRA SIGN VIRAMA..SAURASHTRA SIGN CANDRABINDU A8CE..A8CF;BA # Po [2] SAURASHTRA DANDA..SAURASHTRA DOUBLE DANDA A8D0..A8D9;NU # Nd [10] SAURASHTRA DIGIT ZERO..SAURASHTRA DIGIT NINE A8E0..A8F1;CM # Mn [18] COMBINING DEVANAGARI DIGIT ZERO..COMBINING DEVANAGARI SIGN AVAGRAHA @@ -2574,16 +2600,16 @@ FF62;OP # Ps HALFWIDTH LEFT CORNER BRACKET FF63;CL # Pe HALFWIDTH RIGHT CORNER BRACKET FF64;CL # Po HALFWIDTH IDEOGRAPHIC COMMA FF65;NS # Po HALFWIDTH KATAKANA MIDDLE DOT -FF66;AL # Lo HALFWIDTH KATAKANA LETTER WO +FF66;ID # Lo HALFWIDTH KATAKANA LETTER WO FF67..FF6F;CJ # Lo [9] HALFWIDTH KATAKANA LETTER SMALL A..HALFWIDTH KATAKANA LETTER SMALL TU FF70;CJ # Lm HALFWIDTH KATAKANA-HIRAGANA PROLONGED SOUND MARK -FF71..FF9D;AL # Lo [45] HALFWIDTH KATAKANA LETTER A..HALFWIDTH KATAKANA LETTER N +FF71..FF9D;ID # Lo [45] HALFWIDTH KATAKANA LETTER A..HALFWIDTH KATAKANA LETTER N FF9E..FF9F;NS # Lm [2] HALFWIDTH KATAKANA VOICED SOUND MARK..HALFWIDTH KATAKANA SEMI-VOICED SOUND MARK -FFA0..FFBE;AL # Lo [31] HALFWIDTH HANGUL FILLER..HALFWIDTH HANGUL LETTER HIEUH -FFC2..FFC7;AL # Lo [6] HALFWIDTH HANGUL LETTER A..HALFWIDTH HANGUL LETTER E -FFCA..FFCF;AL # Lo [6] HALFWIDTH HANGUL LETTER YEO..HALFWIDTH HANGUL LETTER OE -FFD2..FFD7;AL # Lo [6] HALFWIDTH HANGUL LETTER YO..HALFWIDTH HANGUL LETTER YU -FFDA..FFDC;AL # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL LETTER I +FFA0..FFBE;ID # Lo [31] HALFWIDTH HANGUL FILLER..HALFWIDTH HANGUL LETTER HIEUH +FFC2..FFC7;ID # Lo [6] HALFWIDTH HANGUL LETTER A..HALFWIDTH HANGUL LETTER E +FFCA..FFCF;ID # Lo [6] HALFWIDTH HANGUL LETTER YEO..HALFWIDTH HANGUL LETTER OE +FFD2..FFD7;ID # Lo [6] HALFWIDTH HANGUL LETTER YO..HALFWIDTH HANGUL LETTER YU +FFDA..FFDC;ID # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL LETTER I FFE0;PO # Sc FULLWIDTH CENT SIGN FFE1;PR # Sc FULLWIDTH POUND SIGN FFE2;ID # Sm FULLWIDTH NOT SIGN @@ -2610,7 +2636,7 @@ FFFD;AI # So REPLACEMENT CHARACTER 10175..10178;AL # No [4] GREEK ONE HALF SIGN..GREEK THREE QUARTERS SIGN 10179..10189;AL # So [17] GREEK YEAR SIGN..GREEK TRYBLION BASE SIGN 1018A..1018B;AL # No [2] GREEK ZERO SIGN..GREEK ONE QUARTER SIGN -1018C;AL # So GREEK SINUSOID SIGN +1018C..1018E;AL # So [3] GREEK SINUSOID SIGN..NOMISMA SIGN 10190..1019B;AL # So [12] ROMAN SEXTANS SIGN..ROMAN CENTURIAL SIGN 101A0;AL # So GREEK SYMBOL TAU RHO 101D0..101FC;AL # So [45] PHAISTOS DISC SIGN PEDESTRIAN..PHAISTOS DISC SIGN WAVY BAND @@ -2621,6 +2647,7 @@ FFFD;AI # So REPLACEMENT CHARACTER 102E1..102FB;AL # No [27] COPTIC EPACT DIGIT ONE..COPTIC EPACT NUMBER NINE HUNDRED 10300..1031F;AL # Lo [32] OLD ITALIC LETTER A..OLD ITALIC LETTER ESS 10320..10323;AL # No [4] OLD ITALIC NUMERAL ONE..OLD ITALIC NUMERAL FIFTY +1032D..1032F;AL # Lo [3] OLD ITALIC LETTER YE..OLD ITALIC LETTER SOUTHERN TSE 10330..10340;AL # Lo [17] GOTHIC LETTER AHSA..GOTHIC LETTER PAIRTHRA 10341;AL # Nl GOTHIC LETTER NINETY 10342..10349;AL # Lo [8] GOTHIC LETTER RAIDA..GOTHIC LETTER OTHAL @@ -2637,6 +2664,8 @@ FFFD;AI # So REPLACEMENT CHARACTER 10450..1047F;AL # Lo [48] SHAVIAN LETTER PEEP..SHAVIAN LETTER YEW 10480..1049D;AL # Lo [30] OSMANYA LETTER ALEF..OSMANYA LETTER OO 104A0..104A9;NU # Nd [10] OSMANYA DIGIT ZERO..OSMANYA DIGIT NINE +104B0..104D3;AL # Lu [36] OSAGE CAPITAL LETTER A..OSAGE CAPITAL LETTER ZHA +104D8..104FB;AL # Ll [36] OSAGE SMALL LETTER A..OSAGE SMALL LETTER ZHA 10500..10527;AL # Lo [40] ELBASAN LETTER A..ELBASAN LETTER KHE 10530..10563;AL # Lo [52] CAUCASIAN ALBANIAN LETTER ALT..CAUCASIAN ALBANIAN LETTER KIW 1056F;AL # Po CAUCASIAN ALBANIAN CITATION MARK @@ -2774,6 +2803,7 @@ FFFD;AI # So REPLACEMENT CHARACTER 1123A;AL # Po KHOJKI WORD SEPARATOR 1123B..1123C;BA # Po [2] KHOJKI SECTION MARK..KHOJKI DOUBLE SECTION MARK 1123D;AL # Po KHOJKI ABBREVIATION SIGN +1123E;CM # Mn KHOJKI SIGN SUKUN 11280..11286;AL # Lo [7] MULTANI LETTER A..MULTANI LETTER GA 11288;AL # Lo MULTANI LETTER GHA 1128A..1128D;AL # Lo [4] MULTANI LETTER CA..MULTANI LETTER JJA @@ -2806,6 +2836,19 @@ FFFD;AI # So REPLACEMENT CHARACTER 11362..11363;CM # Mc [2] GRANTHA VOWEL SIGN VOCALIC L..GRANTHA VOWEL SIGN VOCALIC LL 11366..1136C;CM # Mn [7] COMBINING GRANTHA DIGIT ZERO..COMBINING GRANTHA DIGIT SIX 11370..11374;CM # Mn [5] COMBINING GRANTHA LETTER A..COMBINING GRANTHA LETTER PA +11400..11434;AL # Lo [53] NEWA LETTER A..NEWA LETTER HA +11435..11437;CM # Mc [3] NEWA VOWEL SIGN AA..NEWA VOWEL SIGN II +11438..1143F;CM # Mn [8] NEWA VOWEL SIGN U..NEWA VOWEL SIGN AI +11440..11441;CM # Mc [2] NEWA VOWEL SIGN O..NEWA VOWEL SIGN AU +11442..11444;CM # Mn [3] NEWA SIGN VIRAMA..NEWA SIGN ANUSVARA +11445;CM # Mc NEWA SIGN VISARGA +11446;CM # Mn NEWA SIGN NUKTA +11447..1144A;AL # Lo [4] NEWA SIGN AVAGRAHA..NEWA SIDDHI +1144B..1144E;BA # Po [4] NEWA DANDA..NEWA GAP FILLER +1144F;AL # Po NEWA ABBREVIATION SIGN +11450..11459;NU # Nd [10] NEWA DIGIT ZERO..NEWA DIGIT NINE +1145B;BA # Po NEWA PLACEHOLDER MARK +1145D;AL # Po NEWA INSERTION SIGN 11480..114AF;AL # Lo [48] TIRHUTA ANJI..TIRHUTA LETTER HA 114B0..114B2;CM # Mc [3] TIRHUTA VOWEL SIGN AA..TIRHUTA VOWEL SIGN II 114B3..114B8;CM # Mn [6] TIRHUTA VOWEL SIGN U..TIRHUTA VOWEL SIGN VOCALIC LL @@ -2844,6 +2887,7 @@ FFFD;AI # So REPLACEMENT CHARACTER 11643;AL # Po MODI ABBREVIATION SIGN 11644;AL # Lo MODI SIGN HUVA 11650..11659;NU # Nd [10] MODI DIGIT ZERO..MODI DIGIT NINE +11660..1166C;BB # Po [13] MONGOLIAN BIRGA WITH ORNAMENT..MONGOLIAN TURNED SWIRL BIRGA WITH DOUBLE ORNAMENT 11680..116AA;AL # Lo [43] TAKRI LETTER A..TAKRI LETTER RRA 116AB;CM # Mn TAKRI SIGN ANUSVARA 116AC;CM # Mc TAKRI SIGN VISARGA @@ -2867,7 +2911,65 @@ FFFD;AI # So REPLACEMENT CHARACTER 118E0..118E9;NU # Nd [10] WARANG CITI DIGIT ZERO..WARANG CITI DIGIT NINE 118EA..118F2;AL # No [9] WARANG CITI NUMBER TEN..WARANG CITI NUMBER NINETY 118FF;AL # Lo WARANG CITI OM +11A00;AL # Lo ZANABAZAR SQUARE LETTER A +11A01..11A06;CM # Mn [6] ZANABAZAR SQUARE VOWEL SIGN I..ZANABAZAR SQUARE VOWEL SIGN O +11A07..11A08;CM # Mc [2] ZANABAZAR SQUARE VOWEL SIGN AI..ZANABAZAR SQUARE VOWEL SIGN AU +11A09..11A0A;CM # Mn [2] ZANABAZAR SQUARE VOWEL SIGN REVERSED I..ZANABAZAR SQUARE VOWEL LENGTH MARK +11A0B..11A32;AL # Lo [40] ZANABAZAR SQUARE LETTER KA..ZANABAZAR SQUARE LETTER KSSA +11A33..11A38;CM # Mn [6] ZANABAZAR SQUARE FINAL CONSONANT MARK..ZANABAZAR SQUARE SIGN ANUSVARA +11A39;CM # Mc ZANABAZAR SQUARE SIGN VISARGA +11A3A;AL # Lo ZANABAZAR SQUARE CLUSTER-INITIAL LETTER RA +11A3B..11A3E;CM # Mn [4] ZANABAZAR SQUARE CLUSTER-FINAL LETTER YA..ZANABAZAR SQUARE CLUSTER-FINAL LETTER VA +11A3F;BB # Po ZANABAZAR SQUARE INITIAL HEAD MARK +11A40;AL # Po ZANABAZAR SQUARE CLOSING HEAD MARK +11A41..11A44;BA # Po [4] ZANABAZAR SQUARE MARK TSHEG..ZANABAZAR SQUARE MARK LONG TSHEG +11A45;BB # Po ZANABAZAR SQUARE INITIAL DOUBLE-LINED HEAD MARK +11A46;AL # Po ZANABAZAR SQUARE CLOSING DOUBLE-LINED HEAD MARK +11A47;CM # Mn ZANABAZAR SQUARE SUBJOINER +11A50;AL # Lo SOYOMBO LETTER A +11A51..11A56;CM # Mn [6] SOYOMBO VOWEL SIGN I..SOYOMBO VOWEL SIGN OE +11A57..11A58;CM # Mc [2] SOYOMBO VOWEL SIGN AI..SOYOMBO VOWEL SIGN AU +11A59..11A5B;CM # Mn [3] SOYOMBO VOWEL SIGN VOCALIC R..SOYOMBO VOWEL LENGTH MARK +11A5C..11A83;AL # Lo [40] SOYOMBO LETTER KA..SOYOMBO LETTER KSSA +11A86..11A89;AL # Lo [4] SOYOMBO CLUSTER-INITIAL LETTER RA..SOYOMBO CLUSTER-INITIAL LETTER SA +11A8A..11A96;CM # Mn [13] SOYOMBO FINAL CONSONANT SIGN G..SOYOMBO SIGN ANUSVARA +11A97;CM # Mc SOYOMBO SIGN VISARGA +11A98..11A99;CM # Mn [2] SOYOMBO GEMINATION MARK..SOYOMBO SUBJOINER +11A9A..11A9C;BA # Po [3] SOYOMBO MARK TSHEG..SOYOMBO MARK DOUBLE SHAD +11A9E..11AA0;BB # Po [3] SOYOMBO HEAD MARK WITH MOON AND SUN AND TRIPLE FLAME..SOYOMBO HEAD MARK WITH MOON AND SUN +11AA1..11AA2;BA # Po [2] SOYOMBO TERMINAL MARK-1..SOYOMBO TERMINAL MARK-2 11AC0..11AF8;AL # Lo [57] PAU CIN HAU LETTER PA..PAU CIN HAU GLOTTAL STOP FINAL +11C00..11C08;AL # Lo [9] BHAIKSUKI LETTER A..BHAIKSUKI LETTER VOCALIC L +11C0A..11C2E;AL # Lo [37] BHAIKSUKI LETTER E..BHAIKSUKI LETTER HA +11C2F;CM # Mc BHAIKSUKI VOWEL SIGN AA +11C30..11C36;CM # Mn [7] BHAIKSUKI VOWEL SIGN I..BHAIKSUKI VOWEL SIGN VOCALIC L +11C38..11C3D;CM # Mn [6] BHAIKSUKI VOWEL SIGN E..BHAIKSUKI SIGN ANUSVARA +11C3E;CM # Mc BHAIKSUKI SIGN VISARGA +11C3F;CM # Mn BHAIKSUKI SIGN VIRAMA +11C40;AL # Lo BHAIKSUKI SIGN AVAGRAHA +11C41..11C45;BA # Po [5] BHAIKSUKI DANDA..BHAIKSUKI GAP FILLER-2 +11C50..11C59;NU # Nd [10] BHAIKSUKI DIGIT ZERO..BHAIKSUKI DIGIT NINE +11C5A..11C6C;AL # No [19] BHAIKSUKI NUMBER ONE..BHAIKSUKI HUNDREDS UNIT MARK +11C70;BB # Po MARCHEN HEAD MARK +11C71;EX # Po MARCHEN MARK SHAD +11C72..11C8F;AL # Lo [30] MARCHEN LETTER KA..MARCHEN LETTER A +11C92..11CA7;CM # Mn [22] MARCHEN SUBJOINED LETTER KA..MARCHEN SUBJOINED LETTER ZA +11CA9;CM # Mc MARCHEN SUBJOINED LETTER YA +11CAA..11CB0;CM # Mn [7] MARCHEN SUBJOINED LETTER RA..MARCHEN VOWEL SIGN AA +11CB1;CM # Mc MARCHEN VOWEL SIGN I +11CB2..11CB3;CM # Mn [2] MARCHEN VOWEL SIGN U..MARCHEN VOWEL SIGN E +11CB4;CM # Mc MARCHEN VOWEL SIGN O +11CB5..11CB6;CM # Mn [2] MARCHEN SIGN ANUSVARA..MARCHEN SIGN CANDRABINDU +11D00..11D06;AL # Lo [7] MASARAM GONDI LETTER A..MASARAM GONDI LETTER E +11D08..11D09;AL # Lo [2] MASARAM GONDI LETTER AI..MASARAM GONDI LETTER O +11D0B..11D30;AL # Lo [38] MASARAM GONDI LETTER AU..MASARAM GONDI LETTER TRA +11D31..11D36;CM # Mn [6] MASARAM GONDI VOWEL SIGN AA..MASARAM GONDI VOWEL SIGN VOCALIC R +11D3A;CM # Mn MASARAM GONDI VOWEL SIGN E +11D3C..11D3D;CM # Mn [2] MASARAM GONDI VOWEL SIGN AI..MASARAM GONDI VOWEL SIGN O +11D3F..11D45;CM # Mn [7] MASARAM GONDI VOWEL SIGN AU..MASARAM GONDI VIRAMA +11D46;AL # Lo MASARAM GONDI REPHA +11D47;CM # Mn MASARAM GONDI RA-KARA +11D50..11D59;NU # Nd [10] MASARAM GONDI DIGIT ZERO..MASARAM GONDI DIGIT NINE 12000..12399;AL # Lo [922] CUNEIFORM SIGN A..CUNEIFORM SIGN U U 12400..1246E;AL # Nl [111] CUNEIFORM NUMERIC SIGN TWO ASH..CUNEIFORM NUMERIC SIGN NINE U VARIANT FORM 12470..12474;BA # Po [5] CUNEIFORM PUNCTUATION SIGN OLD ASSYRIAN WORD DIVIDER..CUNEIFORM PUNCTUATION SIGN DIAGONAL QUADCOLON @@ -2914,7 +3016,12 @@ FFFD;AI # So REPLACEMENT CHARACTER 16F51..16F7E;CM # Mc [46] MIAO SIGN ASPIRATION..MIAO VOWEL SIGN NG 16F8F..16F92;CM # Mn [4] MIAO TONE RIGHT..MIAO TONE BELOW 16F93..16F9F;AL # Lm [13] MIAO LETTER TONE-2..MIAO LETTER REFORMED TONE-8 -1B000..1B001;ID # Lo [2] KATAKANA LETTER ARCHAIC E..HIRAGANA LETTER ARCHAIC YE +16FE0..16FE1;NS # Lm [2] TANGUT ITERATION MARK..NUSHU ITERATION MARK +17000..187EC;ID # Lo [6125] TANGUT IDEOGRAPH-17000..TANGUT IDEOGRAPH-187EC +18800..18AF2;ID # Lo [755] TANGUT COMPONENT-001..TANGUT COMPONENT-755 +1B000..1B0FF;ID # Lo [256] KATAKANA LETTER ARCHAIC E..HENTAIGANA LETTER RE-2 +1B100..1B11E;ID # Lo [31] HENTAIGANA LETTER RE-3..HENTAIGANA LETTER N-MU-MO-2 +1B170..1B2FB;ID # Lo [396] NUSHU CHARACTER-1B170..NUSHU CHARACTER-1B2FB 1BC00..1BC6A;AL # Lo [107] DUPLOYAN LETTER H..DUPLOYAN LETTER VOCALIC M 1BC70..1BC7C;AL # Lo [13] DUPLOYAN AFFIX LEFT HORIZONTAL SECANT..DUPLOYAN AFFIX ATTACHED TANGENT HOOK 1BC80..1BC88;AL # Lo [9] DUPLOYAN AFFIX HIGH ACUTE..DUPLOYAN AFFIX HIGH VERTICAL @@ -2996,9 +3103,18 @@ FFFD;AI # So REPLACEMENT CHARACTER 1DA8B;AL # Po SIGNWRITING PARENTHESIS 1DA9B..1DA9F;CM # Mn [5] SIGNWRITING FILL MODIFIER-2..SIGNWRITING FILL MODIFIER-6 1DAA1..1DAAF;CM # Mn [15] SIGNWRITING ROTATION MODIFIER-2..SIGNWRITING ROTATION MODIFIER-16 +1E000..1E006;CM # Mn [7] COMBINING GLAGOLITIC LETTER AZU..COMBINING GLAGOLITIC LETTER ZHIVETE +1E008..1E018;CM # Mn [17] COMBINING GLAGOLITIC LETTER ZEMLJA..COMBINING GLAGOLITIC LETTER HERU +1E01B..1E021;CM # Mn [7] COMBINING GLAGOLITIC LETTER SHTA..COMBINING GLAGOLITIC LETTER YATI +1E023..1E024;CM # Mn [2] COMBINING GLAGOLITIC LETTER YU..COMBINING GLAGOLITIC LETTER SMALL YUS +1E026..1E02A;CM # Mn [5] COMBINING GLAGOLITIC LETTER YO..COMBINING GLAGOLITIC LETTER FITA 1E800..1E8C4;AL # Lo [197] MENDE KIKAKUI SYLLABLE M001 KI..MENDE KIKAKUI SYLLABLE M060 NYON 1E8C7..1E8CF;AL # No [9] MENDE KIKAKUI DIGIT ONE..MENDE KIKAKUI DIGIT NINE 1E8D0..1E8D6;CM # Mn [7] MENDE KIKAKUI COMBINING NUMBER TEENS..MENDE KIKAKUI COMBINING NUMBER MILLIONS +1E900..1E943;AL # L& [68] ADLAM CAPITAL LETTER ALIF..ADLAM SMALL LETTER SHA +1E944..1E94A;CM # Mn [7] ADLAM ALIF LENGTHENER..ADLAM NUKTA +1E950..1E959;NU # Nd [10] ADLAM DIGIT ZERO..ADLAM DIGIT NINE +1E95E..1E95F;OP # Po [2] ADLAM INITIAL EXCLAMATION MARK..ADLAM INITIAL QUESTION MARK 1EE00..1EE03;AL # Lo [4] ARABIC MATHEMATICAL ALEF..ARABIC MATHEMATICAL DAL 1EE05..1EE1F;AL # Lo [27] ARABIC MATHEMATICAL WAW..ARABIC MATHEMATICAL DOTLESS QAF 1EE21..1EE22;AL # Lo [2] ARABIC MATHEMATICAL INITIAL BEH..ARABIC MATHEMATICAL INITIAL JEEM @@ -3034,37 +3150,79 @@ FFFD;AI # So REPLACEMENT CHARACTER 1EEAB..1EEBB;AL # Lo [17] ARABIC MATHEMATICAL DOUBLE-STRUCK LAM..ARABIC MATHEMATICAL DOUBLE-STRUCK GHAIN 1EEF0..1EEF1;AL # Sm [2] ARABIC MATHEMATICAL OPERATOR MEEM WITH HAH WITH TATWEEL..ARABIC MATHEMATICAL OPERATOR HAH WITH DAL 1F000..1F02B;ID # So [44] MAHJONG TILE EAST WIND..MAHJONG TILE BACK +1F02C..1F02F;ID # Cn [4] <reserved-1F02C>..<reserved-1F02F> 1F030..1F093;ID # So [100] DOMINO TILE HORIZONTAL BACK..DOMINO TILE VERTICAL-06-06 +1F094..1F09F;ID # Cn [12] <reserved-1F094>..<reserved-1F09F> 1F0A0..1F0AE;ID # So [15] PLAYING CARD BACK..PLAYING CARD KING OF SPADES +1F0AF..1F0B0;ID # Cn [2] <reserved-1F0AF>..<reserved-1F0B0> 1F0B1..1F0BF;ID # So [15] PLAYING CARD ACE OF HEARTS..PLAYING CARD RED JOKER +1F0C0;ID # Cn <reserved-1F0C0> 1F0C1..1F0CF;ID # So [15] PLAYING CARD ACE OF DIAMONDS..PLAYING CARD BLACK JOKER +1F0D0;ID # Cn <reserved-1F0D0> 1F0D1..1F0F5;ID # So [37] PLAYING CARD ACE OF CLUBS..PLAYING CARD TRUMP-21 +1F0F6..1F0FF;ID # Cn [10] <reserved-1F0F6>..<reserved-1F0FF> 1F100..1F10C;AI # No [13] DIGIT ZERO FULL STOP..DINGBAT NEGATIVE CIRCLED SANS-SERIF DIGIT ZERO +1F10D..1F10F;ID # Cn [3] <reserved-1F10D>..<reserved-1F10F> 1F110..1F12D;AI # So [30] PARENTHESIZED LATIN CAPITAL LETTER A..CIRCLED CD 1F12E;AL # So CIRCLED WZ +1F12F;ID # Cn <reserved-1F12F> 1F130..1F169;AI # So [58] SQUARED LATIN CAPITAL LETTER A..NEGATIVE CIRCLED LATIN CAPITAL LETTER Z 1F16A..1F16B;AL # So [2] RAISED MC SIGN..RAISED MD SIGN -1F170..1F19A;AI # So [43] NEGATIVE SQUARED LATIN CAPITAL LETTER A..SQUARED VS +1F16C..1F16F;ID # Cn [4] <reserved-1F16C>..<reserved-1F16F> +1F170..1F1AC;AI # So [61] NEGATIVE SQUARED LATIN CAPITAL LETTER A..SQUARED VOD +1F1AD..1F1E5;ID # Cn [57] <reserved-1F1AD>..<reserved-1F1E5> 1F1E6..1F1FF;RI # So [26] REGIONAL INDICATOR SYMBOL LETTER A..REGIONAL INDICATOR SYMBOL LETTER Z 1F200..1F202;ID # So [3] SQUARE HIRAGANA HOKA..SQUARED KATAKANA SA -1F210..1F23A;ID # So [43] SQUARED CJK UNIFIED IDEOGRAPH-624B..SQUARED CJK UNIFIED IDEOGRAPH-55B6 +1F203..1F20F;ID # Cn [13] <reserved-1F203>..<reserved-1F20F> +1F210..1F23B;ID # So [44] SQUARED CJK UNIFIED IDEOGRAPH-624B..SQUARED CJK UNIFIED IDEOGRAPH-914D +1F23C..1F23F;ID # Cn [4] <reserved-1F23C>..<reserved-1F23F> 1F240..1F248;ID # So [9] TORTOISE SHELL BRACKETED CJK UNIFIED IDEOGRAPH-672C..TORTOISE SHELL BRACKETED CJK UNIFIED IDEOGRAPH-6557 +1F249..1F24F;ID # Cn [7] <reserved-1F249>..<reserved-1F24F> 1F250..1F251;ID # So [2] CIRCLED IDEOGRAPH ADVANTAGE..CIRCLED IDEOGRAPH ACCEPT -1F300..1F39B;ID # So [156] CYCLONE..CONTROL KNOBS +1F252..1F25F;ID # Cn [14] <reserved-1F252>..<reserved-1F25F> +1F260..1F265;ID # So [6] ROUNDED SYMBOL FOR FU..ROUNDED SYMBOL FOR CAI +1F266..1F2FF;ID # Cn [154] <reserved-1F266>..<reserved-1F2FF> +1F300..1F384;ID # So [133] CYCLONE..CHRISTMAS TREE +1F385;EB # So FATHER CHRISTMAS +1F386..1F39B;ID # So [22] FIREWORKS..CONTROL KNOBS 1F39C..1F39D;AL # So [2] BEAMED ASCENDING MUSICAL NOTES..BEAMED DESCENDING MUSICAL NOTES 1F39E..1F3B4;ID # So [23] FILM FRAMES..FLOWER PLAYING CARDS 1F3B5..1F3B6;AL # So [2] MUSICAL NOTE..MULTIPLE MUSICAL NOTES 1F3B7..1F3BB;ID # So [5] SAXOPHONE..VIOLIN 1F3BC;AL # So MUSICAL SCORE -1F3BD..1F3FA;ID # So [62] RUNNING SHIRT WITH SASH..AMPHORA -1F3FB..1F3FF;AL # Sk [5] EMOJI MODIFIER FITZPATRICK TYPE-1-2..EMOJI MODIFIER FITZPATRICK TYPE-6 -1F400..1F49F;ID # So [160] RAT..HEART DECORATION +1F3BD..1F3C1;ID # So [5] RUNNING SHIRT WITH SASH..CHEQUERED FLAG +1F3C2..1F3C4;EB # So [3] SNOWBOARDER..SURFER +1F3C5..1F3C6;ID # So [2] SPORTS MEDAL..TROPHY +1F3C7;EB # So HORSE RACING +1F3C8..1F3C9;ID # So [2] AMERICAN FOOTBALL..RUGBY FOOTBALL +1F3CA..1F3CC;EB # So [3] SWIMMER..GOLFER +1F3CD..1F3FA;ID # So [46] RACING MOTORCYCLE..AMPHORA +1F3FB..1F3FF;EM # Sk [5] EMOJI MODIFIER FITZPATRICK TYPE-1-2..EMOJI MODIFIER FITZPATRICK TYPE-6 +1F400..1F441;ID # So [66] RAT..EYE +1F442..1F443;EB # So [2] EAR..NOSE +1F444..1F445;ID # So [2] MOUTH..TONGUE +1F446..1F450;EB # So [11] WHITE UP POINTING BACKHAND INDEX..OPEN HANDS SIGN +1F451..1F465;ID # So [21] CROWN..BUSTS IN SILHOUETTE +1F466..1F469;EB # So [4] BOY..WOMAN +1F46A..1F46D;ID # So [4] FAMILY..TWO WOMEN HOLDING HANDS +1F46E;EB # So POLICE OFFICER +1F46F;ID # So WOMAN WITH BUNNY EARS +1F470..1F478;EB # So [9] BRIDE WITH VEIL..PRINCESS +1F479..1F47B;ID # So [3] JAPANESE OGRE..GHOST +1F47C;EB # So BABY ANGEL +1F47D..1F480;ID # So [4] EXTRATERRESTRIAL ALIEN..SKULL +1F481..1F483;EB # So [3] INFORMATION DESK PERSON..DANCER +1F484;ID # So LIPSTICK +1F485..1F487;EB # So [3] NAIL POLISH..HAIRCUT +1F488..1F49F;ID # So [24] BARBER POLE..HEART DECORATION 1F4A0;AL # So DIAMOND SHAPE WITH A DOT INSIDE 1F4A1;ID # So ELECTRIC LIGHT BULB 1F4A2;AL # So ANGER SYMBOL 1F4A3;ID # So BOMB 1F4A4;AL # So SLEEPING SYMBOL -1F4A5..1F4AE;ID # So [10] COLLISION SYMBOL..WHITE FLOWER +1F4A5..1F4A9;ID # So [5] COLLISION SYMBOL..PILE OF POO +1F4AA;EB # So FLEXED BICEPS +1F4AB..1F4AE;ID # So [4] DIZZY SYMBOL..WHITE FLOWER 1F4AF;AL # So HUNDRED POINTS SYMBOL 1F4B0;ID # So MONEY BAG 1F4B1..1F4B2;AL # So [2] CURRENCY EXCHANGE..HEAVY DOLLAR SIGN @@ -3074,31 +3232,80 @@ FFFD;AI # So REPLACEMENT CHARACTER 1F517..1F524;AL # So [14] LINK SYMBOL..INPUT SYMBOL FOR LATIN LETTERS 1F525..1F531;ID # So [13] FIRE..TRIDENT EMBLEM 1F532..1F549;AL # So [24] BLACK SQUARE BUTTON..OM SYMBOL -1F54A..1F579;ID # So [48] DOVE OF PEACE..JOYSTICK -1F57B..1F5A3;ID # So [41] LEFT HAND TELEPHONE RECEIVER..BLACK DOWN POINTING BACKHAND INDEX -1F5A5..1F5D3;ID # So [47] DESKTOP COMPUTER..SPIRAL CALENDAR PAD +1F54A..1F573;ID # So [42] DOVE OF PEACE..HOLE +1F574..1F575;EB # So [2] MAN IN BUSINESS SUIT LEVITATING..SLEUTH OR SPY +1F576..1F579;ID # So [4] DARK SUNGLASSES..JOYSTICK +1F57A;EB # So MAN DANCING +1F57B..1F58F;ID # So [21] LEFT HAND TELEPHONE RECEIVER..TURNED OK HAND SIGN +1F590;EB # So RAISED HAND WITH FINGERS SPLAYED +1F591..1F594;ID # So [4] REVERSED RAISED HAND WITH FINGERS SPLAYED..REVERSED VICTORY HAND +1F595..1F596;EB # So [2] REVERSED HAND WITH MIDDLE FINGER EXTENDED..RAISED HAND WITH PART BETWEEN MIDDLE AND RING FINGERS +1F597..1F5D3;ID # So [61] WHITE DOWN POINTING LEFT HAND INDEX..SPIRAL CALENDAR PAD 1F5D4..1F5DB;AL # So [8] DESKTOP WINDOW..DECREASE FONT SIZE SYMBOL 1F5DC..1F5F3;ID # So [24] COMPRESSION..BALLOT BOX WITH BALLOT 1F5F4..1F5F9;AL # So [6] BALLOT SCRIPT X..BALLOT BOX WITH BOLD CHECK 1F5FA..1F5FF;ID # So [6] WORLD MAP..MOYAI -1F600..1F64F;ID # So [80] GRINNING FACE..PERSON WITH FOLDED HANDS +1F600..1F644;ID # So [69] GRINNING FACE..FACE WITH ROLLING EYES +1F645..1F647;EB # So [3] FACE WITH NO GOOD GESTURE..PERSON BOWING DEEPLY +1F648..1F64A;ID # So [3] SEE-NO-EVIL MONKEY..SPEAK-NO-EVIL MONKEY +1F64B..1F64F;EB # So [5] HAPPY PERSON RAISING ONE HAND..PERSON WITH FOLDED HANDS 1F650..1F675;AL # So [38] NORTH WEST POINTING LEAF..SWASH AMPERSAND ORNAMENT 1F676..1F678;QU # So [3] SANS-SERIF HEAVY DOUBLE TURNED COMMA QUOTATION MARK ORNAMENT..SANS-SERIF HEAVY LOW DOUBLE COMMA QUOTATION MARK ORNAMENT 1F679..1F67B;NS # So [3] HEAVY INTERROBANG ORNAMENT..HEAVY SANS-SERIF INTERROBANG ORNAMENT 1F67C..1F67F;AL # So [4] VERY HEAVY SOLIDUS..REVERSE CHECKER BOARD -1F680..1F6D0;ID # So [81] ROCKET..PLACE OF WORSHIP +1F680..1F6A2;ID # So [35] ROCKET..SHIP +1F6A3;EB # So ROWBOAT +1F6A4..1F6B3;ID # So [16] SPEEDBOAT..NO BICYCLES +1F6B4..1F6B6;EB # So [3] BICYCLIST..PEDESTRIAN +1F6B7..1F6BF;ID # So [9] NO PEDESTRIANS..SHOWER +1F6C0;EB # So BATH +1F6C1..1F6CB;ID # So [11] BATHTUB..COUCH AND LAMP +1F6CC;EB # So SLEEPING ACCOMMODATION +1F6CD..1F6D4;ID # So [8] SHOPPING BAGS..PAGODA +1F6D5..1F6DF;ID # Cn [11] <reserved-1F6D5>..<reserved-1F6DF> 1F6E0..1F6EC;ID # So [13] HAMMER AND WRENCH..AIRPLANE ARRIVING -1F6F0..1F6F3;ID # So [4] SATELLITE..PASSENGER SHIP +1F6ED..1F6EF;ID # Cn [3] <reserved-1F6ED>..<reserved-1F6EF> +1F6F0..1F6F8;ID # So [9] SATELLITE..FLYING SAUCER +1F6F9..1F6FF;ID # Cn [7] <reserved-1F6F9>..<reserved-1F6FF> 1F700..1F773;AL # So [116] ALCHEMICAL SYMBOL FOR QUINTESSENCE..ALCHEMICAL SYMBOL FOR HALF OUNCE +1F774..1F77F;ID # Cn [12] <reserved-1F774>..<reserved-1F77F> 1F780..1F7D4;AL # So [85] BLACK LEFT-POINTING ISOSCELES RIGHT TRIANGLE..HEAVY TWELVE POINTED PINWHEEL STAR +1F7D5..1F7FF;ID # Cn [43] <reserved-1F7D5>..<reserved-1F7FF> 1F800..1F80B;AL # So [12] LEFTWARDS ARROW WITH SMALL TRIANGLE ARROWHEAD..DOWNWARDS ARROW WITH LARGE TRIANGLE ARROWHEAD +1F80C..1F80F;ID # Cn [4] <reserved-1F80C>..<reserved-1F80F> 1F810..1F847;AL # So [56] LEFTWARDS ARROW WITH SMALL EQUILATERAL ARROWHEAD..DOWNWARDS HEAVY ARROW +1F848..1F84F;ID # Cn [8] <reserved-1F848>..<reserved-1F84F> 1F850..1F859;AL # So [10] LEFTWARDS SANS-SERIF ARROW..UP DOWN SANS-SERIF ARROW +1F85A..1F85F;ID # Cn [6] <reserved-1F85A>..<reserved-1F85F> 1F860..1F887;AL # So [40] WIDE-HEADED LEFTWARDS LIGHT BARB ARROW..WIDE-HEADED SOUTH WEST VERY HEAVY BARB ARROW +1F888..1F88F;ID # Cn [8] <reserved-1F888>..<reserved-1F88F> 1F890..1F8AD;AL # So [30] LEFTWARDS TRIANGLE ARROWHEAD..WHITE ARROW SHAFT WIDTH TWO THIRDS -1F910..1F918;ID # So [9] ZIPPER-MOUTH FACE..SIGN OF THE HORNS -1F980..1F984;ID # So [5] CRAB..UNICORN FACE +1F8AE..1F8FF;ID # Cn [82] <reserved-1F8AE>..<reserved-1F8FF> +1F900..1F90B;AL # So [12] CIRCLED CROSS FORMEE WITH FOUR DOTS..DOWNWARD FACING NOTCHED HOOK WITH DOT +1F90C..1F90F;ID # Cn [4] <reserved-1F90C>..<reserved-1F90F> +1F910..1F917;ID # So [8] ZIPPER-MOUTH FACE..HUGGING FACE +1F918..1F91C;EB # So [5] SIGN OF THE HORNS..RIGHT-FACING FIST +1F91D;ID # So HANDSHAKE +1F91E..1F91F;EB # So [2] HAND WITH INDEX AND MIDDLE FINGERS CROSSED..I LOVE YOU HAND SIGN +1F920..1F925;ID # So [6] FACE WITH COWBOY HAT..LYING FACE +1F926;EB # So FACE PALM +1F927..1F92F;ID # So [9] SNEEZING FACE..SHOCKED FACE WITH EXPLODING HEAD +1F930..1F939;EB # So [10] PREGNANT WOMAN..JUGGLING +1F93A..1F93C;ID # So [3] FENCER..WRESTLERS +1F93D..1F93E;EB # So [2] WATER POLO..HANDBALL +1F93F;ID # Cn <reserved-1F93F> +1F940..1F94C;ID # So [13] WILTED FLOWER..CURLING STONE +1F94D..1F94F;ID # Cn [3] <reserved-1F94D>..<reserved-1F94F> +1F950..1F96B;ID # So [28] CROISSANT..CANNED FOOD +1F96C..1F97F;ID # Cn [20] <reserved-1F96C>..<reserved-1F97F> +1F980..1F997;ID # So [24] CRAB..CRICKET +1F998..1F9BF;ID # Cn [40] <reserved-1F998>..<reserved-1F9BF> 1F9C0;ID # So CHEESE WEDGE +1F9C1..1F9CF;ID # Cn [15] <reserved-1F9C1>..<reserved-1F9CF> +1F9D0;ID # So FACE WITH MONOCLE +1F9D1..1F9DD;EB # So [13] ADULT..ELF +1F9DE..1F9E6;ID # So [9] GENIE..SOCKS +1F9E7..1FFFD;ID # Cn [1559] <reserved-1F9E7>..<reserved-1FFFD> 20000..2A6D6;ID # Lo [42711] CJK UNIFIED IDEOGRAPH-20000..CJK UNIFIED IDEOGRAPH-2A6D6 2A6D7..2A6FF;ID # Cn [41] <reserved-2A6D7>..<reserved-2A6FF> 2A700..2B734;ID # Lo [4149] CJK UNIFIED IDEOGRAPH-2A700..CJK UNIFIED IDEOGRAPH-2B734 @@ -3106,7 +3313,9 @@ FFFD;AI # So REPLACEMENT CHARACTER 2B740..2B81D;ID # Lo [222] CJK UNIFIED IDEOGRAPH-2B740..CJK UNIFIED IDEOGRAPH-2B81D 2B81E..2B81F;ID # Cn [2] <reserved-2B81E>..<reserved-2B81F> 2B820..2CEA1;ID # Lo [5762] CJK UNIFIED IDEOGRAPH-2B820..CJK UNIFIED IDEOGRAPH-2CEA1 -2CEA2..2F7FF;ID # Cn [10590] <reserved-2CEA2>..<reserved-2F7FF> +2CEA2..2CEAF;ID # Cn [14] <reserved-2CEA2>..<reserved-2CEAF> +2CEB0..2EBE0;ID # Lo [7473] CJK UNIFIED IDEOGRAPH-2CEB0..CJK UNIFIED IDEOGRAPH-2EBE0 +2EBE1..2F7FF;ID # Cn [3103] <reserved-2EBE1>..<reserved-2F7FF> 2F800..2FA1D;ID # Lo [542] CJK COMPATIBILITY IDEOGRAPH-2F800..CJK COMPATIBILITY IDEOGRAPH-2FA1D 2FA1E..2FFFD;ID # Cn [1504] <reserved-2FA1E>..<reserved-2FFFD> 30000..3FFFD;ID # Cn [65534] <reserved-30000>..<reserved-3FFFD> |