From a98b541f26beb6d2ffcd0a720885224c17b54470 Mon Sep 17 00:00:00 2001 From: Konstantin Ritt Date: Sun, 1 Nov 2015 08:15:29 +0400 Subject: Update Unicode data files to v8.0 Change-Id: I0aa368cb07353924031a9af4f0bdc33692eb1053 Reviewed-by: Lars Knoll --- util/unicode/data/SentenceBreakProperty.txt | 90 +++++++++++++++++++++-------- 1 file changed, 65 insertions(+), 25 deletions(-) (limited to 'util/unicode/data/SentenceBreakProperty.txt') diff --git a/util/unicode/data/SentenceBreakProperty.txt b/util/unicode/data/SentenceBreakProperty.txt index 19752103f9..8dd1abff0f 100644 --- a/util/unicode/data/SentenceBreakProperty.txt +++ b/util/unicode/data/SentenceBreakProperty.txt @@ -1,8 +1,8 @@ -# SentenceBreakProperty-7.0.0.txt -# Date: 2014-02-19, 15:51:38 GMT [MD] +# SentenceBreakProperty-8.0.0.txt +# Date: 2015-03-11, 22:29:43 GMT [MD] # # Unicode Character Database -# Copyright (c) 1991-2014 Unicode, Inc. +# Copyright (c) 1991-2015 Unicode, Inc. # For terms of use, see http://www.unicode.org/terms_of_use.html # For documentation, see http://www.unicode.org/reports/tr44/ @@ -53,7 +53,7 @@ 0825..0827 ; Extend # Mn [3] SAMARITAN VOWEL SIGN SHORT A..SAMARITAN VOWEL SIGN U 0829..082D ; Extend # Mn [5] SAMARITAN VOWEL SIGN LONG I..SAMARITAN MARK NEQUDAA 0859..085B ; Extend # Mn [3] MANDAIC AFFRICATION MARK..MANDAIC GEMINATION MARK -08E4..0902 ; Extend # Mn [31] ARABIC CURLY FATHA..DEVANAGARI SIGN ANUSVARA +08E3..0902 ; Extend # Mn [32] ARABIC TURNED DAMMA BELOW..DEVANAGARI SIGN ANUSVARA 0903 ; Extend # Mc DEVANAGARI SIGN VISARGA 093A ; Extend # Mn DEVANAGARI VOWEL SIGN OE 093B ; Extend # Mc DEVANAGARI VOWEL SIGN OOE @@ -216,8 +216,6 @@ 1932 ; Extend # Mn LIMBU SMALL LETTER ANUSVARA 1933..1938 ; Extend # Mc [6] LIMBU SMALL LETTER TA..LIMBU SMALL LETTER LA 1939..193B ; Extend # Mn [3] LIMBU SIGN MUKPHRENG..LIMBU SIGN SA-I -19B0..19C0 ; Extend # Mc [17] NEW TAI LUE VOWEL SIGN VOWEL SHORTENER..NEW TAI LUE VOWEL SIGN IY -19C8..19C9 ; Extend # Mc [2] NEW TAI LUE TONE MARK-1..NEW TAI LUE TONE MARK-2 1A17..1A18 ; Extend # Mn [2] BUGINESE VOWEL SIGN I..BUGINESE VOWEL SIGN U 1A19..1A1A ; Extend # Mc [2] BUGINESE VOWEL SIGN E..BUGINESE VOWEL SIGN O 1A1B ; Extend # Mn BUGINESE VOWEL SIGN AE @@ -291,7 +289,7 @@ A66F ; Extend # Mn COMBINING CYRILLIC VZMET A670..A672 ; Extend # Me [3] COMBINING CYRILLIC TEN MILLIONS SIGN..COMBINING CYRILLIC THOUSAND MILLIONS SIGN A674..A67D ; Extend # Mn [10] COMBINING CYRILLIC LETTER UKRAINIAN IE..COMBINING CYRILLIC PAYEROK -A69F ; Extend # Mn COMBINING CYRILLIC LETTER IOTIFIED E +A69E..A69F ; Extend # Mn [2] COMBINING CYRILLIC LETTER EF..COMBINING CYRILLIC LETTER IOTIFIED E A6F0..A6F1 ; Extend # Mn [2] BAMUM COMBINING MARK KOQNDON..BAMUM COMBINING MARK TUKWENTIS A802 ; Extend # Mn SYLOTI NAGRI SIGN DVISVARA A806 ; Extend # Mn SYLOTI NAGRI SIGN HASANTA @@ -345,7 +343,7 @@ ABEC ; Extend # Mc MEETEI MAYEK LUM IYEK ABED ; Extend # Mn MEETEI MAYEK APUN IYEK FB1E ; Extend # Mn HEBREW POINT JUDEO-SPANISH VARIKA FE00..FE0F ; Extend # Mn [16] VARIATION SELECTOR-1..VARIATION SELECTOR-16 -FE20..FE2D ; Extend # Mn [14] COMBINING LIGATURE LEFT HALF..COMBINING CONJOINING MACRON BELOW +FE20..FE2F ; Extend # Mn [16] COMBINING LIGATURE LEFT HALF..COMBINING CYRILLIC TITLO RIGHT HALF FF9E..FF9F ; Extend # Lm [2] HALFWIDTH KATAKANA VOICED SOUND MARK..HALFWIDTH KATAKANA SEMI-VOICED SOUND MARK 101FD ; Extend # Mn PHAISTOS DISC SIGN COMBINING OBLIQUE STROKE 102E0 ; Extend # Mn COPTIC EPACT THOUSANDS MARK @@ -376,6 +374,7 @@ FF9E..FF9F ; Extend # Lm [2] HALFWIDTH KATAKANA VOICED SOUND MARK..HALFWIDT 111B3..111B5 ; Extend # Mc [3] SHARADA VOWEL SIGN AA..SHARADA VOWEL SIGN II 111B6..111BE ; Extend # Mn [9] SHARADA VOWEL SIGN U..SHARADA VOWEL SIGN O 111BF..111C0 ; Extend # Mc [2] SHARADA VOWEL SIGN AU..SHARADA SIGN VIRAMA +111CA..111CC ; Extend # Mn [3] SHARADA SIGN NUKTA..SHARADA EXTRA SHORT VOWEL MARK 1122C..1122E ; Extend # Mc [3] KHOJKI VOWEL SIGN AA..KHOJKI VOWEL SIGN II 1122F..11231 ; Extend # Mn [3] KHOJKI VOWEL SIGN U..KHOJKI VOWEL SIGN AI 11232..11233 ; Extend # Mc [2] KHOJKI VOWEL SIGN O..KHOJKI VOWEL SIGN AU @@ -385,7 +384,7 @@ FF9E..FF9F ; Extend # Lm [2] HALFWIDTH KATAKANA VOICED SOUND MARK..HALFWIDT 112DF ; Extend # Mn KHUDAWADI SIGN ANUSVARA 112E0..112E2 ; Extend # Mc [3] KHUDAWADI VOWEL SIGN AA..KHUDAWADI VOWEL SIGN II 112E3..112EA ; Extend # Mn [8] KHUDAWADI VOWEL SIGN U..KHUDAWADI SIGN VIRAMA -11301 ; Extend # Mn GRANTHA SIGN CANDRABINDU +11300..11301 ; Extend # Mn [2] GRANTHA SIGN COMBINING ANUSVARA ABOVE..GRANTHA SIGN CANDRABINDU 11302..11303 ; Extend # Mc [2] GRANTHA SIGN ANUSVARA..GRANTHA SIGN VISARGA 1133C ; Extend # Mn GRANTHA SIGN NUKTA 1133E..1133F ; Extend # Mc [2] GRANTHA VOWEL SIGN AA..GRANTHA VOWEL SIGN I @@ -411,6 +410,7 @@ FF9E..FF9F ; Extend # Lm [2] HALFWIDTH KATAKANA VOICED SOUND MARK..HALFWIDT 115BC..115BD ; Extend # Mn [2] SIDDHAM SIGN CANDRABINDU..SIDDHAM SIGN ANUSVARA 115BE ; Extend # Mc SIDDHAM SIGN VISARGA 115BF..115C0 ; Extend # Mn [2] SIDDHAM SIGN VIRAMA..SIDDHAM SIGN NUKTA +115DC..115DD ; Extend # Mn [2] SIDDHAM VOWEL SIGN ALTERNATE U..SIDDHAM VOWEL SIGN ALTERNATE UU 11630..11632 ; Extend # Mc [3] MODI VOWEL SIGN AA..MODI VOWEL SIGN II 11633..1163A ; Extend # Mn [8] MODI VOWEL SIGN U..MODI VOWEL SIGN AI 1163B..1163C ; Extend # Mc [2] MODI VOWEL SIGN O..MODI VOWEL SIGN AU @@ -424,6 +424,11 @@ FF9E..FF9F ; Extend # Lm [2] HALFWIDTH KATAKANA VOICED SOUND MARK..HALFWIDT 116B0..116B5 ; Extend # Mn [6] TAKRI VOWEL SIGN U..TAKRI VOWEL SIGN AU 116B6 ; Extend # Mc TAKRI SIGN VIRAMA 116B7 ; Extend # Mn TAKRI SIGN NUKTA +1171D..1171F ; Extend # Mn [3] AHOM CONSONANT SIGN MEDIAL LA..AHOM CONSONANT SIGN MEDIAL LIGATING RA +11720..11721 ; Extend # Mc [2] AHOM VOWEL SIGN A..AHOM VOWEL SIGN AA +11722..11725 ; Extend # Mn [4] AHOM VOWEL SIGN I..AHOM VOWEL SIGN UU +11726 ; Extend # Mc AHOM VOWEL SIGN E +11727..1172B ; Extend # Mn [5] AHOM VOWEL SIGN AW..AHOM SIGN KILLER 16AF0..16AF4 ; Extend # Mn [5] BASSA VAH COMBINING HIGH TONE..BASSA VAH COMBINING HIGH-LOW TONE 16B30..16B36 ; Extend # Mn [7] PAHAWH HMONG MARK CIM TUB..PAHAWH HMONG MARK CIM TAUM 16F51..16F7E ; Extend # Mc [46] MIAO SIGN ASPIRATION..MIAO VOWEL SIGN NG @@ -436,10 +441,16 @@ FF9E..FF9F ; Extend # Lm [2] HALFWIDTH KATAKANA VOICED SOUND MARK..HALFWIDT 1D185..1D18B ; Extend # Mn [7] MUSICAL SYMBOL COMBINING DOIT..MUSICAL SYMBOL COMBINING TRIPLE TONGUE 1D1AA..1D1AD ; Extend # Mn [4] MUSICAL SYMBOL COMBINING DOWN BOW..MUSICAL SYMBOL COMBINING SNAP PIZZICATO 1D242..1D244 ; Extend # Mn [3] COMBINING GREEK MUSICAL TRISEME..COMBINING GREEK MUSICAL PENTASEME +1DA00..1DA36 ; Extend # Mn [55] SIGNWRITING HEAD RIM..SIGNWRITING AIR SUCKING IN +1DA3B..1DA6C ; Extend # Mn [50] SIGNWRITING MOUTH CLOSED NEUTRAL..SIGNWRITING EXCITEMENT +1DA75 ; Extend # Mn SIGNWRITING UPPER BODY TILTING FROM HIP JOINTS +1DA84 ; Extend # Mn SIGNWRITING LOCATION HEAD NECK +1DA9B..1DA9F ; Extend # Mn [5] SIGNWRITING FILL MODIFIER-2..SIGNWRITING FILL MODIFIER-6 +1DAA1..1DAAF ; Extend # Mn [15] SIGNWRITING ROTATION MODIFIER-2..SIGNWRITING ROTATION MODIFIER-16 1E8D0..1E8D6 ; Extend # Mn [7] MENDE KIKAKUI COMBINING NUMBER TEENS..MENDE KIKAKUI COMBINING NUMBER MILLIONS E0100..E01EF ; Extend # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256 -# Total code points: 1834 +# Total code points: 1967 # ================================================ @@ -764,6 +775,7 @@ E0020..E007F ; Format # Cf [96] TAG SPACE..CANCEL TAG 052D ; Lower # L& CYRILLIC SMALL LETTER DCHE 052F ; Lower # L& CYRILLIC SMALL LETTER EL WITH DESCENDER 0561..0587 ; Lower # L& [39] ARMENIAN SMALL LETTER AYB..ARMENIAN SMALL LIGATURE ECH YIWN +13F8..13FD ; Lower # L& [6] CHEROKEE SMALL LETTER YE..CHEROKEE SMALL LETTER MV 1D00..1D2B ; Lower # L& [44] LATIN LETTER SMALL CAPITAL A..CYRILLIC LETTER SMALL CAPITAL EL 1D2C..1D6A ; Lower # Lm [63] MODIFIER LETTER CAPITAL A..GREEK SUBSCRIPT SMALL LETTER CHI 1D6B..1D77 ; Lower # L& [13] LATIN SMALL LETTER UE..LATIN SMALL LETTER TURNED G @@ -1094,15 +1106,19 @@ A7A3 ; Lower # L& LATIN SMALL LETTER K WITH OBLIQUE STROKE A7A5 ; Lower # L& LATIN SMALL LETTER N WITH OBLIQUE STROKE A7A7 ; Lower # L& LATIN SMALL LETTER R WITH OBLIQUE STROKE A7A9 ; Lower # L& LATIN SMALL LETTER S WITH OBLIQUE STROKE +A7B5 ; Lower # L& LATIN SMALL LETTER BETA +A7B7 ; Lower # L& LATIN SMALL LETTER OMEGA A7F8..A7F9 ; Lower # Lm [2] MODIFIER LETTER CAPITAL H WITH STROKE..MODIFIER LETTER SMALL LIGATURE OE A7FA ; Lower # L& LATIN LETTER SMALL CAPITAL TURNED M AB30..AB5A ; Lower # L& [43] LATIN SMALL LETTER BARRED ALPHA..LATIN SMALL LETTER Y WITH SHORT RIGHT LEG AB5C..AB5F ; Lower # Lm [4] MODIFIER LETTER SMALL HENG..MODIFIER LETTER SMALL U WITH LEFT HOOK -AB64..AB65 ; Lower # L& [2] LATIN SMALL LETTER INVERTED ALPHA..GREEK LETTER SMALL CAPITAL OMEGA +AB60..AB65 ; Lower # L& [6] LATIN SMALL LETTER SAKHA YAT..GREEK LETTER SMALL CAPITAL OMEGA +AB70..ABBF ; Lower # L& [80] CHEROKEE SMALL LETTER A..CHEROKEE SMALL LETTER YA FB00..FB06 ; Lower # L& [7] LATIN SMALL LIGATURE FF..LATIN SMALL LIGATURE ST FB13..FB17 ; Lower # L& [5] ARMENIAN SMALL LIGATURE MEN NOW..ARMENIAN SMALL LIGATURE MEN XEH FF41..FF5A ; Lower # L& [26] FULLWIDTH LATIN SMALL LETTER A..FULLWIDTH LATIN SMALL LETTER Z 10428..1044F ; Lower # L& [40] DESERET SMALL LETTER LONG I..DESERET SMALL LETTER EW +10CC0..10CF2 ; Lower # L& [51] OLD HUNGARIAN SMALL LETTER A..OLD HUNGARIAN SMALL LETTER US 118C0..118DF ; Lower # L& [32] WARANG CITI SMALL LETTER NGAA..WARANG CITI SMALL LETTER VIYO 1D41A..1D433 ; Lower # L& [26] MATHEMATICAL BOLD SMALL A..MATHEMATICAL BOLD SMALL Z 1D44E..1D454 ; Lower # L& [7] MATHEMATICAL ITALIC SMALL A..MATHEMATICAL ITALIC SMALL G @@ -1133,7 +1149,7 @@ FF41..FF5A ; Lower # L& [26] FULLWIDTH LATIN SMALL LETTER A..FULLWIDTH LATIN 1D7C4..1D7C9 ; Lower # L& [6] MATHEMATICAL SANS-SERIF BOLD ITALIC EPSILON SYMBOL..MATHEMATICAL SANS-SERIF BOLD ITALIC PI SYMBOL 1D7CB ; Lower # L& MATHEMATICAL BOLD SMALL DIGAMMA -# Total code points: 2029 +# Total code points: 2172 # ================================================ @@ -1412,6 +1428,7 @@ FF41..FF5A ; Lower # L& [26] FULLWIDTH LATIN SMALL LETTER A..FULLWIDTH LATIN 10A0..10C5 ; Upper # L& [38] GEORGIAN CAPITAL LETTER AN..GEORGIAN CAPITAL LETTER HOE 10C7 ; Upper # L& GEORGIAN CAPITAL LETTER YN 10CD ; Upper # L& GEORGIAN CAPITAL LETTER AEN +13A0..13F5 ; Upper # L& [86] CHEROKEE LETTER A..CHEROKEE LETTER MV 1E00 ; Upper # L& LATIN CAPITAL LETTER A WITH RING BELOW 1E02 ; Upper # L& LATIN CAPITAL LETTER B WITH DOT ABOVE 1E04 ; Upper # L& LATIN CAPITAL LETTER B WITH DOT BELOW @@ -1729,9 +1746,11 @@ A7A4 ; Upper # L& LATIN CAPITAL LETTER N WITH OBLIQUE STROKE A7A6 ; Upper # L& LATIN CAPITAL LETTER R WITH OBLIQUE STROKE A7A8 ; Upper # L& LATIN CAPITAL LETTER S WITH OBLIQUE STROKE A7AA..A7AD ; Upper # L& [4] LATIN CAPITAL LETTER H WITH HOOK..LATIN CAPITAL LETTER L WITH BELT -A7B0..A7B1 ; Upper # L& [2] LATIN CAPITAL LETTER TURNED K..LATIN CAPITAL LETTER TURNED T +A7B0..A7B4 ; Upper # L& [5] LATIN CAPITAL LETTER TURNED K..LATIN CAPITAL LETTER BETA +A7B6 ; Upper # L& LATIN CAPITAL LETTER OMEGA FF21..FF3A ; Upper # L& [26] FULLWIDTH LATIN CAPITAL LETTER A..FULLWIDTH LATIN CAPITAL LETTER Z 10400..10427 ; Upper # L& [40] DESERET CAPITAL LETTER LONG I..DESERET CAPITAL LETTER EW +10C80..10CB2 ; Upper # L& [51] OLD HUNGARIAN CAPITAL LETTER A..OLD HUNGARIAN CAPITAL LETTER US 118A0..118BF ; Upper # L& [32] WARANG CITI CAPITAL LETTER NGAA..WARANG CITI CAPITAL LETTER VIYO 1D400..1D419 ; Upper # L& [26] MATHEMATICAL BOLD CAPITAL A..MATHEMATICAL BOLD CAPITAL Z 1D434..1D44D ; Upper # L& [26] MATHEMATICAL ITALIC CAPITAL A..MATHEMATICAL ITALIC CAPITAL Z @@ -1768,7 +1787,7 @@ FF21..FF3A ; Upper # L& [26] FULLWIDTH LATIN CAPITAL LETTER A..FULLWIDTH LAT 1F150..1F169 ; Upper # So [26] NEGATIVE CIRCLED LATIN CAPITAL LETTER A..NEGATIVE CIRCLED LATIN CAPITAL LETTER Z 1F170..1F189 ; Upper # So [26] NEGATIVE SQUARED LATIN CAPITAL LETTER A..NEGATIVE SQUARED LATIN CAPITAL LETTER Z -# Total code points: 1641 +# Total code points: 1782 # ================================================ @@ -1806,7 +1825,7 @@ FF21..FF3A ; Upper # L& [26] FULLWIDTH LATIN CAPITAL LETTER A..FULLWIDTH LAT 0824 ; OLetter # Lm SAMARITAN MODIFIER LETTER SHORT A 0828 ; OLetter # Lm SAMARITAN MODIFIER LETTER I 0840..0858 ; OLetter # Lo [25] MANDAIC LETTER HALQA..MANDAIC LETTER AIN -08A0..08B2 ; OLetter # Lo [19] ARABIC LETTER BEH WITH SMALL V BELOW..ARABIC LETTER ZAIN WITH INVERTED V ABOVE +08A0..08B4 ; OLetter # Lo [21] ARABIC LETTER BEH WITH SMALL V BELOW..ARABIC LETTER KAF WITH DOT BELOW 0904..0939 ; OLetter # Lo [54] DEVANAGARI LETTER SHORT A..DEVANAGARI LETTER HA 093D ; OLetter # Lo DEVANAGARI SIGN AVAGRAHA 0950 ; OLetter # Lo DEVANAGARI OM @@ -1843,6 +1862,7 @@ FF21..FF3A ; Upper # L& [26] FULLWIDTH LATIN CAPITAL LETTER A..FULLWIDTH LAT 0ABD ; OLetter # Lo GUJARATI SIGN AVAGRAHA 0AD0 ; OLetter # Lo GUJARATI OM 0AE0..0AE1 ; OLetter # Lo [2] GUJARATI LETTER VOCALIC RR..GUJARATI LETTER VOCALIC LL +0AF9 ; OLetter # Lo GUJARATI LETTER ZHA 0B05..0B0C ; OLetter # Lo [8] ORIYA LETTER A..ORIYA LETTER VOCALIC L 0B0F..0B10 ; OLetter # Lo [2] ORIYA LETTER E..ORIYA LETTER AI 0B13..0B28 ; OLetter # Lo [22] ORIYA LETTER O..ORIYA LETTER NA @@ -1869,7 +1889,7 @@ FF21..FF3A ; Upper # L& [26] FULLWIDTH LATIN CAPITAL LETTER A..FULLWIDTH LAT 0C12..0C28 ; OLetter # Lo [23] TELUGU LETTER O..TELUGU LETTER NA 0C2A..0C39 ; OLetter # Lo [16] TELUGU LETTER PA..TELUGU LETTER HA 0C3D ; OLetter # Lo TELUGU SIGN AVAGRAHA -0C58..0C59 ; OLetter # Lo [2] TELUGU LETTER TSA..TELUGU LETTER DZA +0C58..0C5A ; OLetter # Lo [3] TELUGU LETTER TSA..TELUGU LETTER RRRA 0C60..0C61 ; OLetter # Lo [2] TELUGU LETTER VOCALIC RR..TELUGU LETTER VOCALIC LL 0C85..0C8C ; OLetter # Lo [8] KANNADA LETTER A..KANNADA LETTER VOCALIC L 0C8E..0C90 ; OLetter # Lo [3] KANNADA LETTER E..KANNADA LETTER AI @@ -1885,7 +1905,7 @@ FF21..FF3A ; Upper # L& [26] FULLWIDTH LATIN CAPITAL LETTER A..FULLWIDTH LAT 0D12..0D3A ; OLetter # Lo [41] MALAYALAM LETTER O..MALAYALAM LETTER TTTA 0D3D ; OLetter # Lo MALAYALAM SIGN AVAGRAHA 0D4E ; OLetter # Lo MALAYALAM LETTER DOT REPH -0D60..0D61 ; OLetter # Lo [2] MALAYALAM LETTER VOCALIC RR..MALAYALAM LETTER VOCALIC LL +0D5F..0D61 ; OLetter # Lo [3] MALAYALAM LETTER ARCHAIC II..MALAYALAM LETTER VOCALIC LL 0D7A..0D7F ; OLetter # Lo [6] MALAYALAM LETTER CHILLU NN..MALAYALAM LETTER CHILLU K 0D85..0D96 ; OLetter # Lo [18] SINHALA LETTER AYANNA..SINHALA LETTER AUYANNA 0D9A..0DB1 ; OLetter # Lo [24] SINHALA LETTER ALPAPRAANA KAYANNA..SINHALA LETTER DANTAJA NAYANNA @@ -1945,7 +1965,6 @@ FF21..FF3A ; Upper # L& [26] FULLWIDTH LATIN CAPITAL LETTER A..FULLWIDTH LAT 1312..1315 ; OLetter # Lo [4] ETHIOPIC SYLLABLE GWI..ETHIOPIC SYLLABLE GWE 1318..135A ; OLetter # Lo [67] ETHIOPIC SYLLABLE GGA..ETHIOPIC SYLLABLE FYA 1380..138F ; OLetter # Lo [16] ETHIOPIC SYLLABLE SEBATBEIT MWA..ETHIOPIC SYLLABLE PWE -13A0..13F4 ; OLetter # Lo [85] CHEROKEE LETTER A..CHEROKEE LETTER YV 1401..166C ; OLetter # Lo [620] CANADIAN SYLLABICS E..CANADIAN SYLLABICS CARRIER TTSA 166F..167F ; OLetter # Lo [17] CANADIAN SYLLABICS QAI..CANADIAN SYLLABICS BLACKFOOT W 1681..169A ; OLetter # Lo [26] OGHAM LETTER BEITH..OGHAM LETTER PEITH @@ -1971,7 +1990,7 @@ FF21..FF3A ; Upper # L& [26] FULLWIDTH LATIN CAPITAL LETTER A..FULLWIDTH LAT 1950..196D ; OLetter # Lo [30] TAI LE LETTER KA..TAI LE LETTER AI 1970..1974 ; OLetter # Lo [5] TAI LE LETTER TONE-2..TAI LE LETTER TONE-6 1980..19AB ; OLetter # Lo [44] NEW TAI LUE LETTER HIGH QA..NEW TAI LUE LETTER LOW SUA -19C1..19C7 ; OLetter # Lo [7] NEW TAI LUE LETTER FINAL V..NEW TAI LUE LETTER FINAL B +19B0..19C9 ; OLetter # Lo [26] NEW TAI LUE VOWEL SIGN VOWEL SHORTENER..NEW TAI LUE TONE MARK-2 1A00..1A16 ; OLetter # Lo [23] BUGINESE LETTER KA..BUGINESE LETTER HA 1A20..1A54 ; OLetter # Lo [53] TAI THAM LETTER HIGH KA..TAI THAM LETTER GREAT SA 1AA7 ; OLetter # Lm TAI THAM SIGN MAI YAMOK @@ -2021,7 +2040,7 @@ FF21..FF3A ; Upper # L& [26] FULLWIDTH LATIN CAPITAL LETTER A..FULLWIDTH LAT 31A0..31BA ; OLetter # Lo [27] BOPOMOFO LETTER BU..BOPOMOFO LETTER ZY 31F0..31FF ; OLetter # Lo [16] KATAKANA LETTER SMALL KU..KATAKANA LETTER SMALL RO 3400..4DB5 ; OLetter # Lo [6582] CJK UNIFIED IDEOGRAPH-3400..CJK UNIFIED IDEOGRAPH-4DB5 -4E00..9FCC ; OLetter # Lo [20941] CJK UNIFIED IDEOGRAPH-4E00..CJK UNIFIED IDEOGRAPH-9FCC +4E00..9FD5 ; OLetter # Lo [20950] CJK UNIFIED IDEOGRAPH-4E00..CJK UNIFIED IDEOGRAPH-9FD5 A000..A014 ; OLetter # Lo [21] YI SYLLABLE IT..YI SYLLABLE E A015 ; OLetter # Lm YI SYLLABLE WU A016..A48C ; OLetter # Lo [1143] YI SYLLABLE BIT..YI SYLLABLE YYR @@ -2037,6 +2056,7 @@ A6A0..A6E5 ; OLetter # Lo [70] BAMUM LETTER A..BAMUM LETTER KI A6E6..A6EF ; OLetter # Nl [10] BAMUM LETTER MO..BAMUM LETTER KOGHOM A717..A71F ; OLetter # Lm [9] MODIFIER LETTER DOT VERTICAL BAR..MODIFIER LETTER LOW INVERTED EXCLAMATION MARK A788 ; OLetter # Lm MODIFIER LETTER LOW CIRCUMFLEX ACCENT +A78F ; OLetter # Lo LATIN LETTER SINOLOGICAL DOT A7F7 ; OLetter # Lo LATIN EPIGRAPHIC LETTER SIDEWAYS I A7FB..A801 ; OLetter # Lo [7] LATIN EPIGRAPHIC LETTER REVERSED F..SYLOTI NAGRI LETTER I A803..A805 ; OLetter # Lo [3] SYLOTI NAGRI LETTER U..SYLOTI NAGRI LETTER O @@ -2046,6 +2066,7 @@ A840..A873 ; OLetter # Lo [52] PHAGS-PA LETTER KA..PHAGS-PA LETTER CANDRABIN A882..A8B3 ; OLetter # Lo [50] SAURASHTRA LETTER A..SAURASHTRA LETTER LLA A8F2..A8F7 ; OLetter # Lo [6] DEVANAGARI SIGN SPACING CANDRABINDU..DEVANAGARI SIGN CANDRABINDU AVAGRAHA A8FB ; OLetter # Lo DEVANAGARI HEADSTROKE +A8FD ; OLetter # Lo DEVANAGARI JAIN OM A90A..A925 ; OLetter # Lo [28] KAYAH LI LETTER KA..KAYAH LI LETTER OO A930..A946 ; OLetter # Lo [23] REJANG LETTER KA..REJANG LETTER A A960..A97C ; OLetter # Lo [29] HANGUL CHOSEONG TIKEUT-MIEUM..HANGUL CHOSEONG SSANGYEORINHIEUH @@ -2140,6 +2161,8 @@ FFDA..FFDC ; OLetter # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL 1083F..10855 ; OLetter # Lo [23] CYPRIOT SYLLABLE ZO..IMPERIAL ARAMAIC LETTER TAW 10860..10876 ; OLetter # Lo [23] PALMYRENE LETTER ALEPH..PALMYRENE LETTER TAW 10880..1089E ; OLetter # Lo [31] NABATAEAN LETTER FINAL ALEPH..NABATAEAN LETTER TAW +108E0..108F2 ; OLetter # Lo [19] HATRAN LETTER ALEPH..HATRAN LETTER QOPH +108F4..108F5 ; OLetter # Lo [2] HATRAN LETTER SHIN..HATRAN LETTER TAW 10900..10915 ; OLetter # Lo [22] PHOENICIAN LETTER ALF..PHOENICIAN LETTER TAU 10920..10939 ; OLetter # Lo [26] LYDIAN LETTER A..LYDIAN LETTER C 10980..109B7 ; OLetter # Lo [56] MEROITIC HIEROGLYPHIC LETTER A..MEROITIC CURSIVE LETTER DA @@ -2166,8 +2189,14 @@ FFDA..FFDC ; OLetter # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL 11183..111B2 ; OLetter # Lo [48] SHARADA LETTER A..SHARADA LETTER HA 111C1..111C4 ; OLetter # Lo [4] SHARADA SIGN AVAGRAHA..SHARADA OM 111DA ; OLetter # Lo SHARADA EKAM +111DC ; OLetter # Lo SHARADA HEADSTROKE 11200..11211 ; OLetter # Lo [18] KHOJKI LETTER A..KHOJKI LETTER JJA 11213..1122B ; OLetter # Lo [25] KHOJKI LETTER NYA..KHOJKI LETTER LLA +11280..11286 ; OLetter # Lo [7] MULTANI LETTER A..MULTANI LETTER GA +11288 ; OLetter # Lo MULTANI LETTER GHA +1128A..1128D ; OLetter # Lo [4] MULTANI LETTER CA..MULTANI LETTER JJA +1128F..1129D ; OLetter # Lo [15] MULTANI LETTER NYA..MULTANI LETTER BA +1129F..112A8 ; OLetter # Lo [10] MULTANI LETTER BHA..MULTANI LETTER RHA 112B0..112DE ; OLetter # Lo [47] KHUDAWADI LETTER A..KHUDAWADI LETTER HA 11305..1130C ; OLetter # Lo [8] GRANTHA LETTER A..GRANTHA LETTER VOCALIC L 1130F..11310 ; OLetter # Lo [2] GRANTHA LETTER EE..GRANTHA LETTER AI @@ -2176,19 +2205,24 @@ FFDA..FFDC ; OLetter # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL 11332..11333 ; OLetter # Lo [2] GRANTHA LETTER LA..GRANTHA LETTER LLA 11335..11339 ; OLetter # Lo [5] GRANTHA LETTER VA..GRANTHA LETTER HA 1133D ; OLetter # Lo GRANTHA SIGN AVAGRAHA +11350 ; OLetter # Lo GRANTHA OM 1135D..11361 ; OLetter # Lo [5] GRANTHA SIGN PLUTA..GRANTHA LETTER VOCALIC LL 11480..114AF ; OLetter # Lo [48] TIRHUTA ANJI..TIRHUTA LETTER HA 114C4..114C5 ; OLetter # Lo [2] TIRHUTA SIGN AVAGRAHA..TIRHUTA GVANG 114C7 ; OLetter # Lo TIRHUTA OM 11580..115AE ; OLetter # Lo [47] SIDDHAM LETTER A..SIDDHAM LETTER HA +115D8..115DB ; OLetter # Lo [4] SIDDHAM LETTER THREE-CIRCLE ALTERNATE I..SIDDHAM LETTER ALTERNATE U 11600..1162F ; OLetter # Lo [48] MODI LETTER A..MODI LETTER LLA 11644 ; OLetter # Lo MODI SIGN HUVA 11680..116AA ; OLetter # Lo [43] TAKRI LETTER A..TAKRI LETTER RRA +11700..11719 ; OLetter # Lo [26] AHOM LETTER KA..AHOM LETTER JHA 118FF ; OLetter # Lo WARANG CITI OM 11AC0..11AF8 ; OLetter # Lo [57] PAU CIN HAU LETTER PA..PAU CIN HAU GLOTTAL STOP FINAL -12000..12398 ; OLetter # Lo [921] CUNEIFORM SIGN A..CUNEIFORM SIGN UM TIMES ME +12000..12399 ; OLetter # Lo [922] CUNEIFORM SIGN A..CUNEIFORM SIGN U U 12400..1246E ; OLetter # Nl [111] CUNEIFORM NUMERIC SIGN TWO ASH..CUNEIFORM NUMERIC SIGN NINE U VARIANT FORM +12480..12543 ; OLetter # Lo [196] CUNEIFORM SIGN AB TIMES NUN TENU..CUNEIFORM SIGN ZU5 TIMES THREE DISH TENU 13000..1342E ; OLetter # Lo [1071] EGYPTIAN HIEROGLYPH A001..EGYPTIAN HIEROGLYPH AA032 +14400..14646 ; OLetter # Lo [583] ANATOLIAN HIEROGLYPH A001..ANATOLIAN HIEROGLYPH A530 16800..16A38 ; OLetter # Lo [569] BAMUM LETTER PHASE-A NGKUE MFON..BAMUM LETTER PHASE-F VUEQ 16A40..16A5E ; OLetter # Lo [31] MRO LETTER TA..MRO LETTER TEK 16AD0..16AED ; OLetter # Lo [30] BASSA VAH LETTER ENNI..BASSA VAH LETTER I @@ -2241,9 +2275,10 @@ FFDA..FFDC ; OLetter # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL 20000..2A6D6 ; OLetter # Lo [42711] CJK UNIFIED IDEOGRAPH-20000..CJK UNIFIED IDEOGRAPH-2A6D6 2A700..2B734 ; OLetter # Lo [4149] CJK UNIFIED IDEOGRAPH-2A700..CJK UNIFIED IDEOGRAPH-2B734 2B740..2B81D ; OLetter # Lo [222] CJK UNIFIED IDEOGRAPH-2B740..CJK UNIFIED IDEOGRAPH-2B81D +2B820..2CEA1 ; OLetter # Lo [5762] CJK UNIFIED IDEOGRAPH-2B820..CJK UNIFIED IDEOGRAPH-2CEA1 2F800..2FA1D ; OLetter # Lo [542] CJK COMPATIBILITY IDEOGRAPH-2F800..CJK COMPATIBILITY IDEOGRAPH-2FA1D -# Total code points: 99420 +# Total code points: 106002 # ================================================ @@ -2293,12 +2328,13 @@ ABF0..ABF9 ; Numeric # Nd [10] MEETEI MAYEK DIGIT ZERO..MEETEI MAYEK DIGIT N 114D0..114D9 ; Numeric # Nd [10] TIRHUTA DIGIT ZERO..TIRHUTA DIGIT NINE 11650..11659 ; Numeric # Nd [10] MODI DIGIT ZERO..MODI DIGIT NINE 116C0..116C9 ; Numeric # Nd [10] TAKRI DIGIT ZERO..TAKRI DIGIT NINE +11730..11739 ; Numeric # Nd [10] AHOM DIGIT ZERO..AHOM DIGIT NINE 118E0..118E9 ; Numeric # Nd [10] WARANG CITI DIGIT ZERO..WARANG CITI DIGIT NINE 16A60..16A69 ; Numeric # Nd [10] MRO DIGIT ZERO..MRO DIGIT NINE 16B50..16B59 ; Numeric # Nd [10] PAHAWH HMONG DIGIT ZERO..PAHAWH HMONG DIGIT NINE 1D7CE..1D7FF ; Numeric # Nd [50] MATHEMATICAL BOLD DIGIT ZERO..MATHEMATICAL MONOSPACE DIGIT NINE -# Total code points: 532 +# Total code points: 542 # ================================================ @@ -2358,18 +2394,22 @@ FF61 ; STerm # Po HALFWIDTH IDEOGRAPHIC FULL STOP 11141..11143 ; STerm # Po [3] CHAKMA DANDA..CHAKMA QUESTION MARK 111C5..111C6 ; STerm # Po [2] SHARADA DANDA..SHARADA DOUBLE DANDA 111CD ; STerm # Po SHARADA SUTRA MARK +111DE..111DF ; STerm # Po [2] SHARADA SECTION MARK-1..SHARADA SECTION MARK-2 11238..11239 ; STerm # Po [2] KHOJKI DANDA..KHOJKI DOUBLE DANDA 1123B..1123C ; STerm # Po [2] KHOJKI SECTION MARK..KHOJKI DOUBLE SECTION MARK +112A9 ; STerm # Po MULTANI SECTION MARK 115C2..115C3 ; STerm # Po [2] SIDDHAM DANDA..SIDDHAM DOUBLE DANDA -115C9 ; STerm # Po SIDDHAM END OF TEXT MARK +115C9..115D7 ; STerm # Po [15] SIDDHAM END OF TEXT MARK..SIDDHAM SECTION MARK WITH CIRCLES AND FOUR ENCLOSURES 11641..11642 ; STerm # Po [2] MODI DANDA..MODI DOUBLE DANDA +1173C..1173E ; STerm # Po [3] AHOM SIGN SMALL SECTION..AHOM SIGN RULAI 16A6E..16A6F ; STerm # Po [2] MRO DANDA..MRO DOUBLE DANDA 16AF5 ; STerm # Po BASSA VAH FULL STOP 16B37..16B38 ; STerm # Po [2] PAHAWH HMONG SIGN VOS THOM..PAHAWH HMONG SIGN VOS TSHAB CEEB 16B44 ; STerm # Po PAHAWH HMONG SIGN XAUS 1BC9F ; STerm # Po DUPLOYAN PUNCTUATION CHINOOK FULL STOP +1DA88 ; STerm # Po SIGNWRITING FULL STOP -# Total code points: 96 +# Total code points: 117 # ================================================ -- cgit v1.2.3