*TODO Misc property proposal

1. Possible stability policy invariant

    • Z = whitespace - [\u0009-\u000D \u0085]

2. Add to Default_Ignorable_Code_Point

U+FFFC (  ) OBJECT REPLACEMENT CHARACTER

3. Change Scripts from Common to Specific for the following Other_Symbols

U+0CF1 ( ೱ ) KANNADA SIGN JIHVAMULIYA

U+0CF2 ( ೲ ) KANNADA SIGN UPADHMANIYA

U+FDFD ( ﷽ ) ARABIC LIGATURE BISMILLAH AR-RAHMAN AR-RAHEEM

About 1000 symbols have specific scripts:

http://unicode.org/cldr/utility/list-unicodeset.jsp?a=[[:s:]-[:script=common:]]

[༔ ፠ ᥀ ႞ ႟ ͵ ᎐-᎙ ҂ ؈ ؎ ؏ ۩ ߶ ৺ ୰ ௳-௸ ௺ ౿ ൹ ꠨-꠫ ༁-༃ ༓ ༕-༗ ༚-༟ ༴ ༶ ༸ ྾-࿅ ࿇-࿌ ࿎ ࿏ ᧠-᧿ ᭡-᭪ ᭴-᭼ ϶ ؆ ؇ ⳥-⳪ ⠀-⣿ ꒐-꓆ 𐅹-𐆉 𝈀-𝉁 𝉅 ؋ ৲ ৳ ૱ ௹ ៛ ۽ ۾ ⺦ ⺀ ⺄ ⺃ ⺂ ⺅-⺋ ⺁ ⺌ ⺍ ⺐⺎ ⺏⺑-⺓ ⺕ ⺔ ⺗ ⺖ ⺘ ⺙ ⺛-⺞ ⺠-⺥ ⺧-⺰ ⺲⺵ ⺱⺳ ⺴ ⺶-⺹ ⺻ ⺺ ⺼-⻂ ⻄ ⻃ ⻅-⻝ ⻟⻞ ⻠-⻲]

with decomposition (dt!=none):

[` ΄´ ΅῭῁ ᾽᾿῎῍῏ ῾῞῝῟ ῀ ㈀ ㈎ ㈁ ㈏ ㈂ ㈐ ㈃ ㈑ ㈄ ㈒ ㈅ ㈓ ㈆ ㈔ ㈇ ㈕ ㈝ ㈞ ㈈ ㈖ ㈜ ㈉ ㈗ ㈊ ㈘ ㈋ ㈙ ㈌ ㈚ ㈍ ㈛ ﬩ ﷼ ㉠ ㉮ ㉡ ㉯ ㉢ ㉰ ㉣ ㉱ ㉤ ㉲ ㉥ ㉳ ㉦ ㉴ ㉧ ㉵ ㉾ ㉨ ㉶ ㉽ ㉩ ㉷ ㉼ ㉪ ㉸ ㉫ ㉹ ㉬ ㉺ ㉭ ㉻ ㋐ ㌃ ㌀-㌂ ㋑ ㌄ ㌅ ㋒ ㌆ ㋓ ㌈ ㌇ ㋔ ㌊ ㌉ ㋕ ㌋-㌏ ㋖ ㌐-㌗ ㋗ ㌘-㌛ ㋘ ㌜ ㋙ ㌞ ㌝ ㋚ ㌟ ㌠ ㋛ ㌡ ㋜ ㋝ ㌢ ㌣ ㋞ ㋟ ㌤ ㋠-㋢ ㌥ ㋣ ㌦ ㌧ ㋤ ㌨ ㋥-㋨ ㌩ ㋩ ㌫-㌭ ㌪ ㋪ ㌮-㌱ ㋫ ㌲-㌵ ㋬ ㌻ ㌼ ㌶-㌺ ㋭ ㍁ ㍂ ㌽-㍀ ㋮ ㍃-㍇ ㋯ ㍈-㍊ ㋰ ㋱ ㍍ ㍋ ㍌ ㋲ ㋳ ㍎ ㍏ ㋴ ㍐ ㋵-㋷ ㍑ ㍒ ㋸ ㍔ ㍓ ㋹ ㍕ ㍖ ㋺ ㋻ ㍗ ㋼-㋾ ⼀-⽏ ⺟ ⽐-⿔ ⻳ ⿕]

4. We are inconsistent regarding manufactured symbols

㉠ is Script=Hangul, Other_Symbol

㋐ is Script=Katakana, Other_Symbol

⒜ is Script=Common, Other_Symbol

㊏ is Script=Common, Other_Symbol

㈹ is Script=Common, Other_Symbol

ⓐ is Script=Common, Other_Symbol

U+32CD ( ㋍ ) SQUARE ERG is Script=Common, Other_Symbol

U+3300 ( ㌀ ) SQUARE APAATO is Script=Katakana, Other_Symbol

U+337F ( ㍿ ) SQUARE CORPORATION is Script=Common, Other_Symbol

µ is Script=Common, Lowercase_Letter

U+1D400 ( 𝐀 ) MATHEMATICAL BOLD CAPITAL A is Script Common, Uppercase_Letter

There is no principled reason why ㉠ or ㌀ should not have Script=Common, like the others. Either that or the others have the appropriate scripts.

About a 1000 letters have Common Script, typically where they are functionally symbols:

http://unicode.org/cldr/utility/list-unicodeset.jsp?a=[[:Letter:]-[:^script=common:]]

[ـ ʹ ʺ ˆ-ˏ ˬ ꜗ-ꜟ ꞈ ː ˑ 〱-〵 ー ʻ ʽ ˀ ʼ ˮ ʾ ʿ ˁ ⸯ 〼 〆]

with decomposition:

[µʹℂℇℊ-ℓℕℙ-ℝℤℨℬℭℯ-ℱℳ-ℹ ℼ-ℿⅅ-ⅉー゙゚𝐀-𝑔𝑖-𝒜𝒞𝒟𝒢𝒥𝒦𝒩-𝒬 𝒮-𝒹𝒻𝒽-𝓃𝓅-𝔅𝔇-𝔊𝔍-𝔔𝔖-𝔜𝔞-𝔹 𝔻-𝔾𝕀-𝕄𝕆𝕊-𝕐𝕒-𝚥𝚨-𝛀𝛂-𝛚𝛜-𝛺 𝛼-𝜔𝜖-𝜴𝜶-𝝎𝝐-𝝮𝝰-𝞈𝞊-𝞨𝞪-𝟂 𝟄-𝟋]

5. I was using the subblocks (from NamesList) to help organize, and found the following:

Singular plus Plural forms.

In many cases, we have a subblock that is a singular form of a plural name. I'd suggest we always use the plural form for noun or noun phrases, even if there is only a single character in the subblock. That is "XXX symbol" => "XXX symbols". Adjectival phrases ("Arabic") would be left alone.

This would affect the following (plus perhaps more: I only checked singular forms where there was a regular plural also).

Additional consonant => Additional consonants

Additional letter => Additional letters

Circle => Circles

Circled Hangul syllable => Circled Hangul syllables

Consonant => Consonants

Currency symbol => Currency symbols

Dependent vowel sign => Dependent vowel signs

Double diacritic => Double diacritics

Extended Arabic letter => Extended Arabic letters

Format character => Format characters

Gender symbol => Gender symbols

Keyboard symbol => Keyboard symbols

Latin letter => Latin letters

Letter => Letters

Letterlike symbol => Letterlike symbols

Medievalist addition => Medievalist additions

Miscellaneous addition => Miscellaneous additions

Miscellaneous arrow => Miscellaneous arrows

Miscellaneous mark => Miscellaneous marks

Miscellaneous mathematical operator => Miscellaneous mathematical operators

Miscellaneous mathematical symbol => Miscellaneous mathematical symbols

Miscellaneous symbol => Miscellaneous symbols

Modifier letter => Modifier letters

Operator => Operators

Relation => Relations

Rest => Rests

Sign => Signs

Space => Spaces

Special => Specials

Special character extension => Special character extensions

Squared Latin abbreviation => Squared Latin abbreviations

Subjoined consonant => Subjoined consonants

Syllable => Syllables

Symbol => Symbols

Tamil symbol => Tamil symbols

Variant letterform => Variant letterforms

Vertical line operator => Vertical line operators

Vowel => Vowels

There are a number of cases where the subblock equals the block.

Singletons: If it is a singleton (the only subblock in the block), add more of a gloss.

Braille Patterns

CJK Radicals Supplement

CJK Strokes

Currency Symbols

Ideographic Description Characters

Kanbun

Kangxi Radicals

Small Form Variants

Variation Selectors

Yi Radicals

Yijing Hexagram Symbols

Variation Selectors

Yi Radicals

Yijing Hexagram Symbols

Non-Singletons: also change by narrowing the description, or changing to just "Miscellaneous", since it doesn't make sense to have block X contain subblock X and Y; that would imply both that Y's are X's and that Y's are not X's.

Block Elements

CJK Symbols And Punctuation

Combining Diacritical Marks For Symbols

Combining Half Marks

General Punctuation

Geometric Shapes

IPA Extensions

Letterlike Symbols

Miscellaneous Symbols

Miscellaneous Technical