Unicode & Int’l SW‎ > ‎UTC‎ > ‎

Clarifying Compatibility Characters

L2/09-xxxx
Subject
Clarifying Compatibility Characters
Author Mark Davis
Date 2009-10-23
To UTC


Ken and I had a disagreement in the editorial committee, and I'd like to resolve it in the UTC.

From my prospective, "compatibility characters" includes those characters that we wouldn't have encoded in Unicode except that we had to. Examples include characters like the Angstrom character that we needed to roundtrip with another standard, but also characters that didn't come from any other standard, but were the price to pay for avoiding other problems.
  • the Arabic ligatures, which we needed to get the merger with ISO 10646
  • the Tag characters, which we encoded to pre-emptively avoid a goofy variant of UTF-8 being proposed in the IETF.
  • U+206A (  ) INHIBIT SYMMETRIC SWAPPING
    ...
    U+206F (  ) NOMINAL DIGIT SHAPES
  • and so on.
There is text in the standard to support either a broad or narrow interpretation:

p18
However, considerations of interoperability with other standards and systems [my bold] often require
that such compatibility characters be included in the Unicode Standard.

2.3
Conceptually, compatibility characters are characters that would not have been encoded in
the Unicode Standard except for compatibility and round-trip convertibility with other
standards.

I think the broader interpretation is the more useful, and more reflective of the purpose of calling these things "compatibility characters", and so would prefer that we use that interpretation in the text.
Comments