IDNA behavior

The following looks at normalization behavior in browsers.

Corrigendum #4: Five Unihan Canonical Mapping Errors

Example

  • 2F868 (㛼) = xn--g22n
  • 3.2 normalization => xn--j74i = 2136A (𡍪)
  • 5.2 normalization => xn--snl = 36FC ()

Behavior

  • IE - 3.2
  • Chrome/Safari - 3.2
  • FF - 5.2

Corrigendum #5: Normalization Idempotency

Example

  • 1100 0300 1161 0323 (ᄀ̀ᅡ̣) = xn--ksa4ez54cela
  • 3.2 normalization => xn--ksa4ez795d = AC00 0300 0323 (가̣̀)
    => xn--ksa3e0795d = AC00 0323 0300 (가̣̀)
  • 5.2 normalization => xn--ksa4ez54cela = 1100 0300 1161 0323 (ᄀ̀ᅡ̣)

Behavior

  • IE - 5.2
  • Chrome/Safari - 3.2
  • FF - 3.2 -- applied twice!
Same thing happens with Indic example:
  • 0BC6 0300 0BBE 0323 (ெ̀ா̣) = xn--ksa4e234a4a.com
  • 3.2 normalization =...
  • 5.2 normalization = ...

Comments