Unicode & Int’l SW‎ > ‎UTC‎ > ‎

UTS #46 Test Suite

Erik van der Poel, Markus Scherer and I were discussing having a test suite for UTS #46. Here are some thoughts we had about how it could be constructed.

Have a table that includes a series of lines of the form:

source_string ; options ; ToASCII-result ; ToASCII-status ; ToUnicode-result ; ToUnicode-status # comment
  • source_string and results are space-separated code points (eg "0061 0062"
  • options are space-delimited options: "noSTD3", "Transitional", "noContextJ", "noBidi"
  • status is "error" or blank
Example:

0061 0062 ; noBidi ; 0061 0062 ; ok ; 0061 0062 ; ok # ab

The source strings would come from the following:
  1. Add all of the examples that are used in the spec.
  2. Every place an error condition could result, have an example that is just enough to trigger the exception (eg a label of length 64), plus one that is just at the limit (eg a label of length 63).
  3. Randomly generate a large number of strings using interesting classes of characters, eg a mixture of representative characters from the classes (a)-(k) in http://unicode.org/draft/reports/tr46/tr46.html#Table_IDNA_Comparisons, both in normalized and denormalized form.

Comments