UTS #46 Test Suite
Erik van der Poel, Markus Scherer and I were discussing having a test suite for UTS #46. Here are some thoughts we had about how it could be constructed.
Have a table that includes a series of lines of the form:
source_string ; options ; ToASCII-result ; ToASCII-status ; ToUnicode-result ; ToUnicode-status # comment
source_string and results are space-separated code points (eg "0061 0062"
options are space-delimited options: "noSTD3", "Transitional", "noContextJ", "noBidi"
status is "error" or blank
Example:
0061 0062 ; noBidi ; 0061 0062 ; ok ; 0061 0062 ; ok # ab
The source strings would come from the following:
Add all of the examples that are used in the spec.
Every place an error condition could result, have an example that is just enough to trigger the exception (eg a label of length 64), plus one that is just at the limit (eg a label of length 63).
Randomly generate a large number of strings using interesting classes of characters, eg a mixture of representative characters from the classes (a)-(k) in http://unicode.org/draft/reports/tr46/tr46.html#Table_IDNA_Comparisons, both in normalized and denormalized form.