When running the UTF-8 to byte array tests in the Google Closure library, the string provided is:
\u0000\u007F\u0080\u07FF\u0800\uFFFF
This string is expected to be converted into the following array:
[0x00, 0x7F, 0xC2, 0x80, 0xDF, 0xBF, 0xE0, 0xA0, 0x80, 0xEF, 0xBF, 0xBF]
After testing with other JavaScript and TypeScript implementations for UTF-8 to byte array conversion, some of them have claimed that the given UTF-8 string is invalid.
The string seems to cover the values that transition from 1 byte to 2-byte to 3-byte values.
The question remains: Is Google's implementation correct or are the other libraries right?