“UTF-8 or UTF-16 may fully supplant those two standards for encoding Chinese text.”
“UTF-8 bytes, or even UTF-16 double-bytes, are not Unicode codepoints.”
“Since the first two characters of a JSON text will always be ASCII characters [RFC0020], it is possible to determine whether an octet stream is UTF-8, UTF-16 (BE or LE), or UTF-32 (BE or LE) by looking at the pattern of nulls in the first four octets.”
“UTF-8, UTF-16, UTF-32 and their variants are ways of expressing Unicode using different rules for transforming bytes into characters.”
“That's because UTF-16 uses two bytes for most complex characters, while UTF-8 can”
“· Full Unicode support: everything you see in Sigil is in UTF-16”
“You've defined both a UTF-8 format and a UTF-16 format.”
“Are the first two bytest two characters in UTF-8 encoding? or a single character in UTF-16 encoding?”
“· For the wchar_t character type the only valid value for this option is 'auto' and the encoding is automatically selected between UTF-16 and UTF-32, depending on the wchar_t type size.”
“Many text editors claim they support Unicode (UTF-7, UTF-8 and UTF-16), however, they only convert Unicode to the system encoding (or ANSI) internally when they open files, so they cannot actually edit characters that are not supported by the system encoding.”
‘UTF-16’ hasn't been added to any lists yet.
Looking for tweets for UTF-16.