Online Encyclopedia
Punycode
Unicode series |
Unicode |
Unicode Consortium
|
UCS |
UTF-7 |
UTF-8 |
UTF-16 |
UTF-32 |
SCSU |
Punycode
|
Bi-directional text |
BOM |
Han unification |
Unicode and HTML |
Punycode, defined in RFC 3492, is a self-proclaimed "Bootstring encoding" of Unicode strings into the limited character set supported by the Domain Name System. The encoding is used as part of IDNA, which is a system enabling the use of internationalized domain names in all languages supported by Unicode, where the burden of translation lies entirely with the user application (e.g., web browser).
The encoding is applied seperately to each component of a domain name which is not representable solely within the ASCII charcter set, and a reserved prefix 'xn--' is added to the translated Punycode string. For example, bücher becomes bcher-kva in Punycode, and therefore the domain name bücher.ch
would be represented as xn--bcher-kva.ch
in IDNA.
Punycode is designed to work across all script systems, and to be self-optimising by attempting to adapt to the character sets within the string as it operates. It is optimised for the case where the string is composed of zero or more ASCII characters and in addition characters from only one other script system, but will cope with any arbitrary Unicode string. Note that for DNS use, the domain name string is assumed to have been normalised using Nameprep before being Punycoded, and that the DNS protocol sets limits on the acceptable lengths of the output Punycode string.
Punycode has been adopted by the national registrars of Germany, Austria and Switzerland starting on March 1, 2004.