Opera supports internationalized domain names (IDN), which allowsfor example Russian or Chinese domain names to be written in theirown native scripts.
However, this also makes it possible to have domain names thatlook exactly the same as known, legitimate domain names whileactually being written in a different script. Such possibilitiescan be used for fraud.
Since 2003 domain names written with characters outside theUS ASCII character set have been supported by the IDNA (RFC 3490)standard. IDNA is based on Unicode, which offers one bigcharacter set for the whole world. Standards for Unicode withinHTML or plain text documents have existed for a longer time.UTF-8 is now a common encoding for Web pages, making ASCII-onlydomain names stand out because all non-English letters have tobe mangled or “romanized”.
The promise of Unicode is that a single version of a programshall be able to support all scripts of the world. The scriptscan be mixed within the same text, without any escape codes orextra metadata. This is a boon for electronic typography andinteroperability.
Before Unicode, the character set and the script used by theprogram were the character set and script chosen by the user.With a few exceptions (such as Japanese) all characters werevisually different, because they had to be distinguishableto the user. And since the script was native to the user, theuser would be trained to tell all the characters apart. With Unicode this is no longer true. Now the programs candisplay characters that are not native to the user, in-betweenthe familiar characters. And within Unicode there are manyso-called homographs: different characters that are visuallyidentical. For example, several characters in the Cyrillicalphabet look the same as letters in in the Roman alphabet.They even tend to map to the same glyph in the same font;this is by design. But as far as the programs are concerned,they are different characters.
Internationalized domain names make it possible to haveseveral domain names that look exactly the same typographically,as they are supposed to. As far as domain name servers,security protocols and Web browsers are concerned, the domainnames are still different from each other. This can be abusedto mislead the user. The deception will be totally convincing,unlike the trivial ASCII-only substitutions, for example”paypa1″.
Opera has added a whitelist of top-level domains that aretrusted to enforce a safe policy on domain names. Severaltop-level registrars have strict rules for domain names.Opera for Windows, Mac and UNIX will check for an updatedlist of trusted TLDs on a regular basis. Opera now only accepts Latin 1 characters in domain namesfrom top-level domains that are not on the whitelist. Thiscovers Western European languages without introducing anyconvincing homographs.
Top-level domain registrars who have enforced strict domainname policies are encouraged to contact Opera Softwareto be included in the browser’s whitelist, provided thattheir policies are approved.