160 lines
6.2 KiB
Plaintext
160 lines
6.2 KiB
Plaintext
Internet Draft Yoshiro Yoneya
|
|
draft-ietf-idn-jpchar-00.txt Yasuhiro Morishita
|
|
November 17, 2000 JPNIC
|
|
Expires May 17, 2001
|
|
|
|
Japanese characters in multilingual domain name label
|
|
|
|
Status of this memo
|
|
|
|
This document is an Internet-Draft and is in full conformance with all
|
|
provisions of Section 10 of RFC2026.
|
|
|
|
Internet-Drafts are working documents of the Internet Engineering Task
|
|
Force (IETF), its areas, and its working groups. Note that other
|
|
groups may also distribute working documents as Internet-Drafts.
|
|
|
|
Internet-Drafts are draft documents valid for a maximum of six months
|
|
and may be updated, replaced, or obsoleted by other documents at any
|
|
time. It is inappropriate to use Internet-Drafts as reference
|
|
material or to cite them other than as "work in progress."
|
|
|
|
The list of current Internet-Drafts can be accessed at
|
|
http://www.ietf.org/ietf/1id-abstracts.txt
|
|
|
|
The list of Internet-Draft Shadow Directories can be accessed at
|
|
http://www.ietf.org/shadow.html.
|
|
|
|
Abstract
|
|
|
|
This document explains about Japanese characters and its canonicalization
|
|
rules in multilingual domain name labels. This document is based on
|
|
discussions and examinations in JPNIC.
|
|
|
|
Despite of IDN WG rough consensus that character set in multilingual
|
|
domain name is UCS [UCS], most popular Japanese character set used in
|
|
Japan is Japanese Industrial Standards X 0208 -- hereafter abbreviated
|
|
as "JIS" -- [JISX0208]. This means that many of PCs and most of PDAs
|
|
including handy phones in Japan can display only JIS and ASCII.
|
|
Therefore, Japanese characters used in multilingual domain name are
|
|
strongly recommended as common part of JIS, ASCII and UCS.
|
|
|
|
Furthermore, for historical reasons, JIS have many compatible code
|
|
points in Kana and Alpha-numericals. Such compatible code points are
|
|
still used widely, so that these characters SHOULD be acceptable
|
|
especially in user interface, and MUST be canonicalized before
|
|
transmission to the wire. The former half should be implemented for
|
|
localization, and the latter half must be implemented for
|
|
internationalization.
|
|
|
|
|
|
1. Japanese characters in multilingual domain name labels
|
|
|
|
In principle domain name is a symbolic name of resources on the
|
|
Internet for understanding and memorizing easily to the Internet
|
|
users. Internationalization or multilingualization of domain name
|
|
MUST obey this principle. That is, characters in multilingualized
|
|
domain name labels SHOULD be unambiguous.
|
|
|
|
JIS has a lot of characters including graphical and compatible
|
|
characters. But as for domain name, significant characters to
|
|
represent names are Kanji, Hiragana and Katakana [CJK]. Therefore,
|
|
according to the principle, Japanese characters in multilingual domain
|
|
name MUST be Kanji, Hiragana and Katakana in JIS.
|
|
|
|
The file "idntabjp10.txt" defines Japanese characters in the format of
|
|
[VERSION], with additional corresponding JIS code points as 3rd field,
|
|
that can be used in multilingual domain name labels. Some of them,
|
|
such as PROLONGED SOUND MARK (U+30FC), are categorized into graphical
|
|
character in JIS, but usage of them are part of Kanji, Hiragana or
|
|
Katakana. These characters are in canonicalized form.
|
|
|
|
|
|
2. Canonicalization rules of Japanese characters in multilingual
|
|
domain name labels
|
|
|
|
In this section, this document describes two parts of canonicalization
|
|
rules. One explains "localization", and the other comments on
|
|
"internationalization". In other words, one is for Input/Display
|
|
level, and another is for API level [IDNA].
|
|
|
|
2.1 Localization: Characters to be canonicalized before NAMEPREP
|
|
|
|
As mentioned above, JIS has a lot of compatible characters that are
|
|
regarded alpha-numeric or Katakana. The former is so called
|
|
FULL-WIDTH Alpha-numeric, and the latter is so called HALF-WIDTH kana.
|
|
These characters are prohibited in [NAMEPREP], but still widely used
|
|
in many PCs and most PDAs in Japan. Hence, application softwares that
|
|
treat Japanese characters in multilingual domain name label SHOULD
|
|
accept these compatible characters as input and canonicalize them
|
|
before [NAMEPREP].
|
|
|
|
The file "idntabjpcanon10.txt" defines compatible characters, with
|
|
additional canonicalized character code as 3rd field; that is, mapping
|
|
table of FULL-WIDTH Alpha-numeric to ASCII, and HALF-WIDTH kana to
|
|
Katakana.
|
|
|
|
The file "idntabjpcomp10.txt" defines compatible character sequences
|
|
as composed, with additional canonicalized characters code as 3rd
|
|
field; that is, composition table of Kana and voiced sound mark.
|
|
|
|
Recommended order of applying canonicalization rules is as follows:
|
|
|
|
(1) "idntabjpcanon10"
|
|
(2) "idntabjpcom10"
|
|
|
|
This part is a local part of canonicalization.
|
|
|
|
2.2 Internationalization: Characters to be canonicalized in NAMEPREP
|
|
|
|
Japanese characters in multilingual domain name labels MUST be
|
|
characters defined in "idntabjp10". Another characters except for
|
|
"idntabjp10" SHOULD be canonicalized at [NAMEPREP].
|
|
|
|
[NAMEPREP] is common and recommended rule for IDN.
|
|
|
|
This part is an international part of canonicalization.
|
|
|
|
|
|
3. Security considerations
|
|
|
|
None in particular.
|
|
|
|
|
|
4. References
|
|
|
|
[UCS] "Universal Multiple-Octet Coded Character Set",
|
|
ISO/IEC 10646-1:1993, ISBN 0-201-61633-5
|
|
[JISX0208] "Japanese Industrial Standards",
|
|
Information Technology (Terms/Code/Date elements)-99,
|
|
ISBN4-542-12976-4
|
|
[IDNREQ] "Requirements of Internationalized Domain Names",
|
|
draft-ietf-idn-requirements-03.txt, Jun 2000, Z Wenzel, J Seng
|
|
[NAMEPREP] "Preparation of Internationalized Host Names",
|
|
draft-ietf-idn-nameprep-00.txt, Jul 2000, P Hoffman, M Blanchet
|
|
[CJK] "Han Ideograph (CJK) for Internationalized Domain Names",
|
|
draft-ietf-idn-cjk-00.txt, Sep 2000, J Seng, Y Yoneya,
|
|
K Huang, K Kyongsok
|
|
[VERSION] "Handling versions of internationalized domain names protocols",
|
|
draft-ietf-idn-version-00.txt, Nov 2000, M Blanchet
|
|
|
|
|
|
5. Acknowledgements
|
|
|
|
JPNIC IDN-TF members.
|
|
|
|
|
|
6. Author's Address
|
|
|
|
Yoshiro Yoneya
|
|
Japan Network Information Center
|
|
Fuundo Bldg 1F, 1-2 Kanda-ogawamachi
|
|
Chiyoda-ku Tokyo 101-0052, Japan
|
|
yone@nic.ad.jp
|
|
|
|
Yasuhiro Morishita
|
|
Japan Network Information Center
|
|
Fuundo Bldg 1F, 1-2 Kanda-ogawamachi
|
|
Chiyoda-ku Tokyo 101-0052, Japan
|
|
yasuhiro@nic.ad.jp
|