374 lines
16 KiB
Plaintext
374 lines
16 KiB
Plaintext
Internet Draft Maynard Kang
|
|
draft-ietf-idn-mua-00.txt i-EMAIL.net
|
|
February 5, 2001
|
|
Expires on August 5, 2001
|
|
|
|
Internationalizing Domain Names in Mail User Agents
|
|
|
|
Status of this Memo
|
|
|
|
This document is an Internet-Draft and is in full conformance with all
|
|
provisions of Section 10 of RFC2026.
|
|
|
|
Internet-Drafts are working documents of the Internet Engineering Task
|
|
Force (IETF), its areas, and its working groups. Note that other
|
|
groups may also distribute working documents as Internet-Drafts.
|
|
|
|
Internet-Drafts are draft documents valid for a maximum of six months
|
|
and may be updated, replaced, or obsoleted by other documents at any
|
|
time. It is inappropriate to use Internet-Drafts as reference material
|
|
or to cite them other than as "work in progress."
|
|
|
|
|
|
The list of current Internet-Drafts can be accessed at
|
|
http://www.ietf.org/ietf/1id-abstracts.txt
|
|
|
|
The list of Internet-Draft Shadow Directories can be accessed at
|
|
http://www.ietf.org/shadow.html.
|
|
|
|
|
|
|
|
Abstract
|
|
|
|
This document describes a way where domain names used in Internet e-mail
|
|
can be internationalized by making changes only to end-user Mail User
|
|
Agents and, by doing so, avoid damaging other applications which handle
|
|
Internet e-mail, such as Message Transfer Agents and Delivery Agents.
|
|
|
|
1. Introduction
|
|
|
|
One of the proposed solutions for internationalized domain names (IDN)
|
|
involves only updating the user applications with no changes required
|
|
to the DNS protocol, servers and resolvers [IDNA] compared to other
|
|
solutions which require changes to be made to protocol, servers,
|
|
resolvers and applications.
|
|
|
|
The underlying principle of [IDNA] may be similarly applied to the
|
|
Internet e-mail system today - by effecting changes to only the Mail
|
|
User Agent (MUA) component of the e-mail system. Thus, existing
|
|
Message Transfer Agents, Delivery Agents and other applications which
|
|
handle e-mail do not have to be changed at all.
|
|
|
|
1.1 Definitions and Conventions
|
|
|
|
Usage of terms related to the character encoding model are in
|
|
reference to Unicode Technical Report 17 [UTR17].
|
|
|
|
The terms "international character", "non-ASCII character" and
|
|
"multilingual character", which are used interchangeably, are taken
|
|
to mean any abstract character which is not included in the range
|
|
specified by [US-ASCII].
|
|
|
|
1.2 Terminology
|
|
|
|
The key words "MUST", "SHALL", "REQUIRED", "SHOULD", "RECOMMENDED",
|
|
and "MAY" in this document are to be interpreted as described in RFC
|
|
2119 [RFC2119].
|
|
|
|
1.3. Design Philosophy
|
|
|
|
As the Internet e-mail system is a diverse, distributed and
|
|
heterogeneous system with many vendors deploying a vast number of
|
|
applications, it is of utmost importance that interoperability amongst
|
|
these various components is maintained. Thus, the ideal solution would
|
|
be one which does not compromise or damage the operation of any of these
|
|
existing components once internationalized domain names are encountered.
|
|
|
|
Also, solutions which call for changes to be made to many or even all
|
|
components of the Internet e-mail system would require far too much
|
|
time and effort to deploy, given that Internet e-mail has such a huge
|
|
installed base.
|
|
|
|
This solution adheres to both of the above principles, in that
|
|
interoperability is preserved and that the cost and speed of
|
|
implementation is low. All that the user has to do to use IDNs in e-mail
|
|
is update his or her MUA.
|
|
|
|
1.4. IDN Summary
|
|
|
|
This solution specifies an IDN architecture of arch-3 (just send ACE)
|
|
and a transition strategy of trans-1 (always do current plus new
|
|
architecture) as described in [IDNCOMP]. The choice of ACE format is not
|
|
defined in this document, but MUST be the same as that specified in
|
|
[IDNA] in order to maintain uniqueness and consistency.
|
|
|
|
1.5. E-mail Internationalization Summary
|
|
|
|
As many Internet e-mail standards such as the SMTP protocol [RFC821]
|
|
and the e-mail message format [RFC822] only specify usage of the 7-bit
|
|
ASCII character set [US-ASCII], international characters which use octet-
|
|
based character encoding schemes (CES) cannot be used in e-mail
|
|
transmission, headers and bodies.
|
|
|
|
Although this issue has been addressed in [RFC2045] for message bodies
|
|
and [RFC2047] for message headers through the use of a Transfer Encoding
|
|
Syntax (TES) such as Quoted-Printable or Base64, there is no similar
|
|
solution which extends the functionality of [RFC821] to include usage of
|
|
international characters, except for [RFC1652] which allows transmission
|
|
of 8-bit data passed by the DATA command in an SMTP session.
|
|
|
|
[RFC1652] however, does not fully address the problem of using IDNs in
|
|
an SMTP session - the IDN may be used in areas within the SMTP session
|
|
other than the DATA command, such as the MAIL FROM and RCPT TO commands,
|
|
where an IDN may be part of the e-mail address(es) specified there.
|
|
|
|
Hence, this would be a major stumbling block to deploying "just-send-
|
|
8bit" IDNs for use in Internet e-mail, as these IDNs would not be able
|
|
to be used in SMTP e-mail transmissions due to [RFC821] restrictions.
|
|
|
|
2. Architectural Overview
|
|
|
|
The end-user MUA may encounter IDNs in the scenarios below:
|
|
|
|
(i) When specifying the transmission server (i.e. SMTP server)
|
|
(ii) When specifying the retrieval server (i.e. POP3/IMAP4/any other
|
|
retrieval mechanism)
|
|
(iii) When specifying e-mail addresses during composition of a message
|
|
(iv) When reading messages with e-mail addresses in it
|
|
|
|
As with [IDNA], the MUA is updated in a similar fashion to process IDNs
|
|
which are input by users and process IDNs which are displayed to users,
|
|
in all of the scenarios above.
|
|
|
|
For (i) and (ii), the IDN MUST be handled in the same manner as
|
|
specified in [IDNA]. The method of handling an IDN For (iii) and (iv) is
|
|
described below in 2.1.
|
|
|
|
2.1 Interfaces between E-mail components when composing/reading a mail
|
|
|
|
The interfaces between e-mail components can be pictorially represented
|
|
as shown below.
|
|
|
|
The example assumes the setup of a POP3/IMAP4 retrieval client and
|
|
server, but the exact nature of end-to-end e-mail transmission may vary
|
|
accordingly (e.g. elm or pine would read directly from the mail store).
|
|
However, these variations do not impact an accurate description of this
|
|
solution to a large extent as no changes are required at these levels.
|
|
|
|
+------+ +------+
|
|
| User | | User |
|
|
+------+ +---^--|
|
|
| User Input: User Display: Characters/ |
|
|
| Keyboard/Pen/etc Glyphs on CRT or other |
|
|
+-----v---------------+ Representation (e.g. sound) |
|
|
| Input Method Editor | +------------|-----+
|
|
+---------------------+ | Rendering Engine |
|
|
| Input: Any localized/ +---------^--------+
|
|
| internationalized Output: Any localized/ |
|
|
| charset internationalized |
|
|
+----v-----------------+ charset |
|
|
| +------------------+ | +----------|-------------+
|
|
| | Mail Composition | | | +--------------+ |
|
|
| | Interface | | Sender's | | Mail Reading | |
|
|
| +------------------+ | MUA | | Interface | |
|
|
| | | | +--------^-----+ |
|
|
| | Nameprepped ACE | Receiver's | | Nameprepped |
|
|
| v | MUA | | ACE |
|
|
| +-------------+ | | +-------------------+ |
|
|
| | SMTP Client | | | | POP3/IMAP4 Client | |
|
|
| +-------------+ | | +-------------------+ |
|
|
+----|-----------------+ +----------^-------------+
|
|
| Nameprepped | Nameprepped
|
|
v ACE Nameprepped Nameprepped | ACE
|
|
+-------------+ ACE +------------+ ACE +-------------------+
|
|
| SMTP Server | -----> | Mail Store | -----> | POP3/IMAP4 Server |
|
|
+-------------+ +------------+ +-------------------+
|
|
|
|
2.1.1 Interface between User and Input Method Editor
|
|
|
|
For ASCII characters, input is straightforward: the user types on the
|
|
keyboard and whichever character that is pressed is sent to the
|
|
application.
|
|
|
|
However, for international characters, the end-user has to use a script-
|
|
specific Input Method Editor (IME), which may or may not be built-into
|
|
the OS, to interpret what the user communicates to the system and
|
|
thereafter send the respective international characters to the
|
|
application.
|
|
|
|
For example, for input of Chinese characters, some users use IMEs
|
|
which support the "Pinyin" input method. When a user types "zhongguo"
|
|
(in ASCII characters) on the keyboard and selects the characters which
|
|
represent "China" (in Chinese) from a list, the IME sends the
|
|
international characters to the application in a user-determined
|
|
charset (e.g. GB2312).
|
|
|
|
2.1.2 Interface between Input Method Editor and MUA Composition
|
|
Interface
|
|
|
|
The MUA mail composition interface (i.e. the "Compose Message"
|
|
function of the MUA) SHOULD be able to accept IDNs using 8-bit character
|
|
encoding schemes, including those represented in any localized (e.g.
|
|
GB2312) or internationalized (e.g. UTF-8) charsets.
|
|
|
|
This input typically takes place where e-mail addresses are entered
|
|
such as the "From", "To", "Cc", "Bcc" fields, amongst others, as IDNs
|
|
may be used at the right-hand-side of the "@" sign in an e-mail address
|
|
(domain-parts).
|
|
|
|
The mail composition interface MAY allow ACE input for the same
|
|
reasons as specified in [IDNA], but is not recommended as ACE is opaque
|
|
and ugly.
|
|
|
|
2.1.3 Interface between MUA Composition Interface and SMTP Client
|
|
|
|
The MUA composition interface communicates with the SMTP client in the
|
|
MUA typically through internal function calls within the software itself
|
|
or through an API. It is at this level where ACE conversion of any IDN
|
|
encountered by the MUA composition interface takes place.
|
|
|
|
Before converting the name parts of the IDN into ACE, the MUA MUST
|
|
prepare each name part as specified in [NAMEPREP]. Thereafter, the MUA
|
|
MUST convert the name parts into ACE before passing any data to the SMTP
|
|
client.
|
|
|
|
The SMTP client then prepares the e-mail for transmission using the
|
|
SMTP protocol [RFC821], and thereafter establishes an SMTP connection
|
|
with the user-specified SMTP server to transmit the e-mail.
|
|
|
|
It is important to note that an IDN specified in the parameters of any
|
|
SMTP command MUST be represented in nameprepped ACE at this point in
|
|
time. This includes SMTP commands which require domain parameters (such
|
|
as the HELO and EHLO commands) and commands where e-mail addresses are
|
|
specified (such as the MAIL FROM, RCPT TO, DATA, VRFY, EXPN, SEND, SOML
|
|
and SAML commands).
|
|
|
|
As for data passed by the DATA command, ACE conversion MUST be
|
|
performed when the "domain" portion of an "addr-spec" or when a "domain"
|
|
itself, within the context of [RFC822], is encountered. This is
|
|
necessary as an updated MUA may originate a message which is read by a
|
|
non-updated MUA. If this happens, the non-updated MUA may face
|
|
operational problems dealing with IDNs that appear in the "addr-spec"
|
|
which are not in ACE.
|
|
|
|
Any transfer encoding syntax to be applied to the mail headers as
|
|
specified in [RFC2047] SHOULD be performed before nameprepped ACE
|
|
conversion. This is to reduce confusion between IDNs within "addr-spec"
|
|
and "domain" portions, in the context of [RFC822], and IDNs which appear
|
|
as arbitrary data in mail headers and bodies.
|
|
|
|
2.1.4. Interface between POP3/IMAP4 client (or local mail store) and
|
|
Mail Reading Interface
|
|
|
|
The MUA mail reading interface (i.e. "Read mail" function of an MUA)
|
|
typically displays e-mail data retrieved from either a POP3/IMAP4
|
|
client or from a local mail store through internal function calls within
|
|
the MUA software or through an API.
|
|
|
|
When e-mail containing an ACE-represented IDN is to be displayed, the
|
|
MUA SHOULD convert the ACE-represented IDN contained within the
|
|
"addr-spec" or "domain" portion specified in [RFC822] back into any
|
|
localized or internationalized charset of the user's choice, whenever
|
|
possible. In the event that it is impossible to achieve conversion back
|
|
into the selected localized charset (for example, conversion of RACE-
|
|
represented Hangeul characters into ISO-8859-1 is impossible), the MUA
|
|
should prompt the user with an error message.
|
|
|
|
It may be possible to save and retrieve information about the original
|
|
charset of the ACE-converted IDN through the use of additional
|
|
[RFC822] mail headers, but that is not (yet) addressed by this memo.
|
|
|
|
Although it is possible to render ACE into properly decoded glyphs and
|
|
display the actual abstract characters without any conversion to other
|
|
charsets, the MUA SHOULD NOT do this as it is not the primary function
|
|
of an MUA to render characters. This should be left to a rendering
|
|
engine which is separate from the MUA and typically embedded into the
|
|
OS. It is sufficient for the MUA to pass the appropriate charset to the
|
|
rendering engine for proper display.
|
|
|
|
3. ACE Length Considerations
|
|
|
|
As [RFC821] in Section 4.5.3 restricts the maximum total length of a
|
|
domain name to 64 characters, representation of IDNs using ACE may
|
|
pose a potential problem. Most ACEs typically require 3-4 ASCII
|
|
characters to represent one international character (especially in the
|
|
case of CJK characters, where compression is less effective).
|
|
|
|
That would leave only about 16-24 characters for the whole IDN,
|
|
including all name parts and dots. This is highly undesirable as some
|
|
languages such as Arabic are unable to be abbreviated and the domain
|
|
names may require a larger length than that which is allowed by
|
|
[RFC821].
|
|
|
|
To further complicate matters, several mailing list software such as
|
|
ezmlm embed domain names into the local-parts portion of an e-mail
|
|
address during management of subscriptions, together with randomly-
|
|
generated subscription information. This would leave an even smaller
|
|
maximum ACE length, if interoperability with these mailing list software
|
|
were to be maintained, given that there is also a 64 character
|
|
restriction on local parts.
|
|
|
|
4. Security Considerations
|
|
|
|
As this memo is based on [IDNA], security considerations are similar
|
|
to that faced by [IDNA]. This includes security considerations from
|
|
[NAMEPREP] as well.
|
|
|
|
5. Other Considerations
|
|
|
|
Although this document addresses end-user MUAs (e.g. elm, mutt, pine,
|
|
Eudora, Outlook Express, etc) to a large extent, the definition of an
|
|
MUA could be extended to include web-based e-mail server software and
|
|
automated programs such as mailing list management software.
|
|
|
|
End-user MUAs may also include additional functionality where IDNs may
|
|
be encountered, such as calendaring/scheduling, directory services and
|
|
digital certificate storage. This is not (yet) addressed in this memo.
|
|
|
|
6. Future Extensions
|
|
|
|
It is possible to achieve internationalization of the entire e-mail
|
|
address by representation of international characters in the local-parts
|
|
of an "addr-spec" using nameprepped ACE conversion in a similar fashion
|
|
as described in this memo.
|
|
|
|
However, this is a different problem altogether and is currently beyond
|
|
the scope of this memo.
|
|
|
|
7. References
|
|
|
|
[IDNA] Paul Hoffman & Patrik Faltstrom, "Internationalizing Host Names
|
|
in Applications (IDNA)", draft-ietf-idn-idna.
|
|
|
|
[UTR17] K. Whistler & M. Davis, Unicode Consortium, "Character Encoding
|
|
Model", Unicode Technical Report #17,
|
|
http://www.unicode.org/unicode/reports/tr17/
|
|
|
|
[US-ASCII] United States of America Standards Institute, "USA Code for
|
|
Information Interchange", X3.4, 1968.
|
|
|
|
[RFC2119] Scott Bradner, "Key words for use in RFCs to Indicate
|
|
Requirement Levels", March 1997, RFC 2119.
|
|
|
|
[IDNCOMP] Paul Hoffman, "Comparison of Internationalized Domain Name
|
|
Proposals", draft-ietf-idn-compare.
|
|
|
|
[RFC821] Jonathan B. Postel, "Simple Mail Transfer Protocol", August
|
|
1982, RFC 821.
|
|
|
|
[RFC822] David H. Crocker, "Standard for the Format of ARPA Internet
|
|
Text Messages", August 1982, RFC 822.
|
|
|
|
[RFC2045] N. Freed & N. Borenstein, "Multipurpose Internet Mail
|
|
Extensions (MIME) Part One: Format of Internet Message Bodies",
|
|
November 1996, RFC 2045.
|
|
|
|
[RFC2047] K. Moore, "MIME (Multipurpose Internet Mail Extensions)
|
|
Part Three: Message Header Extensions for Non-ASCII Text", November
|
|
1996, RFC 2047.
|
|
|
|
[RFC1652] J. Klensin et al., "SMTP Service Extension for 8bit-
|
|
MIMEtransport", July 1994, RFC 1652.
|
|
|
|
|
|
[NAMEPREP] Paul Hoffman & Marc Blanchet, "Preparation of
|
|
Internationalized Host Names", draft-ietf-idn-nameprep.
|
|
|
|
A. Author's Address
|
|
|
|
Maynard Kang
|
|
i-EMAIL.net Pte Ltd
|
|
1 Kim Seng Promenade #12-07
|
|
Great World City West Tower
|
|
Singapore 237994
|
|
E-mail: maynard@i-email.net |