ISO/IEC 10646
ISO/IEC 10646 (UCS;
Unicode |
---|
|
UTF-7 |
UTF-8 |
CESU-8 |
UTF-16 |
UTF-32 |
UTF-EBCDIC |
SCSU |
Punycode (IDN/IDNA) |
GB 18030 |
その |
UCS |
マッピング |
|
BOM |
|
UnicodeとHTML |
Unicodeと |
Unicodeフォント |
制定 の経緯 とその影響
[この
この
その
このような
制定 された規格 群
[※ 1999
1993/05/01 | ISO/IEC 10646-1: 1993 | Universal Multiple-Octet Coded Character Set (UCS) -- Part 1: Architecture and basic Multilingual Plane |
1996/03/01 | ISO/IEC 10646-1: 1993/Cor.1 | TECHNICAL CORRIGENDUM 1 to ISO/IEC 10646-1:1993 |
1996/10/15 | ISO/IEC 10646-1:1993/Amd.1 | Transformation Format for 16 planes of group 00 (UTF-16) |
1996/10/15 | ISO/IEC 10646-1:1993/Amd.2 | UCS Transformation Format 8 (UTF-8) |
1996/10/15 | ISO/IEC 10646-1:1993/Amd.3 | Code positions for control characters |
1996/10/15 | ISO/IEC 10646-1:1993/Amd.4 | Removal of annex G (UTF-1) |
1997/11/15 | ISO/IEC 10646-1:1993/Amd.6 | Tibetan |
1997/11/15 | ISO/IEC 10646-1:1993/Amd.7 | 33 additional characters |
1997/12/15 | ISO/IEC 10646-1:1993/Amd.8 | New annex on CJK Ideographs to ISO/IEC 10646-1:1993 |
1997/12/15 | ISO/IEC 10646-1:1993/Amd.9 | Identifiers for Characters |
1998/05/15 | ISO/IEC 10646-1:1993/Amd.5 | Hangul syllables |
1998/07/15 | ISO/IEC 10646-1:1993/Cor.2 | TECHNICAL CORRIGENDUM 2 to ISO/IEC 10646-1:1993 |
1998/07/15 | ISO/IEC 10646-1:1993/Amd.11 | Unified Canadian Aboriginal Syllabics |
1998/09/01 | ISO/IEC 10646-1:1993/Amd.12 | Cherokee |
1998/10/01 | ISO/IEC 10646-1:1993/Amd.10 | Ethiopic script |
1998/10/15 | ISO/IEC 10646-1:1993/Amd.13 | CJK unified ideographs |
1998/11/01 | ISO/IEC 10646-1:1993/Amd.16 | Braille Patterns |
1998/11/01 | ISO/IEC 10646-1:1993/Amd.19 | Runic |
1998/11/01 | ISO/IEC 10646-1:1993/Amd.20 | Ogham |
1999/05/15 | ISO/IEC 10646-1:1993/Amd.23 | Bopomofo and various other characters |
1999/06/01 | ISO/IEC 10646-1:1993/Amd.21 | Sinhala |
1999/07/15 | ISO/IEC 10646-1:1993/Amd.17 | CJK Unified Ideograph Extension |
1999/07/15 | ISO/IEC 10646-1:1993/Amd.18 | Symbols and Others |
1999 | ISO/IEC 10646-1:1993/Cor.3 | TECHNICAL CORRIGENDUM 3 to ISO/IEC 10646-1:1993 |
1999 | ISO/IEC 10646-1:1993/Amd.14 | Yi syllables and Yi radicals |
1999 | ISO/IEC 10646-1:1993/Amd.22 | Keyboard symbols |
1999 | ISO/IEC 10646-1:1993/Amd.24 | Thaana Script |
1999 | ISO/IEC 10646-1:1993/Amd.25 | Khmer Script |
1999 | ISO/IEC 10646-1:1993/Amd.26 | Burmese Script |
1999 | ISO/IEC 10646-1:1993/Amd.27 | Syriac Script |
1999 | ISO/IEC 10646-1:1993/Amd.29 | Mongolian |
1999 | ISO/IEC 10646-1:1993/Amd.30 | Additional Latin and other characters |
2000 | ISO/IEC 10646-1:1993/Amd.15 | Radicals and Numerals |
2000 | ISO/IEC 10646-1:1993/Amd.28 | Ideographic Description Sequences |
2000 | ISO/IEC 10646-1:1993/Amd.31 | Tibetan Extension |
2000/09/15 | ISO/IEC 10646-1:2000 | UCS -- Part 1: Architecture and basic Multilingual Plane |
2001/11/01 | ISO/IEC 10646-2:2001 | UCS -- Part 2: Supplementary Planes |
2002/07/16 | ISO/IEC 10646-1:2000/Amd.1 | Mathematical symbols and other characters |
2003/12/15 | ISO/IEC 10646:2003 | Universal Multiple-Octet Coded Character Set (UCS) |
2005/11/15 | ISO/IEC 10646:2003/Amd.1 | Glagolitic, Coptic, Georgian and other characters |
2006/07/01 | ISO/IEC 10646:2003/Amd.2 | N'Ko, Phags-pa, Phoenician and other characters |
2008/02/15 | ISO/IEC 10646:2003/Amd.3 | Lepcha, Ol Chiki, Saurashtra, Vai and other characters |
2008/07/01 | ISO/IEC 10646:2003/Amd.4 | Cham, Game Tiles, and other characters |
2008/12/01 | ISO/IEC 10646:2003/Amd.5 | Tai Tham, Tai Viet, Avestan, Egyptian Hieroglyphs, CJK Unified Ideographs Extension C, and other characters |
2009/10/15 | ISO/IEC 10646:2003/Amd.6 | Bamum, Javanese, Lisu, Meetei Mayek, Samaritan, and other characters |
2010/7/15 | ISO/IEC 10646:2003/Amd.7 | Mandaic, Batak, Brahmi, and other characters |
2011/5/2 | ISO/IEC 10646:2011 | Universal Coded Character Set (UCS) |
2012/05/21 | ISO/IEC 10646:2012 | Information technology -- Universal Coded Character Set (UCS) |
2013/04/09 | ISO/IEC 10646:2012/Amd 1:2013 | Linear A, Palmyrene, Manichaean, Khojki, Khudawadi, Bassa Vah, Duployan, and other characters |
2014/08/29 | ISO/IEC 10646:2014 | Information technology -- Universal Coded Character Set (UCS) |
2015 | ISO/IEC 10646:2014/Amd 1:2015 | Cherokee supplement and other characters |
2016 | ISO/IEC 10646:2014/Amd 2:2016 | Bhaiksuki, Marchen, Tangut and other characters |
2017/12/22 | ISO/IEC 10646:2017 | Information technology -- Universal Coded Character Set (UCS) |
文字 符号 化 方式
[Unicodeの『UTF』が『Unicode Transformation Format』を
- UTF-1
初期 に提案 されていた、8ビットコードによる方式 。ほとんど利用 されることなくUTF-8にとって代 わられた。- UCS-2
- 2オクテット
固定 のUCS (Universal Coded-Character Set) である。BMP(基本 多言 語 面 )以外 の文字 を使 うことはできず、すべての文字 を符号 化 できるUTF-16にとって代 わられた。2011年 の改訂 ではdeprecated(廃止 予定 )とされた。 - UTF-8
- UnicodeのUTF-8と
同 じ[3]。 - UTF-16
- UnicodeのUTF-16と
同 じ[4]。 - UTF-32 (UCS-4)
- UnicodeのUTF-32と
同 じ。
実装 レベル
[- Level 1
合成 列 などを扱 わない- Level 2
必要 な合成 列 を扱 える- Level 3
全 て扱 える
Unicodeは、Level 3の
脚注
[- ^ “The Unicode Standard Version 11.0” (PDF) (English). The Unicode Consortium. p. 1 (2018
年 6月 5日 ). 2019年 1月 21日 閲覧 。 “The Unicode Standard is code-for-code identical with International Standard ISO/IEC 10646.” - ^ “The Unicode Standard Version 11.0” (PDF) (English). The Unicode Consortium. p. 88 (2018
年 6月 5日 ). 2019年 1月 21日 閲覧 。 “The character names in the Unicode Standard match those of the English edition of ISO/IEC 10646.” - ^ a b “The Unicode Standard Version 11.0” (PDF) (English). The Unicode Consortium. p. 930 (2018
年 6月 5日 ). 2019年 1月 21日 閲覧 。 “The ISO/IEC 10646 definition of UTF-8 is identical to UTF-8 as described under Definition D92 in Section 3.9, Unicode Encoding Forms.” - ^ a b “The Unicode Standard Version 11.0” (PDF) (English). The Unicode Consortium. p. 930 (2018
年 6月 5日 ). 2019年 1月 21日 閲覧 。 “The ISO/IEC 10646 definition of UTF-16 is identical to UTF-16 as described under Definition D91 in Section 3.9, Unicode Encoding Forms.”
参考 文献
[- ISO/IEC 10646:2003 Information technology -- Universal Multiple-Octet Coded Character Set (UCS)
- ISO/IEC 10646:2003/Amd 1:2005 Glagolitic, Coptic, Georgian and other characters
- ISO/IEC 10646:2003/Amd 2:2006 N'Ko, Phags-pa, Phoenician and other characters
- ISO/IEC 10646:2011(E) Universal Coded Character Set (UCS)
- ISO/IEC 10646:2017 Universal Coded Character Set (UCS)