Mojikyō

*Mojikyō*
*Konjaku Mojikyō*
今昔こんじゃく文字もじ鏡きょう
	The Mojikyō character map highlighting the Taiwanese kana　セ
Developer(s)	Tadahisa Ishikawa; (石川いしかわ忠久ただひさ); Tokio Furuya; (古家ふるや時雄ときお); Mojikyō Institute; (文字もじ鏡きょう研究けんきゅう会かい)
Initial release	1.0 / July 1997; 26 years ago
Final release	4.0 / December 15, 2018; 5 years ago
Operating system	Microsoft Windows
Size	51MB
Available in	Japanese
Type	Character set bundled with fonts and a character map
License	Proprietary
Website	mojikyo.org

Mojikyō (Japanese: 文字もじ鏡きょう), also known by its full name Konjaku Mojikyō (今昔こんじゃく文字もじ鏡きょう, lit. '(the) past and present character mirror'), is a character encoding scheme. The Mojikyō Institute (文字もじ鏡きょう研究けんきゅう会かい, Mojikyō Kenkyūkai), which published the character set, also published computer software and TrueType fonts to accompany it. The Mojikyō Institute, chaired by Tadahisa Ishikawa (石川いしかわ忠久ただひさ),^[1] originally had its character set and related software and data redistributed on CD-ROMs sold in Kinokuniya stores.^[2]

Conceptualized in 1996,^[3] the first version of the CD-ROM was released in July 1997.^[4] For a time, the Mojikyō Institute also offered a web subscription, termed "Mojikyō WEB" (文字もじ鏡きょうWEB), which had more up-to-date characters.^[5]

As of September 2006^[update], Mojikyō encoded 174,975 characters.^[6] Among those, 150,366 characters (≈86%) then belonged to the extended Chinese–Japanese–Korean–Vietnamese (CJKV)^{[note 2]} family.^[5] Many of Mojikyō's characters are considered obsolete or obscure, and are not encoded by any other character set, including the most widely used international text encoding standard, Unicode.

Originally a paid proprietary software product, as of 2015, the Mojikyō Institute began to upload its latest releases to Internet Archive as freeware,^[7] as a memorial to honor one of its developers, Tokio Furuya (古家ふるや時雄ときお), who died that year.^[3] On December 15, 2018, version 4.0 was released. The next day, Ishikawa announced that without Furuya this would be the final release of Mojikyō.^[3]

Premise[edit]

The Mojikyō encoding was created to provide a complete index of Chinese, Korean, and Japanese characters. It also encodes a large number of characters in ancient scripts, such as the oracle bone script, the seal script, and Sanskrit (Siddhaṃ). For many characters, it is the only character encoding to encode them, and its data is often used as a starting point for Unicode proposals.^[8]^[9] However, Mojikyō has much looser standards than Unicode for encoding, which leads Mojikyō to have many encoded glyphs of dubious, or even unintentionally fictional, origin.^[10]^[11] As such, while many non-Unicode Mojikyō characters are suitable for addition to Unicode, not all can become Unicode characters, due to the differing standards of evidence required by each.

Composition[edit]

The Mojikyō fonts (文字もじ鏡きょうフォント) are TrueType fonts that come in a ZIP file and are each around 2–5 megabytes; the different fonts contain different numbers of characters.^{[note 3]} Also included is a Windows executable that implements a graphical character map, the "Mojikyō Character Map" (文字もじ鏡きょうMAP), MOCHRMAP.EXE.^{[note 4]}^{[note 5]} MOCHRMAP.EXE allows users to browse through the Mojikyō fonts, and copy and paste characters in lieu of typing them on the keyboard. As opposed to the regular Windows character map, or for that matter KCharSelect, which both support TrueType fonts, MOCHRMAP.EXE displays the numbered Mojikyō encoding slot of the requested character.^[12]^{[note 6]} In order for MOCHRMAP.EXE to work, all Mojikyō fonts must be installed.^{[note 7]}

Encoding[edit]

When referring to a character encoded in Mojikyō, the format MJXXXXXX is often used, similar to the U+XXXX format used for Unicode. For example, hentaigana U+1B008 𛀈 HENTAIGANA LETTER I-3 has Mojikyō encoding MJ090007 and Unicode encoding U+1B008.^[13] A difference, however, is that Mojikyō encodings displayed this way are decimal, while Unicode's U+ encoding is hexadecimal.

From the earliest days of Unicode, Mojikyō has both influenced—and been influenced by—the standard. Glyphs originating from Mojikyō first appear in a proposal to the Ideographic Rapporteur Group (IRG),^{[note 8]} which is responsible for maintaining all CJK blocks in Unicode,^[14]^[15] on 18 April 2002.^[16] In May 2007, Mojikyō played a minor role in an eventually successful series of proposals to encode the Tangut script in Unicode;^[17]^{[note 9]} Mojikyō already had within its encoding 6,000 Tangut characters by October 2002.^[6]

The Unicode Standard's Unihan Database refers to Mojikyō as the "Japanese KOKUJI Collection" (日本にっぽん国字こくじ集しゅう),^[18] abbreviated "JK".^[19]^[20] For example, U+2B679 𫙹 CJK UNIFIED IDEOGRAPH-2B679,^{[note 10]} an ideograph read in Japanese as burizādo (ブリザード, lit. 'blizzard'), has a J-Source^{[note 11]} equal to JK-66038. All Unicode characters with a JK-prefixed J-Source originate from Mojikyō.^[21]^{[note 12]} According to Ken Lunde, a subject matter expert in character encodings and East Asian languages, as of Unicode 13.0, 782 ideographs in Unicode originate from Mojikyō, split somewhat evenly between two blocks: CJK Unified Ideographs Extension C, with 367, and CJK Unified Ideographs Extension E, with 415.^[20]^[22] Not all Unicode characters with Mojikyō origins (JK-prefixed J-Sources) have the same representative glyph in the code chart as in the Mojikyō font;^{[note 13]} some characters had their shapes changed before final encoding, as investigation showed the shapes assigned by the Mojikyō Institute were wrong.^[11]^{[note 14]}

Blocks[edit]

As of September 2006^[update] it encoded 174,975 characters.^[6] Among those, 150,366 characters then belonged to the extended CJKV^{[note 2]} family.^[5] Many of the encoded characters are considered obsolete or otherwise obscure, and are not encoded by any other character set, including the international standard, Unicode. Each Mojikyō character has a unique number, and the characters are organized into blocks.

Mojikyō puts CJKV characters in different blocks according to their traditional Kangxi radical. Common radicals containing an especially high number of characters, such as Radicals 9 (人ひと) and 162 (⻌), are split further by stroke order.^{[note 15]}

No unification[edit]

Unlike Unicode, Mojikyō purposely avoids Han unification; no attempt at compactness of the encoding is made, nor is there an attempt to keep all common characters below U+FFFF as there is in Unicode.

Unicode, on the other hand, sorts its CJK into blocks based on how common they are: the most common are generally put into the Basic Multilingual Plane,^{[note 14]} while those that are rare or obscure are put into the Supplementary Planes.

For example, Radical 9 has two characters where Unicode has one: MJ054435 (令れい), and MJ059031 (令れい), both represented in Unicode as U+4EE4 令れい CJK UNIFIED IDEOGRAPH-4EE4.

License[edit]

Mojikyō is proprietary software under a restrictive license. Originally, the Mojikyō Institute tried to prevent its character data from being used, and threatened those who published conversion tables to and from its character set. In July 2010, the Mojikyō Institute abandoned its legal efforts to stop at least one Japanese user from publishing conversion tables or converting characters encoded in Mojikyō to Unicode or other character sets.^[23] Mere data, sometimes including the shapes of letters, are considered in many jurisdictions to be common property as they do not meet the threshold of originality.^{[note 16]}

Due to this legacy, however, GlyphWiki [ja] disallowed Mojikyō data as of 2020.^[24]

Collected writing systems[edit]

Living[edit]

Chinese — Hanzi
Japanese — Kanji, Kana (including Hentaigana)
Korean — Hanja
Latin alphabet with diacritics
Cyrillic script with diacritics

Dead or obsolete[edit]

Ancient Chinese
- Oracle bone script
- Seal script
Taiwanese kana
Vietnamese — Chữ Nôm
Sanskrit — Siddhaṃ
Tangut script
Sui script

References[edit]

^ "今昔こんじゃく文字もじ鏡きょうについて" [About Mojikyō]. Mojikyō Institute (in Japanese). Archived from the original on 3 February 2001. Retrieved 6 July 2020.
^ ようこそ、今昔こんじゃく文字もじ鏡きょうの世界せかいへ！ [Welcome to the world of Mojikyō!] (in Japanese). Kinokuniya KK. Archived from the original on 4 March 2005. Retrieved 5 July 2020.
^ ^a ^b ^c Ishikawa, Tadahisa (August 2015). "古家ふるや時雄ときお君くんを悼いたむ" [Tokio Furuya, we grieve your death]. Mojikyō Institute (in Japanese). Retrieved 8 July 2020.
^ Konjaku Mojikyō 今昔こんじゃく文字もじ鏡きょう (in Japanese), July 1997, ISBN 9784314900034
^ ^a ^b ^c 今昔こんじゃく文字もじ鏡きょうとは [About Mojikyo] (in Japanese). Kinokuniya KK. Archived from the original on 27 April 2010. Retrieved 5 July 2020.
^ ^a ^b ^c 今昔こんじゃく文字もじ鏡きょうとは [What is Mojikyō?] (in Japanese). Kinokuniya KK. Archived from the original on 5 February 2005. Retrieved 5 July 2020.
^ "Search: creator:"MOJIKYO Institute"". Internet Archive. Retrieved 6 July 2020.
^ Takada, Tomokazu; Yada, Tsutomu; Saito, Tatsuya (18 September 2015). Proposal for hentaigana (PDF). Translated by Kobayashi, Tatsuo; Kobayashi, Daniel. Information Processing Society of Japan. L2/15-239. Retrieved 5 July 2020 – via Unicode Consortium.
^ Hiura, Hideki; Kobayashi, Tatsuo; et al. (31 October 2003). Ideograph Variation Selector and Variation Collection Identifier. Open Internationalization Initiative. L2/03-413. Retrieved 5 July 2020 – via Unicode Consortium.
^ Takada, Tomokazu [高田たかだ智和ともかず]; Oda, Tetsuji [織田おだ哲治てつじ]; et al. (26 August 2013). 平成へいせい25年度ねんど第だい3回かい文字もじ情報じょうほう検討けんとうサブワーキンググループ議事ぎじ録ろく [Meeting Minutes of the Third Character Information Examination Sub-Working Group of 2013 (Heisei 25)] (PDF). Information Technology Promotion Agency, Government of Japan (in Japanese). p. 2. Retrieved 6 July 2020. 文字もじ鏡きょう研究けんきゅう会かいの関係かんけい者しゃにヒアリングしたところ、オランダから提案ていあんされたWG2 N36981には文字もじ鏡きょうのフォントが使用しようされているが、文字もじ鏡きょう研究けんきゅう会かいは関与かんよしておらず、提案ていあん内容ないようについても疑問ぎもんがあるとのことであった。[According to an interview with a representative of the Mojikyō Institute, a Mojikyō font is used in WG2 N36981 proposed by the Netherlands, but the Mojikyō Institute itself is not involved with the proposal; it furthermore has doubts about some of the content of that proposal.]
^ ^a ^b Suzuki, Toshiya [鈴木すずき俊哉としや] (30 July 2009). 統合とうごう漢字かんじに申請しんせいされた「殷いん周しゅう金文きんぶん集成しゅうせい引得」図形ずけい文字もじの調査ちょうさ [Investigation on Glyphs collected from "Index to Collection of Inscriptions of the Yin-Zhou Period" to submit to CJK Unified Ideographs]. IPSJ SIG Technical Report (in Japanese). 2009-DD-72 (7). Information Processing Society of Japan: 2 – via Internet Archive. しかし、拡張かくちょうCの標準ひょうじゅん化か作業さぎょうが8年ねんの長期ちょうきにわたり、また事後じご的てきに用例ようれいが必須ひっすとされたため、正式せいしきに公布こうふされた拡張かくちょうC漢字かんじの典拠てんきょは当初とうしょの典拠てんきょとはかなり異ことなるものとなっている。たとえば日本にっぽんでは当初とうしょは文字もじ鏡きょう研究けんきゅう会かいによって選定せんていされた1000文字もじ程度ていどの漢字かんじを申請しんせいしていた[。] [...] 典拠てんきょ用例ようれい確認かくにんは文字もじ鏡きょうとは独立どくりつに行おこなわれたため、字形じけいが文字もじ鏡きょう漢字かんじから変更へんこうされたものも多おおい。[As the standardization effort for CJK Unified Ideographs Extension C has been eight long years in the making and examples of kanji have been requested after their encoding, the officially promulgated Extension C kanji standard is quite different from the original standard. For example, we, the Government of Japan, initially applied for about 1,000 kanji selected by the Mojikyō Institute[.] [...] Since the verification of the kanji was performed independently of the Mojikyō Institute, the character shapes were often changed from Mojikyō's version of that same codepoint.]
^ Ishikawa, Tadahisa (25 May 1999). "パソコン悠悠ゆうゆう漢字かんじ術じゅつ今昔こんじゃく文字もじ鏡きょう徹底てってい活用かつよう" [Kanji on your PC, Made Easy—The Complete Mojikyō Manual]. Mojikyō Institute. Retrieved 6 July 2020.
^ MJ文字もじ情報じょうほう一覧いちらん表ひょう [Table of MJ Character Encodings] (in Japanese). Information Technology Promotion Agency. Archived from the original on 29 September 2018. Retrieved 5 July 2020.
^ "Unicode Standard Annex #45: U-source Ideographs". The Unicode Standard. Unicode Consortium.
^ "Appendix E: Han Unification History" (PDF). The Unicode Standard. Unicode Consortium. March 2020.
^ "CJK Extension C1 From Japan". Ideographic Rapporteur Group. IRG#19 N895 – via The Chinese University of Hong Kong's Department of Computer Science and Engineering. N895-Japan_C1
^ Cook, Richard (9 May 2007). Proposal to encode Tangut characters in UCS Plane 1 (PDF). UC Berkeley Script Encoding Initiative. p. 4. L2/07-143 – via Unicode Consortium.
^ Jenkins, John H.; Cook, Richard; Lunde, Ken, eds. (5 March 2020), "kIRG JSource", Unicode Standard Annex #38, Unicode Consortium
^ Kobayashi, Tatsuo (3 December 2001). "List of Japanese Ideographs which may be proposed in Extension-C". ISO/IEC JTC1/SC2/WG2/IRG N853.
^ ^a ^b Ken Lunde [@ken_lunde] (6 July 2020). "In particular, all 782 JK-prefixed ideographs are indeed from 今昔こんじゃく文字もじ鏡きょう per IRG N862. Most were encoded in #ExtensionC, and the stragglers were encoded in #ExtensionE." (Tweet). Retrieved 6 July 2020 – via Twitter.
^ Ken Lunde [@ken_lunde] (7 July 2020). "JK-prefixed J-Source ideographs came from 今昔こんじゃく文字もじ鏡きょう, which are in Extensions C and E (the mention of Extension D was simply that what became Extension E was originally targeted to become Extension D)" (Tweet). Archived from the original on 7 July 2020. Retrieved 6 July 2020 – via Twitter.
^ Ken Lunde [@ken_lunde] (7 July 2020). "367 JK-prefixed ideographs are in Extension C, and the remaining 415 are in Extension E." (Tweet). Retrieved 6 July 2020 – via Twitter.
^ "終戦しゅうせん宣言せんげん" [Announcement: The War is Over]. 青蛙あおがえる亭てい漢語かんご塾じゅく [Seiwatei's Kanji Cram School] (in Japanese) (28 January 2016 ed.). 21 July 2010. Retrieved 7 July 2020.
^ "データ・記事きじのライセンス" [License of our data and articles]. GlyphWiki (9 June 2010 ed.). Retrieved 6 July 2020. 今昔こんじゃく文字もじ鏡きょうおよびその関連かんれん製品せいひん、データは、そのライセンス上じょうグリフウィキには用もちいることができません。文字もじ鏡きょう番号ばんごう（独自どくじ部分ぶぶん）および文字もじ鏡きょうのフォントに収録しゅうろくされているグリフそのもの、およびそれを参照さんしょう、利用りようして作成さくせいしていると判断はんだんできる情報じょうほうは、グリフウィキに登録とうろくする際さいの典拠てんきょとすることはできませんので、ご協力きょうりょくをお願ねがいいたします。 [Konjaku Mojikyō and related products and associated data are licensed in such a way that they are incompatible with our above GlyphWiki license. Neither the number of the Mojikyō encoding slot, nor the appearance of the glyph itself in Mojikyō's fonts, nor any information whatsoever that can be judged to have been gathered by referring to a Mojikyō product, can be used when entering data into GlyphWiki. We absolutely cannot accept Mojikyō data. Please cooperate with us.]

Notes[edit]

^ As yet, lacks a Unicode encoding, so is approximated here with CSS and U+30BB セ KATAKANA LETTER SE.
^ ^a ^b For Korean, Hanja are referred to. For Vietnamese, Chữ Nôm.
^ Download the file MojikyoCmap400ALL49TTF.7z from the official website
^ English name from the title of the window produced by running the executable; Japanese name from the icon of the executable.
^ Also called the "Mojikyō Cmap".
^ See the screenshots on the official website
^ Into the system fonts directory C:\Windows\Fonts.
^ As of 2019, the IRG rebranded as the Ideographic Research Group.
^ The history of the encoding of the Tangut script is quite complicated, see Tangut (Unicode block) § History for a full listing of all the related proposals and a timeline.
^ Ideographic Description Sequence: ⿰魚さかな嵐あらし
^ This is a column name in the Unihan database; ⟨J⟩ here is short for "Japanese glyph source". The full name of the column is kIRG_JSource. Under Han unification, there are nine such sources. See §3.1 of UAX#38 for a complete list and more information.
^ Other J-Source prefixes exist, such as J4, meaning the character originates from JIS X 0213:2004.
^ That is to say, a glyph made up of the same radicals in the same positions.
^ ^a ^b Errors in large collections of ideographs are, of course, not uncommon. Such errors even accidentally occur in well funded government-produced collections, such as the famous kanji from unknown sources in the Japanese Industrial Standards Committee's JIS X 0208 double-byte character encoding standard. All of these JIS X 0208 error kanji (Ghost characters, 幽霊ゆうれい文字もじ; e.g., 彁) have made their way into Unicode despite not being "real" kanji.
^ For proof, see the list in the Mojikyō Character Map, MOCHRMAP.EXE.
^ See also: fictitious entry; trap street.

External links[edit]

Official website

[2] "今昔こんじゃく文字もじ鏡きょうについて" [About Mojikyō]. Mojikyō Institute (in Japanese). Archived from the original on 3 February 2001. Retrieved 6 July 2020.

[3] ようこそ、今昔こんじゃく文字もじ鏡きょうの世界せかいへ！ [Welcome to the world of Mojikyō!] (in Japanese). Kinokuniya KK. Archived from the original on 4 March 2005. Retrieved 5 July 2020.

[:3-4] Ishikawa, Tadahisa (August 2015). "古家ふるや時雄ときお君くんを悼いたむ" [Tokio Furuya, we grieve your death]. Mojikyō Institute (in Japanese). Retrieved 8 July 2020.

[5] Konjaku Mojikyō 今昔こんじゃく文字もじ鏡きょう (in Japanese), July 1997, ISBN 9784314900034

[2010MOJIKYO-6] 今昔こんじゃく文字もじ鏡きょうとは [About Mojikyo] (in Japanese). Kinokuniya KK. Archived from the original on 27 April 2010. Retrieved 5 July 2020.

[:2-7] 今昔こんじゃく文字もじ鏡きょうとは [What is Mojikyō?] (in Japanese). Kinokuniya KK. Archived from the original on 5 February 2005. Retrieved 5 July 2020.

[9] "Search: creator:"MOJIKYO Institute"". Internet Archive. Retrieved 6 July 2020.

[10] Takada, Tomokazu; Yada, Tsutomu; Saito, Tatsuya (18 September 2015). Proposal for hentaigana (PDF). Translated by Kobayashi, Tatsuo; Kobayashi, Daniel. Information Processing Society of Japan. L2/15-239. Retrieved 5 July 2020 – via Unicode Consortium.

[11] Hiura, Hideki; Kobayashi, Tatsuo; et al. (31 October 2003). Ideograph Variation Selector and Variation Collection Identifier. Open Internationalization Initiative. L2/03-413. Retrieved 5 July 2020 – via Unicode Consortium.

[12] Takada, Tomokazu [高田たかだ智和ともかず]; Oda, Tetsuji [織田おだ哲治てつじ]; et al. (26 August 2013). 平成へいせい25年度ねんど第だい3回かい文字もじ情報じょうほう検討けんとうサブワーキンググループ議事ぎじ録ろく [Meeting Minutes of the Third Character Information Examination Sub-Working Group of 2013 (Heisei 25)] (PDF). Information Technology Promotion Agency, Government of Japan (in Japanese). p. 2. Retrieved 6 July 2020. 文字もじ鏡きょう研究けんきゅう会かいの関係かんけい者しゃにヒアリングしたところ、オランダから提案ていあんされたWG2 N36981には文字もじ鏡きょうのフォントが使用しようされているが、文字もじ鏡きょう研究けんきゅう会かいは関与かんよしておらず、提案ていあん内容ないようについても疑問ぎもんがあるとのことであった。[According to an interview with a representative of the Mojikyō Institute, a Mojikyō font is used in WG2 N36981 proposed by the Netherlands, but the Mojikyō Institute itself is not involved with the proposal; it furthermore has doubts about some of the content of that proposal.]

[:0-13] Suzuki, Toshiya [鈴木すずき俊哉としや] (30 July 2009). 統合とうごう漢字かんじに申請しんせいされた「殷いん周しゅう金文きんぶん集成しゅうせい引得」図形ずけい文字もじの調査ちょうさ [Investigation on Glyphs collected from "Index to Collection of Inscriptions of the Yin-Zhou Period" to submit to CJK Unified Ideographs]. IPSJ SIG Technical Report (in Japanese). 2009-DD-72 (7). Information Processing Society of Japan: 2 – via Internet Archive. しかし、拡張かくちょうCの標準ひょうじゅん化か作業さぎょうが8年ねんの長期ちょうきにわたり、また事後じご的てきに用例ようれいが必須ひっすとされたため、正式せいしきに公布こうふされた拡張かくちょうC漢字かんじの典拠てんきょは当初とうしょの典拠てんきょとはかなり異ことなるものとなっている。たとえば日本にっぽんでは当初とうしょは文字もじ鏡きょう研究けんきゅう会かいによって選定せんていされた1000文字もじ程度ていどの漢字かんじを申請しんせいしていた[。] [...] 典拠てんきょ用例ようれい確認かくにんは文字もじ鏡きょうとは独立どくりつに行おこなわれたため、字形じけいが文字もじ鏡きょう漢字かんじから変更へんこうされたものも多おおい。[As the standardization effort for CJK Unified Ideographs Extension C has been eight long years in the making and examples of kanji have been requested after their encoding, the officially promulgated Extension C kanji standard is quite different from the original standard. For example, we, the Government of Japan, initially applied for about 1,000 kanji selected by the Mojikyō Institute[.] [...] Since the verification of the kanji was performed independently of the Mojikyō Institute, the character shapes were often changed from Mojikyō's version of that same codepoint.]

[17] Ishikawa, Tadahisa (25 May 1999). "パソコン悠悠ゆうゆう漢字かんじ術じゅつ今昔こんじゃく文字もじ鏡きょう徹底てってい活用かつよう" [Kanji on your PC, Made Easy—The Complete Mojikyō Manual]. Mojikyō Institute. Retrieved 6 July 2020.

[20] MJ文字もじ情報じょうほう一覧いちらん表ひょう [Table of MJ Character Encodings] (in Japanese). Information Technology Promotion Agency. Archived from the original on 29 September 2018. Retrieved 5 July 2020.

[22] "Unicode Standard Annex #45: U-source Ideographs". The Unicode Standard. Unicode Consortium.

[23] "Appendix E: Han Unification History" (PDF). The Unicode Standard. Unicode Consortium. March 2020.

[24] "CJK Extension C1 From Japan". Ideographic Rapporteur Group. IRG#19 N895 – via The Chinese University of Hong Kong's Department of Computer Science and Engineering. N895-Japan_C1

[25] Cook, Richard (9 May 2007). Proposal to encode Tangut characters in UCS Plane 1 (PDF). UC Berkeley Script Encoding Initiative. p. 4. L2/07-143 – via Unicode Consortium.

[UAX38-27] Jenkins, John H.; Cook, Richard; Lunde, Ken, eds. (5 March 2020), "kIRG JSource", Unicode Standard Annex #38, Unicode Consortium

[28] Kobayashi, Tatsuo (3 December 2001). "List of Japanese Ideographs which may be proposed in Extension-C". ISO/IEC JTC1/SC2/WG2/IRG N853.

[:4-29] Ken Lunde [@ken_lunde] (6 July 2020). "In particular, all 782 JK-prefixed ideographs are indeed from 今昔こんじゃく文字もじ鏡きょう per IRG N862. Most were encoded in #ExtensionC, and the stragglers were encoded in #ExtensionE." (Tweet). Retrieved 6 July 2020 – via Twitter.

[:1-32] Ken Lunde [@ken_lunde] (7 July 2020). "JK-prefixed J-Source ideographs came from 今昔こんじゃく文字もじ鏡きょう, which are in Extensions C and E (the mention of Extension D was simply that what became Extension E was originally targeted to become Extension D)" (Tweet). Archived from the original on 7 July 2020. Retrieved 6 July 2020 – via Twitter.

[34] Ken Lunde [@ken_lunde] (7 July 2020). "367 JK-prefixed ideographs are in Extension C, and the remaining 415 are in Extension E." (Tweet). Retrieved 6 July 2020 – via Twitter.

[38] "終戦しゅうせん宣言せんげん" [Announcement: The War is Over]. 青蛙あおがえる亭てい漢語かんご塾じゅく [Seiwatei's Kanji Cram School] (in Japanese) (28 January 2016 ed.). 21 July 2010. Retrieved 7 July 2020.

[40] "データ・記事きじのライセンス" [License of our data and articles]. GlyphWiki (9 June 2010 ed.). Retrieved 6 July 2020. 今昔こんじゃく文字もじ鏡きょうおよびその関連かんれん製品せいひん、データは、そのライセンス上じょうグリフウィキには用もちいることができません。文字もじ鏡きょう番号ばんごう（独自どくじ部分ぶぶん）および文字もじ鏡きょうのフォントに収録しゅうろくされているグリフそのもの、およびそれを参照さんしょう、利用りようして作成さくせいしていると判断はんだんできる情報じょうほうは、グリフウィキに登録とうろくする際さいの典拠てんきょとすることはできませんので、ご協力きょうりょくをお願ねがいいたします。 [Konjaku Mojikyō and related products and associated data are licensed in such a way that they are incompatible with our above GlyphWiki license. Neither the number of the Mojikyō encoding slot, nor the appearance of the glyph itself in Mojikyō's fonts, nor any information whatsoever that can be judged to have been gathered by referring to a Mojikyō product, can be used when entering data into GlyphWiki. We absolutely cannot accept Mojikyō data. Please cooperate with us.]

[1] As yet, lacks a Unicode encoding, so is approximated here with CSS and U+30BB セ KATAKANA LETTER SE.

[:0-8] For Korean, Hanja are referred to. For Vietnamese, Chữ Nôm.

[14] Download the file MojikyoCmap400ALL49TTF.7z from the official website

[15] English name from the title of the window produced by running the executable; Japanese name from the icon of the executable.

[16] Also called the "Mojikyō Cmap".

[18] See the screenshots on the official website

[19] Into the system fonts directory C:\Windows\Fonts.

[21] As of 2019, the IRG rebranded as the Ideographic Research Group.

[26] The history of the encoding of the Tangut script is quite complicated, see Tangut (Unicode block) § History for a full listing of all the related proposals and a timeline.

[30] Ideographic Description Sequence: ⿰魚さかな嵐あらし

[31] This is a column name in the Unihan database; ⟨J⟩ here is short for "Japanese glyph source". The full name of the column is kIRG_JSource. Under Han unification, there are nine such sources. See §3.1 of UAX#38 for a complete list and more information.

[33] Other J-Source prefixes exist, such as J4, meaning the character originates from JIS X 0213:2004.

[35] That is to say, a glyph made up of the same radicals in the same positions.

[:1-36] Errors in large collections of ideographs are, of course, not uncommon. Such errors even accidentally occur in well funded government-produced collections, such as the famous kanji from unknown sources in the Japanese Industrial Standards Committee's JIS X 0208 double-byte character encoding standard. All of these JIS X 0208 error kanji (Ghost characters, 幽霊ゆうれい文字もじ; e.g., 彁) have made their way into Unicode despite not being "real" kanji.

[37] For proof, see the list in the Mojikyō Character Map, MOCHRMAP.EXE.

[39] See also: fictitious entry; trap street.

[note 1]

[1]

[2]

[3]

[4]

[5]

[6]

[note 2]

[7]

[8]

[9]

[10]

[11]

[note 3]

[note 4]

[note 5]

[12]

[note 6]

[note 7]

[13]

[note 8]

[14]

[15]

[16]

[17]

[note 9]

[18]

[19]

[20]

[note 10]

[note 11]

[21]

[note 12]

[22]

[note 13]

[note 14]

[note 15]

[23]

[note 16]

[24]

v t e Character encodings
Early telecommunications	Telegraph code Needle Morse Non-Latin Wabun/Kana Chinese Cyrillic Korean Baudot and Murray Fieldata ASCII ISO/IEC 646 BCDIC Teletex and Videotex/Teletext T.51/ISO/IEC 6937 ITU T.61 ITU T.101 World System Teletext background sets Transcode
ISO/IEC 8859	Approved parts -1 (Western Europe) -2 (Central Europe) -3 (Maltese/Esperanto) -4 (North Europe) -5 (Cyrillic) -6 (Arabic) -7 (Greek) -8 (Hebrew) -9 (Turkish) -10 (Nordic) -11 (Thai) -13 (Baltic) -14 (Celtic) -15 (New Western Europe) -16 (Romanian) Abandoned parts -12 (Devanagari) Proposed but not approved KOI-8 Cyrillic Sámi Adaptations Welsh Barents Cyrillic Estonian Ukrainian Cyrillic
Bibliographic use	MARC-8 ANSEL CCCII/EACC ISO 5426 5426-2 5427 5428 6438 6862
National standards	ArmSCII Big5 BraSCII CNS 11643 DIN 66003 ELOT 927 GOST 10859 GB 2312 GB 12345 GB 12052 GB 18030 HKSCS ISCII JIS X 0201 JIS X 0208 JIS X 0212 JIS X 0213 KOI-7 KPS 9566 KS X 1001 KS X 1002 LST 1564 LST 1590-4 PASCII Shift JIS SI 960 TIS-620 TSCII VISCII VSCII YUSCII
ISO/IEC 2022	ISO/IEC 8859 ISO/IEC 10367 Extended Unix Code / EUC
Mac OS Code pages ("scripts")	Armenian Arabic Barents Cyrillic Celtic Central European Croatian Cyrillic Devanagari Farsi (Persian) Font X (Kermit) Gaelic Georgian Greek Gujarati Gurmukhi Hebrew Iceland Inuit Keyboard Latin (Kermit) Maltese/Esperanto Ogham Roman Romanian Sámi Turkish Turkic Cyrillic Ukrainian VT100
DOS code pages	437 668 708 720 737 770 773 775 776 777 778 850 851 852 853 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 897 899 903 904 932 936 942 949 950 951 1034 1040 1042 1043 1044 1098 1115 1116 1117 1118 1127 3846 ABICOMP CS Indic CSX Indic CSX+ Indic CWI-2 Iran System Kamenický Mazovia MIK
IBM AIX code pages	895 896 912 915 921 922 1006 1008 1009 1010 1012 1013 1014 1015 1016 1017 1018 1019 1046 1124 1133
Windows code pages	CER-GS 932 936 (GBK) 950 1169 Extended Latin-8 1250 1251 1252 1253 1254 1255 1256 1257 1258 1270 Cyrillic + Finnish Cyrillic + French Cyrillic + German Polytonic Greek
EBCDIC code pages	Japanese language in EBCDIC DKOI
DEC terminals (VTx)	Multinational (MCS) National Replacement (NRCS) French Canadian Swiss Spanish United Kingdom Dutch Finnish French Norwegian and Danish Swedish Norwegian and Danish (alternative) 8-bit Greek 8-bit Turkish SI 960 Hebrew Special Graphics Technical (TCS)
Platform specific	1052 1053 1054 1055 1056 1057 1058 Acorn RISC OS Amstrad CPC Apple II ATASCII Atari ST BICS Casio calculators CDC Compucolor 8001 Compucolor II CP/M+ DEC RADIX 50 DEC MCS/NRCS DG International Galaksija GEM GSM 03.38 HP Roman HP FOCAL HP RPL SQUOZE LICS LMBCS MSX NEC APC NeXT PETSCII PostScript Standard PostScript Latin 1 SAM Coupé Sega SC-3000 Sharp calculators Sharp MZ Sinclair QL Teletext TI calculators TRS-80 Ventura International WISCII XCCS ZX80 ZX81 ZX Spectrum
Unicode / ISO/IEC 10646	UTF-1 UTF-7 UTF-8 UTF-16 UTF-32 UTF-EBCDIC GB 18030 DIN 91379 BOCU-1 CESU-8 SCSU TACE16 Comparison of Unicode encodings
TeX typesetting system	Cork LY1 OML OMS OT1
Miscellaneous code pages	ABICOMP ASMO 449 Digital encoding of APL symbols ISO-IR-68 ARIB STD-B24 Fieldata HZ IEC-P27-1 INIS 7-bit 8-bit ISO-IR-169 ISO 2033 KOI KOI8-R KOI8-RU KOI8-U Mojikyō SEASCII Stanford/ITS Symbol TRON Unified Hangul Code
Control character	Morse prosigns C0 and C1 control codes ISO/IEC 6429 JIS X 0211 Unicode control, format and separator characters Whitespace characters
Related topics	CCSID Character encodings in HTML Charset detection Han unification Hardware code page MICR code Mojibake Variable-length encoding
Character sets