|last updated Tue Nov-01-2005 12:45|
CJK Compatibility Database
The Library of Congress has posted a CJK Compatibility Database on the CPSO home page to help CJK catalogers quickly and conveniently replace non-MARC21 characters with their MARC21 equivalents or a missing character symbol. Go to
Non-MARC21 Characters And Their MARC21 Equivalents
The Library of Congress database will soon be upgraded to Unicode compatibility. RLG's Union Catalog and OCLC's WorldCat databases are now also Unicode compatible. Chinese, Japanese and Korean (CJK) scripts are input into these systems using Microsoft input method editors (IMEs).
The Unicode character set includes several hundred duplicate CJK characters (for example, , F937, and *, 8DEF), as well as many others that represent close variants (for example, *, 6B65, and , 6B69). Generally, one of these variants is a MARC21 character, while the other is not.
Only MARC21 characters can be displayed in USMARC records. However, sometimes the most logical way to create a character using a Microsoft IMEs produces a non-MARC21 character. For example, if one creates the common character by keying in and converting in the Korean IME, the result is a non-MARC21 character (F9E1). One must key in and convert to create the valid MARC21 form, *, 674E.
The character , 6B69, is created with the Japanese IME. But the Japanese form of this character is not a valid MARC21 form. The valid MARC21 equivalent, *, 6B65, can only be created by the Korean or Chinese IME.
Only MARC21 characters can be displayed properly in a MARC21 bibliographic record. Therefore, a non-MARC21 character in a bibliographic record must be replaced by its MARC21 equivalent.
The CJK Compatibility Database
The CJK Compatibility Database includes more than 450 non-MARC21 Chinese, Japanese and Korean characters, Hangul syllables and diacritic marks, matched with their MARC21 equivalents. The list of characters in the database was initially identified by LC staff, and was supplemented by entries in a similar database at Yale University. Characters that do not have a MARC21 equivalent are matched with the missing character symbol .
The database is intended to enable catalogers to quickly and conveniently replace a non-MARC21 character with its MARC21 equivalent. It is also possible to view all of the non-MARC21 characters in the database, along with their MARC21 equivalents, the Unicode value for each character, and other information that may be helpful in identifying the characters and describing how the MARC21 character may be input.
Updating The Database
The database is a cooperative undertaking, and is intended for the use of all CJK catalogers. If, in the course of your work, you encounter a non-MARC21 character that is not listed in this database, please report it to us so that it can be added to the database. Notify:
Young Ki Lee, Senior Cataloging Specialist
Library of Congress