|
BEIJING, March 28 (Xinhua) -- Chinese scientists have succeeded in
digitalizing all Chinese characters with a four-byte coding technology, enabling
ancient texts full of rare characters to be printed.
Wang Hongyuan, the inventor of the coding technology, said on Tuesday that
it will not only help people type and search all Chinese characters, but also
solve problems caused by rare characters or words in people's daily life.
Taking bank service as an example, Wang said, "If a man's name contained rare
characters, he had difficulty in setting up a deposit account since the computer
system of the bank did not have a sophisticated enough coding method to
recognize his name."
With the help of the four-byte coding technology, people can easily type in 70,000
characters in any computer installed with a coordinated database, Wang said,
adding that the original two-byte coding could only deal with 20,000
characters.
Statistics show that 60 million Chinese people out of a population of 1.3
billion have rare characters in their names.
Wang said that although some printing methods for rare characters have been
invented, there was no such database that included format, spell, pronunciation
and source of the characters.
Feng Zheng, an expert in Chinese language with the Beijing-based Capital
Normal University, said that the research in Chinese language also faced
difficulties for lack of digitalized reading materials.
Generally speaking, there will be one character in every 1,000 Chinese
characters in a single ancient book that is so rare to be printed by the
two-byte coding. Therefore, many ancient books cannot have a digitalized version
that can be open to all researchers.
Wang Hongyuan said the database based on four-byte coding set up13
categories with millions of records, which involved almost all ancient Chinese
dictionaries, unearthed documents and files.
The Kangxi Dictionary, a famous Chinese dictionary compiled during the
reign of Kangxi Emperor of the Qing Dynasty (1644-1911),is now under the
publishing process with the help of four-byte coding. The dictionary was best
known for including the most rare characters in the Chinese language.
"Apart from its own meaning, one character also embodies the culture and
history of the user," Feng said, "We should better preserve and protect our
Chinese characters by using advanced technology."
According to Wang, the four-byte coding and the coordinated database has applied
for 20 patents, and has been on trial in more than 100 Chinese and
foreign universities. In the long run, the database will design digitalized
textbooks on characters for Chinese primary and middle schools.
At present, 1.5 billion people use the Chinese language. The number of
Chinese learners worldwide have reached 30 million. Enditem |