[Robelle] [SmugBook] [Index] [Prev] [Next]

Unicode Covers All Major Living Languages

Unicode is a 16-bit character set designed to cover all the world's major living languages, in addition to scientific symbols and dead languages that are the subject of scholarly interest. It eliminates the complexity of multibyte character sets that are currently used on UNIX and Windows to support Asian languages. Unicode was created by a consortium of companies including Apple, Microsoft, HP, Digital and IBM and merged its efforts with the ISO-10646 standard to produce a single standard in 1993. Unicode is already the basis for at least one operating system: Windows/NT.

Unicode is a 16-bit character set where all characters occupy the same space. The first 256 values are the same as the ISO-Latin character set, which is also the basis for the ANSI Character set used in Windows 3.1 and Windows 95. But Unicode goes on to define 34,168 distinct coded characters. In most character sets a single value is often assigned to several characters. For example, in ASCII a "-" is used to represent a hyphen, a minus sign, a dash and a non-breaking hyphen. In Unicode each meaning is given its own code. The Unicode standard contains only one instance of each character and assigns it a unique name and code value. It also supports "combining" accent characters, which follow the base character that they are to modify.

For more information on Unicode, visit the Unicode Web Site.

[Robelle] [SmugBook] [Index] [Characters] [Prev] [Next]