home · mobile · calendar · colloquia · 2010-2011 · 

Colloquium - Haddon

Unicode 6.0 -- Finally There?
Bruce Haddon
Paladin Software International

This talk is the latest update to the series of talks that have covered the ongoing development of the Unicode Standard. The presenter has given three previous talks covering Unicode 3.2, Unicode 4.1, and Unicode 5.0. Over the series, it has become evident that the Standard is becoming more mature. However, Unicode 6.0 brings with it some additional theory and techniques surrounding the use of this character set and its representation.

The Unicode 6.0 Standard was published in September, 2010. Incorporating 109,242 graphic characters. Thus the concept of Unicode being character set assignable in 16-bits is forever gone. However, the rate of growth is slowing, as only 2,088 new characters were added with this version of the standard. Most of the significant changes were in the database of descriptors for the entire character set, and in the technical details of using the character set (such as the Unicode Collation Algorithm, which defines the sorting of Unicode character strings).

The other indicator of maturity is the wide-spread adoption of Unicode, for Internet content and URLs, programming languages and operating systems, word processor and other layout programs, and so on. What was once an "add-on" feature in those arenas is now standard practice. Yet, it remains true that many "users" are unaware of most of the complexities of such a general character set, and it is the purpose of this talk to highlight those complexities.

The talk with give a brief history of the motivation for and development of Unicode, explain its representations via transformations, and conclude with a survey of the current state of the Standard and its annexes.

Dr. Bruce K. Haddon is currently a senior engineer in software quality assurance, being an aspect of software engineering in which he has been involved since first encountering computer science and software engineering. In particular, he has also been long associated with the software engineering aspects of portable software, programming language design, internationalization and localization, and hence the problems of the representation of various natural languages. These interests converged as a consulting Java Architect, in which role he has advised world-wide banks, telecommunications companies, the US Air Force, and others. This talk deals with that issue of representation of natural languages.

Hosted by William Waite.
Haddon has provided his slides.

Department of Computer Science
University of Colorado Boulder
Boulder, CO 80309-0430 USA
May 5, 2012 (14:13)