IUC 32 Logo Banner
  top corner
Hotel cut-off:    
09/30/2009    
     
Venue:    
 Hilton Washington Hotel    
Hilton San Jose
300 Almaden Blvd.
San Jose, CA 95110

 

 

Program - Session Descriptions


 

Wednesday, October 14, 2009

09:00-12:30  MORNING TUTORIALS

Presenter:

Richard Ishida
Internationalization Lead,
W3C


Track 1: An Introduction to Writing Systems & Unicode
The tutorial will provide you with a good understanding of the many unique characteristics of non-Latin writing systems, and illustrate the problems involved in implementing such scripts in products. It does not provide detailed coding advice, but does provide the essential background information you need to understand the fundamental issues related to Unicode deployment, across a wide range of scripts. It has also proved to be an excellent orientation for newcomers to the conference, providing the background needed to assist understanding of the other talks! The tutorial goes beyond encoding issues to discuss characteristics related to input of ideographs, combining characters, context-dependent shape variation, text direction, vowel signs, ligatures, punctuation, wrapping and editing, font issues, sorting and indexing, keyboards, and more. The concepts are introduced through the use of examples from Chinese, Japanese, Korean, Arabic, Hebrew, Thai, Hindi/Tamil, Russian and Greek. While the tutorial is perfectly accessible to beginners, it has also attracted very good reviews from people at an intermediate and advanced level, due to the breadth of scripts discussed. No prior knowledge is needed.

Presenter:

Addison Phillips
Globalization Architect
Lab126 (Amazon)

Track 2: Internationalization: An Introduction, Part I: Characters and Character Encodings
What is internationalization? What do developers, product managers, or quality engineers need to know about it? How does a software development organization incorporate internationalization into the design, implementation, and delivery of an application?

This tutorial track provides an introduction to the topics of internationalization, localization and globalization. Attendees will understand the overall concepts and approach necessary to analyze a product for internationalization issues, develop a design or approach, and deliver a global-ready solution. The focus is on architectural approaches and general concepts, but will include specific examples and exercises.

Part I focuses on characters, character encodings, and the basics of Unicode.


Presenter:

Elizabeth Pyatt
Instructional Designer
Penn State

Track 3: Building a Custom Keyboard Layout for the Mac with Ukulele and XML
Building custom keyboards can be a useful timesaver if you work with an unusual range of characters across a large number of documents. The tutorial will describe how to create a custom keyboard layout on the Mac OS X platform using the freeware Ukelele tool from SIL plus modifications to the XML file. Although the main example will a keyboard built for symbolic logic characters, the tutorial will cover how to create keyboards for many foreign languages.
   
10:30-10:45 - Morning Refreshments
   
Presenter:

Addison Phillips
Globalization Architect
Lab126 (Amazon)

Track 2: Internationalization: An Introduction, Part II: Writing Global-Ready Code
Part II focuses on preparing for the localization (translation) of user interfaces; making applications “locale-aware”, including format and display differences; as well as approaches to delivering multi-lingual and multi-locale software or content.

Presenter:

Thomas Milo
President
DecoType

Track 3: Arabic Script: Structure, Geographic and Regional Classification
A new tutorial about Arabic script (including Arabic script for dummies, structural analysis, typology, stylistic geography, technical and aesthetic aspects, language-dependant preferences within calligraphic styles, and extra attention for orthographies East of Iraq), against the background of the development of a brand-new Nastaliq typeface that covers the Unicode for all languages that require this Persian-derived style.
   
12:30-13:30 - LUNCH
   
13:30-15:30  AFTERNOON TUTORIALS

Presenters:

Craig Cummings
Mike McKenna
Internationalization Architects 
Yahoo! Inc.


Track 1 - Unicode - A Grand Tour
This tutorial will cover the next level of detail of what Unicode is, and how it is used in the real world. The modules of the tutorial will cover: The Unicode standard - what are the "Guiding Lights", or design principles behind Unicode? A tour of Unicode's structure, encoding forms, behavior, technical reports, database, and how to use the Unicode Standard. Implementation according to Unicode - a walk through the details of attributes, compatibility, non-spacing characters, directionality, normalization, graphemes, complex scripts, surrogates, collation, regular expressions and other aspects according to the Unicode Standard and associated Technical Reports. Unicode and the Real World - an overview of International Components for Unicode (ICU) and implementations supporting Unicode in web servers, application servers, browsers, C/C++, Java, PHP, SQL, and various operating systems. On-going programs - how Unicode is evolving to support more minority scripts, languages, and help solve linguistic processing issues.

Presenter:

Tex Texin
Xen Master
XenCraft

Track 2 - Web Internationalization - Standards and Best Practices
This tutorial is an introduction to internationalization on the World Wide Web. The audience will learn about the standards that provide for global interoperability and come away with an understanding of how to work with multilingual data on the Web. Character representation and the Unicode-based Reference Processing Model are described in detail. HTML, XHTML, XML (eXtensible Markup Language; for general markup), and CSS (Cascading Style Sheets; for styling information) are given particular emphasis. The tutorial addresses language identification and selection, character encoding models and negotiation, text presentation features, and more. The design and implementation of multilingual Web sites and localization considerations are also introduced.

Presenter:

Jim DeLaHunt
Principal
Jim DeLaHunt & Associates

Track 3 - Building Multilingual Websites in Joomla [Drupal]
A practical look at the language and locale capabilities of Joomla! and Drupal, two leading free software content management systems (CMSs). They let you build more powerful, more international websites faster. We look at: their core services for internationalization and locale support; localization of UI and content; and localization support in some leading modules. You will leave with specific tips for building your own site. We don't assume Joomla or Drupal experience, but do include material for advanced practioners. A good tutorial for web site product managers, for web designers and developers, and for managers of international web site teams. 

15:30-15:45 - Afternoon Refreshments
15:45-17:45  AFTERNOON TUTORIALS

Track 1 - Unicode - A Grand Tour (Cont'd.)

Presenter:

Richard Ishida
Internationalization Lead
W3C

 

Track 2 - Creating XHTML/HTML Pages with Right-to-Left Scripts
This short tutorial explains how to go about creating XHTML and HTML pages containing text written in the Arabic or Hebrew scripts. The tutorial examines how best to achieve the correct effect for these bi-directional scripts using appropriate markup, CSS properties and Unicode code points or entities. It covers the basics, and goes beyond to provide recommended techniques for some of the tricky situations that even native speakers can struggle with. The tutorial assumes a basic familiarity with the bi-directional characteristics of Arabic and Hebrew, as well as a basic knowledge of HTML and CSS.

Presenter:

Behdad Esfahbod
Software Developer
Red Hat/GNOME

Track 3 - Free Software Stack for Unicode Text Rendering
The Free Software world has a lot to offer when it comes to building a stack up from the grounds. Be it building an ARM-based Linux mobile platform or cross-platform text rendering to rendering downloadable CFF fonts on Windows, the Free Software stack provides all the bits and pieces one needs to assemble a high quality OpenType-based Unicode text rendering pipeline with great flexibility. In this tutorial we will go over the building blocks involved and how to put them together.

18:00-19:00 - Welcome Reception hosted by Adobe Systems

 

Thursday, October 15, 2009

09:00-09:15 WELCOME & OPENING REMARKS


09:15-10:00

Nicholas Ostler
Chairman
Foundation Endangered Languages

KEYNOTE Presentation: The Alphabetic Principle and its Enemies
The alphabetic principle for writing seems brilliantly simple, and its implementation, often subverting other options, has often caused explosive growths in literacy, with important historical consequences for cultural survival. Its great advantages are economy of effort in the learner, and ready application to new languages. However, it has drawbacks as to speed for the initiated user, and also (by being essentially mechanical and phonetic) in representing many of the cultural overtones which people like their written language to have. There is, too, a certain resistance to the role of art in writing. But as alphabetic traditions age, becoming less purely alphabetic, these disadvantages can be reduced. New structures may emerge, meaningful patterns that leave alphabets far behind. Alphabetic scripts have more recently revealed new aspects, defining a convenient order to index anything, inspiring the phonemic principle of structural linguistics, and later mapping more easily than other systems onto digital systems, and hence a whole new set of functions for written language. But the alphabet remains a rather arbitrary means of representing meanings, since its icons are parasitic on the particular sounds of particular words in particular languages, a long way from thoughts.
10:00-20:00 -  EXHIBIT AREA OPEN
10:00-10:30 - Morning Refreshments in Exhibit Area
10:30-11:20  SESSION 1

Presenter:

Kirti Velankar
Senior Software Engineer
Yahoo! Inc.


Track 1 - Internationalization with PHP
PHP is one of the most prominent and popular platforms for modern Web development. This updated session discusses PHP from the perspective of internationalization, what some of the challenges in PHP are, the features available in PHP 5, and the promise of Unicode in PHP 6.

This session also includes examples and usage in practical scenarios. You will learn how to effectively build applications for multiple languages and cultures using PHP with some of the new internationalization features such as locales, sorting, resource bundles, as well as date, number and message formatting.


Presenter:

Ken Lunde
Senior Computer Scientist
Adobe

Track 2 - Designing & Developing Pan-CJK Fonts for Today
Designing and developing Pan-CJK fonts, meaning fonts whose CJK Unified Ideographs can serve more than a single CJK locale, region, or culture, is both challenging and time-consuming. But, like most things that require effort, there are great rewards: smaller overall font footprint, design consistency across locales, and so on. In developing such fonts, there are challenges related to the actual design of the glyphs, which transcend any font format concerns. This presentation pinpoints specific design and implementation problems that developers of such fonts will face, and then details workable solutions. A prototype Pan-CJK font will demonstrated during the presentation.

Presenter:

Mark Davis
Sr. Internationalization Architect
Google Inc.

Track 3 - Unicode Update: Unicode 5.2 and CLDR 1.7
The 5.2 version of Unicode (Fall 09) adds many new characters, new properties, and fixes to existing properties, and is being issued as a complete online book. CLDR 1.7 (Spring 09) contains over 21% more locale data than the previous release, with over 40,000 new or modified data items from over 140 different contributors, including Adobe, Apple, Google, IBM, and Sun, plus official representatives from a number of countries.

This presentation, from the president and co-founder of the Unicode consortium, covers the new features of both standards, examples of the impact on companies such as Google, and future directions for these and other globalization standards -- the new emoji characters, international domain names, Unicode security, and others.


11:30-12:20  SESSION 2

Presenter:

Martin Duerst
Aoyama Gakuin University


Track 1 - Internationalization in Ruby 1.9
Ruby is a purely object-oriented scripting language which is easy to learn for beginners and highly appreciated by experts for its productivity and depth. Internationalization of Ruby made a big leap forwards when this January, Ruby 1.9.1, the first stable release of the Ruby 1.9 series, was released. While previous versions of Ruby mostly treated text data as byte sequences, strings in Ruby 1.9 are sequences of characters. Because Ruby tags each string with encoding information internally, different applications can choose different internationalization models.

The presentation will give a short overview of Ruby as a programming language, and introduce the new internationalization features in detail. We will be concentrating on how to use Ruby with Unicode, which in Ruby's case means UTF-8. We will also discuss internationalization support in Ruby on Rails, the popular Web application framework written in Ruby.


Presenter:

Kamal Mansour
Manager of Non-Latin Products
Monotype Imaging

Track 2 - Unicode & Fonts: a status report
The adoption of Unicode as the universal character code standard has profoundly changed the computing landscape. We now expect to be able to exchange multilingual text documents across platforms and software applications. Since its inception, Unicode has cautiously distanced itself from the process of displaying glyphs, delegating it to an external “rendering layer” that includes fonts. Alongside Unicode, the OpenType Standard has enabled new levels of sophistication in fonts. However, one is often disappointed by a particular font doesn’t work as it should. We will give a brief overview of what works today and what we can expect in the future.

Presenters:

Deborah Anderson
Project Leader, Script Encoding
Initiative, Department of Linguistics, UC Berkeley
Richard Cook
Post-Doctoral Researcher, Dept. of Linguistics
UC Berkeley
Charles Riley
Catalog Librarian for African Languages
Yale University
Anshuman Pandey
C.Phil. History
University of Michigan

Track 3 - Patching Holes in the Unicode Pipeline: A Status Report on the Unencoded Scripts of Asia and Africa
In 2002, 96 scripts listed on the Unicode Pipeline were unencoded. Today,
the number is considerably smaller. Currently about 25 scripts from Asia and
Africa remain unencoded, but they present particular challenges: many are
not well-known and will involve considerable research to acquire materials
and to track down experts. This session will be made up of 3 speakers who
have worked on South Asian and African script proposals. They will discuss
the work that remains to be done and highlight specific issues for
implementers.

12:30-13:30 - LUNCH
13:30-14:20  SESSION 3

Presenter:

Norbert Lindenberg
Internationalization Architect
Yahoo! Inc.


Track 1 - Internationalization for JavaScript Applications
JavaScript, as defined by the EcmaScript standard and implemented in browsers, is a rather weak platform for internationalized web applications. Several toolkits have attempted to fill the gap in different ways, ranging from reliance on existing server-side internationalization libraries to implementing the functionality in JavaScript itself. This presentation surveys the landscape and compares the different solutions.

Presenter:

Ken Lunde
Senior Computer Scientist
Adobe

Track 2 - The Design & Development of Fully Proportional Japanese Fonts
Japanese fonts have traditionally been designed on the principle that each glyph occupies a fixed design space. Some fonts have overcome this principle by providing alternate metrics, which really amount to pseudo proportional metrics. It is possible to develop Japanese fonts whereby each glyph has proportional metrics by default, in both horizontal and vertical writing directions. In addition to the obvious design challenges, there are also several technical hurdles related to implementing the typeface design as an OpenType font. This presentation details the unique design aspects of Kazuraki, a fully-proportional Japanese font, along with details about its OpenType implementation.

Presenter:

Martin Duerst
Aoyama Gakuin University

Track 3 - Update on Internationalized Domain Names and Internationalized Resource Identifiers
In domain names such as www.unicode.org, only a limited number of characters are allowed. This limitation also applies to Uniform Resource Identifiers (URIs) such as http://www.unicode.org. Internationalized Domain Names (IDNs) and Internationalized Resource Identifiers (IRIs) changed this a few years ago, both allowing a wide range of characters from the Unicode repertoire. The specifications underlying these technologies are currently facing an overhaul, major for IDNs and minor for IRIs. The long-overdue and now imminent introduction of the first international top-level domain names will mean that the importance of IDNs and IRIs will significantly increase in the near future.

The presentation will give a general overview of IDNs and IRIs and discuss the current revisions of the specifications in detail. For IDNs, the set of allowed characters is defined using an inclusion-based model rather than the earlier exclusion-based model. Fixed tables are replaced by a property-based selection process to avoid fixing the specification to a single version of Unicode. The mapping step (dealing with casing and normalization, among else) is moved out of the core libraries and closer to the user to allow adaptions for special cases and reduce user surprises. The IRI specification is being extended with descriptions of widely used variants for handling characters strictly speaking not allowed in IRIs. Both specifications are affected by bug fixes to bidirectionality restrictions.


14:30-15:20  SESSION 4

Presenter:

Umesh Nair
Software Engineer
Google Inc.


Track 1 - Implementing International Calendars in JavaScript
Conversion routines between the Gregorian calendar and non-Gregorian calendars involve complex floating point computations, large lookup tables and calendar-specific computations. Floating point operations impact performance and accuracy, while lookup tables impact memory footprint and download time. Calendar-specific computations require special algorithms and data structures. Implementing such algorithms efficiently with compact data structures is essential for the successful deployment of online calendars for the international audience. This presentation discusses several such techniques for calendrical calculations in client-side JavaScript. The techniques described here are applicable to a number of other areas in internationalization as well as general software usage with JavaScript.

Presenter:

Thomas Milo
President
DecoType

Track 2 - The Unicode-based Koran: a Conflict Between Calligraphic Tradition and Computer Typography

A technical talk about the practical problems encountered in the project to produce a Unicode-based Koran on the behest of the Omani Ministry of Awqaf and Religious Affairs. The focus is on the discrepancies discovered between the age-old calligraphic tradition and the 1924 revision of the Koran. The pivotal issues will be identified and explained. A workable solution will be presented.


Presenters:

Mark Davis
Sr. Internationalization Architect
Google Inc.

Addison Phillips

Globalization Architect
Lab126 (Amazon)

Track 3 - Language Identification and Usage
In 2006, the IETF issued an updated version of BCP 47 "Tags for Identifying Languages", which updated the way languages are identified in most computer programs and protocols. The latest version of BCP 47 (2009) incorporates over 7,000 new languages and many other improvements. This presentation, from the authors of the updated and previous RFCs, covers:
  • the format of language tags and the language subtag registry
  • the matching algorithms for comparing language tags to user preferences
  • plus distance-based algorithms
  • the new features in BCP 47 and their impact on developers

and how BCP 47 is being used in:

  • Unicode locales (CLDR)
  • prominent open-source libraries such as ICU
  • companies such as Google and Amazon

15:20-16:00 - Afternoon Refreshments in Exhibit Area
16:00-16:50  SESSION 5

Presenters:

Steven Loomis
Software Engineer
IBM
Markus Scherer
Unicode Software Engineer
Google Inc.


Track 1 - What's New with ICU
The International Components for Unicode library, or ICU, provides a full range of services for Unicode enablement, and is the globalization foundation used by many software packages and operating systems. Freely available as open-source, it provides cross-platform C, C++ and Java APIs, with a thread-safe programming model. This presentation will provide a brief overview of ICU, with emphasis on the current status of ICU (4.2), including the latest support for Unicode 5.1 and CLDR 1.7, and an update on ICU’s planned direction for 4.4 and future releases.

Presenters:

Michael Manca
Project Manager and Solution Quality Analyst
IT Flex Services
Intel Corporation
Tomas Galicia
Solutions Quality Analyst
IT Flex Service
Intel Corporation
Loic Dufresne de Virel
Localization Strategist
IT Flex Services
Intel Corporation

Track 2 - A Systematic Approach to I18N Testing
Building on last year's presentation "We're World-Ready, What Does This Really Mean?", Intel's localization experts will present and discuss the steps they follow, the tools they use, and their overall I18N testing philosophy. They will explain in details how they proceed when working with development teams to ensure applications are properly internationalized before they're released or localized. Based on recent I18N testing efforts conducted by Intel, this interactive session will provide a solid framework of reference for I18N testing, as well as valuable pointers that can be easily and directly applied to your own localization projects or reused within your organization.

Presenter:

Toshiya Suzuki
Research Assistant
Hiroshima University

Track 3 - Investigation of Opaque Glyphs Synthesized from Old Hanzi
After the long efforts during 7 years, finally ISO/IEC 10646:2008 have included CJK Unified Ideographs Extension C. It has 366 glyphs taken from "Index to Collections of the Inscriptions in Yin-Zhou period" (I2CIYZ) proposed by PRC, and more glyphs are scheduled for future Extension E project. They are suspected to be the glyphs invented only for the specification of Old Hanzi. In this report, the source is investigated and compared with existing dictionaries for Bronze scripts. The requirements of some glyph shapes are questionable, the expected procedure to standardize these opaque glyphs is discussed.

   
   
17:00-17:50  SESSION 6

Presenter:

Behdad Esfahbod
Software Developer
Red Hat/GNOME


Track 1 - HarfBuzz, the Free and Open OpenType Shaping Engine
In this session we will introduce HarfBuzz, the unified Free Software and Open Source, OpenType-based, text shaping engine. We will discuss design considerations, technical decisions made, and performance and other features that make HarfBuzz an attractive alternative to the existing OpenType engines.

HarfBuzz is already being used by both GNOME and KDE desktop environments and is at the heart of the GTK+ and Qt desktop and mobile platforms, with others planning to use it in the coming months, including Mozilla Firefox, OpenOffice.org, and ICU Layout.


Presenters:

Andrew Swerdlow
Internationalization Tech Program Mng
Google Inc.
Manish Bhargava
Google Inc.
Jens Riegelsberger
Google Inc.
Laura Cuozzo

Google Inc.

Track 2 - Google Internationalization Quality Control Framework
There are many obstacles to a great international user experience. There is a range of issues that cut across organizational boundaries, such as localization, internationalization, visual design, interaction design, business analysis, usability analysis, and market research. Against this backdrop we at Google started experimenting with a standardized review framework that relies on a global network of external evaluators. These evaluators live in market and thus are familiar with local standards and practices. This framework allows us to identify themes that may point to requirements that are common across multiple regions aiding in prioritizing features or giving resources to projects.

Presenter:

Murry Sargent III
Partner Software Design Engineer
Microsoft

Track 3 - Math Editing and Display in Microsoft Office
Math editing is described that uses math context menus, a math ribbon, keyboard navigation, and formula autobuildup in Microsoft Office 2010. The math typography is similar to TeX’s, the input methods are state of the art, the math character set is Unicode’s, and the environment is Office’s, which comes with the many features one expects from a leading office suite. Demonstrations will be given using Office 2010.
   
18:00-20:00 -  IUC32 CONFERENCE RECEPTION (IN EXHIBIT AREA)


Friday, October 16, 2009

09:00-09:50  SESSION 7

Presenter:

Douglas Davidson
Software Engineer
Apple, Inc.


Track 1 - International Features of Mac OS X Snow Leopard
From its inception, Mac OS X has been designed with top-to-bottom international and multilingual support. The latest version, Mac OS X 10.6 Snow Leopard, expands on that with new bidirectional input support, multilingual spellchecking, and many other new features. This session covers the international capabilities of Mac OS X from both a user and a developer perspective, with a particular emphasis on new features in Snow Leopard. Topics covered include localization, locale data, text input, text display, proofing tools, and user customization.

Presenter:

Brent Ramerth
Software Engineer
Apple, Inc.

Track 2 - International Features of iPhone OS
The iPhone OS platform starts with the internationalization architecture fundamental to Mac OS X, and adds a unique virtual keyboard and text input system that handles a wide array of languages. This session covers the international capabilities of the platform from both a user and a developer perspective, with particular attention to iPhone-specific features. Topics covered include localization, text display, and text input.

Presenter:

Elizabeth Pyatt
Instructional Designer
Penn State

Track 3 - Practical "Unicode Logic" for Online Tech Courses
This session describes some of the challenges and workarounds for implementing Unicode content in two online courses in symbolic logic and thermodynamics. Topics include development utilities, templates and guidance for students, issues with multiple applications and font selection across platforms. The presentation will also discuss some differences between implementing Unicode for math courses and Unicode for foreign language courses.
   
   
10:00-10:50  SESSION 8

Presenter:

Derek Murnam
Senior Program Manager
Microsoft Corporation


Track 1 - Windows 7: Writing World-Ready Applications
This session centers on the new globalization features for Windows 7, including sorting and string comparison, locale support, and coverage for new languages, with an eye to helping developers extend their applications to a global user base. In addition to introducing the Extended Linguistic Services API, this session will also cover the Multilingual User Interface (MUI) resource technology available in Windows 7. This session will provide an end-to-end look at how to make your application world-ready so that you can easily take your application worldwide and extend your customer base into new language markets.

Presenters:

Markus Scherer
Unicode Software Engineer
Google Inc.
Katsuhiko Momoi
Staff Test Engineer & I18n Consultant
Google Inc.
Mark Davis
Sr. Internationalization Architect
Google Inc.

Track 2 - Emoji in Unicode: Cell Phones Meet the Internet
Emoji" symbols or "picture characters" are used in email by more than 80 million Japanese cell phone users. They are treated as characters, via vendor-specific extensions of the Japanese character sets. Other email providers have to be able to exchange emails with the Japanese cell phone companies without losing or corrupting data. Most email providers use Unicode, requiring conversion of mail data to/from Unicode. Unicode Private Use characters are used for this purpose. However, they do not provide for reliable public interchange. For a permanent solution, the Unicode Consortium has approved the addition of the Emoji symbols to Unicode 6.0, and is working with ISO to ensure inclusion in the corresponding version of ISO 10646. This paper presents the state and progress of the Unicode encoding proposal with an overview of the Emoji symbols.

Presenter:

Adam Asnes
President
Lingoport, Inc.

Track 3 - Creating an I18n Project Plan
Many initial internationalization scoping efforts focus on creating findings documents. But often the real trick is gathering accurate metrics and turning them into realistic, budge table and actionable project plans. In this presentation we will demonstrate how we assess source code and architecture, and then review a detailed project plan and how we arrived at tasks, durations and staffing.
   
10:50-11:10 - Morning Refreshments
   
11:10-12:00  SESSION 9

Presenter:

Mihai Nita
Globalization Architect
Adobe Systems, Inc.


Track 1 - Accessing Globalization Services on Multiple Operating Systems
This presentation will cover the experience gained by implementing a cross platform C library that makes use of the operating system dependent services for the following language and region specific functionality. In contrast to ICU which carries its own set of locale data, this solution provides a cross platform set of APIs but uses the facilities provided by the operating system. This presentation will explore the pros and cons of such an approach, trade-offs, implementation issues, major traps, and some of the surprises we encountered.

Presenters:

Loic Dufresne de Virel
Localization Strategist
IT Flex Services
Intel Corporation
Michael Kuperstein
Senior Localization Engineer
IT Flex Services
Intel Corporation
Margie Foster
Localization Project Manager
Moblin Project
Intel Corporation

Track 2 - Taking Moblin to the World
When the Moblin project asked for our help to localize their application, our initial reaction was enthusiastic! "Finally a cool open-source project to work on", we thought! After getting back to our senses, we realized that localizing Moblin (Moblin stands for Mobile Linux) was not our typical localization project... Far from it! In this session, we will review the thought process we followed to define and limit the scope of this significant undertaking, give an update on the current status of this on-going project, explain how we addressed the first major challenges of this amazing journey, and provide an overview of the first-ever attempt at community-based translation by Intel's localization team.

Presenter:

Cindy Conlin
Senior Engineer
The Church of Jesus Christ of Latter-Day Saints

Track 3 - Building a Global Names System: A Case Study
This case study discusses our experience building a global names application containing records for all members of the LDS Church worldwide. We'll discuss the interesting challenges and requirements we face, such as building a data structure flexible enough to accommodate names from multiple cultures simultaneously. We'll talk about using ICU's transliteration functionality to generate romanizations of non-Latin names, and about our experience supporting private-use characters in Chinese names. We'll also discuss how we've created a user interface that allows users from multiple locales to work with data that originated in many other locales.
   
12:00-13:00 - LUNCH
   
13:00-13:50  SESSION 10

Presenter:

Sumit Sarkar
i18n Product Specialist
DataDirect Technologies


Track 1 - Internationalization in Database Drivers for C/C++/Java/.NET Applications
Everything you want to know about i18n and database drivers across C/C++/Java/.NET programming languages. Discussion starts by asking what Unicode support encompasses at the Database Access API level, and what components affect Unicode Support. Take a closer look under the covers at the low level data access across major RDBMS including DB2, SQL Server, Oracle, and Sybase. This includes identifying who is doing the conversions at each component of the data access application layer. To summarize and apply the learned concepts, host will answer key questions about your globalized application's data access: Why should conversions be avoided when possible; and what high level features of a database driver are recommended?

Moderators:

Steven Loomis
Software Engineer
IBM
Mark Davis
Sr. Internationalization Architect
Google Inc.

Track 2 - Deploying the Common Locale Data Repository (CLDR)
The Common Locale Data Repository is a project for the exchange of language and locale information used in application development, and to gather, store, and make such data publicly available. By pooling resources, the time and expense of collecting good data is minimized, and language groups have an avenue to get their data into implementations. This session will discuss implementation of CLDR, the latest project status, and how the process is being improved to produce higher-quality data. Ample time will be given for comments and questions from the audience.

Presenter:

Chris Weber
Casaba Security

Track 3 - Unicode Transformations and Security Vulnerabilities
Web-applications are being exploited every day as attackers find new vectors for performing cross-site scripting attacks. This talk will cover ways which latent character and string handling can transform clever inputs into malicious outputs. Many application frameworks such as .NET and ICU enable these behaviors without the developer's knowledge. String transformations through best-fit mappings, casing operations, normalization, over-consumption and other means will be discussed, with inputs useful for testing. A testing tool is also planned for release.

The current state of visual spoofing attacks will also be discussed. Phishing attacks are prevalent on the Web, and well-designed URL's can increase an attack's chance of success. It's eye-opening to see demonstrations of just how vulnerable modern Web browsers still are to many forms of visual spoofing attacks.

   
14:00-14:50  SESSION 11

Presenter:

Su Liu
AIX Globalization Architect
IBM


Track 1 - Unicode Technology and Globalization Support in IBM UNIX, AIX
AIX, an IBM UNIX, supports more than 60 languages and about 250 locales. Unicode is a key technology to support globalization features to meet different national language requirements. This presentation discusses Unicode impacts on globalization strategy and mechanism in UNIX operating system level. It focuses on how Unicode technologies are used to simplify globalization configurations first. Then, topics are covered on Unicode impacts on system performance, locale data test, and national language support procedure. Examples are given to explain show Unicode support on complex texts, CJK input methods, Unicode conversions, and automated tests. A further looking into Unicode highlights customization subjects on user-defined locale settings and user-defined Unicode conversion tables. Finally, issues in implementations, market requirements and solutions for future Unicode support in UNIX are assessed.

Presenters:

Benedicto Franco Jr.
Software Engineer
Yahoo! Inc.
Marco Aurelio Carvalho
Senior Software Engineer
Yahoo! Inc.

Track 2 - CLDR on the Cloud
The value of CLDR (Common Locale Data Repository) for global applications is undeniable. But how do you update time zone and daylight saving rules, or a new currency, or geo-political changes that might be relevant for the application without taking the inherent risks and costs of a release deployment process? In this presentation, we are going to talk about a solution that exposes CLDR as a service and how CLDR on the Cloud can be used to help create robust internationalized JavaScript and Ajax applications fed by CLDR data published in JSON format ubiquitously.

Presenter:

Tex Texin
Xen Master
XenCraft

Track 3 - My Unicode Disk Storage Went into the Circular File
This session will present some of the difficulties of providing a common international interface to file services on different operating systems. Although Unicode supports all the necessary characters, identifying the set of characters that are legitimate on any OS can be difficult, and rules for case-insensitivity, normalization, etc. vary, and may even vary by user. The presentation will describe the problem space. It may offer possible solutions.
   
14:50 – 15:10 - Afternoon Refreshments
   
15:10 - 16:00  SESSION 12

Presenters:

Ryan Cavalcante
Software Development Engineer
Microsoft
James Lyle
Program Manager
Microsoft


Track 1 - Extended Linguistic Services in Windows 7
In this presentation we will discuss the Extended Linguistic Services (ELS) platform, new to Windows 7, which provides diverse linguistic services to developers through a common API. We will discuss the linguistic services now available to developers through the ELS platform in Windows 7—Language Detection, Script Detection, and various Transliteration services—as well as the future vision for the platform.

Presenter:

Adil Allawi
Technical Director
Diwan Software Limited

Track 2 - Mashing-up Bi-Di
Mash-ups is a relatively new fashionable word on the Web - taking bits of other web sites to build up your own web page. It is not new or special - any search engine showing a snippet of a web site that it has found is a form of mash-up. Integrating a news or micro-blogging feed is another. And it seems that every company and their mother has its own mash-up API. But what happens when you have an Arabic web-site integrate content that may be Arabic or English or both? The Unicode Bi-Di Algorithm can render text and numbers unreadable. URL's may become unusable or, in the worst case, direct to fraudulent sites. It can be hard to predict how to mark-up the integrated content for the right result. This presentation will cover real world issues and attempt to suggest practical solutions.

Presenter:

Jim DeLaHunt
Principal
Jim DeLaHunt & Associates

Track 3 - Twanguages of the World: a Language Census of Twitter
What "twanguage" do you "tweet"? Twitter, the buzzing conversation of brief web and SMS messages, exploded into wide use in 2009. But just how wide? To how many countries has it spread? And into which languages? We aimed to find out. Our "Twanguage" project is a language census on a sample of Twitter's global traffic. Come hear our findings. Which are the top languages? Are #hashtags localized? How does language correlate with location? And which Unicode character is the most rarely used? Accessible to everyone, this talk is especially interesting to students of social media and of quantative language analysis.
   
16:10 - 17:00  SESSION 13

Presenters:

Frank Yung-Fong Tang
Sr. Software Engineer
Google Inc.
Wenchao Tong
Software Engineer
Google Inc.


Track 1 - Google APIs for Text Input and Translation
In this talk, we introduce several Google public APIs to empower web developer build more powerful internationalized web site, including, but not limited to:
  • Use Google AJAX Language API and element API to perform Machine Translation
  • Use Google AJAX Language API and element API to empower user to input text of different language by transliteration
  • Use Maps in Google Chart API and Geomap in Google Visualization API to represent information divided by geographical distribution

For each of these topics, we will first introduce the issues, following by the brief description of the API, and demonstrate with some real Google or non Google products which utilize these APIs. Short sample codes will also be walk though.


Presenter:

Roozbeh Pournader
Internationalization Specialist
HighTech Passport

Track 2 - Bidirectionalization: Demystifying Bidi Enabling
Bidirectionalization, or enabling software to be usable to people who write in bidirectional languages like Arabic and Hebrew, has sometimes been discarded as a superfluous and strenuous endeavor. This presentation will explain why bidi enabling is a must for every application and website intended for bidirectional users of the Middle East, as well as for other parts of Asia and Africa. It will also include suggestions on how to plan for, design, code, and test the bidirectionalization of such applications and sites. Last but not least, it will cover common internationalization requirements for the Middle East, including alternative calendars, local digits, and geopolitical sensitivities.

The intended audience of this presentation are developers, software architects, and managers planning to bidirectionalize their software or add support for other requirements of the bidirectional language markets.


Presenter:

Ilya Shtein
IT Architect
Metavante

Track 3 - Banking in the Cloud: Challenges of Internationalizing Banking Software (Case Study)
Based on the experience of building the Metavante Global Banking platform, we will discuss the challenges of internationalization in a distributed, service-oriented, heterogeneous banking environment.

Internationalization in the banking industry presents a number of challenges, such as the large number of legacy applications that do not share the same terminology and the need for further terminology customization on multiple hierarchy levels, as well as transactions spanning multiple locales and time zones.

We will talk about the applicability of Unicode and Unicode standards in different architecture layers, using W3C-i18n recommendations, and discuss the effect the listed challenges have on internationalization decisions.


Program is subject to change.

Object Management Group, (OMG) organizes the Internationalization and Unicode Conferences around the world under an exclusive license granted by the Unicode Consortium. Personal information provided to OMG via this website is subject to OMG’s
Privacy Policy. All responsibility for conference finances and operations is borne by OMG. The independent conference board provides technical review of the program and papers. All inquiries regarding the Internationalization and Unicode Conferences should be addressed to info@unicodeconference.org.  Copyright @ 2009 Object Management Group. All Rights Reserved.
 

Hit Counter