Welcome to WebHeadStart.org

Web Technologies

Sponsored By

WebHeadStart.org is currently in beta.
Please pardon our appearance as we work to provide you with the most comprehensive reference on today's web technologies.

Interested in advertising on WebHeadStart? Become an advertising partner today!

[WWW-HTML Mailing List Archive Home] [Messages By Thread] [Messages By Date]

Re: more xhtml 2.0 comments

From: Michael Day <mikeday@yeslogic.com>
Date: Fri, 18 Apr 2003 12:00:38 +1000 (EST)
To: Ernest Cline <ernestcline@mindspring.com>
Cc: www-html@w3.org, <Donna.Worby@dardni.gov.uk>
Message-ID: <Pine.LNX.4.44.0304181155270.25201-100000@lorien.yeslogic.com>


Hi Ernest,

> I strongly doubt that an 'Mc' character will ever be part of Unicode.  
> The Unicode view is that 'Mc' is what the standard refers to as a 
> grapheme, and as such it should be encoded as two characters 'M' and 
> 'c'.  Existing multi-letter characters, sich as 'Dz' were included in 
> Unicode only because they existing in pre-UNICODE character sets and 
> were therefore included in Unicode to facilitate conversion between 
> those character sets and Unicode on a character for character basis.

That's interesting. So, given that "Mc" is rendered differently and
collated differently from the sequence of two characters "M" and "c", how
should this be handled?

Is it in fact an issue of script/language, in the same way that Spanish 
collates the character combinations "ll" and "ch" differently?

Presumably then if the sequence "Mc" is encountered in text with language
en-UK (or some other code?) it should be collated differently and rendered
using a superscript c or other method.

Surely there must be some existing standard for this?

Michael Day

YesLogic Pty. Ltd.
Received on Thursday, 17 April 2003 20:44:26 GMT
Valid XHTML 1.0! Valid CSS! Site Map | Privacy Policy | Terms of Use | WebHeadStart.org © 2005 All Rights Reserved.