Welcome to WebHeadStart.org

Web Technologies

Sponsored By

WebHeadStart.org is currently in beta.
Please pardon our appearance as we work to provide you with the most comprehensive reference on today's web technologies.

Interested in advertising on WebHeadStart? Become an advertising partner today!

[WWW-HTML Mailing List Archive Home] [Messages By Thread] [Messages By Date]

Re: Problem in publishing multilingual HTML document on web in UTF-8 encoding

From: Sebastian Redl <sebastian.redl@getdesigned.at>
Date: Sat, 03 Jun 2006 15:31:00 +0200
Message-ID: <44818F14.5020608@getdesigned.at>
To: www-html@w3.org

Philip TAYLOR wrote:

> That certainly addresses my paradox issue, but seems to suggest (to me)
> that a single document may actually use two (or perhaps more) character
> sets, one which obtains up to the point of the META element, and another
> thereafter.  If this were not the case, the parenthesis "(at least until
> the META element is parsed)" would appear to be redundant.

Not true. To me, this suggests that the whole construct is only valid in 
character sets that have the ASCII set as a direct subset, such as UTF-8 
and ISO-8859-*, but only after the meta may characters outside the ASCII 
range appear. It is not valid, however, to change the character set 
completely with the meta element.

> Is this at the heart of Ian's ("Hixie"s) example :
>
>     <meta http-equiv="content-type" content="text/html; charset=utf-16">
>
> was everything prior to and including this in UTF-8, and everything
> thereafter in UTF-16 ?

With my understanding, unless the character coding is signalled through 
some other way (Content-type HTTP header or similar mechanism), such 
code is invalid, except for UTF-16 and UTF-8, provided that the same 
support requirement applies to HTML as does to XML. (I'm not that 
familiar with the HTML spec.)

Sebastian Redl
Received on Saturday, 3 June 2006 13:30:55 GMT
Valid XHTML 1.0! Valid CSS! Site Map | Privacy Policy | Terms of Use | WebHeadStart.org © 2005 All Rights Reserved.