![]() |
|
![]() |
||
![]() |
![]() |
|
[WWW-HTML Mailing List Archive Home] [Messages By Thread] [Messages By Date] Re: Problem in publishing multilingual HTML document on web in UTF-8 encoding
From: Philip TAYLOR <P.Taylor@Rhul.Ac.Uk>
Date: Fri, 02 Jun 2006 21:31:06 +0100 Message-ID: <4480A00A.50802@Rhul.Ac.Uk> To: "рд?рд╢ре?рд╖ рд╢реБрд?реНрд▓рд╛ \"Wah Java !!\"" <wahjava@gmail.com> CC: W3C HTML Mailing List <www-html@w3.org> рд?рд╢ре?рд╖ рд╢реБрд?реНрд▓рд╛ "Wah Java !!" wrote: > If UA (user agent), finds a "Content-Type" in <meta> tag in HTML document, > it should use that to identify the document's character encoding, > because it is a part of the document. The server's reply should only > be considered when document doesn't explicitly states its character > encoding. Much as I think your argument has merit, I cannot see how you can resolve the following paradox : suppose, in some as-yet unknown encoding (say, ISO-9999-9), the character positions which in ISO-8859-1 correspond to the letters "M", "E", "T" and "A" correspond instead to the letters "B", "O", "D" and "Y". Now the server says that the document is in ISO-8859-1, so when the UA sees <META http-equiv="content-type" content="text/html; charset=iso-9999-9"> it interprets the META directive as you would wish. But in so doing, it starts to parse the document on the basis of it being expressed in ISO-9999-9, whereupon it discovers that there wasn't a META directive at all, there was, rather, a(n ill-formed) BODY tag. But because it now knows there /was/ no META directive, it parses using ISO-8859-1. But that means there IS a META directive. And so on. I'm sure you see the problem ... Philip TaylorReceived on Friday, 2 June 2006 20:30:32 GMT |
|
||||||||||||||||