Welcome to WebHeadStart.org

Web Technologies

Sponsored By

WebHeadStart.org is currently in beta.
Please pardon our appearance as we work to provide you with the most comprehensive reference on today's web technologies.

Interested in advertising on WebHeadStart? Become an advertising partner today!

[WWW-HTML Mailing List Archive Home] [Messages By Thread] [Messages By Date]

Re: Problem with LANG keyword

From: David Woolley <david@djwhome.demon.co.uk>
Date: Wed, 24 Sep 2003 23:05:35 +0100 (BST)
Message-Id: <200309242205.h8OM5Zm14204@djwhome.demon.co.uk>
To: www-html@w3.org

> 
> Hmmm, nice I did not think about that. So the use of "&#...;" is actually
> should be used for a very specific list of symbols.

&#....; always represents the ISO 10646 (loosely Unicode) code point.

In very old versions of HTML it was the 256 character initial subset, 
which is identical to ISO 8859/1.  Most of the control characters and
some other control-like characters are not allowed.   In particular,
although generated by certain common authoring tools, &#146; and
&#147; are control characters and not permitted.

The conceptual process is:

- if the character set is in the real HTTP content-type header, note that;
- otherwise, if the document appears to be in 16 bit Unicode or an ASCII
  superset, scan it for a meta for content type, and extract the 
  character set;
- if neither succeeds in extracting a character set, the document is in
  error, and here the spec contradicts itself by saying that the browser
  must not use a default but suggesting that it may use heuristics
  (to me a default is a heuristic);
- translate the whole document from the character set identified above into
  ISO 10646;
- parse it, including expanding any numeric entities;
- render it;
- convert the result into platform fonts that includes the appropriate
  character, using CSS font hinting, but not so as to force a false encoding
  - specifying 5<span style="font-face: Symbol">m</span>V should produce
  five millivolts, not the five microvolts that is likely to appear on 
  many browsers - browser that handle other fonts correctly and likely to
  deliberately misinterpret Symbol.
Received on Wednesday, 24 September 2003 18:12:28 GMT
Valid XHTML 1.0! Valid CSS! Site Map | Privacy Policy | Terms of Use | WebHeadStart.org © 2005 All Rights Reserved.