Welcome to WebHeadStart.org

Web Technologies

Sponsored By

WebHeadStart.org is currently in beta.
Please pardon our appearance as we work to provide you with the most comprehensive reference on today's web technologies.

Interested in advertising on WebHeadStart? Become an advertising partner today!

[WWW-HTML Mailing List Archive Home] [Messages By Thread] [Messages By Date]

Re: Identifying (X)HTML without MIME

From: Lachlan Hunt <lachlan.hunt@iinet.net.au>
Date: Tue, 09 Nov 2004 09:23:35 +1100
Message-ID: <418FF1E7.1010609@iinet.net.au>
To: trejkaz@xaoza.net
CC: James Cerra <jfcst24_public@yahoo.com>, www-html@w3.org

Trejkaz Xaoza wrote:
> On Sun, 7 Nov 2004 09:48, James Cerra wrote:
>> What are the recommendations for
>> identifying the document's type when MIME or HTTP is 
>> not available?
> 
> If it starts with "<?xml" it's an XML document.  If it is then in the XHTML1 
> namespace, it's XHTML1.  If it's in the XHTML2 namespace, it's XHTML2.

That is not always reliable.  Hixie has explained [1] in detail, about 
the cases where that will not work.  Although, technically, the 
following description was talking about sniffing documents that were 
sent as text/html, similar rules should apply where the MIME information 
is not available elsewhere.  I'd recommend you do as Anne already 
mentioned, and use the File extension like Mozilla does.

----
     + You can't sniff for the five characters "<?xml" because:

       - The <?xml ... ?> header is optional per Appendix C, and it is
         recommended not to include it as it causes IE6 to trigger
         quirks mode.

       - SGML can also contain PIs (see the example below).

    ...

    e.g. what language is this text/html document in?:

       <?xml this is not?>
       <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0//EN"
           [ <!-- SYSTEM "not XHTML" --> ]>
       <!-- -- -->
         This is a comment. This document is not XHTML.
         <html xmlns="http://www.w3.org/1999/xhtml "/>
         Ok, I'm done now. -->
       <html>
        <title> Need a title in HTML4! </title>
        <p> This is a valid HTML4 document.
       </html>

  ...

  * The HTML working group said that UAs should not do this:
       http://lists.w3.org/Archives/Public/www-html/2000Sep/0024.html 
----

[1] http://hixie.ch/advocacy/xhtml 
-- 
Lachlan Hunt
http://lachy.id.au/ 
http://GetFirefox.com/     Rediscover the Web
http://SpreadFirefox.com/    Igniting the Web
Received on Monday, 8 November 2004 22:24:16 GMT
Valid XHTML 1.0! Valid CSS! Site Map | Privacy Policy | Terms of Use | WebHeadStart.org © 2005 All Rights Reserved.