Login | Register
My pages Projects Community openCollabNet

Discussions > dev > Re: dtds/xhtml, huh?

Discussion topic

Back to topic list

Re: dtds/xhtml, huh?


Author Pier Fumagalli <pier at apache dot org>
Full name Pier Fumagalli <pier at apache dot org>
Date 2002-06-07 18:26:01 PDT
Message From: todd fahrner <tfahrner at collab dot net>
> I can't parse "they are NOT html, or at least that's how my browser
> thinks they are". Does Pier's funk start before or after he views
> source? <g>

Just to explain a little bit my point about why I think that your "HTML" is
not "HTML"... First of all let me explain that what you did WORKS, it
actually does, no problems with any browser whatsoever, nada.

But "literally" speaking (just to go a little bit down in the spec), that's
how I interpret the "DOCTYPE" pseudo-element in a document: Let's assume

1) I am reading the content of one of your files from a file called "test"
(not test.xml, or test.html, or anything that would associate the content
with a given "type")

2) I am not being served the file through an HTTP connection, therefore I
don't have any information on the "MIME" type of the file as well (no
text/html, test/xml, no nothing)...

All I can see, looking at the document is that it becomes with this

<?XML version="1.0"?>

Good, since this is there (and no matter what encoding you want to use in
your document, I can get that, and figure out that this is an XML

Now, this tells me how your document is "structure", not what's inside it:
now that we have a way to "read" the document and its structure (it's XML)
we can go ahead and try to figure out what it contains: to do so, we check
out the DOCTYPE "pseudo element":

<!DOCTYPE html PUBLIC "-//CollabNet//DTD XHTML 1.0 Transitional//EN"

Good, we got one, and to be picky, let's say what it means:

    ok, cool, we know that this document has a root element called "html"
for which the DTD associated with this DOCTYPE is valid

PUBLIC "-//CollabNet...":
    The PUBLIC identifier of the DTD is something like "-//CollabNet..."
humanly interpretable as "it's something CollabNet did.

(this is implied after PUBLIC) SYSTEM "http://..."
    Great, I can find the formal definition of the document at this URL...

So far, inside here, there's NOTHING absolutely which tells me that YOUR
document type (which, just appears to have the root element called "html"),
is something "related" to the W3C HTML standard... Nothing...

All I can see, from your collection, is that your document is importing some
entities definition from the W3C HTML specification (just the - for example
- &nbsp; entities and such)...

But who the hell TELLS ME that your <p> means "paragraph" as in HTML (and
like so it must be rendered in a browser, with its own default style, rather
than "problem" (for example) which has a complete different semantic meaning
(and therefore we're not guaranteed it's going to render in the same way as
your "paragraph" does)...

"Semantically" reading your documents, there is NOTHING that tells me that
your document _IS_ something "comparable" to the W3C html, nada, zero, and
although browsers are not as picky as I am (I'm a big son of a B*** most of
the times), they are just "guessing" or "being lucky" to render your "<p>"
as a "paragraph"...

Some browsers are loosier, such as IE, have you ever tried to serve an HTML
file, over an HTTP connection, forcing content-type as "text/plain" but
leaving either no extension, or with a .html extension, IE is just going to
forget about the Content-Type header and do what HE thinks is right, render
is at HTML when it's only plain text. I had this problem when I wrote a
document (no extension, content type text/plain) beginning with:


This how you should begin all your HTML documents....
[blablabla, examples of tags, but REALLY this is a plaintext document, no

Some browsers are more strict... But so far, I can't see anything in your
DTDs and documents that tells me that your <b> means the same as the W3C's
HTML <b> tag, "paragraph" and not "brutal" or "belligerant"...

It works, but it's far from correct IMVHO... (and, no Brian, I didn't smoke
anything weird, my usual 25 Marlboro as every day)...

Have fun! :)


[Perl] combines all the worst aspects of C and Lisp: a billion of different
sublanguages in one monolithic executable. It combines the power of C with
the readability of PostScript. [Jamie Zawinski - DNA Lounge - San Francisco]

To unsubscribe, e-mail: dev-unsubscribe@styl​e.tigris.org
For additional commands, e-mail: dev-help at style dot tigris dot org

« Previous message in topic | 1 of 3 | Next message in topic »


Show all messages in topic

Re: dtds/xhtml, huh? Pier Fumagalli <pier at apache dot org> Pier Fumagalli <pier at apache dot org> 2002-06-07 18:26:01 PDT
     Re: dtds/xhtml, huh? todd fahrner <tfahrner at collab dot net> todd fahrner <tfahrner at collab dot net> 2002-06-10 07:57:23 PDT
         Re: dtds/xhtml, huh? ianosh Pier Fumagalli 2002-06-10 08:44:29 PDT
Messages per page: