The structure of an HTML 3.2 document

Writing a structured document does not mean that you are writing in a straitjacket. It only means you have to lay out the document in advance. It also means the document becomes easier to read, maintain and extend. While this may not seem too important if you just want a homepage, when you have a whole site to maintain, well-structured documents make life a lot easier! It is also important to note that HTML uses the ISO-8859-1 character set. Apart from the entities defined in the Wilbur draft, the characters from this list are the only ones you should use. Other characters are not guaranteed to show up at all in a browser, let alone show up as the character you're hoping for. Every HTML 3.2 compliant document should look basically as follows: (Note: the line numbers are only here for the explanation below) 1. <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN"> 2. <HTML> 3. <HEAD> 4. <TITLE>The title of the documents</TITLE> 5. <META NAME="description" CONTENT="This is a document"> 6. <LINK REV="made" HREF="mailto:galactus at"> 7. </HEAD> 8. <BODY> 9. ... document body 10. </BODY> 11. </HTML> 1. DOCTYPE This is a so-called DOCTYPE declaration. It is used by SGML tools to detect what kind of document is being processed. If your document adheres to the Wilbur standard, the above is what it should look like. If your document is HTML 2.0 compliant, the DOCTYPE of it is <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> Some HTML editors like to include an arbitrary DOCTYPE declaration in your documents, even when it is not correct. Note that in particular, any doctype for HTML 3.0 is not an "official" declaration, since that proposal has been expired for a long time now. 2. HTML This tag goes around the entire document. Basically, it states that the rest is all HTML, as opposed to some other language which may use tags within < and > brackets. In theory, it can also be used by servers to see that the document they want to send is actually HTML and not plain text. However, this is almost never done (for performance reasons, usually). 3. HEAD The head of your document contains information about the document itself. Nothing within the HEAD section should be displayed in the document window. The head section must include the TITLE of the document. It can optionally contain things like a description, a list of keywords for search engines, and the name of the program used to create the HTML document. The HEAD tag is optional. If you arrange all the information about the document at the top of the document, and all body tags below, it is obvious for a parser where the header ends and where the body begins. 4. TITLE The TITLE tag is the only required tag for the head section. It is typically displayed in the browser's window title bar, and used in bookmark files and search engine result listings. For the last two situations, you should make sure the title is descriptive for the document - "Homepage" or "Index" doesn't say much in a bookmark file. 5. META META tags provide "meta information" about the document. For example, it can give a description of the document, indicate when the document will have expired or what program was used to generate it. There are many possible META constructs, so please read the section on meta tags in the list of HTML tags. This particular META tag provides a description of the document, which is used by search engines such as Alta Vista and Infoseek. 6. LINK A LINK tag provides information about the document relative to the rest of the site. For example, you can have a LINK tag stating where the table of contents is, what the next document is or where the style sheet can be found. This particular LINK tag gives the address of the document's author. Some browsers (most notably Lynx) allow you to send a comment to this person with one keystroke if this tag is defined. 9. BODY The BODY of the document contains the actual information. There may be only one BODY statement in the document. Some editors incorrectly insert another BODY statement for each new attribute you want to add to the body, but this can have unexpected side-effects (such as some of the attributes getting ignored completely).
Designing a structured contents for your HTML document is an art in itself. I won't go into it too deeply here. Initially, use only the six headers to set up the structure of the document, adding lists, tables and other block elements until the general layout of the document is finished. Then begin filling in the blocks, marking up the text with the appropriate text-level elements. Images are very important, but as the IMG tag is a text-level tag, it must be contained in a block-level tag. Often a document will be part of a set, so it will use a common style. This style should specify a standard structure for documents, including navigation aids and standard images. Writing a template is then a very handy thing. The WDG's Style guide for online hypertext discusses this in more detail.
Reference index ~ Wilbur index ~ Tag overview ~ Feedback Copyright (c) 1997 Arnoud "Galactus" Engelfriet.