Back to table of contents

The World Wide Web;
or, Shameless Self-promotion

Daphne Chong and Gabriela Marcionetti with material from Piers Johnson and Eliseo d'Annunzio

I'm terrible with analogies, so I'm going to give you just the one throughout this whole chapter. Skip the first paragraph if you already know what HTML is and what it does, for the ``teach me something already, dammit'' section of the chapter. But if you're unfamiliar with what HTML is, do read on.

Getting Started

HTML is the language that is used to write web pages you see on the internet. It is basically a formatting language, which can turn a plain text document into something nice and presentable, including graphics and links to other web pages, with the help of a browser such as Internet Explorer or Netscape. I would ask you to think of it in terms of using Microsoft Word's little formatting buttons on the toolbar at the top of the screen, but HTML is a little more stubborn than Word - it needs some extra work on your behalf, just like an authentic English cottage rose garden, or a complete Australia Post stamp collection which has been maintained since 197513.1.

A browser, by the way, is a program that will convert the HTML gobbledygook you've written into something formatted, swish and pretty so that others can admire it too. By far the most common browser around is Microsoft's Internet Explorer [insert huge debate about Netscape, Microsoft and the antritrust case right here, baby.] If you're keen on making your website browser-independent, you'll need to keep in mind that there are other browsers being used, like Netscape, Mozilla, Opera, Safari and the text-only Lynx.

Something that looks great in MSIE may not look so great in Netscape, and will definitely look even worse in Lynx if it's composed purely of graphics. Text alternatives for your graphical pages, or those that are heavily formatted, are a good idea.

But back to the HTML. What you need to be able to publish your own web site is: a) a place to store your files, and b) the HTML to format those documents or files into something presentable.

Luckily, ProgSoc members get an account on ProgSoc servers where you can store files for a web site. You can also store files on your Faculty of IT account, if you're doing a faculty course or subject at UTS. And hopefully you'll also be able to get the skills to write your own HTML for your web site from this chapter.

Now, to store your web site on your ProgSoc or IT account, you'll need to create a directory in your account called public_html. In that directory you'll need to make a file called index.html or index.htm, like so:

    [aperson@ftoomsh]$ mkdir public_html
    [aperson@ftoomsh]$ cd public_html
    [aperson@ftoomsh]$ vi index.html

It needs to be index.html because if someone types in an address such as http://www.progsoc.org/˜example/, which is not a particular file but a directory, the web server will look for a file called index.html and serve that.

If you don't make an index.html or index.htm file, your visitor will be presented with a message from the server saying the file doesn't exist. Note that this doesn't happen if your visitor types in the full address, http://www.progsoc.org/˜example/page.html, but human beings tend to be lazy, and try to get away with doing the least amount of work possible to achieve something13.2.

So even though your page exists, at home.html, because you've done things your own creative way your visitor won't be able to see your work of art.13.3

You can save yourself this heartache, by following the index.html convention for every new site you make, and for every new folder that you store your HTML files in.

But off my digressing path -- once you've created your index you'll need to change the permissions of the file and any directories that your files are stored in to readable (r) for group and others so that everyone can view them. If they're CGI scripts then they'll need to be executable (x) for group and others as well. Generally,

    [example@ftoomsh]$ chmod 755 

should do the trick, or if you want to give everyone read and execute permissions to everything in your public_html/ directory, type

    [example@ftoomsh]$ chmod -R go+rx public_html

If you're using your FIT account, make sure your home directory is executable for group and others as well. Type

    [example@charlie]$ chmod 711 .

while in your home directory13.4. ProgSoc home directories seem to be executable for group and others by default (in my experience) but I may be wrong. Check the permissions of your home directory by typing

    [example@ftoomsh]$ ls -ld .

while in your home directory. If all appears to be well, we're ready to start with the HTML.

Basic concepts of HTML

HTML is ever-evolving, and let me tell you there's been a hell of a lot of advancement since the last TFM was written :-) There's no possible way that all of the HTML tags known can be listed here, but there are several comprehensive indexes and tutorials on the net, which can help you if you're stuck on any concepts, or want to find a particular tag or attribute.

The HTML Compendium
http://www.htmlcompendium.org A listing of all html tags, and any possible attributes or events that you can use with them. A very comprehensive site.

HTML Goodies
http://www.htmlgoodies.com A good tutorial site if you get sick of my waffle. :)

Webmonkey
http://hotwired.lycos.com/webmonkey/authoring/html_basics/ Webmonkey also has excellent HTML tutorials, plus a very handy HTML reference sheet for the basic tags that you'll be using to create a site: http://hotwired.lycos.com/webmonkey/reference/html_cheatsheet/index.html

Now for the brave souls who haven't abandoned ship...

HTML tags work like brackets, and their logic is very simple. If you 'open' a tag <i> for example (which stands for italics) and write this sentence:

    The word <i>italics</i> will be italicised.

Then...well, the word italics will be italicised. Neat, huh? However, if you wrote:

    The word <i>italics will be italicised.

Then the rest of the sentence after the word ``italics'' would be in italics. That's because you didn't ``close'' the tag, with </i>. A command that is opened with <command> should always be closed (eventually) by the </command> tag. As with all rules, there are some exceptions, but don't fret - we'll cover those later.

You can also nest tags, like so:

    The word <b><i>italics</i> will be bolded and italicised</b>.

and your tags can also have attributes, or options, which you specify inside a particular tag when you open it. For example, the <hr> tag stands for horizontal rule, and it will rule a line across your page which is the full width of your browser. But by adding just one nifty attribute, you can make it 80% the width of your browser window.

    <hr width="80%"> 

voila! There's no real limit to the number of attributes that you can add to a particular tag either13.5. The aforementioned HTML compendium has an extensive list of attributes that are available with a particular HTML tag, and there's a list of basic attributes further on in this chapter. Experiment and see what your page looks like in different browsers.

Basic HTML Tags

These are the basic tags that you really can't do without; almost any document needs these with varying frequency.

Formatting Text and Pages

Attributes

Attributes go inside HTML tags. For example,

   <IMG SRC="hello.gif" border="1">. 

In this instance the basic tag is IMG, and the attributes are SRC (source) and BORDER.

Many tags share the same attributes - BORDER can be used with <IMG> and <TABLE>; SIZE can be used with <HR> and <FONT>.... The cases listed below are definitely not the only times when certain attributes will work, so feel free to play around.

Trickier Stuff

Now you have the basics, we can examine some of the nastier tags.
Lists
Lists in HTML are of several different types. They all take list items preceded by <LI>.
 

There is another list type which does not take the <LI> type. This is the <DL> ...</DL> list, which is the definition or glossary list. Elements for this list are <DT> For the term being defined and <DD> for the definition of the term. Like the <LI> tag, these don't have a terminator (i.e. no </LI>), the next tag automatically closes them.

Tables
Every table is enclosed in <TABLE> ...</TABLE> tags. The TABLE tag has several attributes, some of which aren't supported by all browsers (anything fancy or new is a bit dubious), but the standard ones are things like height, width, border, bgcolor etc. The table is divided into rows, specified by <TR> ...</TR>; and each row is divided into elements delimited by <TD> ...</TD>. Other tags are <TH> ...</TH> for table header and CELLSPACING and CELLPADDING tags which work in much the same way as BORDER and HSPACE/VSPACE do13.9.

Your average table might look like:

<TABLE BORDER=0 WIDTH=200 HEIGHT=400 CELLSPACING=1>
<TR><TD>Example text</TD>
<TD><IMG SRC="thing.gif"></TD></TR>
</TABLE>
Note that anything can be in a table: plain text, links, images, whatever. I could also have written that table on one line, but separating it out makes debugging easier. Believe me, it's often very difficult to find why your table isn't doing what you expect13.10. There are attributes for the <TD> tag as well. The attribute ROWSPAN=X determines how many rows deep a cell is. This is useful if you wish to align an image alongside several boxes of text. Another attribute is COLSPAN, which works the same way, except for columns instead of rows. You can use the ALIGN tag, as you would for images, and there's also a VALIGN tag which takes the values TOP, MIDDLE and BOTTOM. Another reminder: When putting images into table cells, align them using the ALIGN and VALIGN tags in the <TD> tag, rather than in the <IMG> tag. Otherwise, you will get unpredictable results.

Backgrounds
When TFM was first written, the Web was still (relatively) young and exciting, and background graphics were the norm. Now, they're considered hideously unfashionable in the same way that flares were unfashionable a couple of years ago. Anyway, you still might see them around. To include them, you extend the <BODY>. There are two ways to do this --- either by specifying a background picture.13.11 or a background colour. To specify a picture, you would use the syntax <BODY BACKGROUND="thing.gif">. To specify a colour, you would use <BODY BGCOLOR=000000 signifies a number (in hex) representing the RGB values of a colour. ffffff is white. Other colours come between13.12.

Extended Highlights.

Text Effects

Having plain text throughout a Web page, although being legible, can be, in the one word ``boring''...I mean let's face it, you won't get the Nobel Prize in Literature for sticking with plain text, will you? So it often helps if you use a few tags to spruce up your document a little:

Special Characters

HTML has within itself a special system for displaying characters normally used in HTML and foreign characters. Thus HTML will be able to display languages such as French, Italian, Spanish, Portuguese, German, Swiss, and Scandinavian13.16.

In general the syntax for the appropriate character is &(letter)(accent code);
e.g. &252; is written as &uuml;13.17

Thus typing p&acirc;t&eacute; de foie gras will show the words pâté de foie gras on the page.

The current codes are available13.18:


Accent Accent Code Example Written As
Acute acute ó &oacute;
Grave grave ì &igrave;
Circumflex circ ê &ecirc;
Ring ring å &aring;
Tilde tilde Ñ &Ntilde;
Umlaut uml Ä &Auml;
Slash slash Ø &Oslash;
Cedilla cedil C &Ccedil;


Other characters have different codes:


Symbol Code Symbol Code
ß &szlig; æ &aelig;
Æ &AElig; < &lt;
> &gt; & &amp;
" &quot; % &cent;
£ &pound;



Another code can be used to display a particular ASCII character, eg. if you want the ``A'', for the sake of example, one would type &#65; (The hash symbol (#), must be used when calling the character up.)

A full list of these tags can be found on the web13.19.

Advanced Linking

The standard form of a link, as mentioned above, is the regular text link, e.g.:

<A HREF="http://www.crap.com/~crap/crap.html">The Crap Page</A>
The next way is to put in a nice novelty button to click onto in order to reach the link. A slight step up, but quite okay if you want to keep things simple:
<FORM ACTION="http://www.crap.com/~crap/crap.html">
<INPUT TYPE="submit" VALUE="Visit the Crap Page!">
</FORM>
This code will produce a nice button with the text which you put into the VALUE attribute. Pressing this button will link you to the address in the ACTION attribute13.20.

But if you want to go that extra step beyond, go graphic, and use a picture to link to the site:

<A HREF="http://www.crap.com/~crap/crap.html">
<IMG SRC="./images/turd.gif" ALT="[A turd]" BORDER=0>
</A><I> Click the turd to reach the Crap Page!</I>
Just slip in the image with the <IMG> tag between the <A> ... </A> tags. Pretty easy, provided you have the pictures at hand. Notice that I've slipped in a bit of italics there to enhance the effect of the text. Also, notice the use of the BORDER attribute, BORDER=0 ensures you don't have a blue line around linking pictures, which looks ugly.

Playing with images

There is often a lot of confusion about the images one should use for Web pages. Talk of interlaced images, transparency, and so on just confuses the tyro. I shall attempt to clear up that confusion here. GIFs are easier to play with, but JPEGs are smaller, which means that the advantage of interlacing GIFs is somewhat offset by the longer transmission time. However, you can still make GIFs have transparent bits ...

Interlacing and Transparency for GIF images

On UNIX

There are a number of useful utilities on UNIX which allow you to make an image interlaced and transparent. My method goes something like this:
sally:~ :1>giftoppm thing.gif > thing.ppm
sally:~ :2>ppmtogif -interlace -transparent red thing.ppm > thing2.gif
This produces a GIF89a format image, which is interlaced and has all the pixels which were formerly red13.21 now being transparent (their transparency bit is set). Now that was easy, wasn't it?! Oh, you might have to add something to your paths, ppmtogif is at /usr/X11/bin/ppmtogif.

On a Mac

You don't have ppmtogif on a Mac. You don't even have a command line on a Mac13.22. What you might have is a programme called Graphic Converter. It's shareware, so if you don't have it, it's out there somewhere. This programme is mostly useless except for its Picture : Color and Save as... options. To make a colour transparent, choose the Colors option from the Picture menu. From there choose the Transparent GIF color option. You will be show a colour map, and you may select any one of the available colours to make transparent. This doesn't work as well as ppmtogif, as you may have the same colour mapped to different identifiers. To interlace an image, choose the Save as... option from the File menu, choose GIF format (if not already selected), open the Options... box by clicking on its button. Here you can select GIF89a format and Interlaced. Click on OK, then click on Save. You might want to rename the file as well. There's also a utility called Transparency, check it out.

On a PC

There's a Windows13.23programme called LView Pro. The latest versions of it support both interlaced gifs (89a), and transparent gif colours.

There's a few other utilities out there --- Gif Construction Kit allows you to make a gif not only interlaced and trasparent, but also animated. A Web search will probably give you a lot of sites dedicated to animated GIFs.

Ideal images for the Web

Simply, keep it small. Large images deter people from waiting for your page. A good way to keep file sizes down is to use less colours - saving a 16 colour image as 4-bit makes it <b>much</b> smaller than a 256 colour image, or even an 8-bit 16 colour image. It can be a good idea towarn users about the size of the file beforehand, particularly if you have a whole page of them. If you do this, it's a good idea to tell the user what the picture is so they can decide if they want to view it or not. This is part of the developing etiquette for Web pages.

An Introduction to CGI Scripts

So, what is CGI? It stands for Common Gateway Interface. CGI scripts are special programs that were designed to make our job of being both ProgSoc members and Web page authors a hell of a lot easier13.24. CGI scripts are often used to take input or give output in one way or another: for email, locating people on accounts (a process known as fingering), showing animations, or for searches.

There are various CGI scripts located in cgi-bin directories in two of the university's servers. One such directory is at the university's main UTS server, (http://www.uts.edu.au/cgi-bin/), and our very own ftoomsh, (http://ftoomsh.progsoc.uts.edu.au/cgi-bin/). Here are a few examples: one for email, one for fingering other people's accounts and one for counting hits.

If you want to be able to use the mailing script the following code can be used13.25:

<A HREF="/cgi-bin/mail?to=qris@138.25.6.1">
qris@ftoomsh.progsoc.uts.edu.au
</A>
Another thing that needs to be mentioned is an extension to the address. If you want to have the email script send you your mail with a personalised subject line, you will need to use the following code :
<A HREF="/cgi-bin/mail?to=qris@138.25.6.1&Subject=Home+Page">
qris@ftoomsh.progsoc.uts.edu.au
</A>
You might note that the subject itself must be typed out with plus signs and not spaces. This is the necessary syntax as HTML cannot handle spaces very well. The plus signs are nevertheless treated as spaces when the mail is finally sent.

There is a mailto: protocol, which can be included in your HTML document just like any other link. The syntax is, simply enough, <A HREF="mailto:user@machine">...</A>.

The Finger script does the same thing as the finger command for UNIX - finding information about a particular user, and to see if they are currently on. The following code can be used :

<FORM ACTION="http://www.uts.edu.au/cgi-bin/finger">
<INPUT TYPE="submit" VALUE="Find me!">
<SELECT NAME="isindex">
<OPTION> qris@ftoomsh.progsoc.uts.edu.au
<OPTION> edannunz@acs.itd.uts.edu.au
</SELECT>
</FORM>

The concept of FORMs mentioned in the above example haven't been explained in major detail here, and I don't have enough room in this chapter to go through it (I'm still learning about it, you see). Of course with this code you could modify it so that you could finger a whole lot of other people by adding more <OPTION> tags where shown in the example. But a word of caution. As the selection is made on the email address, it is necessary that you type it correctly, so that the script is executed properly.

There is, of course, the almost ubiquitous counter script. Nearly every page has some version on it. There are plain text and graphical versions, but I think the text versions are better, so that users don't see You are visitor # [image] when it fails to load the picture. To use the text version, simply include this bit of code:

&;lt!--#exec cgi="/cgi-bin/counter" -->
This sets up a link to the page of the writers of the script, as well as saying how many times a page has been accessed. If you don't want the link, <!#exec cgi="/cgi-bin/counter-nl" --> should fix that13.26.

Where To Next?

The best place to find the most up-to-date information about creating web pages is on the web.13.27 There are more pages devoted to good website design than you can poke a stick at13.28. Most importantly, if you see something you like on another site, you can always view the source and see how they did it.

Back to table of contents