UKOLN Article for Online & CD-ROM Review

Technologies and Standards For Building Web Sites



1. Introduction

The impression given on reading popular Internet magazines is that creating a web site is easy. Mastery of a few simple HTML tags (<H1>, <P>, <IMG> and <A HREF=...>) enables of multimedia web site to be created. And for users who wish to avoid HTML tags, many "WYSIWYG" editors, including public domain or cheap shareware packages often bundled on CD-ROMs affixed to the front cover of Internet magazines, through to popular commercial products such as Microsoft's FrontPage enable web pages to be created without any knowledge of HTML tags.

But is it really that easy? This article looks at some of the problems with this approach and outlines some possible solutions.

2. The Problems

The use of authoring tools which have a file-based approach can cause problems in the maintenance of large web sites. Authoring tools which make use of templates can help to reduce the maintenance associated with redesign of a web site. However many of the current generation of authoring tools still create web sites which suffer from the maintenance of the page contents, performance and accessibility problems.

The current generation of mainstream authoring packages make use of HTML tables to control the appearance of web pages and make intensive use of graphics to provide design features. The use of tables can result in delays in the rendering of the display, since a table must be completely downloaded before the browser can start displayed the text. As all web user will be aware of, the use of graphics also degrades the performance of web sites. For the visually impaired the use of graphical images to present textual information can result in the user being denied access to the information.

Once a web site has been created, it has to be maintained. As experienced web site creators know, the maintenance of a web site is often more time-consuming that its initial creation. There are several aspects to the maintenance including keeping the content up-to-date, updating the appearance and maintaining the links from the web site.

3. Solutions

Standards

The World Wide Web Consortium (W3C) is the body responsible for coordinating the development of World Wide Web standards, including developing new standards and updating existing ones. Standards which emerge from W3C are developed in accordance with W3C's roadmap, which is based on Tom Berners-Lee's (W3C Director and father of the Web) vision for the future. W3C member organisations (which includes companies such as Microsoft and Netscape) are involved in the development of new standards. This process helps to ensure that W3C work is not carried out in isolation to real-world commercial and marketing considerations.

Style Sheets

The W3C recommendation for controlling the appearance of HTML resources is through the use of Cascading Style Sheets. The current recommendation, CSS 2.0, [1] provides a great deal of control over the appearance of HTML pages, including font characteristics and the positioning of objects.

An example of use of CSS is shown below.


<STYLE TYPE="text/css">
BODY { 
  color: #000;
  background: #FFF;
  margin-left: 7%;
  font-family: verdana, helvetica, sans-serif;
}

H1, H2 { 
  margin-left: -6%;
  color: #900;
}

#p1 {
  margin-top: -30px;
  text-align: right;
}

#s1 {
  color: #DDD; 
  font: 100px Impact, sans-serif;
}
</STYLE>
Figure 1: Portion of a Style Sheet

The CSS fragment illustrated in Figure 1 is taken from the W3C's CSS home page [2]. It contains definitions for the appearance of H1 and H2 elements (indented by 6% of the screen width and displayed in a colour which has the code #600).

The next CSS fragments define how a paragraph with the identity #p1 and a portion of text with the identity #s1 is to be displayed. The display is illustrated in Figure 2.

Figure 2: Positioning of HTML Elements Using CSS
Figure 2: Positioning of HTML Elements Using CSS

Without use of CSS the overlaying of the text shown in Figure 2 would require the text to be captured in an image, which would result in performance degradation. In addition, the text could not be rendered by a speech synthesizer or indexed by search engines such as Alta Vista or Infoseek.

As well as providing performance and accessibility benefits, the use of CSS can also enable web sites to be more easily maintained. Rather than including a style sheet definition within a HTML file, HTML files can include a pointer to an external style sheet file. A single style sheet file can be used to define the appearance of a large web site containing many hundreds, even thousands of HTML files. Changing the appearance and look and feel of the web site can be achieved by altering the single style sheet file - there is no need to make changes to the individual HTML files.

HTML 4.0

The current HTML standard, HTML 4.0 [3] is not significantly different from the HTML 3.2 standard, since control over the appearance is now the responsibility of the CSS specification. The main developments to HTML 4.0 have included accessibility aids (such as providing ease-of-use for users with access to a mouse) and support for the use of style sheets and client-side scripting languages.

DOM

The Document Object Model (DOM) [4] provides a mechanism for enabling web pages to be updated and modified by the use of programs and scripts. The term Dynamic HTML is often used in this context. Dynamic HTML refers to use of a client-side scripting language (such as JavaScript) to alter the content and appearance of web pages by updating HTML and CSS elements.

XML

If HTML 4.0 and CSS 2.0 enable attractive, well-designed web sites to be developed which are also fast, maintainable and accessible, does this mean that we have reached a levelling off in the development of data formats for the web? The answer is no. HTML provides very limited support for structured documents, since only basic document-like features are available in HTML (such as headings, paragraphs, bulleted and numbered lists, tables, etc.) HTML was not designed as a storage format for structured information, enabling data to be reused in a variety of ways. HTML can be regarded as a display format, as described by Austin and Sherwin [5].

XML (the Extensible Markup Language) [5] aims to address these deficiencies. XML permits arbitrary elements to be created, as illustrated in Figure 3.

<album> 
<artist>Jackson Browne</artist>  
<title>Running on empty</title>
<tracklist>    
  <track>Running on empty</track>    
  <track>The Road</track>
   ...
  <track>Stay</track>   
</tracklist>  
<label>Asylum Records</label>  
<year>1977</year> 
</album>

Figure 3 - An XML Fragment

Figure 3 shows an XML fragment containing details about a record collection. Since the information is stored in a structured format, as well as displaying the information, it is possible to perform additional functions, such as display only the artist's name, count the number of track per album, etc.

4. Issues

Although HTML 4.0, CSS 2.0 and DOM would appear to provide an attractive solution to the problems currently facing providers of web services, with XML also having a possible role to play, deployment of these technologies raises several issues:

Browser support
Many browsers which are still in use do not support these technologies, or their support is poor.
Authoring Tools
Many authoring tools do not enable newer technologies to be used.
Support Issues
The deployment of new browsers, authoring tools and backend services has associated support and financial implications.

In the longer term we should see the widespread deployment of software which support the new standards. Until that happens, web site managers will have to address the migration issues towards use of these technologies. The use of back end databases and sophisticated document management systems [7] may provide a migration strategy, but at a cost, both financial and support. However the cost may be worth paying in order to avoid even higher costs in the future.

References

  1. Cascading Style Sheets, level 2 (CSS2) Specification, W3C
    <URL: http://www.w3.org/TR/REC-CSS2>
  2. Cascading Style Sheets, W3C
    <URL: http://www.w3.org/Style/CSS/>
  3. HTML 4.0 Specification, W3C
    <URL: http://www.w3.org/TR/REC-html40>
  4. Document Object Model (DOM), W3C
    <URL: http://www.w3.org/DOM/>
  5. The Role of HTML as a Display Format, D. Austin and G. Sherwin
    <URL: http://www.cnet.com/Papers/W3C/kungpao.html>
  6. XML, W3C
    <URL: http://www.w3.org/XML/>
  7. What are ... Document Management Systems?, Ariadne, issue 17
    <URL: http://www.ariadne.ac.uk/issue17/what-is/>

The Author

Brian Kelly has an unusual job title: UK Web Focus. Brian works at UKOLN (UK Office for Library and Information Networking) which is based at the University of Bath. Brian has been involved with the World Wide Web since early 1993, when he was involved in setting up a web service at Leeds University. He now advises the UK Higher Education community on matters relating to the web.