Enhancing Browsers with Web Page Levels of Detail

Web pages can be viewed at different levels of detail depending on how much information is required. This article describes six levels of detail and explains how they can be used to implement three browser enhancements: enhanced link tool tips, navigation maps, and pre-emptive page downloading.

1. Introduction

Web pages can be viewed at different levels of detail depending on how much information is required. This article describes six levels of detail and explains how they can be used to implement three browser enhancements: enhanced link tooltips, navigation maps and pre-emptive page downloading. Section 2 explains web page levels of detail. Section 3 describes three ways that web page levels of detail can be used to enhance web browsers. Section 4 describes how web page levels of detail can be implemented.

2. Web Page Levels of Detail

Visualization applications render objects in a scene in various amounts of detail; nearer objects are drawn in more detail than objects that are further away. The amount of detail that is drawn—the level of detail—is determined by the distance of the object from the viewer. Non-graphical information can also be viewed at different levels of detail. For example, the three parts of an academic paper:

  1. the title and authors;
  2. the abstract;
  3. the full text

can be viewed at three levels of detail:

  1. the title and authors;
  2. the abstract (plus the title and authors);
  3. the full text (plus the title and authors and the abstract).

Levels of detail are accumulative: each level of detail adds to the information in the previous level.

Many different levels of detail can be produced from a web page. The first four levels of detail, from the lowest to the highest level of detail, are taken directly from the HTML of a web page:

  1. the information contained in the <head> tag such as the title and the meta-data;
  2. the text contained in the heading tags (<h1>, <h2>, <h2>, etc.);
  3. the first 1000 characters of text; and
  4. the full page of HTML.

Remember that levels of detail are accumulative; the second level of detail, for example, contains the information in the <head> tag as well as the text contained in the heading tags (<h1>, <h2>, <h2>, etc.). The following blocks show the first three levels of detail produced from the HTML of a web page.

First Level of Detail

The information contained in the <head> tag such as the title and the meta-data.

<html>
  <head>
    <meta name="description" content="Some 'how to' tips for the Apache httpd server">
    <meta name="keywords" content="apache,redirect,robots,rotate,logfiles">
    <title>Apache HOWTO documentation</title>
  </head>
</html>

Second Level of Detail

The information in the first level of detail plus the text contained in the heading tags (<h1>, <h2>, <h2>, etc.).

<html>
  <head>
    <meta name="description" content="Some 'how to' tips for the Apache httpd server">
    <meta name="keywords" content="apache,redirect,robots,rotate,logfiles">
    <title>Apache HOWTO documentation</title>
  </head>
  <body>
    <h2>Apache HTTP Server Version 1.3</h2>
    <h1>Apache HOWTO documentation</h1>
    <h2>How to redirect an entire server or directory to a single URL</h2>
    <h2>How to reset your log files</h2>
    <h2>How to stop or restrict robots</h2>
    <h2>How to proxy SSL requests through your non-SSL Apache server</h2>
    <h2>Apache HTTP Server Version 1.3</h2>
  </body>
</html>

Third Level of Detail

The information in the first and second levels of detail plus the first 1000 characters of text.

<html>
  <head>
    <meta name="description" content="Some 'how to' tips for the Apache httpd server">
    <meta name="keywords" content="apache,redirect,robots,rotate,logfiles">
    <title>Apache HOWTO documentation</title>
  </head>
  <body>
    <div>
      <img src="../images/sub.gif" alt="[APACHE DOCUMENTATION]">
      <h2>Apache HTTP Server Version 1.3</h2>
    </div>
    <h1>Apache HOWTO documentation</h1>
    How to:
    <ul>
      <li><a href="#redirect">redirect an entire server or directory to a single URL</a>
      <li><a href="#logreset">reset your log files</a>
      <li><a href="#stoprob">stop/restrict robots</a>
      <li><a href="#proxyssl">proxy SSL requests <em>through</em> your non-SSL server</a>
    </ul>
    <hr>
    <h2><A NAME="redirect">How to redirect an entire server or directory to a single URL</A></h2>
    <p>There are two chief ways to redirect all requests for an entire server to a single location:
    one which requires the use of <code>mod_rewrite </code>, and another which uses a CGI script.
    <p>First: if all you need to do is migrate a server from one name to another,
    simply use the <code>Redirect</code> directive, as supplied by <code>mod_alias</code>:
    <blockquote><pre>Redirect / http://www.apache.org/</pre></blockquote>
    <p>Since <code>Redirect</code>will forward along the complete path, however,
    it may not be appropriate - for example, when the directory structure has changed after the
    move, and you simply want to direct people to the home page.
    <p>The best option is to use the standard Apache module <code>mod_rewrite</CODE>.
    If that module is compiled in, the following lines
    <blockquote><pre>RewriteEngine On RewriteRule /.* http://www.apache.org/ [R]
    </pre></blockquote>
    ...
  </body>
</html>

Further levels of detail can be produced by processing and rendering a web page:

  1. the keywords extracted from the full text of the page after stop words have been removed;
  2. an image of the page scaled to various sizes (thumbnail, quarter size, etc.).

The following table lists the keywords extracted from the web page shown above at the third level of detail.

apache
appropriate
best
cgi
changed
chief
compiled
complete
direct
directive
directory
documentation
engine
entire
example
files
first
following
forward
home
http
httpd
lines
location
log
logfiles
migrate
mod_alias
mod_rewrite
module
move
name
need
option
page
path
people
proxy
redirect
requests
requires
reset
restrict
rewrite
rewrite
robots
rotate
rule
script
server
simply
since
single
some
ssl
standard
stop
structure
supplied
through
tips
url
use
uses
version
want
ways

The following images scale the web page shown above at the third level of detail to three different sizes.

Small version of the third level of detail
Medium version of the third level of detail
Large version of the third level of detail

Web page levels of detail enable a range of enhancements to be made to web browsers.

3. Web Browser Enhancements

Three of the web browser enhancements that can be implemented with web page levels of detail are enhanced link tooltips, navigation maps, and pre-emptive downloading of web pages.

Web page levels of detail can be used to enhance link tooltips. In most browsers, when the cursor hovers over a link the text contained in the title attribute of the <a> tag is displayed as a tooltip. The disadvantage of storing this text in the linking page is that if the linked to page changes, the linking page might also need to be updated. If the browser requested the first level of detail—the information contained in the <head> tag—for each link contained in a web page, the browser could use the title of the linked to page as the tooltip text for the link. If the title of a linked to page changes, the tooltip text for that link would automatically be updated.

The first level of detail for each link in a web page would be downloaded immediately after the page has finished downloading. Waiting to download the information for the tooltip until it is needed might delay the presentation of the tooltip and interfere with the feedback expected by users.

Although the first level of detail requires only a small amount of data to be downloaded, an HTTP connection must be made for each link. Enhanced tooltips can be made more efficient by combining the level of detail request mechanism described in this article with multiple HTTP request packages. The browser would package together the requests for levels of detail from the same web site to reduce the number of HTTP connections.

Enhancements to link tooltips are not restricted to text. The levels of detail that provide thumbnail and quarter size scaled images can be used to produce a visual preview of the web page that would be displayed if the link is clicked on. The tooltip could be further enhanced by combining a scaled image and the title of the linked to page. The requests for these two levels of detail can be reduced to a single HTTP request using HTTP request packages.

3.2 Navigation Maps

Most web browsers provide limited navigation facilities. One way to enhance browser navigation is to provide a map of the web pages that are linked to by a web page. Such a map might present web pages as a graph of linked nodes positioned in 2D or 3D space; the distance between a pair of web pages would represent their semantic distance as calculated by analysing the similarities between the keywords provided by the fifth level of detail. Each web page node can be presented as a scaled image provided by the sixth level of detail. As the user zooms in and out of the map, higher or lower levels of detail can be requested to render more or less detail as required.

3.3 Pre-Emptive Downloading

The enhanced tooltips described above are an example of pre-emptive downloading; the first level of detail for each link is downloaded to provide the tooltip. Pre-emptive downloading can be used to increase the speed at which web pages appear to download. The second or third level of detail for each page linked to by a page would be downloaded once the page has finished downloading. Clicking on a link would display the downloaded level of detail immediately which would give the user information to scan while the full page is downloaded. Pre-emptively downloading web pages that are never looked at needlessly increases internet traffic. Pre-emptive downloading algorithms must choose carefully which pages to download to minimize extra internet traffic.

4. Implementation

Web page levels of detail can be implemented by a simple server script that responds to a request for a level of detail encoded as a URL query. URL queries can be cumbersome so a shorthand notation is used when writing level of detail request URLs. An @ is appended to the URL of the web page for which a level of detail is requested, followed by the number of the level of detail. The @ should be read as “at level of detail”. For example, to request the web page

http://www.website.com/catalog/index.html

at the second level of detail—the contents of the <head> tag plus the headings (<h1>, <h2>, <h2>, etc.)—the following URL would be used:

http://www.website.com/catalog/index.html@2

The @ is a shorthand notation which must be mapped onto a valid URL query suitable for processing by a web server script. The following Apache web server rewrite rule transforms the @ notation to a valid URL query. This rewrite rule is added to the configuration file of the Apache web server which rewrites URLs ending with @ as a valid URL query.

RewriteRule /(.+)@(.+)$ http://www.website.com/LOD?page=$1&lod=$2 [R,L]

For example, the rewrite rule instructs an Apache web server to transform the URL

http://www.website.com/catalog/index.html@2

into

http://www.website.com/LOD?page=catalog/index.html&lod=2

where the URL query parameter page holds the page to provide the level of detail for, lod is the number of the requested level of detail, and LOD is the name of the web server script.

The six levels of detail for each page in a web site can be produced in two ways. First, each level of detail would be produced as each page is added to the server. The disadvantage would be that levels of detail that might never be used would be stored. A more efficient method would generate and store each level of detail on the fly the first time it is requested. The stored level of detail would be returned on subsequent requests. The advantage of this method is that storage is required only for the levels of detail that have actually been used. If the content of a web page changes, the levels of detail that have already been created would be deleted; they would be regenerated and stored the next time they are requested.