Reducing Perceived Download Time with Web Page Pre-Emption

Users often experience a delay between clicking on a link to request a web page and the presentation of the page in the browser. This article describes web page pre-emption, a browser-based method of decreasing the download delays perceived by users, and presents five strategies for selecting which pages to pre-emptively download.

1. Introduction

Users often experience a delay between clicking on a link to request a web page and the presentation of the page in the browser. This delay is a combination of the speed with which web servers respond to requests and the amount of competing internet traffic. One method of reducing download delays is to use a proxy cache. A proxy cache reduces download times to zero by storing a local copy of each web page requested by users of the cache; the proxy cache fulfils each subsequent request for a page with the cached copy rather than with the original. The disadvantage of proxy cache software is that it must be installed by the internet service provider, which puts installation beyond user control. Another approach is to delegate the task of reducing download time to the user’s browser with web page pre-emption.

2. Web Page Pre-Emption

Web page pre-emption decreases the delay between the request for a web page and it’s presentation in the browser by pre-fetching web pages. After downloading a web page, the browser’s pre-emption algorithm extracts the links in the page and downloads the linked-to pages. When users click on a link, the page is displayed immediately because it has already been downloaded. The browser extracts links and pre-fetches web pages in the background while users read and scroll the current page. If users click on the link for a page that has not yet been pre-fetched, every page currently being pre-fetched is stored and the available bandwidth is dedicated to downloading the requested page. If users subsequently request a partially pre-fetched page, the partial page is displayed immediately to give users information to read while the rest of the page downloads.

Pre-emption is a trade-off between minimising extra internet traffic and downloading every linked-to page to ensure rapid feedback. The best case is that every linked-to page that users request has been pre-fetched and that only linked-to pages that users request are pre-fetched; the worst case is that every linked-to page is pre-fetched and that users request none of those pages. This trade-off is addressed by the strategy that selects which pages to pre-fetch and the strategy that controls the amount of each page to pre-fetch.

3. Pre-Fetching Strategies

Pre-emptively downloading web pages that users subsequently request produces no extra internet traffic. However, pre-emptively downloading web pages not requested by users generates unnecessary internet traffic and wastes bandwidth. Web page pre-emption algorithms must implement an efficient strategy to minimize extra internet traffic. The following list presents five strategies for selecting which linked-to web pages to pre-fetch:

  1. pre-fetch every linked-to page;
  2. pre-fetch only the linked-to pages whose links appear above the fold, i.e. the links shown in the browser window before scrolling (every other linked-to page is pre-fetched only when it’s link appears as the page is scrolled);
  3. pre-fetch linked-to pages whose links intersect the current cursor trajectory;
  4. pre-fetch a linked-to page when the browser displays the link’s tool tip (the hover necessary to activate the tool tip distinguishes genuine interest in the link from the passage of the cursor though the link on its way to another part of the screen); and
  5. pre-fetch the linked-to pages frequently visited from the current page.

The pre-fetching strategy determines the number of linked-to pages a browser selects for pre-emptive downloading. The second consideration for a web page pre-emption algorithm is the amount of information to download for each page.

4. Pre-Fetching Web Page Information

Web servers return the complete HTML source of each requested web page. However, even after selecting the most efficient pre-emption strategy, downloading the whole page can be inefficient, especially if users do not subsequently request the page. The web page levels of detail (LODs) technique enables web servers to return a range of information about each web page. An implementation of web page LODs can extract many different textual and graphical levels of detail from a web page. Four of the most useful levels of detail are listed below, from the lowest level of detail to the highest:

  1. the information contained in the <head> tag such as the title and the meta-data;
  2. the text contained in the heading tags (<h1>, <h2>, <h2>, etc.);
  3. the first 1000 characters of text; and
  4. the full page of HTML.

The first level of detail provides the least amount of information about a page. However, the browser’s pre-emption algorithm can use the title of the page to provide an enhanced tool tip: whenever the cursor hovers over a link, the browser displays the title of the linked-to page. The title of the linked-to page will likely be more descriptive than the title of the link recorded in the title attribute of the <a> tag.

The second level of detail returns the headings in a linked-to page. The browser can use this information to display a structural overview, such as a table of contents. Web page LODs are accumulative, which means that each subsequent level of detail includes the detail in the previous levels. Therefore, the browser can head the table of contents of a linked-to page with the title of the page using a single web server request.

The third level of detail provides the first 1000 characters of marked-up text in a linked-to page. The first 1000 characters represent the content of the page above the fold, which is the region users read most often. The browser can use the first 1000 characters to create an introduction to a linked-to page.

The fourth level of detail returns the complete HTML source of a linked-to page. Content referenced by the page, such as images, audio clips, Java Applets and Flash files, is downloaded only when users request the page.

The following list presents two strategies for selecting the amount of information to download for each page selected for pre-fetching:

  1. choose a level of detail and download that level of detail for each page; and
  2. download the first level of detail for each page followed by each subsequent level of detail for each page, up to a specified maximum level of detail.

The first strategy limits the amount of information downloaded for each page to a specified minimum; the second strategy provides increasingly detailed information up to a specified maximum.

Although web page LODs minimize the number of bytes returned by a web server for each page, one HTTP request is still required for each level of detail. The HTTP packages technique minimizes the number of requests to the same web server by grouping them into a single request.