Research Issues for Virtual Documents

Carolyn Watters and Michael Shepherd
Faculty of Computer Science
Dalhousie University, Halifax, NS, Canada B3J 2X4

An electronic document consists of both the content and the links associated with that document. Therefore, documents on the Web may be composed of one or more Web pages [Crowston & Williams, 1999]. Such documents may be static and persistent or they may be generated dynamically and be virtual. A virtual document is a document for which no persistent state exists and for which some or all of each instance is generated at run time [Watters, 1999]. A virtual document can then be multiple pages, a guided tour, Java applets, or application results, and may or may not have associated links. The content may be defined by tags, a template, a program, a database query, or by some application. Virtual documents have grown out of a need for interactivity and individualization of documents, particularly on the web.

The paradigm of the Web has shifted our expectations for access to information. Previously, we accessed information by the retrieval of electronic copies of documents from a large repository of relatively static information. We now expect to access information through the manipulation of a large collection of information resources. Some of these resources are documents and some of these resources are processes that create documents. In addition, the role of user is shifting from reader to active participant and author. Users expect hypertext functionality to be available with digital documents, i.e., users expect to be able to make comments and annotations, to be able to initiate discussion, and to be able to add content and links while reading, both individually and collaboratively.

Research Issues

A number of interesting research issues must be resolved surrounding these virtual documents on the Web. These issues cover a wide range and are described briefly below.

Generation - At what point in time is a virtual document defined? A virtual document can be defined by an author through the use of templates and links or it can be defined as the result of a search or application. Guided tours can be generated dynamically, based on an information need as defined by a user profile and/or an explicitly stated query.

Search - How do you search for virtual documents? What is the domain in which to perform the search? Will the document exist by the time the user requests it?

Revisiting - Users have an expectation that documents found once will be available on a subsequent search. The notion of bookmark does not apply to virtual documents in its normal, simplistic way. Bookmarks need enough information to recreate the document as it was.

Versioning - Version control has long been a concern of Information Retrieval research and is now a central issue for management of virtual documents. Users need to be able to return to a bookmarked version of a virtual document and to go forward and backward in time through changes to that virtual document.

Authentication - Who is responsible for the quality of the contents of a virtual document where components may come from a variety of sources and /or processes?

Reference - How do authors cite virtual documents or versions of virtual documents?

Annotation - The roles of user of information and supplier of information are merging. Readers expect to be able to add data, such as, comments, annotations, paths, and links, as well as content, while they are reading.

Summary

The web has not only increased the scale of information retrieval systems and applications but has also introduced a new variation of the notion of document. Basic research is required to provide the same level of understanding and measures of effectiveness and efficiency of access to virtual documents as has been achieved for persistent documents.

Crowston, K. and M. Williams. 1999. The Effects of Linking on Genres of Web Documents. Proceedings of the Thirty-Second Annual Hawaii International Conference on System Sciences. Maui, Hawaii. CD-ROM Publication

Watters, C. 1999. Information Retrieval and the Virtual Document. Journal of the American Society for Information. To appear. Hawaii International Conference on System Sciences. Maui, Hawaii. CD-ROM Publication

Watters, C. 1999. Information Retrieval and the Virtual Document. Journal of the American Society for Information. To appear. postÕ¤ÝÊXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXcate •Gf’Õ¤ÝÊL²ÿWI³y 2¹»tcat @•Gf’Õ¤ÝÊL²ÿWI³y 2¹»ÿÿþaux ASURL9http://www.cs.dal.ca/~shepherd/www8/workshop/virtual.html