HT'98
Demos and Posters |
Separating Textual Contents from Structures for Reading
Hypertext Structured Medical Records
Vincent Brunie (1, 2), Pierre Morizet-Mahoudeaux (1),
Bruno Bachimont (3)
(2) UMR CNRS 6599 Heudiasyc
University of Technology of Compiegne
BP 529, 60205 Compiegne cedex, France
Tel: (33) 1 40 77 96 06
Fax: (33) 1 45 86 56 85
E-mail: brunie@biomath.jussieu.fr
(1) Service d'Informatique Medicale de l'AP-HP
91 boulevard de l'hopital, 75634 Paris cedex 13, France
(3) Institut National de l'Audiovisuel
4, avenue de l'Europe, 94366 Bry-sur-Marne cedex, France
Hypertext systems are typically organised as sets of nodes and links.
Nodes represent data and links support the navigation functions. An efficient
implementation of a hypertext system is to use a structural markup scheme,
for example SGML, to build structured representations of the documents
contained in the hypertext. In this case navigation is driven by the documentary
structures rather than relying on explicitly marked links. Furthermore,
this structural approach is efficient for automatically generating links
and for building a typology of links.
We have used this approach at the Pitie-Salpitriere Hospital, Paris,
France, to implement a prototype of a computerised medical record, called
Hospitext. This experiment has shown that the efficiency of this approach
relies on the existence of synthesis documents, which correspond to medically
standardised readings of the record. The Hospitext system is equipped with
tools that automatically collect textual units and handle pointers, which
refer to their position in the structures, for dynamically building new
documents corresponding to these standard readings of the record. These
synthesis tools are grouped into two categories covering the range of all
the syntheses, which we have enumerated in the hospital. The first category
groups tools, which build traditional navigation devices: table of contents,
indexes, navigation buttons, etc. The second category corresponds to more
sophisticated tools, providing medical added value readings: summaries
of long documents, graphs, patient discharge summaries, etc.
This first experiment has already shown several limits of this hypertext
documentary representation approach for using synthesis tools. The main
weakness is that the synthesis documents generated by the system contain
textual information, which is redundant with textual information contained
in the original documents:
an efficient updating management of information data is not possible
with redundancy,
it is not possible to have structural links between the generated and
the original documents, thus making it difficult for the reader to build
associative relationships,
it is not possible to build a structure representation spanning across
several documents (which we call inter-documentary structures). Effectively,
synthesis documents are just replications of parts of other documents,
and thus cannot represent the corresponding inter- documentary structures.
They are the material proof of the existence of these structures, but cannot
represent them,
it is not possible to represent multiple levels of structures for the
same document content, since the structured representation scheme is hierarchical.
We propose a new representation scheme for hypertexts to resolve the
problems addressed above. It is based first, on separating the textual
contents from the structures, and second, on dynamically building documents
for reading. The initial goal of this scheme was to allow the representation
of multiple levels of structure for the same document, but it has shown
to be also a good solution to the four addressed problems.
This representation scheme is based on two main categories of objects.
First, contents are built from textual parts of the documents. They are
composed of lexicographic characters standing for themselves. Contents
have an addressing mechanism on the whole record, thus allowing identifying
unambiguously any character. In addition, addresses have a partial order
relation corresponding to the order in which the characters appear in the
original documents. There is no need for the order relation to be complete,
so the addressing scheme may be multidimensional. Second, structures are
tag trees, representing a valid instance of a SGML Document Type Definition
(DTD). Each tag is a SGML-like tag, bound to an address pointing to a content.
Each tag is then composed of a label, a list of attributes, and an address.
There are two types for tags: opening tags and closing tags. The address
points to the first character after the tag insertion point. Each tree
represents an original document, or a generated structuring document.
The data model is managed by several operators consisting in inputs,
increments, and outputs. Input operations consist in adding a structured
document to the hypertext. Increment operations may add contents, structures,
or both, to the hypertext. This can be done automatically by synthesis
tools, or manually by annotation means. Outputs are based on the projection
operation. This is the operation consisting in building an output tagged
document from contents and structures. Projected units are documents (structure
trees), which select a set of content items. Each item can then be output
in the order prescribed by the main structure. The output is a tagged document
of the type defined by the projection. Moreover, a projection must define
how content sharing between several structures should influence the output
process.
We have yet implemented the kernel of a hypertext system corresponding
to this representation scheme. It has been developed by using the Java
programming language. The system is able to take SGML and XML documents
as input, to build their representation as a hypertext, to add structuring
documents with a number of synthesis tools, and to make projections on
a HTML based displaying browser.
This prototype has been used for building a hypertext system containing
ten computerised medical records. It has shown the technical validity of
the approach, and provides a kernel for future experiments. Although many
developments and improvements are still necessary to obtain an entire hypertext
system comparable to the Hospitext prototype, it has yet provided efficient
solutions to the problems presented above with the Hospitext approach.
The next stage of our project is to build an experiment comparable in
volume to what was made for developing the Hospitext system. We intend
to build a prototype containing hypertextual medical records based on the
scheme proposed in this paper. Automatic synthesis tools implemented in
the Hospitext prototype will be implemented in this new architecture. New
synthesis tools will be experimented. More precisely, we will provide the
user with the possibility of specifying personalised syntheses based on
several levels of structures. They could permit, for example, to request
syntheses of all the biological results appearing in the introduction or
conclusion of a given patient report. They could also make it possible
to build synthesis documents corresponding to given annotations such as:
"all the bacteriological reports tagged as important", for example.
|