Technical Paper 11 – Use of an Integrated CASE Tool for GIS Customization

by Gillian Kendrick and Peter Batty

Abstract

The implementation of large corporate GIS systems places heavy demands on the
customisability of GIS products. This paper examines the use of an integrated CASE
tool in GIS customisation. The authors start by discussing some of the common problems
which designers of GIS systems face in developing new applications. The paper describes
the or features which a CASE tool should include in order to address these issues.
The similarities between the technology which can be used to underpin both a CASE
tool and a GIS will be mentioned.

The paper describes the practical use of a CASE tool in application development.
It also discusses the management of multi-user application development and the
benefits gained from utilising a CASE tool in the development process. It is shown
that the use of an integrated tool facilitates the adoption of a new incremental
design methodology.

Introduction

One of the greatest costs in large GIS implementations is that of customising
the basic system to meet an organisation’s specific requirements. There are many
aspects to this customisation. The user interface of the system may be tuned to
speed up data capture or else to streamline the execution of particular queries.
The way in which object’s spatial attributes are displayed must be specified. However,
one of the most significant parts of any customisation involves the design and
maintenance of the application’s data model.

Data modelling is a complex task for most computer applications, but this is particularly
true in GIS for several reasons. The first of these is the large number of classes
of objects which are involved in these systems. Each implementation will typically
involve hundreds of different object classes. The second factor is the number and
variety of relationships between these objects. Relationships are of three main
types: aggregation and association, e.g. a building is ‘part of’ a school; spatial,
e.g. a house is ‘near to’ a lake; and topological, e.g. a valve is ‘connected to’
a pipe. Of these types, only the first may be explicitly represented in terms of
traditional relational joins. The others will be derived indirectly from spatial
attributes of the objects or else through the interaction between these spatial
attributes.

For many organisations, the GIS will only form a part of the corporate computing
system. There will be existing data bases holding data such as customer information.
Some of this data may be relevant to the GIS, and therefore part of the customisation
will involve integrating these existing data models with that of the GIS (Bundock & Theriault
1992). This integration will again increase the size and complexity of the overall
design.

Another aspect of GIS implementations, which has an impact on the maintenance
of the data model, is that user requirements change. GIS projects typically have
quite long life cycles, and the technology is relatively new to most users. This
means that at the outset of the project they may not realise the full capabilities
of the system. As new applications are developed and added to the system, changes
are required in the data model.

Since GIS projects involve capturing large amounts of data, it is critical that
these changes to the data model can be made without losing or compromising any
data which is already stored in the system.

With most traditional GIS software, data model evolution is a difficult issue.
This has meant that a long period of time has been spent at the beginning of the
project on requirements analysis and data model design to try to make sure that
it is exactly right, which of course is rarely achieved. This has had to happen
before any other work could begin, meaning that there has been a very long delay
for users between the purchasing of a GIS and the start of productive use of the
customised system.

This paper describes how an integrated CASE tool can be used to address all of
these data modelling issues. This in turn radically changes the way in which one
can approach the problem of customising a GIS, using an interactive design methodology.

What is a CASE Tool?

The acronym CASE stands for Computer Aided Software Engineering. It is used to
describe a variety of computer-based tools which can be used to assist in the design
and development of computer programs. Such tools have been developed for various
purposes including the analysis and documentation of procedures and data flows,
and the design and documentation of a data model. It is the latter function which
we consider in this paper as this is the most appropriate kind of tool to help
with GIS customisation.

A CASE tool which is to be used for data model design must display a graphical
representation of the model. This is usually in the form of an entity-relationship
diagram which shows the entities, or classes of objects, and the relationships
between them. Such a facility gives the designer a good picture of a large design.
The tool allows plots of the diagram to be made. These can be included as part
of the documentation of the overall design.

The tool allows the designer to interact with the spatial representations of object
classes and relationships. A graphical user interface (GUI) provides an environment
in which the attributes and behaviour of the objects and relationships can be edited
and queried. The tool produces automatic documentation on selected parts of the
design.

One of the ways in which a CASE tool can reduce the time taken to implement a
design is by automatically generating the code which creates the data model. This
facility has two main benefits. Firstly the designer no longer has to worry about
implementation details; he can concentrate on the task of data modelling. Secondly,
the automatic code generation speeds up the process of creating data mode]s, and
avoids programming errors.

The tool should also provide facilities which enforce the correctness of the data
model. It should validate that the design is consistent. Checks made by the tool
at the design stage will reduce the amount of time spent in tracing bugs by the
designers at a later date. This will again speed up the delivery of finished applications.

This section has described general requirements for a CASE tool which is to be
used for data model design. The next two sections describe in more detail some
of the requirements for CASE tools which are to be used in GIS. The first section
covers general requirements for tools for GIS data model design. The second considers
the extra requirements for an integrated tool and covers some of the benefits provided
by such a tool.

CASE Tools for GIS

Multi-User Design

It was mentioned in the introduction that one of the reasons for the complexity
of GIS data models was that they involved a large number of classes of objects.
It will usually be the case that more than one designer will be involved in the
development of these objects, their relationships and their behaviour. This means
that it is important that the tool used to develop the data model is multi-user.

Each designer should be able to work on additions or modifications to the data
model without being affected by changes which others are making to different parts
of the design. These changes will also be made over a long time scale – maybe hours,
days or weeks. The designer may wish to try out several alternative versions of
the data model in order to settle on a best design. When a new part of the model
has been completed or re-engineered, the tool must offer support for ‘merging’
the new work in with the rest of the design.

The requirements of CASE tool database technology in this area, involving long
transactions and version management, are very similar to those of GIS. (Easterfield,
et al. 1990).

Modelling of Spatial Data and Topological Relationships

GIS data models are special because they require the user to model spatial attributes
of objects. New data types such as point, area, chain, raster and grid are supported.
The tool also models the topological relationships which will exist between some
objects. For example a valve location ‘connects to’ a pipe centreline, a land parcel
‘shares’ its boundary with another land parcel.

These aspects of GIS data models should be supported in a CASE tool which is to
be used in GIS customisation.

Integrated CASE Tools for GIS

For high level conceptual design, it is not necessary to have a close coupling
between the CASE tool and the DBMS used to implement the system. However, if the
CASE tool is to be used for the physical database design and to generate code,
then a much closer link is required between the tool, the DBMS, and the application
development environment used to implement the actual system. The requirements for
the tool if it is to achieve this higher level of integration can be separated
into two parts.

First, the CASE tool must support all of the data modelling concepts supported
by the GIS. These include things such as the datatypes, relationships and other
system specific concepts such as triggers, validators, and enumerators. For object
oriented systems, the CASE tool should also manage the behaviour for the object
classes (Booch 1991). In particular for GIS the CASE tool needs to manage spatial
information and topological relationships. The CASE tool may extend to covering
aspects of the user interface of the application, such as which fields are visible
to the user and what sort of interface is used to edit objects and their individual
fields.

Second, the CASE tool must generate code in an appropriate format which implements
the data model which has been designed. in the case of the work reported here,
the tool produces a script in the language Smallworld Magik.

These requirements are to a certain extent independent of each other. If the first
one is met, but not the second, then it is at least possible to use the CASE tool
to do the logical design for the system, but the code to implement this then has
to be written manually. on the other hand, it is possible to have a CASE tool which
supports a subset of the concepts supported by the DBMS (with GIS, for example,
a CASE tool which supports alphanumeric datatypes but not spatial datatypes). This
could be made to output code in an appropriate format for the DBMS, but further
development work would then be required in the DBMS environment (outside the CASE
tool) to incorporate any of these additional concepts in the application. In this
situation it is likely to be much more difficult to usefully maintain the data
model using the CASE tool after its initial creation, since it does not know about
any of the changes which have been made to the data model within the DBMS environment.
Clearly, the CASE tool is much more useful if it meets both of these requirements
in full.

A CASE tool which can generate the data model for a GIS application automatically
is a very useful part of a development tool kit. There are two more facilities
however which are necessary in order to make the tool truly useful for corporate
GIS design.

Integration of Existing Data Models in the GIS Data Model

Something that was mentioned in the introduction to this paper is that a common
requirement in GIS implementations is that existing (legacy) databases, and therefore
data models, should be integrated into the system. It will often be the case that
there are relationships between objects which have been created as part of the
GIS and those which belong to these other databases.

As an example consider a large utility company that has a customer information
database. It is undesirable to move this data to the GIS as many other applications
will already be designed to run on it. Part of the requirement of the GIS application
is to associate spatial information with those customer records.

To facilitate this integration, the CASE tool can ‘reverse engineer’ the data
models from these existing databases. The objects can then be integrated into the
whole design within the domain of one tool.

Maintenance of the data model

GIS applications involve large amounts of data, and as mentioned before, they
are also particularly prone to having the user requirements change during the life
cycle of the project. If a change in the user requirements means that the data
model must be modified, then it is essential that facilities are provided to ‘forward
engineer’ the populated database to be consistent with the new datamodel.

A common feature of CASE tools is that they generate the code which creates a
design. It is less common to generate the code which will ‘forward engineer’ an
existing design when the data model definition held in the tool has been changed.
A CASE tool for GIS should offer support to evolve a database from one schema to
another.

Features of an Integrated CASE Tool for GIS

Based on the previous sections, we can summarise that an integrated CASE tool
should have the following features:

  • Support all of the data modelling concepts which are supported by the GIS
    DBMS
  • Generate the code which will implement the data model in the GIS DBMS ‘reverse
    engineer’ existing data models so that their schema can be integrated into the
    overall design for the system
  • ‘forward engineer’ populated databases when the design of the objects and
    relationships is updated.

Such a tool provides significant benefits in the speed of GIS customisation and
also in the maintenance of the applications throughout their life cycles.

Similarities Between a GIS Application and a CASE Tool

Several of the capabilities provided by a CASE tool are similar to those which
are found in a GIS.

Firstly, in both systems, the user creates objects which combine both spatial
and alpha-numeric attributes. In the GIS, these objects are things such as houses,
pipes or rivers. The alpha-numeric attributes of a house are things such as its
address and owner, and its spatial attributes are its position and footprint. In
the CASE tool, the objects which the user works with are the definitions of the
classes of objects which are to be used in the GIS application. For example, a
CASE tool object has an alphanumeric attribute which holds the name of an object
class, this could be ‘House’. The spatial attributes of these objects are used
to position them in the entity-relationship diagrams. Similarly, the CASE tool
stores relationship objects, their attributes would include the type of the relationship,
e.g. ‘part of’ or ‘connected to’.

In both the CASE tool and the GIS, users can query and edit the properties of
the objects through a GUI. This interface provides facilities for plotting and
reporting.

One of the areas in which a GIS and CASE tool are most similar is in their need
to provide support for multi-user working in a long transaction database. This
should not be surprising as both systems are used as design tools as well as information
systems.

The Smallworld CASE Tool

After a consideration of the benefits of using an integrated CASE tool, and a
survey of the tools currently available on the market, Smallworld chose to implement
its own. Because of the similarities outlined above between GIS and CASE, it was
possible to implement the tool as a specific GIS application with its own data
model.

Working with the CASE Tool

The tool is activated during a normal GIS session. The designer can have both
the GIS application and the CASE tool running at the same time.

[Figure not yet available]

Figure 1. Integration of GIS and CASE in one environment

Application Development with the CASE Tool

Smallworld GIS contains a powerful data dictionary which supports many advanced
data modelling features. These include the definition of new data types, triggers
and validators. The data dictionary is capable of storing the definition of object
classes, their attributes, behaviour and also the details of the topological and
associative relationships between them. The data dictionary can manage many different
versions of a data model and allows the designer to look at these different versions
within one GIS session.

The CASE tool operates directly on the GIS application’s data dictionary. If the
GIS application involves integration with external databases, the data models held
in these databases are reverse engineered into the tool. Once present in the tool,
the user can add spatial attributes to the objects, and can also define relationships
between them and the objects which live in the main GIS database. The user can
also add behaviour to these external objects.

The CASE tool supports a steady progression from a logical to a physical definition
of the required data model. It does not demand that the data model is completely
defined but will advise on areas which are not yet complete. Facilities are provided
which allow the designers to document the model extensively as they are working.

Developing the Data Model

As already mentioned, the CASE tool is a multi-user facility. We will now describe
how many users can work on the same model.

[Figure not yet available]

Figure 2. Version management of both schema and GIS data

In figure 2, the boxes to the left of the dotted line represent the different
alternatives in the CASE tool’s database. Beneath the Top alternative there are
two others where the designers Phil and Betty can work independently. The user
Phil also has another two sub alternatives where he can try out different designs.
The area to the right of the dotted line shows the alternative tree in the main
GIS application. Beneath the Top, there are two alternatives Live Data and Test.
The alternatives beneath the Live Data alternative are those where the GIS application
is being used. The GIS applications may include data capture, analysis of existing
data and design. The part of the tree under the ‘Test’ alternative is used by CASE
designers to try out their developments. When Phil decides that he has completed
a part of a design in the alternative ‘Design 1’, he can make the CASE tool ‘apply’
the new data model to the alternative ‘Test 1’. At this point, he can immediately
start testing the new data model in the GIS application. If after testing, the
new part of the design is found to be a success, the changes he has made can be
‘posted’ up to the Top of the CASE database. If the design was unsatisfactory,
he can proceed to improve it and then ‘apply’ the new changes.

When changes have been made to the design which have been thoroughly tested they
can be applied to the Top alternative of the GIS application’s database. That alternative
will have its data model forward engineered’. Data model changes then spread through
the alternative tree as users at lower alternatives request to see them.

[Figure not yet available]

Figure 3. Part of a GIS application’s data model

The use of an integrated CASE tool in GIS allows designers to develop data models
iteratively. There no longer needs to be such a great reliance on getting the design
right first time. This in turn means that the end users can rapidly be presented
with a prototype system which allows them to more easily understand what the system
can offer. They are therefore able to offer much more feedback in the design of
the system, leading to the development of systems which are much better suited
to their requirements.

Data Model Re-use

Another feature of the CASE tool is that it facilitates the re-use of pieces of
design. It can archive complete sections of data model in a format which can be
read in by another tool. This has enabled the setting up of a library of designs.
This greatly speeds up the development of new applications as they rarely have
to be designed from scratch. An existing ‘template’ data model can be loaded in
to an applications CASE tool and this can be thought of as the first prototype.

Conclusions

Since Smallworld commenced using an integrated CASE tool for application development
the following benefits from this approach have been found:

  • Customisation of the system has been made more accessible. The tool has reduced
    the size of the knowledge barrier which designers have to overcome, as they no
    longer have to be trained in how to talk to the underlying DBMS directly. They
    can instead spend more of their time with design considerations. This means that
    more people are able to customise the system with less initial training
  • The tool has enabled the creation of applications which better meet the customer’s
    requirements through the interactive development approach. Users are given the
    chance to interact with a version of the application at an early stage. They
    can then refine their requirements, the developers can evolve the design and
    the tool will evolve the database.
  • Designs are more easily re-used. The archiving facility in the Case tool allows
    parts of designs to be stored in a form where they can be loaded in to another
    tool. This means that a library of commonly used parts of designs can be set
    up. When a new application is being developed, much of the initial data model
    can be created from these standard components. This has two main benefits: firstly
    the development time for new applications is greatly reduced; secondly, the quality
    of developed applications is improved. Time invested in improving the quality
    of the basic data model components leads to improvements in the overall system
    quality.
  • Multi-user working is easier to manage. It is always a difficult problem when
    many designers are working on the same project to avoid duplication of effort,
    and conflicts between their different designs. The CASE tool aids in these problems
    by incorporating a number of validation and completeness checks which will prevent
    the design from becoming inconsistent. The users can develop parts of the data
    model independently in separate alternatives. When they incorporate these into
    the total design by merging’ their changes, the DBMS automatically spots any
    conflicts in the design which the designer has introduced and allows him/her
    to reconcile them.
  • An Incremental development methodology means that designers can get production
    systems in place much more quickly.

References

Booch, G. (1991) Object Oriented Design with Applications. Benjamin/Cummings,
Redwood City California, 1991

Bundock, M. & Theriault, D. (1992): Integration Of Case Technology into the
GIS Environment, in AGI92 Conference Papers, Birmingham, November 1992

Easterfield, M., Newell, D.G. & Theriault, D. (1990): Version Management in
GIS – Applications and Techniques, EGIS 90 Conference Proceedings, Amsterdam, April
1990.

Top