Technical Paper 3 – An Object-Oriented GIS – Issues and Solutions

by Arthur Chance, Richard G. Newell & David G. Theriault

Abstract

The current generation of software tools is inadequate to satisfy the wide diversity
of GIS requirements in a seamless manner. In addition the tools provided to develop
user applications or to customise current GIS offerings exacerbate the problem.
Object oriented programming systems (OOPS) are now recognised as a key component
in building powerful applications which are robust and maintainable and which are
also to be seamlessly extendible. Unfortunately, many myths surround OOPS: that
they are difficult, simply fashionable, or inherently slow. This paper will propose
that an OOPS coupled with an interactive programming environment can be highly
effective when applied to the demanding requirements of GIS. It describes a rich,
but user-friendly, polymorphic, exemplar- based environment which supports today’s
emerging standards and which has proven highly appropriate to the development and
implementation of GIS.

Introduction

The largest costs incurred for any organisation embarking on the implementation
of a GIS are data conversion, hardware and software, and system implementation,
particularly customisation. It is now recognised that no GIS on the market today
does it all, thus implying that the missing functionality has to be added to the
base system in order to satisfy any particular customer. Yet rarely do you see,
in the many papers describing how to choose a GIS, the need for assessing the ability
of a system to be customised. It is common for the next most expensive item after
data conversion to be customisation.

Most checklists for choosing a GIS concentrate on long lists of superficial functionality,
and little is said of an assessment of the quality of basic system fundamentals
and foundations. The most important fundamentals to get right in a technology as
complex as GIS are the basic database architecture, and the data models implemented
within that architecture together with the software architecture needed to extend
the system at low cost.

During the last ten years or so, enormous strides have been made in the development
of hardware. Processors are at least 20 times faster, memory is 500 times bigger,
disks are hundreds of times bigger. The advance in hardware has fuelled a parallel
advance in fundamental database software. The relational model has advanced from
the era when there were doubts as to whether it could be made to work at all to
now being the dominant technology.

In contrast, the advance in the development of tools to develop and adapt software
systems is nothing like so dramatic. We all know that most of the world’s commercial
software is still written in Cobol, most of the technical software is still written
in Fortran. Even “modern” languages such as C do not give a whole lot more in productivity
and maintainability when compared to the huge advances in other technologies.

The advent of CASE tools and 4GLs make some attempt to address the issue. It is
our belief that the fundamental system architecture itself has to be rethought
if large strides are to be made. This paper proposes an interactive hybrid object-oriented,
procedural language as a key component of such an architecture. We are well aware
that the term “object-oriented” is frequently bandied about with very few people
understanding what it means. We also attempt in this paper to explain what the
technology means, and why it should be of significance to the implementation of
GIS.

Conventional System Structure

Traditionally, large interactive systems were developed in a non-interactive procedural
language, such as Fortran or C. In order that an end user could drive such a system,
an interactive command language was provided so that the user could type in his
commands. Many command languages evolved into programming languages sometimes by
borrowing the programming concepts of Basic. In more modern user interfaces, this
command language may well be hidden behind a system of screen menus, tablet menus
or other input devices. Large modular systems were glued together using an operating
system commands or script (see Figure 1). Within the system, developers and customisers
had a number of other languages available to them to define such things as syntax,
menus, data descriptions, graphic codes, etc, all running as different processes
communicating by files.

The structure of such systems was commonly organized at the highest level around
the command syntax and complex commands were structured in a top down approach
from there. If one examines systems that have been put together in this manner
over the last five to ten years they all suffer a number of difficulties:

  • Development is slow. Users’ requests for enhancement have to wait for the next
    release, which is usually over a year away.
  • They are difficult and expensive to maintain. During the life cycle of the
    system, probably 90% of development goes into maintenance.
  • Major restructuring in the light of five or ten years of hindsight is unthinkable.
  • Customisation is arbitrarily done in one or more of the many languages used
    to put the system together, typically Fortran or the command language.
  • Integration with other systems is nearly impossible.

[ Figure 1 not available ]

Much effort has gone into toolkits for developing and customising systems including
standard graphics libraries, user interface managers, data managers, windowing
systems, etc. However, if one wishes to get in and program or customise any of
these systems, one is confronted with operating system commands, Fortran, C, SQL,
embedded SQL, some online command language, domain specific 4GLs or a combination
of these; not to mention auxiliary
“languages” to define user syntax, menus, data definitions, data dictionaries,
etc. With these kinds of programming tools it can take many man-months of skilled
programmer effort to achieve even modest system customisation.

Development Languages

It is a common problem with systems that contain parts that are front ended by
different languages that it is not possible to integrate them properly. For example,
a graphics system for mapping, which is “hooked into” a database, typically does
not allow the full power of the database to be accessed from within the graphics
command language, nor can the power of the graphics system be invoked from within
the database query language. What is really needed is a system such that all data
and functions can be accessed and manipulated in one seamless programming environment
(Butler 1988).

What has been shown by a number of organisations is that the same development
carried out with an on-line object orientated programming language can cut such
development times by a very large factor (e.g. 20). Object orientation does not
just mean that there is a database with objects in it, but that the system is organised
around the concept of objects which have behaviour (methods). Objects belong to
classes which are arranged in a hierarchy (preferably a heterarchy). Subclasses
inherit behaviour and communication between objects is via a system of message
passing thereby providing very robust interfaces.

Firstly, such a language should be able to provide facilities covering the normal
requirements of an operating system, customisation, applications programming and
most systems programming. Secondly, the language should have a friendly syntax
that allows casual users to write simple programs for quick online customisation.
Larger programs should be in a form that is easily readable and debuggable. Thirdly,
the language must be usable in an interactive mode.

There are several languages around which satisfy some of these requirements: Basic
is alright for simple programs (large impressive systems have been implemented
using some of the more advanced modern Basics); Lisp has much of the power and
speed, but is hardly readable (however, much of the success of Autocad may be attributed
to Autolisp). Smalltalk has both speed and object orientation, but with the total
exclusion of any other programming construct. Postscript is very widely used and
has a number of the desired features, but is another write-only language (i.e. “unreadable” by
anyone, including the original programmer). Hypertalk is wonderful, but you would
not write a large system in it. C++ has much of the required syntax and semantics,
but it is not available as a front end language and can therefore only be accessed
by a select few system builders, normally employed by the system vendor.

Having dismissed most of the well known languages developed during the last 30
years, then what is required? It is an on line programming language, with a friendly
Algol-like control structure and the powerful object oriented semantics of languages
like Smalltalk.

What Is Object Orientation?

The dominant style of programming has always been procedural, the structure of
the program being organised around the functions being performed, usually in a
top down hierarchy of procedure calls. An object oriented programming language
is one where the program is organised around the objects being processed, usually
in a hierarchy of objects which can share (inherit) the procedures (methods) belonging
to other objects.

Object-orientation is not an easy concept to explain, however, its importance
is not in doubt. Object orientation will become the dominant approach to structuring
and building large complex systems in the future.

Efficiency in the development of computer systems depends on how easily they are
modified and enhanced. Changes in an evolving system are either concerned with
changes in function or changes in data structure. Procedural programming does a
reasonable job in localising changes to function by means of such devices as routine
libraries, but changes to data structures usually mean a cascade of side effects
as the data structures are referred to in many parts of the system. Object-oriented
programming goes a long way to localising these changes also.

An object comprises two things, its own state (manifested as a set of instance
variables) which no other part of the system can access directly and a set of procedures
(called methods) which describe its behaviour. Everything about an object is encapsulated
within it and the only way of getting data out of it, or changing it, or getting
it to do something is by sending messages to it. An object is a rather sophisticated
extension of the concept of variable in other languages.

Quite frequently, objects of different classes might be similar, in that one class
might exhibit all the behaviour of another plus some additional behaviour. In other
words, one class may share a number of other classes’ methods, but also has a different
version of some methods and some new methods of its own. In these cases one class
can be defined as a subclass of another so that it can inherit the shared behaviour.

Object orientation, which embodies the above concepts of encapsulation and inheritance
brings the following benefits:

  • Code sharing is greatly increased, thereby increasing programmer productivity.
  • System modules communicate via well defined interfaces, which means it is easier
    to find bugs.
  • It is easier to maintain software: both minor enhancements and sometimes major
    restructuring. Maintenance costs are reduced.
  • The environment is not only extremely good for prototyping, but also can be
    used as a base for a production system.
  • User interfaces can be iterated to their optimum much more easily.
  • Object orientation is very suited to the manipulation of heavily structured
    data.

One of the commonest reservations echoed about object orientation is whether it
can be made to work with acceptable performance. One recalls the same remarks being
made about relational databases nearly 15 years ago.

It is true, object oriented languages do run slower than procedural ones. This
is mainly because the message expression may take more time to evaluate and also
programming style tends to lead to large numbers of messages being sent (procedures
being called) to rather small methods (procedures). This performance issue is now
considerably offset by improved compiler techniques, faster hardware, and using
a procedural approach where appropriate.

In any case, much customisation today is carried out by writing macros in the
system command language. As these are run via an interpreter, they are far slower
to execute than a properly implemented object-oriented language.

At the end of the day, the benefits to the system customiser are that he can perform
major system enhancements in a fraction of the time taken with a conventional system,
and for the end user, a much richer functionality in his system and far fewer delays
in waiting for enhancements.

Object Oriented Databases

One often hears the term object-oriented applied (sometimes wrongly) to many kinds
of systems. So what does the term object-oriented database mean? At first sight
it seems strange for a term which was originally used to describe a programming
language, usually in comparison to procedural languages.

Now when one comes to databases, all conventional databases are (sort of) object-based.
In a relational database, the objects around which the data are organised are tables,
records and fields. There are some higher level objects, such as views, but there
is no explicit representation of the highest level objects such as real world things.
Much of the more recent development has been how to embody high level semantics
in the model, but still, the database itself does not embody itself any of the
behaviour definitions of the objects it contains (Oxborrow 1989).

Now, one doesn’t often hear the term procedural applied to databases, although
one recalls the work by Martin Newell (Newell 1975) on procedure models, in which
he built a modelling system out of procedures, and all operations on the model
(such as rendering) made calls to the procedure models. These then made the right
responses and did the required things when asked. The behaviour of the objects
to be rendered was encapsulated within the procedure models and not within the
rendering algorithms. This is similar to the encapsulation concept of object-oriented
programming languages.

The term object-oriented database is commonly used to mean that the unit of communication
to the database is an object: you put an object in, and you get an object out.
But this is a facile use of the term, since the crucial thing about object-orientation
is that the objects contain their own behaviour and therefore the database needs
to manage the procedures (methods) that define that behaviour. Further, communication
with the database should be by a system of message passing where the user is isolated
from the actual internal representations of the objects.

One view of an object-oriented database is that it is an extension of an object-oriented
language to handle persistence, queries, concurrency (multi-user), security and
integrity. Another view is that it is an extension of a conventional database to
handle procedures, inheritance and message passing. It is a moot point as to whether
the conventional divide between the ephemeral working data of the programming environment
and the persistent data in the database should be maintained or not.

Commercial object-oriented database systems are only just beginning to appear
in the market place. Relational databases suffer a number of serious short-comings
for applications in GIS (Egenhofer 1989, Frank 1988, Oxborrow 1989). Some of these
problems stem from the particular features of the implementation of current commercial
relational databases, not using the facilities of those databases in the most appropriate
manner, and the awful syntax and current limitations of SQL (Herring et al 1988).
However, the absence of real world semantics in the relational model itself means
that the tools provided are at a very low level.

All implementations of the relational model compromise to some extent Codd’s rules,
from those which are no more than tabular representations to the ones that satisfy
most. In particular, the semantics of range queries and versions are missing. However,
tabular representations are a good way to make an efficient engine for managing
persistent data. It is therefore our belief at the moment, that a practical way
to implement an object-oriented database across many platforms is to combine relational
(or tabular) technology with an object-oriented language. This allows higher level
semantics to be embodied in the object-oriented environment of the language.

The Magik Object-Oriented Programming Environment

Magik is an extremely powerful language for the implementation of large interactive
systems. The language is a hybrid of the procedural and object oriented approaches
and program development is carried out in an interactive environment. The interactive
environment allows changes to the system to be immediately tested, without a prolonged
linking process and regardless of the size of the system.

We have implemented Magik in order to build an open, seamless development environment.
The way this is achieved is by embodying the following features in the language
and its development environment:

  • There is but one language for system, application and customisation development.
  • Both object orientation and procedural methodologies are supported.
  • Development is in an interactive environment.
  • The language is expressive and very readable.
  • There is an extensive library of standard object classes, methods and procedures.
  • The language is built as a platform suitable for delivering commercial systems.
  • Applications can be transferred with a minimum of effort between hardware platforms.

It is our belief that the presence of all these features is essential if commercial
systems are to be developed, maintained and customised with a minimum of programmer
effort. It is the lack of a viable language with a sufficient subset of these facilities
that has stimulated us to produce our own which embodies all of them.

[ Figure 2 not available ]

Magik allows programs to be developed in one seamless environment, meaning that
systems programming, applications development, system integration, and customisation
are all written in one environment in the same language. Thus, end users who wish
to customise the system can be confident in the quality of the tools provided because
they are identical to the development tools used by the core and application system
developers. Further, existing systems, such as most database management systems,
can be fully integrated so that to the user they appear as part of one homogeneous
system.

The Virtual Database

It has been said that GIS could be regarded as an integrating technology providing
a window into many disparate distributed databases. If this goal is to be achieved
then an architecture is needed in which the databases to be integrated need to
be set up as servers to the single client GIS.

There is a number of shortcomings in existing available database technology for
the building of a GIS. Nevertheless, there are now many organisat-ions which have
committed in a big way to one of the emerging de facto standard database systems.
It is not acceptable for a GIS vendor to try to displace such a database with something
else tuned for GIS applications. It is necessary to engineer a solution which preserves
the user’s investment while at the same time doing as good a job as possible in
providing a GIS capability. As mentioned above, if one tries to handle all data
in the commercial DBMS, then it is highly likely that a serious performance problem
will result. If one runs a geometric DBMS alongside, then serious problems of backout,
recovery and integrity may result. It would seem that what is needed is some “virtual
DBMS” which can act as a front end to two or more physical DBMSs, and that this
should handle versioning (Easterfield et al 1990, Newell et al 1990) and have a
knowledge of all aspects to do with data dictionary, integrity, and access to the
various data. Data modelling of objects allows the user’s models to be built with
full recognition of their semantic content a key feature not provided in the relational
world (Worboys et al 1989).

We have built a low level interface between the object-oriented and tabular worlds
in which a table maps onto an object class, a record maps onto an instance and
a field maps onto a slot. Higher level abstractions are then modelled wholly in
the object-oriented world. Such a representation is ideal for many of the navigational
style queries that one undertakes in a GIS.

The User’s View Of An Object-Oriented GIS

Any system built using an object-oriented environment could also be built by other
methods, and the end user may well be hard pressed to tell the difference. Sometimes,
systems claim to be object-oriented, because they are built out of an object-oriented
language which is non-interactive. In such systems, end users are denied the major
merit of the approach, which is the ability to modify and extend the system on
line. The claim to object-orientation may be valid, but, so what, if the rest of
the world cannot use it.

Icon driven user interfaces are sometimes called object-oriented. This description
might be justified if the icons represent data objects and when clicked they know
what to do. For example on the Macintosh, clicking on a document icon results in
the appropriate word processor (drawing program, spread sheet, etc.) being started.
This differs from function icons to which you must later supply the data.

For a GIS to deserve the term object-oriented it needs rather more than an object-oriented
systems language and user interface.

The customiser of a system built with an interactive object-oriented front end
language is provided with an extremely open architecture in which he can access
and use many existing classes and their methods. He is provided with browsers to
explore this rich environment of existing functionality in order that he can utilise
it and modify it to make his own extensions to the system. The analogy has been
made to the hardware designer who is given an extensive library of standard components
of which he knows how they perform to given inputs, even though it is unnecessary
to know how they work internally. However, in the object oriented world he can
also make his own components (classes) which can borrow behaviour from one or more
existing classes (multiple inheritance).

The GIS applications programmer perceives all items as objects which have their
own behaviour. Although the data and behaviour may eventually be stored in separate
locations (in our case, in a Magik object library and an underlying database),
from the user’s point of view the objects are self contained items.

As a simple example, consider the following fragment from a GIS shown in Figure
3.

The object type BUILDING understands messages, which are relevant for all types
of building, such as foot print (square metres) and volume (cubic metres), which
are then automatically inherited by HOUSE and OFFICE.

House extends the behaviour of BUILDING with, for example, approximate gas consumption
according to the rules used by the gas board knowing the volume of the house and
the number of occupants.

[ Figure 3 not available ]

From the application programmer’s point of view, he can retrieve HOUSEs from the
database using queries on stored or calculated values.

Within a GIS context, for all these objects, spatial properties such as AREA may
be inherited from SPATIAL_BEHAVIOUR. BUILDING and its sub-classes could understand
and respond to messages like ADJACENT_TO, CONTAINED_IN, NEAR_TO, etc. Should a
user wish to enforce his own definition of say, NEAR_TO for a HOUSE he could do
so.

One step further is that geometry (one of the spatial attributes) is also treated
as an attribute of an object. That is the geometry of a HOUSE can be retrieved
or changed by using methods which operate on the object HOUSE, the fact that the
actual geometry may be stored in many separate tables in an underlying database
is irrelevant for the GIS applications programmer.

Conclusions

We have been engaged in developing a new kind of software architecture for building
and maintaining large, interactive, databased applications such as GIS. The main
issue that we have tried to address is the large costs involved in developing such
systems and particularly the costs of implementation and customisation. It is our
conclusion, that object-oriented programming technology is sufficiently well understood
and gives such astounding benefits that now is the time to apply it to real commercial
systems. We strongly advocate that such an environment itself should be interactive
and that access to all objects in the system should be available in the user interface,
and not hidden deep in the bowels of the system where only the vendor’s system
programmers have access.

We advocate implementing an object-oriented database capability by front ending
a version managed tabular datastore with an object-oriented language. Existing
databases should also be accommodated in the same way, as large amounts of data
are already committed to these databases.

We observe that GIS is particularly well suited to object-orientation, and so
the benefits of the approach are considerable.

References

Butler, R. (1988). The Use of Artificial Intelligence in GIS. Mapping Awareness
and Integrated Spatial Information Systems, Vol. 2, No. 3.

Easterfield, M. E., Newell, R. G. and Theriault, D. G. (1990). Version Management
in GIS Applications and Techniques. Conference Proceedings, EGIS, Amsterdam, April
1990.

Egenhofer, M. J., and Frank, A. U. (1989). Object-Oriented Modelling in GIS: Inheritance
and Propagation. Auto- Carto 9, Baltimore, April 1989.

Frank, A. U. (1988). Requirements for a Database Management System for a GIS.
PE & RS Vol. LIV, No. 11, November 1988.

Herring, J. R., Larsen, R. C. and Shivakumar, J. (1988). Extensions to the SQL
Query Language to Support Spatial Analysis in a Topological Database. Proceedings
of GIS/LIS ï88, Vol. 2, San Antonio, Nov 1988.

Newell, M. E. (1975), The Utilization of Procedure Models in Digital Image Synthesis.
Doctoral Dissertation, Dept. of Comp. Science, University of Utah, Summer 1975.

Newell, R. G., Theriault, D. G. and Easterfield, M. E. (1990). Temporal GIS Modelling
the Evolution of Spatial Data in Time. Conference Proceedings, GIS Design Models
and Functionality, Leicester, March 1990.

Oxborrow, E. and Kemp, Z. (1989). An Object-Oriented Approach to the Management
of Geographical Data. Conference Proceedings: Managing Geographical Information
Systems and Databases, Lancaster University, September 1989.

Worboys, M., Hearnshaw, H. and Maguire, D. (1989). The IFO Object-Oriented Data
Model. Conference Proceedings: Managing Geographical Information Systems and Databases,
Lancaster University, September 1989.

Top