Technical Paper 10 – Discrete Geometry with Seamless Topology in a GIS

by R. G. Newell and M. Doe

Abstract

Modern GIS systems go to great lengths to model a seamless world containing a
rich topological model. However, there is a requirement, particularly among utility
companies such as electricity and telecommunications companies to model their network
in both geometric and, possibly, multiple schematic coordinate systems. The schematics
may hold an alternative representation or a detail of par of the network. This
paper describes an elegant method of modelling such situations so that topology
is handled seamlessly and data common to the alternative

representations is not replicated. The technique, known as ‘multiple worlds ,
is a natural and straightforward extension of the spatial indexing techniques used
to handle a seamless database efficiently. One of the benefits of the approach
is that any network analysis application written to work in a single coordinate
world will work unchanged in a multiple world situation. Applications include the
integration of electricity substation schematics within a geographic database,
facilities management, multi-sheet schematic applications such as power systems
diagrams as well as P&ID diagrams in chemical plant applications.

Introduction

The work reported in this paper was carried out as a result of the requirement
to model complex network applications, particularly in the electricity supply and
telecommunications industries. The ideas presented have been developed within Smallworld
GIS. Smallworld GIS is one of a number of systems whose origins lay more in database
management foundations than in file and CAD foundations (Chance et al. 1990, Easterfield
et al.1990). In such systems, the underlying principle is that the GIS manages
a single logical database that manages both spatial and aspatial data together.
The principle extends to cases where the data is held in multiple existing (legacy)
databases.

One of the benefits of this approach is that the spatial data can be modelled
as it really is – a seamless carpet of topologically structured geometry. Such
an architecture requires different solutions to those implemented in tile-based
systems. In particular, spatial organisation is very important so the system gives
a good response time for spatial queries on very large seamless databases. There
are many successful spatial indexing schemes (Abel 1983, Abel 1984, Batty 1990,
Bundock 1987, Guttman 1984, Kriegel 1990, Libera 1986, Newell 1991, Samet 1989,
Seeger 1988, Smith 1990). In this paper we briefly describe the method employed
in Smallworld GIS. For a fuller explanation see Newell et al. 1991.

Although much has been written about managing spatial data within one coordinate
system, less has been written on managing data in multiple coordinate systems,
including cases where there is a topological structure, but possibly no corresponding
spatial coordinate data: for example, a complex junction where a large number of
inputs are connected to corresponding outputs using a connection’matrix’.

A common example of multiple coordinate systems in the electricity supply and
telecommunications industries is schematic diagrams of parts of a network where
there is too much detail to represent the structure in the ‘geographic’ coordinate
system. Such a case arises with substation schematics. Figure 1 shows such an example,
where a substation at Barnes Road has 3 separate representations, one geographic
and 2 schematic, as well as a duct containing cables represented geographically
and as a cross-section.

[Figure not yet available]

Figure 1: Geographic, schematic and cross-section representations

On the face of it, one might think that a tile-based architecture is more suited
to handling such situations, since presumably such systems have ways of handling
connections across tile boundaries. We hope to go on to show in this paper an elegant
approach that maintains connections across multiple coordinate systems without
incorporating any other compromises into the system.

Handling Topology

Smallworld GIS models the world as a collection of objects that have relationships
and attributes. Complex object relationships are modelled by one-to-one, one-to-many
or many-to-many relations that are defined in a CASE tool. Object attributes are
aspatial (represented as numbers or characters for example) and spatial (represented
as points, chains and areas). An aspatial attribute may be defined as a ‘logical’
attribute; such an attribute is not defined in the database, it is always derived
by a method on the object when needed.

[Figure not yet available]

Figure 2: Smallworld Object Structure

Topological relationships are modelled by points, chains and areas sharing nodes,
links or polygons. A chain consists of a linear sequence of links (xy or xyz coordinate
strings) and an area consists of a contiguous set of polygons. The two level structure
allows for a much richer modelling of topology than a layer based approach. Figure
2 shows an idealised representation of the structure underpinning the Smallworld
topology model. One object may have multiple geometries and many objects may share
one geometry.

Spatial Organisation

The Smallworld GIS Version Managed Datastore holds all of its spatial and aspatial
data seamlessly; there are no tiles or layers. The basic storage structure is tabular,
but unlike a conventional relational database, the access language is the object-oriented
language, Smallworld Magik. Fast spatial retrievals are achieved by means of an
improved quadtree indexing technique, which neither complicates nor compromises
the logical data structure described above. The primary key of all spatial data
contains an automatically generated ‘spatial key’. The method both clusters data
spatially on disk and provides an efficient index; Figure 3a illustrates how a
simple quadtree index can sometimes produce an inefficient index. Figure 3b illustrates
the overlapping quadtree index in which a small overlap produces additional digits
in the key.

[Figure not yet available]

Figure 3a: Traditional Quadtree Index

[Figure not yet available]

Figure 3b: Smallworld Overlapping Quadtree Index

Because the spatial key is not necessarily unique for two objects in proximity,
it is extended by an automatically generated unique identifier (see figure 3c).

       Quadtree key Unique id
      Figure 3c: Unique spatial key

Multiple Worlds

Although it is essential in many cases to model a seamless real world with a seamless
geographic database, there are also cases where some information needs to be handled
in a different coordinate system – a different world. Examples include applications
where part of the model is represented as a drawing schematic, either as an alternative
representation or as an expanded detail of part of the geographic world. An example
of the latter is a schematic that represents the internal structure of an electricity
substation. In such cases, although the geometry may be spread across different
coordinate systems, it is important to handle the topology seamlessly, so that
applications such as network analysis can work across multiple worlds unhindered
by modelling artifacts.

Implementation

Smallworld GIS handles multiple worlds as an extension to the quadtree indexing
scheme described above. The spatial index can be extended to include extra high
order bits. In this way topological tracing operations work in an identical manner
regardless of whether the geometry is in a single coordinate system or multiple
coordinate systems. Figure 4 shows how the spatial key is extended to handle multiple
worlds,

       World Id Quadtree Key  Unique id
      Figure 4: Structure of Smallworld spatial
      index.

A further extension of this idea allows the data to be partitioned according to
various non-spatial criteria. In Smallworld GIS, this technique is used to control
both drawing priority and to implement the idea of multiple worlds. A number of
high order bits are reserved to store a world ID. The next block of bits is reserved
to store drawing priority (see figure 5).

The clustering properties of the spatially indexed database will ensure that all
geometry in a particular world will be clustered on disc and it is therefore very
efficient to scan for geometry within one world.

The spatial scanners, used by the GIS in hit-searching and in screen refresh,
can be constrained to find data which has a particular world ID. Hence, viewing
and interaction with the spatial database is made world-specific.

Types of World

There may be many types of world in a GIS. Each world type typically has a different
meaning for the geometry it contains.

Geographic worlds represent the location of objects in the real world. This geometry
may or may not have associated topology.

Schematic worlds have geometry which represents the existence of objects and possibly
their topology, but does not carry any real-world positional information.

Cross Section worlds are essentially geographic, i.e. geometry represents real
world locations of objects, but use a coordinate system in which the x axis represents
distance along some arbitrary direction and the y axis represents vertical position.

Worlds and Universes

There are a fixed number of bits available to be shared between the spatial index,
drawing priority and world ID.

The number of bits used for the spatial index (together with the overall extents
of the scanned area) determines the size of the smallest quad-tree box and hence
the resolution of the quad-tree. This, in turn, affects the efficiency of hit-searching
and graphics re-draws. In a world which is densely populated and which has many
geographically small objects, a relatively fine resolution would be required. Conversely,
a world with very little geometry, or with very sparse geometry, can be scanned
with coarser resolution.

The number of bits used for the world ID and for drawing priority determines the
total number of worlds and priorities that can exist in the database. Hence, there
are competing requirements which have to be juggled to achieve a compromise allocation
of bits.

Using the topmost few bits of the spatial index to store a ‘universe id’, each
‘universe’ uses a different allocation of the rest of the bits. This allows each
universe to contain worlds with certain characteristics. A geographic universe
may contain just one world and hence uses its remaining bits to obtain maximum
spatial resolution. A substation internals universe or cross-section universe would
need to contain a large number of worlds, but each world is small and has little
geometry in it. We can therefore use up a large number of bits to store the world
ID, leaving relatively few for the spatial index without performance loss. A universe
for network schematics lies somewhere between the previous two examples: there
may be a relatively small number of schematic diagrams and the diagrams will be
relatively sparsely populated compared with the geographic world. Figure 5 shows
the full structure of the spatial key.

  Universe  World  Drawing      Quadtree  Unique
  Id        Id     priority     Key       id

Figure 5: Complete structure of spatial key

Alternative Representations of Objects and Networks

An object may be represented in several worlds and it may take a different form
in each. For example, an electricity substation is represented in the geographic
network by an area which is its boundary. In a schematic diagram, however, the
substation is represented as a collection of transformers, busbars and switches.

A cable or pipe may be represented by its centre line (chain) geometry in the
geographic world, but by a point in the world owned by a cross section object.
The cable may appear in many cross sections.

A network may also have more than one representation. An electricity network,
for example, may be represented in the geographic world by a connected network
of cables, overhead lines and joints, and in a schematic world as a connected network
of load sections. Both collections of geometry describe the same network.

Connections Between Worlds

Topological connections from one world to another may be made with a special object
called a hypernode. This object has two point geometries, one in each world. This
allows network traces to jump from world to world. For example, a network trace
in an electricity network may arrive at a substation, jump into its internals world
follow the geometry related topology inside it, then jump back out into the network
again.

Hypernodes are a simple example of a multi-pin device and can be given special
behaviour to change the trace state, or to prevent certain types of trace from
jumping.

Complex Objects

An object may have a very simple representation in the network but may have considerable
internal topological complexity which is hidden from the outside world. This internal
topology may be implemented in a number of ways:

Objects with Internals worlds

Each object owns its own internals world. The internal topology of the object
is associated with geometry in that world.

For example, an electricity substation has components: busbars, transformers and
switches, which have their geometry in the owning substation’s internals world.
These components form an internal topological network for the substation.

Multi-pin-devices

These are objects with several connection pins (point geometries). The number
of pins may be fixed, as in a simple switch or junction box, or dynamically variable
– pins may be added or removed as required by the user.

External connections to the network are made to these pins, but the internal topology
of the object is encapsulated as behaviour on the object itself. This topology
may be implemented as a hard coded set of rules, which may refer to attributes
of the object itself or to other conditions, or as a connectivity table which may
be updated by the user.

For example, the following pseudo-code describes a method on a simple switch object
which responds to the message ‘connectivity_trace’ from the network tracer.

      method simple_switch.connectivity_trace(pin)
      if self.closed
      then if pin = input_pin
           then return output pin
           else return input_pin
           endif
      else
           return unset
      endif

A multi-input/output junction box may have an associated connectivity table. When
asked for output pins connected to a particular input pin, the junction box will
perform a lookup of this table and return a number of pins. Records may be added
to or removed from this table as connections are made or broken in the junction
box.

Objects with trace behavior

These are objects with simple, geometry related topology but which have behaviour
which modifies that simple topology in some way.

For example, a road_junction may have a single node in the road network, which
may be connected to several roads. However, when asked for exit roads given a particular
approach road, the junction object overrides the simple topology (in which all
roads which join that junction are connected) and may take into account such things
as one way streets or restricted access. The road junction may also assign various
costs to these transitions; turning right (in the UK) may be a more costly move
than going straight on or turning left.

As with the multi-pin device, this internal topological complexity may be defined
algorithmically or by table lookup, or some combination of the two. There may be
a dependence on various internal external parameters such as the day of the week,
or the angle between approach road and exit road at a roundabout.

Encapsulating topology inside objects in this way allows us to simplify the geometry
of topologically complex networks. In a fibre-optic cable network, for instance,
each cable may contain 256 fibres and at each junction, the connections between
individual fibres must be modelled. However, the cable can be modelled by a single
chain and when network traces are carried out, the current fibre number is carried
as an attribute of the trace. The junction objects can use this attribute to determine
connectivity and will modify it when passing back output cables. Hence, the network
tracer will ask the junction “I have arrived along this cable on fibre number
125. Give me connected output cables and corresponding fibre numbers”.

Conclusion

The approach described in this paper to handling topologically structured data
in multiple coordinate systems is a simple and natural extension to the spatial
indexing techniques employed to organise a single large seamless geographic database.
The method is elegant in that it adds no complexity to the topological capabilities
of the system implemented for a single coordinate world. Even though geometry is
stored seamlessly, the spatial key implemented organises the data physically so
that data belonging to each coordinate system is stored contiguously in different
parts of the base datastore tables. The approach has been highly effective for
implementing complex models for a range of network applications.

References

Abel, D.J. & Smith, J.L. (1983): A Data Structure and Query Algorithm Based
on a Linear Key for a Rectangle Retrieval Problem, Computer Vision, Graphics, and
Image Processing 24,1, October 1983.

Abel,D.J. & Smith, J.L. (1984): A Data Structure and Query Algorithm for a
Database of Areal Entities, The Australian Computer Journal, Vol 16, No 4.

Batty, P. (1990): Exploiting Relational Database Technology in GIS, Mapping Awareness
magazine, Volume 4 No 6, July/August 1990.

Bundock, M. (1987): An Integrated DBMS Approach to Geographical Information Systems,
Autocarto 8 Conference Proceedings, Baltimore, March/April 1987.

Chance, A., Newell, R.G. & Theriault, D.G. (1990). An Object-Oriented GIS
– Issues and Solutions, EGIS ’90 Conference Proceedings, Amsterdam,, April 1990.

Easterfield, M.E., Newell, R.G. & Theriault, D.G. (1990): Version Management
in GIS Applications and Techniques, EGIS ’90 Conference Proceedings, Amsterdam,
April 1990.

Guttman, A. (1984): R-trees: A Dynamic Index Structure for Spatial Searching,
Proceedings of ACM SIGMOD Conference on Management of Data, Boston, June 1984.

Kriegel, H., Schiwietz, M., Schneider, R., Seeger, B. (1990): Performance Comparison
of Point and Spatial Access Methods, in Design and Implementation of Large Spatial
Databases: Proceedings of the First Symposium SSD ’89, Santa Barbara, July 1989.

Libera, F.D. & Gosen, F. (1986): Using B-trees to Solve Geographic Range Queries,
The Computer Journal, Vol 29, No 2.

Newell, R.G., Easterfield, M. and Theriault, D.G. (1991): Integration of Spatial
Objects in a GIS, Proceedings of Auto-carto 10, Baltimore March 1991, Volume 7,
pp 408-415.

Samet, H. (1989): The Design and Analysis of Spatial Data Structures, Addison
Wesley, 1990, ISBN 0-201-50255-0.

Seeger, B. & Kriegel,H. (1988): Techniques for Design and Implementation of
Efficient Spatial Access Methods, Proceedings of the 14th VLDB Conference, Los
Angeles, California, 1988.

Smith, T. R. and Gao, P. (1990): Experimental Performance Evaluations on Spatial
Access Methods Proceedings of the 4th Spatial Data Handling Symposium, Zurich 1990,
Vol 2, p991.

CFIS

CFIS

CFIS

Technical Paper 10 – Discrete Geometry with Seamless Topology in a GIS

Abstract

Introduction

Handling Topology

Spatial Organisation

Multiple Worlds

Implementation

Conclusion

References