Technical Paper 7 – Object-orientation: Some Objectivity, Please!

by Peter Batty Senior Applications Consultant Smallworld Systems

Abstract

“Object”
would appear, to most people, to be a fairly innocuous, uninteresting word. Strange,
then, that the objective of many GIS vendors these days appears to be to mention
the word object as frequently as possible, objectivity being no object in labelling
a system as object-oriented, to the extent that the word object is in danger
of becoming an object of ridicule in the industry. Many people object to this
rather objectionable state of affairs, where the word object is used as frequently
and in as many different contexts as in this abstract, so the object of this
paper is to try to introduce some objectivity into the use of the word object
in relation to GIS.

The paper explains the difference between object-based systems, object-oriented
interfaces, object-oriented programming and object-oriented databases. It concentrates
in particular on explaining object-oriented programming, using real examples from
a GIS application which the author has just implemented in an object-oriented environment.

The paper also asks the question, “So what?” Even if the user can work out in
what respects a particular system is object-oriented, need they be concerned about
the answer?

Peter Batty is a Senior Applications Consultant with Smallworld Systems of Cambridge,
England. He has seven years of experience in the GIS industry- Most of this time
was spent working for IBM in both the UK and the USA, and he moved to Smallworld
Systems in 1992. His experience includes a wide range of GIS development and implementation
projects in many different industries and countries. He has written and presented
many articles and papers on technical issues in GIS. He has a BA in Mathematics
and an MSc in Computing from Oxford University.

Outline

The term object-orientation has become so widely used in GIS that describing a
system as object-oriented has become fairly meaningless. Rather than try to produce
a single definition of what constitutes an object-oriented system, this paper attempts
to outline the various ways in which the terms object and object-oriented are used
in GIS, and produces a summary checklist which can be used to clarify exactly what
a vendor means when they describe their system as object-oriented.

Topics which will be covered include the following:

  • Object-based (as opposed to sheet-based or tile-based) systems
  • Object-centred (as opposed to geometry-centred) systems
  • Object-oriented user interfaces
  • Object-oriented programming
  • Object-oriented databases

The first two topics are largely a question of defining terminology and are fairly
straightforward to understand. The third topic is rather vague, and is only briefly
discussed. The fourth area of object-oriented programming is somewhat more complex
to understand and it is this area that this paper will primarily focus on. The
final topic of object-oriented databases is not well defined, and various definitions
are discussed. Most of the (sensible) definitions of an object-oriented database
relate back to object-oriented programming, which emphasises the importance of
understanding this area in order to make sense of all the other definitions of
object-orientation which one may meet.

The following sections explain each of the above terms, and in each case discuss
the relevance of the topic to GIS.

Object-based Rather Than Map-based or Tile-based Systems

Perhaps the lowest level of functionality which is sometimes described as object-oriented
in the context of GIS is what this author would describe as an object-based or
feature-based approach to storing geographic data. Many GIS, especially those derived
from CAD systems, split the database into map sheets, tiles, or geographic partitions.
In such systems, any feature which crosses a map or tile boundary needs to be physically
stored as multiple geometric objects, although the system will usually contain
some functionality to make this split largely invisible to the end user.

In contrast, an object-based system does not partition the database into tiles,
and stores geographic objects or features as a fundamental unit in the database.
In such a system, linear or area objects never need to be artificially split because
they cross a tile boundary.

Although, as mentioned above, systems which use a tile-based approach can make
this reasonably transparent to the end user, extra code is required to do this,
and typically application development is more complex because of the need to handle
special cases where objects are split. There are also potential data integrity
problems in trying to ensure that all parts of the object are correctly maintained.
It is therefore generally agreed that an object-based approach is preferable to
a tile-based approach. The main reason for using a tile-based approach is that
this makes it simpler to achieve reasonable performance. However, modern spatial
indexing techniques make it possible to get very good performance with a seamless
object-based approach.

An object-based system as described in this section has nothing to do with the
term object-orientation as it is used in the rest of the computer industry. However,
in the experience of this author, many of the GIS which are described as object-oriented
by their vendors fall into this category.

Object-centred Systems (as Opposed to Geometry-centred Systems)

Another sense in which the word object is used in relation to GIS data modelling
is the term
“object-centred” used by Newell (1). He contrasts what he terms object-centred
and geometry-centred data models for use in GIS.

A geometry-centred model is one in which the primary classification of objects
is a geometric one – for example each object is either a point, line or area. Each
of these geometric types are then subdivided into classes which represent objects
in the real world, for example a line might represent a road, a river or a gas
pipe, and have appropriate attributes associated with it in each case.

In contrast, in an object-centred model, the primary classification of objects
is based on the real world – so an object might be a road or a school. This object
has multiple attributes, which may be either alphanumeric or geometric. Hence an
object could have multiple different geometries of different types with this model.
For example, a road might have a line geometry representing its centreline, which
could be used for route tracing applications, and an area geometry representing
its extent, which might be used in cadastral applications. This approach also facilitates
generalisation, by allowing multiple geometric representations of an object which
can be used at different scales.

In general, an object-centred data model provides a number of advantages over
a geometry-centred model. However, as with the previous case, this use of the word
object has nothing to do with object-orientation. At the risk of confusing the
issue, it is possible to have an object-oriented system which is either geometry-centred
or object-centred as defined in this section (real-world-object-centred might be
a more accurate, if rather long, description for the latter data modelling approach).
These two approaches are just different ways of classifying the objects within
the system.

Object-oriented User Interfaces The term object-oriented interface is a somewhat
nebulous one. Some people use it to describe any graphical user interface which
makes use of windows, icons, etc., such as Microsoft Windows or X-windows. However,
almost all GIS use a standard windowing system, so this is hardly a distinguishing
factor in comparing different systems.

Some people use the term object-oriented interface in more specific ways, for
example to describe the sort of interface used by systems such as the Macintosh,
where the general approach is to select an object first and then choose an action
to be carried out upon it, rather than choosing an action first and then an object.
However, there is no general agreement on a precise definition of an object-oriented
interface.

It is also true to say that, while user interfaces are obviously important in
GIS, they are not really a major consideration in terms of the use of object-orientation
in GIS, so we will not consider them any further here.

Object-oriented Programming

As stated earlier, understanding object-oriented programming is really the key
to understanding definitions of object-orientation in relation to other areas such
as databases, so it is this area that this paper will focus on.

One of the main challenges in explaining object-oriented programming is to find
examples which are detailed enough to show how it can give significant benefits
in practice, without being too long and difficult to understand. This section attempts
to provide some such examples, based on real GIS applications.

The language used in the following examples is Smallworld Magik. This paper will
just explain the minimum amount about the language syntax which is necessary to
understand the examples, since the aim is to explain the important concepts of
object-oriented programming in general, rather than any specific language. For
a more detailed introduction to the Magik language, see (2).

First we will introduce the basic ideas of object-oriented programming: objects,
classes, messages and methods. We will then look in turn at the concepts of encapsulation,
polymorphism and inheritance, which are defined by most authors to be the key things
which characterise an object oriented programming language.

Objects, Classes, Methods and Messages

Somewhat predictably, the idea of an object is central to object-oriented programming.
An object is an item of data, very much like a variable (or constant) in a conventional
programming language. Every object belongs to an object class, which is analogous
to a data type in a conventional language. So for example, the number 1 is an object
belonging to the class integer, and the letter x is an object belonging to the
class character. These basic classes are defined as part of the system, as are
slightly more complicated classes analogous to other data types in conventional
languages, such as arrays.

However, one of the most important things about object classes is that new classes
can be defined by the programmer, based on existing classes. For example, we could
define a coordinate class, specifying that each coordinate has an x component and
a y component, each of which are floating point numbers. This is similar to defining
a structure in a conventional programming language such as C. We can access the
components of an object as shown in the following example, which creates a new
coordinate object with x coordinate of 100 and a y coordinate of 200, and then
prints out the x and y coordinates separately:

c << coordinate.new(100, 200)
write("x = ", c.x)
write("y = ", c.y)

This would produce the output:

x = 100
y = 200

The first line creates a new coordinate object and stores this in a variable called
c (<< is the Magik assignment operator, like = in C or := in Pascal). In
the second line, the expression c.x sends the message x to the object c, and this
returns the value 100 (we will look further at messages in a moment). Components
of an object, like x and y in this example, are known as slots in Magik or instance
variables in Smalltalk.

As an aside, in Magik a slot does not have a fixed type – we could store a character
string like
“Hello” in the x component of a coordinate if we wanted to (although this might
not be a good idea in this case, we will look at examples later where this capability
is very useful). In some other languages like C++, the type (or class) or a slot
has to be declared in advance and cannot be changed (this is known as strong typing).

Now we will look at messages and methods. So far everything we have discussed
relating to objects has an equivalent in conventional languages, like C, which
support the definition of composite data types or structures. However, a key difference
in an object-oriented programming language is that an object class not only defines
the data stored in objects of that class (as we have just briefly discussed), but
it also defines all the functions which can operate on objects of that class. These
functions are known as methods in an object-oriented system. Data in an object
can only be accessed via methods defined on its object class. The significance
of this will be discussed in the next section on encapsulation. A method is invoked
on an object by sending a message to that object, which causes a method of the
same name to be invoked. The distinction between messages and methods can be confusing
at first, but the same message could be sent to objects of different classes and
result in different methods being executed, because the method was defined differently
on each class. This will be discussed further in the section on polymorphism.

To finish this section, we will look at a few examples of invoking methods on
objects by sending messages to them. The Magik syntax for sending a message to
an object is of the form

object name.message_name

Many methods will return a value (more accurately, they return an object). For
example, suppose we had an object called a_road. The following example shows several
Magik expressions in the left hand column and the object which is returned on the
right:

"High Street"
  a_road.name               (a character string
                            object)
                            A chain object (a
  a_road.centre_line        chain is a basic
                            geometric object in
                            the Smallworld GIS)

  a road.centre_line.length 255.0 (the length of
                            the road centre-line
                            in metres - this
                            example shows how we
                            can send another
                            message to an object
                            which is returned from
                            another method . . .
                            expressions like this
                            are evaluated from
                            left to right).

As well as simply returning objects, as shown so far, methods can change data
or cause other actions. For example, the message draw() will invoke a method which
draws an object on all current windows in the GIS:

a_road.draw()

Parameters can be passed to a method – for example, the method draw_on() will
draw an object on a specified window:

a_road.draw_on(a_window)

Some methods create or change objects. In our first example we saw the method
new(), which creates a new object:

c << coordinate.new(100, 200)

There is a special message syntax for assigning data to slots, as in the following
example:

c.y << 300

This sets the y component of the coordinate c to 300.

Encapsulation

We mentioned in passing in the previous section that the only way that data within
an object (i.e. data in a slot) can be accessed or changed is via methods defined
on that object’s class. This is known as encapsulation, and we will consider its
significance in this section. The most important thing about encapsulation is that
it provides a well-defined and strictly enforced external interface to an object.
This makes it possible to change the internal implementation of an object without
affecting any of the other code which uses the object. This is a great advantage
when building large and complex systems. We will look at some examples of how encapsulation
could be used.

First consider the coordinate example we have already looked at. Our coordinate
object class has two slots, and we have methods x and y which allow us to access
these slots, and methods x<< and y <<
which allow us to directly assign values to those slots. For some operations it
may be more convenient to work with coordinates expressed in terms of a polar coordinate
system, as a radius and angle. We could define two new methods on the coordinate
object class, called radius and angle, as follows:

method coordinate.radius
  return sqrt(x*x + y*y)
endmethod

method coordinate.angle
  return atan2(y, x)
endmethod

Now executing the following . . .

c << coordinate.new(3, 4)
write("x = ", c.x, " y = ", c.y,
      " radius = ", c.radius, "angle = ", c.angle)

Would produce this output . . .

x = 3 y = 4 radius = 5 angle = 0.9273

Notice that there is no visible difference between the methods which directly
access data in the slots (x and y) and the methods which access derived data (radius
and angle). At the moment we have no way of setting the radius or angle directly
though, as we have not defined methods to do this. However, we could do this as
follows:

method coordinate.radius << new radius
  current_angle << self.angle # self.angle tells this
                                    # object to send the
                                    # message angle to itself
  x << x * new radius * cos(current_angle)
  y << y * new_radius * sin(current angle)
endmethod

method coordinate.angle << new angle
  current_radius << self.radius
  x << x * current radius * cos(new angle)
  y << y * current radius * sin(new_angle)
endmethod

We can now change and access radius and angle just as though they were slots.
If we now discovered that our application was using the polar form of the coordinates
much more than the cartesian form, we could redefine our coordinate object to have
slots called radius and angle instead of x and y, for efficiency, and define appropriate
methods x, y, x<< and y<< so that all the methods which were previously
defined were still available. Any existing programs using coordinates would run
without any change, even though the underlying implementation of coordinate has
completely changed. Note that slot access and update methods need not exist for
all slots in an object, so slots can be hidden from the external programming interface
for an object.

Encapsulation is a technique which can actually be used in non-object-oriented
language, but it is usually not enforced by the language itself. For example, one
could define a coordinate data structure in a language like C, and define functions
called set_x, set y, get x and get_y, which were analogous to the slot access methods
we have described. However, we are entirely reliant on the discipline of the programmers
who use this data structure whenever they access or update a coordinate, they must
use the specially provided access functions for doing so, rather than accessing
the underlying data structure directly. An object-oriented system strictly enforces
this principle of encapsulation.

Polymorphism

Polymorphism is the ability for the same variable to refer at different times
to different classes of object. We have found this particularly useful in GIS applications,
where there is often a requirement to handle heterogeneous groups of objects. We
can send a message to an object without knowing its class, and the appropriate
method for that class will be invoked on the object.

We will consider as an example a function to carry out Quality Assurance (QA)
on electrical network data which has just been captured. We have a set of rules
such as the following, for each object class which is relevant:

  • All low voltage (LV) joints must have at least 1, and no more than 4, cables
    connected to them.
  • All pole mounted transformers must have at least 1, and no more than 2, lines
    or cables connected to each of the low voltage and high voltage connections.

If we find an object which does not satisfy all the specified rules, we want to
tell the user, highlight the object, and change the currently displayed area so
that the object is in the centre of the screen. The way in which we will implement
this function is to define a method called valid? on each relevant object class,
which returns true or false depending on whether the object satisfies all the rules
or not.

At capture time we run interactive checks to ensure that the only object which
can be connected to an LV joint is an LV cable, so all we need to check at this
stage is the number of objects which are connected to the LV joint. In this particular
data model, an LV joint has a single point geometry called location. Thus our validation
method can be written as follows:

method lv_joint.valid?
  num cables << self.location.all
  connected_geometry.size
  if num_cables < 1 or num_cables > 4 then
    return false
  else
    return true
  endif
endmethod

In the first line, self.location returns the point geometry associated with this
Iv_joint object. We then send this point object the method all_connected_geometry,
which returns a set containing all the geometries which are connected to that object.
In this case we do not wish to look at the individual items in this set, we just
want to know the size of the set, so we just send the set the message size. All
these methods are already defined in the standard class libraries (i.e. class definitions
and methods) which are provided with the system. This last object which is returned
(the size of the set, i.e. the number of cables connected to this joint) is assigned
to the variable num_cables. We then do a simple test on the value of num cables
to check whether this is valid or not, and return true or false accordingly.

We can define a similar method for pole mounted transformers as follows. This
is slightly more complicated since this object has two point geometries, called
Iv_connection and lv connection. These represent the distinct connection points
for low voltage and high voltage cables or lines belonging to this transformer.
Again we validate interactively that only LV cables and LV lines can be connected
to the Iv_connection, and that only HV cables and HV lines can be connected to
the hv connection, so we just need to check the number of objects connected to
each of these geometries.

method pm_transformer.valid?
  num_lv_conns << self.lv_connection.all_connected._geometry.size
  num_hv_conns << self.hv_connection.all_connected_geometry.size
  if min(num lv_conns, num_hv_conns) < 1 or
     max(num_lv_conns, num_hv_conns) > 2 then
     return false
  else
     return true
  endif
endmethod

This method is very similar in principle to the last one, except that in this
case we have to check the connections to each of the two geometries belonging to
the object.

We can now define our QA validation function as follows:

method qa_menu.validate objects()
  for an_object over grs.objects_inside_area(current_qa_area)
  loop
    if not an_object.valid? then
      grs.current object << an_object
      an_object.goto()
      grs.show_message("Invalid object found")
      return
    endif
  endloop
  grs.show_message("QA completed
  successfully")
endmethod

This is the complete code for this application. We define the validation function
as a method on an object called qa_menu, which is a menu we have created from which
the user will initiate the QA function by pressing a button. The qa_menu object
has a couple of slots which are referred to in this method. The first is called
grs, which is the graphics system we are currently running – this is quite a complex
object which is essentially the whole GIS application, which has slots referring
to the current database, all the menus displayed, the currently selected object,
etc. There is also a slot which stores the QA area we are currently working within.
We just want to check objects inside this area, so we send the message objects_inside_area()
to the graphics system. This is a special type of method called an iterator method
– it returns objects one at a time to the loop which follows. Inside the loop,
we send the message valid? to each object which is found inside the area. This
is where polymorphism is important – even though we don’t know the type of object
which has been returned (we could find out, but we don’t need to), we can send
it the same message, valid?, and the appropriate method called valid? gets invoked
depending on the class of the object. This example should clarify the difference
between a message and a method. We have defined two distinct methods called valid?,
one on the class lv_joint and one on the class pm_transfomer. However, we can send
exactly the same message called valid? to an object of either class and the appropriate
method will be invoked.

If the object fails the validation test then we make it the current object in
the graphics system, which causes its geometry to be highlighted and its object
class and attributes to be displayed. We then send the object the message goto(),
which causes this object to be displayed in the current graphics view, and finally
we display an alert message to the user and exit from the loop (and the method)
with a return statement.

The great beauty of this approach is that we can add a new object class to our
application, define a method called valid? for it, and the validation code will
work immediately without requiring any changes. You don’t even need to compile
or link anything. In contrast, with a conventional procedural language it would
be very hard to write an equivalent QA function which could be extended to accommodate
new object classes and rules without having to modify the source code of the validation
routine itself. Allowing customers to directly modify product source code is highly
undesirable for a software vendor (and indeed for the customer, as it makes support
and problem resolution much more difficult), so it is much easier to produce systems
which can be easily and cleanly extended in an object-oriented environment like
the one we are discussing.

As another aside, one important point which has been touched on in passing is
that the Magik programming environment is interactive. One can be running the GIS,
modify a validation method like those above while the system is running, and immediately
test the effects of the change without having to compile or link anything. The
same is true of Smalltalk, but not of C++, which requires you to compile, link,
and re-run your application before you can test the change. Having an interactive
programming environment makes a huge difference to development productivity.

Inheritance

The third main area which characterises object-oriented programming is inheritance.
Inheritance allows new object classes to be defined in terms of existing object
classes, inheriting both data structure (i.e. definition of slots) and behaviour
(definition of methods) from the defining parent class or superclass. A class which
inherits from another class is said to be a subclass of its parent. It is possible
to define additional slots and additional methods on a subclass. It is also possible
to define a method in a subclass with the same name as a method in its parent class,
and this new method will override the method from the parent class. We will look
at examples of all these things shortly. In overview though, the inheritance mechanism
provides a very powerful way of writing generic code which can be shared by many
classes, whilst at the same time allowing any differences from this generic behaviour
to be easily defined in subclasses. This results in much smaller amounts of code
overall, which again greatly helps the reliability and maintainability of a system.

The value of inheritance is most apparent in quite complicated systems, so it
is difficult to illustrate its full benefits in a short paper such as this. To
illustrate the basic concept of inheritance though, we will return to our coordinate
example. Suppose that for some applications we need to handle 3-D coordinates,
and that for the most part these will be used in the same way as 2-D coordinates
(displayed on 2-D maps etc), but that in some cases 3-D coordinates will have additional,
or different, behaviour.

First we will define a few examples of behaviour on 2-D coordinates. We will assume
that we have the access methods x and y which we used before.

We can define a method to measure the distance between two coordinates as follows:

method coordinate.distance_to(another_coordinate)
  dx << self.x - another_coordinate.x
  dy << self.y - another_coordinate.y
  return sqrt(dx*dx + dy*dy)
endmethod

We could also define a method to check whether a coordinate was inside a bounding
box (this is a horizontal rectangular area, often used for initial area comparisons
in a GIS, which is defined by the its bottom left comer (xmin, ymin) and its top
left corner (xmax, ymax)).

method coordinate.inside?(a_bounding box)
  bb << a bounding box
  if self.x >= bb.xmin and self.x <=bb.xmax and
     self.y >= bb.ymin and self.y <= bb.ymax
  then
    return true
  else
    return false
  endif
endmethod

There would obviously be a lot more methods defined on a coordinate in practice,
but these will suffice for this example. A simple example of creating and using
some coordinates and related objects is as follows:

# Create a coordinate
c1 << coordinate.new(5, 5)

# Create another coordinate
c2 << coordinate.new(5, 15)

# Create a bounding box
bb << bounding box.new(0, 0, l0, 10)

# Check if c1 is inside the box
write(cl .inside?(bb))

# Check if c2 is inside the box
write(c2.inside?(bb))

# Calculate the distance from c1 to c2
write(cl .distance_to(c2))

This would produce the following output:

True
False
10

We can now define a subclass of coordinate called 3d_coordinate which inherits
from coordinate and has an additional slot called z. This will immediately inherit
all the methods we have defined on coordinate, so the operations we have defined
above will still work in the same way, accessing the x and y coordinate of the
3d_coordinate and ignoring the z coordinate. We could define a new method to calculate
the 3d distance between two 3d coordinates as follows:

method 3d_coordinate.3d_distance_to(another_3d_coordinate)
  dx << self.x - another 3d_coordinate.x
  dy << self.y - another 3d_coordinate.y
  dy << self.z - another_3d_coordinate.z
  return sqrt(dx*dx + dy*dy + dz*dz)
endmethod

In this way we can easily extend the behaviour of existing classes. We can also
modify the behaviour of a subclass relative to its parent by overriding methods.
We will look at a different example to illustrate this. As mentioned earlier, Magik
allows multiple inheritance, i.e. inheritance from more than one parent class.
It is possible to define special classes called mixins, which do not have any slots
but are just used to define behaviour (methods) which can be inherited by other
classes.

The example we will consider is a data conversion application. We will look at
defining methods which specify how objects are interactively created. Within Smallworld
GIS, there is standard functionality provided to allow the user to create and manipulate
an object called a trail, which is just a general piece of geometry. The trail
is a multi-point line, and functions are provided to add points to the trail, move
and delete them, generate points by raster line following, etc. The geometry in
the trail is used to define the geometry of objects which are added to the system,
such as cables or poles. Point objects can be defined either with a single point
trail, for an object with fixed orientation, or with a two point trail for an object
with variable orientation, where the first point defines the centre of the object
and the direction from the first to the second point defines the orientation of
the object. This is illustrated in the following diagram:

[Fig not available at this time]

There are various types of behaviour common to point objects in this data capture
application, so we define a class called point_object, on which we will define
behaviour common to point objects which can be inherited by application objects
such as joints, poles and transformers.

We will define a general method for creating the main geometry of a point object
from a trail which will cover both of the cases above. It turns out to be useful
to allow a point object to be added at the end of a long trail in certain situations,
for example when digitising linear objects such as cables. We will therefore specify
that point objects without orientation will be added at the location of the last
point in the trail with an orientation of zero, whilst point objects with orientation
will be added at the last but one point in the trail, and the orientation of the
last trail segment will define the orientation of the object. This is illustrated
in the following diagram:

[Figure not available at this time]

In this application, when the user presses the insert button, a new object of
the current type is created with no geometry, and then this object is sent the
message create_geometry_from_trail(), so that the appropriate default geometry
will be created from the current trail. Since we have two different sets of behaviour,
point objects with and without orientation, we can define two new classes called
point_object_with_orientation and point_object_without_orientation on which we
can define the appropriate behaviour to create geometry from the trail. Both of
these classes inherit from point_object, so that any behaviour which applies to
any point object (with or without orientation), can be defined on the point object
class, and it will automatically be inherited by these two subclasses.

We now define our methods as follows:

method point_object_without_orientation.
       create_geometry_from_trail(grs)
  new_point <<
  point.new_at(grs.trail.coords.last)
  self.default_geometry << new_point
endmethod

method point_object_with_orientation.
       create_geometry_from_trail(grs)
  trail << grs.trail
  if trail.size > 1 then
    new_point <<
    point.new_at(trail.coords[trail.size - 1])
    new_point.orientation <<
    trail.segment_angle
  else
    new_point <<
    point.new_at(trail.coords.last)
  endif
  self.default_geometry << new_point
endmethod

The first method creates a new point at the last coordinate in the trail. This
is done by sending the graphics system object, grs, the message trail which returns
a trail object. This in turn is sent the message coords, which returns a vector
(array) of coordinates, and this is sent the message last, which returns the last
element of any ordered collection. So we now have a coordinate, and we create a
new point at this coordinate (a point has more information than a coordinate, such
as an orientation, and information on other geometries which are connected to that
point). No orientation is specified for the point here, since the default orientation
is zero, which is what we want in this case. We then assign the default geometry
of the new object to the point we have created. This assignment causes user-definable
rules to be invoked to connect this geometry to other specified geometries within
a given tolerance, as appropriate.

The second method is similar, but in this case we define the location of the point
to be at the last but one point of the trail, provided that the trail has more
than one point. To do this we use the indexing method [n1 which accesses the nth
element of any ordered collection. We also assign an orientation to the point,
which we obtain by sending the trail the standard message segment angle, which
returns the angle of the last segment in the trail. If there is only one point
in the trail, we create the new point in the same way as for a point object without
orientation.

When we define application point objects like joints, poles and transformers,
each of them will inherit either from point object_with_orientation or point object_without_orientation.
We could also define a new create_geometry_from_trail() method on any of these
specific objects if we wished its behaviour to be different in terms of how its
geometry was created from the trail. For example, we might wish to regard a substation
as a point object with orientation, since like the other point objects we have
considered it is a valid end point for a cable, so it shares behaviour in this
respect. However, we wish to represent the primary geometry of a substation as
a rectangular area geometry, of a size which depends on the voltage level of the
substation.

We would like to define the location of the substation by placing a point at its
bottom left corner and making a second pointing to indicate its angle. This could
be done with the following method:

method substation.create_geometry_from_trail(grs)
  trail << grs.trail
  
	# Define the bottom left corner and the angle
  # from the trail
  if trail.size > 1 then
    base_coord <<
    trail.coords[trail.size-1]
    orientation << trail.segment_angle
  else
    base_coord << trail.coords.last
    orientation << 0
  endif

  # Set the substation size (in mm) depending
  # on the voltage
  if self.voltage = "LV" then
    xsize << 5000
    ysize << 3000
  else
    xsize << 12000
    y size << 8000
  endif

  # Now create the relevant area geometry
  new_area <<
  area.new_rectangle(base_coord, xsize,
  ysize, orientation)
  self.default_geometry << new_area
endmethod

So now our substation object has all the behaviour of a point object with orientation,
except for the way in which its geometry is created from the trail. It can be seen
that in this way inheritance gives us a very powerful technique for sharing code
between object classes – we only need to write additional code for a new object
class where its behaviour differs from its parent class. This example also illustrates
the flexibility of a
“real world object centred” data model rather than a “geometry centred” data model:
we can define a range of objects to be regarded as “point objects”
for the purposes of this application, even though they have different geometry
types.

Object-oriented Databases

Whilst there is a reasonable degree of agreement as to what constitutes object-oriented
programming, as described in the previous section, there is less agreement as to
what constitutes an object-oriented database. Some people, including some well
known figures in the GIS industry, seem to use the term for any database which
can store
“blobs” (binary large objects, such as images), in addition to traditional data
types such as numbers and character strings.

An alternative definition is that it is a system which provides a persistent store
for objects in an object-oriented programming environment, so that objects continue
to exist when a program finishes running. To be regarded as a proper database management
system (DBMS), such a system should also support multi-user access to the data,
and handle the associated issues of concurrent update, and also provide other standard
database functions such as security, backup and recovery. The interface to objects
in such an object-oriented database, from a programming point of view, is usually
exactly the same as the interface to non-persistent objects.

The advantages of an object-oriented DBMS are essentially an extension of those
for object-oriented programming: with such a DBMS it is possible to use all the
same data modelling techniques on objects which need to be stored in the database.

However, whilst there is general agreement in the computer industry that object-oriented
programming is a good thing, database experts seem to be divided over the virtues
of object-oriented databases. There is a well-developed theory behind relational
databases, a lot of experience has been gained with them, and there are established
standards such as SQL. No such formal theory has been developed for object-oriented
databases. There are still some unresolved issues with object-oriented databases,
such as providing a general query language and optimising queries. There is therefore
a school of thought which says that rather than regarding object-oriented databases
as something completely independent of current database technology, relational
database systems should be extended to accommodate object-oriented ideas, and to
allow them to be used within an object-oriented programming environment.

Smallworld has taken an approach which uses a version managed relational database
management system, with an object-oriented programming interface added to it. Accessing
objects in the database is very similar to accessing non-database objects, but
with a couple of restrictions. The first restriction is that slots in database
objects have to have a fixed type (class) which is declared in advance, as with
any relational database. This is in contrast to non-database objects in Magik,
whose slots can be used to store any object of any class. The second restriction
is that in the current version of the system, behavioural inheritance (inheritance
of methods) is supported on database objects, but structural inheritance (inheritance
of slots) is not. However, exploratory work has been done on structural inheritance
and it is planned to support this in a future release of the product.

A database table is regarded as a collection in Magik. A collection is a general
class which stores a group of objects. There are many standard subclasses of collection
which are provided with the system, such as sets, arrays, ordered collections,
etc. Database tables form a class called ds_collection (datastore collection).
There are various standard methods which apply to all collections, for example
size, which returns the number of elements in a collection, and elements(), which
is an iterator method which returns all the elements of the collection in turn.

The following example gives a brief flavour of how database objects can be accessed.
Suppose that we have a cost attribute in pipes, which we wish to set in all pipes
based on other attributes in the pipe (in reality we would probably set this interactively,
triggered by any change in the pipe, but this sort of batch update is a reasonable
illustration of access to the database).

for p over pipe_table.elements()
loop
  material_cost << material_table.at(p.material).unit_cost
  p.cost << p.length * material_cost
endloop

In this example we loop over each of the objects (records) in the pipe table.
For each one we obtain the material cost by looking in another table. The method
at() returns the database object at a specific primary key value in a table, which
is a very efficient means of accessing a record. We can also access records using
generic SQL-like predicates. The record object returned has slots like any other
object, so we access the unit_cost slot of the material record returned. We then
assign the cost attribute of the current pipe database record to the pipe length
multiplied by the material cost. Since this is a database object, this value is
automatically stored in the database.

Summary

We have discussed a number of uses of the term object-oriented in relation to
GIS. The following is a summary set of questions which you should ask of anyone
who calls their system object-oriented in order to clarify what they mean:

  1. Does it store objects as a fundamental unit in the database, with no need
    to split objects across tile boundaries or partitions? This is what we called
    an object-based system: we would not call such a system object-oriented.
  2. Does it have a “real world object centred”
    data model rather than a “geometry centred”
    model, as described above? The answer to this question has no bearing on whether
    or not a system is object-oriented.
  3. Does it provide an object-oriented programming environment which supports
    the following:
    • a) Encapsulation
    • b) Polymorphism
    • c) Inheritance
  4. Does it provide a set of standard class libraries which can be extended by
    the customer?
  5. Does it provide a database system which supports each of the previous concepts?

We will not be pedantic about trying to specify a precise set of answers to these
questions which mean that a system is object-oriented or not, since this seems
a rather pointless exercise.

Conclusion: So What?

Even if you can obtain answers to these questions, what does all this mean? To
an end user of the system, it really makes little direct difference. When sitting
in front of a system, you cannot tell whether or not it is object-oriented. The
primary benefits of object-orientation are in ease of customisation and maintenance
of the system, so the person who really sees the benefits of object-orientation
is the application developer. In turn, this of course benefits the end user, who
can expect to see applications delivered, and bugs fixed, much more quickly in
an object-oriented system. It is Smallworld’s experience after several years’ work
developing a GIS using an object-oriented programming environment that this approach
is significantly more productive than traditional approaches to development.

References

1. Richard G. Newell. Practical experiences of using object-orientation to implement
a GIS, Proceedings of GIS/LIS 92.

2. Arthur Chance, Richard G. Newell and David G. Theriault. An Overview of Smallworld
Magik, Smallworld Technical Paper no. 9.

Top