Technical Paper 13 – The Wide Area Connection

by John Rowland, Grampian Regional Council

Abstract

GIS data is voluminous, demanding upon bandwidth and therefore normally requires
high speed network links. This has served to constrain “real time” wide area distribution
of GIS data. In conjunction with British Telecom, Gandalf Digital Communications
Ltd and Smallworld Systems Ltd, Grampian Regional Council believes it has been
able to implement a realistic solution to this problem using Smallworld’s recently
developed “Persistent Cache” functionality running over British Telecom “Kilostream” links.

This paper:

  • briefly explains Grampian Regional Council’s requirement for wide area GIS;
  • overviews wide area communication options;
  • explains the basic concept of intelligent bridging;
  • describes the key features of the Smallworld System which have been used to
    implement wide area connections;
  • reviews experience to date;
  • briefly considers what the future may hold.

Grampian Regional Council & its Corporate GIS

Grampian Regional Council administers a land area of approximately 8,000km2 which
is home to a population of 530,000, half of whom live in the City of Aberdeen.
As with other Scottish Regional Councils, its responsibilities include the provision
of water, drainage, roads, economic development, strategic planning, fire, police,
education and social services.

The Council’s main headquarters is Woodhill House in Aberdeen, some departments
also operate from a number of divisional and other offices located throughout the
Region.

In 1992 the Council commenced implementation of a Corporate GIS which was installed
in Woodhill House for use by four departments (Economic Development and Planning,
Property, Roads and Water Services). The Council selected Smallworld GIS running
under UNIX as its core system. At present all departments share a single corporate
GIS database which is managed by a Sun MP630 file server. With ongoing data capture
this database continues to increase in size; at the time of writing it held 4GB
(Giga bytes) of GIS data.

Responsibility for maintaining this database and ongoing implementation of the
system on behalf of user departments is vested in a six person team called the “GIS
Unit”. To date the Council has acquired a total of twenty nine GIS “seats” from
Smallworld with more on the way. Ten of these seats have recently been acquired
by the Department of Water Services for installation at six different office locations
remote from Woodhill House (see figure 1).

Until recently it had not been viable for the Council to operate their Corporate
GIS over a wide area network. However, Smallworld’s recently developed Persistent
Cache database management software combined with “state of the art” network bridge
technology has enabled the Council to implement wide area connections using leased
British Telecom Kilostream lines. At the time of writing two of the Department
of Water Services’ remote offices have been connected to the main file server in
Woodhill House.

[ Figure 1 not available ]

Wide Area Connection Components

The wide area connection has four key components: a physical communication link
(British Telecom 64Kbps Kilostream in the first instance), intelligent bridging
(Gandalf LANLine), Smallworld version managed GIS database and Smallworld Persistent
Cache software (2).

Physical communication links

A 500m x 500m Ordnance Survey vector tile of an urban area typically contains
around 250Kbytes of uncompressed data. In order to pass such a tile over a network
and display it in a total elapsed time of less than 45 seconds the network has
to pass data at a speed of in excess of 44Kbps (Kilo bits per second). In order
to view 1km2 of similar data in the same time the speed would have to
increase to in excess of 180Kbps.

This should not be a problem over a local area networks with a bandwidth of10Mbps
(Mega bits per second). However, if all that there is between office locations
is a public telephone network, a couple of high speed modems operating at 14.4Kbps
and the inherent “dial up” delay of analogue communications, then there clearly
is a problem.

There is no alternative but to seek a digital communications link . Depending
upon what you are prepared to pay, digital links can provide effective line speeds
of 64Kbps up to in excess of 8Mbps with minimal “dial up” delay. They can either
be ISDN (“pay when you use”) dial up links or dedicated “Kilostream”
or “Megastream” leased lines.

ISDN

ISDN In United Kingdom ISDN (Integrated Services Digital Network) is available
either as ISDN2 providing an effective 128Kbps line speed using two 64Kbps channels
or ISDN30 providing an effective 1.92Mbps using thirty 64Kbps channels. At the
time of writing British Telecom ISDN2 socket installations were being charged at
approx £400 per site, line rent at £84 per quarter and transmission at normal telephone
call rate charges.

Leased Lines

Leased lines normally incur an initial installation charge and a subsequent annual
rental charge which varies according to distance from the nearest digital exchange.
At the time of writing British Telecom were charging £900 per site to install 64Kbps “Kilostream”.
The annual line rent of a link between two sites varies according to distance and
proximity to BT exchanges, some indicative figures are quoted in the ISDN2 v Kilostream
comparison below.

In contrast 2Mbps “Megastream2” currently costs £6,200 per site plus £750 per
link for a first installation and 8Mbps “Megastream8” £9,734 per site plus £2,625
per link. Line rents vary according to distance between BT exchanges, for example
if two exchanges were 50km apart, Megastream2 would currently cost £15,740 per
annum to rent and Megastream8 £55,108 per annum. Even the most optimistic GIS cost
benefit analysis may have difficulty in justifying expenditure of this magnitude!

Despite current talk of information super highways it is of little surprise that
many multi site GIS installations are still reliant upon using tapes, discs and
couriers to transfer data between individual sites.

The wide area connections to Grampian Regional Council’s six Water Service remote
offices are being implemented using a single 64Kbps Kilostream channel to each
office.

[ Figure 2 not available ]

Intelligent Bridging

Bridge or gateway devices are needed to connect the physical wide area communication
link between two remote sites to the local area networks (LANs) at those sites.

A bridge is effectively a filter which joins two network segments such that data
will only pass through the bridge to a second segment if it is destined for a device
connected to it. Bridges are commonly used to segment local area ethernets so that
unwanted data packets are not allowed to flow along segments where they are not
needed.

In a UNIX environment bridging is achieved using the IP (Internet Protocol) part
of the TCP/IP protocol (1). Every device connected to an ethernet has its own unique
IP address. A data packet being transmitted from one device to another always carries
with it the IP address of the device to which it is being sent. In the case of
a data packet which is broadcast to all devices on a network the IP address is
coded so as to indicate that it needs to be delivered to every device.

Gateways are special devices for transferring data between two different networks
which adhere to different network protocols. As such they actually have to restructure
the data packets which pass through them and are therefore inherently slower than
bridges.

Wide area physical communication links between sites are nearly always slower
than the local area networks they connect together, hence bridge or gateway devices
are needed to prevent unwanted local area traffic from escaping to and causing
congestion on the physical wide area link. Bridges supplied by Gandalf and other
vendors for this purpose incorporate a number of intelligent features to enhance
their performance.

Data Compression

Data Compression algorithms are used to compress transferred data, so as to achieve
actual throughput which exceeds the quoted bandwidth of the physical wide area
communication link. The degree of compression depending upon the extent that data
is already compressed. For example tests at Grampian Regional Council indicate
that their Gandalf “LANLine” bridges operating over 64Kbps Kilostream are able
to compress raw NTF files by ratios in excess of 3:1 and already compressed TIFF
files by ratios of around 2:1, thus achieving effective throughput of data in excess
of 192Kbps for raw NTF and 128Kbps for TIFF. Even higher compression ratios of
up to 8:1 can be achieved with these devices.

[ Figure 3 not available ]

Transparent Automatic Dial Up

Transparent Automatic Dial Up Bridges built specifically for connecting local
area networks to “dial up” links such as ISDN embody an “automatic dial up facility
whereby (for UNIX networking) the bridge is configured with a table which maps
different network IP addresses to the phone numbers to which they are connected.
Thus packets emanating from a “departure” site will cause their interconnecting
bridge to automatically dial up the phone number of the “destination” site.

ISDN bridges will normally also have a configurable “time out” connection period
which specifies how long an ISDN connection should remain connected for after a
packet has been transmitted. For example if the time out were set to 30 seconds
then the connection will close every time there is a break of 30 seconds between
transmitted packets. Given that ISDN connection dial up can be made in as little
as 5 seconds it is quite feasible to make several very short connections during
the course of the working day and only incur a relatively small phone bill.

Automatic dial up and subsequent timed out disconnection is totally transparent
to the user thus the ISDN bridge provides a virtual permanent connection.

Bandwidth On Demand

Bandwidth On Demand ISDN2 incorporates two individual 64Kbps channels which can
either be used in parallel to achieve an effective 128Kbps bandwidth (with compression
actual throughput will be even faster), or separately to send data to two different
destinations at the same time. Similarly Kilostream can be installed in multiples
of 64Kbps channels and used in much the same way.

“Bandwidth on demand” characteristics of local to wide area bridges enable individual
ISDN and Kilostream channels to be automatically opened and closed to different
destinations according to actual traffic volumes. With the Gandalf “LANLine” bridges
it is also possible to mix and match Kilostream and ISDN together such that an
ISDN connection can be opened when a single permanent Kilostream channel becomes
overloaded.

Virtual Extended Local Area Networks

The net effect of state of the art intelligent bridging used in conjunction with
digital wide area communication links such as ISDN and Kilostream is to create
a virtual extended local area network. In a UNIX environment client workstations
located at one site can access server devices at another site several kilometres
away as if both devices were connected to the same local area network. Albeit with
degraded performance if the volume of data being transferred between sites exceeds
the available bandwidth of the physical wide area link.

Not only does this permit remote offices to access main office data, but also
to output data to peripheral devices, such as expensive large format electrostatic
plotters, located in the main office.

Database Version Management

In order to understand how Persistent Cache is being used to provide Grampian
Regional Council’s “wide area connection” it is first necessary to provide a brief
explanation of their implemen-tation of Smallworld’s version managed database.

Smallworld Version Management permits several versions of the database to exist
simultaneously. In Grampian’s case these versions are organised hierarchically
as illustrated by figure 2. There is a single definitive top alternative” which
is normally never written to directly. Each department is then provided with its
own version of the “top alternative” which again are normally never written to
directly, instead all users who are required to write to the database are each
provided with their own “personal writable alternative”.

For routine data capture work users are usually asked to update their departmental
alternative on a daily basis by
“posting up” their own personal alternative to it. This has to be preceded by a “merge
down” of all changes which have already been posted to their departmental alternative.
Once all personal versions have been “merged and posted” a departmental administrator
then ensures that their own department’s alternative is “merged and posted” to
the “top” definitive alternative. Thereby inheriting changes and updates made by
other departments.

Grampian Regional Council’s GIS Unit is responsible for maintaining the Ordnance
Survey map base and other shared corporate datasets such as a number of different
gazetteers. Within the alternative structure the GIS Unit is treated as another
department thus departments, and in turn end users, have their map base maintained
for them by virtue of the “post and merge” procedures.

Within the UNIX file system the GIS database is held in a set of files storing
different types of data (eg geometrical points, lines, areas, associated attributes
etc). Database alternatives can be created so as to either be located totally within
a file set held in a single directory or, created so as to reside in a separate
sub directory with the same file structure. Thus the UNIX file system can if desired
be configured so as to totally or partially mirror the database alternative structure
(figure 3). This in turn implies that different alternative versions of the database
can be stored on different storage devices on the same network.

Persistent Cache

Smallworld’s Persistent Cache software (2) enables all or a subset of a GIS database
to be cached to a local disc attached to a client workstation which is in turn
configured to be a local cache server to both itself and other clients. By maintaining
a copy of frequently accessed data in the local cache, it is an elegant and transparent
way of providing large systems with high performance over low speed communication
links.

In figure 4, workstation A is a local cache server located at a remote site along
with client workstation B. GIS read transactions generated by workstations A and
B look first to the local cache to retrieve data. If the requested data has not
been cached it is retrieved from the main file server via the wide area connection
and then cached.

The local cache has a configurable operating capacity, once this capacity has
been filled old cached data is deleted from the cache on a “least recently used” basis.
The cache capacity can be set to be large or small depending upon the size of the
required database subset. If need be (local disc space permitting) it could be
set to be large enough to replicate the original database.

When using Persistent Cache, remote site users are able to retrieve cached data
very quickly and uncached data at the speed of the wide area connection. Hence
if a subset of the main database is cached there will be occasions when read transactions
may suddenly appear to slow down as data is retrieved over the wide area connection.

Write transactions write directly to the user’s alternative every time a database
record is inserted, updated or deleted and then subsequently copied back to the
local cache.

At appropriate periods of time, remote site users initiate merging and posting
of their changed data with higher order alternative versions. The merge and post
processes are run on whichever machine the various alternatives are held. The local
cache being updated where new “merged down” change data is located in a geographical
area that is already held in cache.

By virtue of the ability of being able to map alternative versions of the database
onto different UNIX directories (see figures 2 and 3) user’s alternatives can either
be held on the main server back at headquarters or somewhere locally at the remote
site. This provides organisations with a high degree of flexibility as to how they
operate over wide area connections.

Holding Remote Site Alternatives on Main Server

If users’ alternatives are located at headquarters then all write data is passed
over the wide area connection whenever a database record is inserted or updated.
In a data capture environment this implies that relatively small amounts of data
are passed frequently over the wide area connection.

Database commits and alternative version posting are processed back on the main
server and therefore no data is passed over the wide area connection. Similarly
the merge process (merging down of changed data from higher order alternatives)
is also undertaken back on the main server, however the amount of changed data
passed back across the wide area connection will depend upon the volume of merged
down changed data which maps onto currently cached “geography”. By holding all
remote site alternative change data on the main server the remote site users do
not need to be concerned with data backup and other routine system administration
tasks which can all be undertaken back at headquarters.

[ Figure (diagram) not available ]

Holding Remote Site Alternatives Locally

By holding user’s alternative change data locally no write data is passed over
the wide area connection until the locally held alternative versions are merged
and posted with and to higher order versions located back on the main file server.
If daily posting and merging is undertaken then this implies a daily transfer of
a larger volume of change data over the wide area connection.

The volume of changed data merged back down to the locally held alternatives is
entirely dependent upon the amount of data which has been recently posted to the
top (definitive) version of the database by other users. This could be considerable
if say a new batch of Ordnance Survey maps had been recently loaded.

Populating the Local Cache

The local cache is essentially an extended reflection of the data which a local
client work-station holds in memory. It is therefore composed of a subset of object
class layers for “blocks” of geographical extent. For example Grampian’s Water
Service divisional offices cache background map and water supply object class layers
for all or part of their divisional areas of operation.

Upon initial creation the local cache is “empty” and must be populated. Users
can be left to do this during the course of natural usage, upon first access all
data is “hauled” over the wide area connection and then cached. This could be a
little tedious if two or more users at the local site are simultaneously hauling
data over a 64Kbps line. They could therefore instead organise to “zoom out” to
a large extent of geography as they leave for home so that the area in which they
wish to work the following day has been cached upon return to work the following
morning.

Alternatively initial cache data can be written to tape by staff back at headquarters
and then copied into the local cache in order to “kick start” it.

Grampian Regional Council’s Wide Area Connection

Kilostream v. ISDN2

Although the Council already had some operational wide area communication links
it was decided that the Corporate GIS would have its own dedicated links because
of difficulties in extending heavily subscribed existing facilities to sites where
GIS was required. The lowest cost option able to provide acceptable performance
was therefore sought. This turned out to be a choice between ISDN2 and single channel
Kilostream. Capital installation costs were very similar for both (approx £2,500
per site) however, in the case of ISDN2 ongoing running costs varied considerably
according to degree of use.

For total daily connection times of less than about four hours per working day
ISDN2 is cheaper to operate than fixed fee Kilostream as illustrated below for
a notional 247 working days per year at current British Telecom day rate call charges:

[ Figure (cost notes) not available ]

The above costings indicate that the most cost effective option is dependent upon
the nature of GIS use at the remote site. If there is a low level of write transaction
at a site where a significant proportion of the database is held on the local cache
then ISDN2 provides a very flexible and potentially inexpensive wide area link.
However, if there is a high level of regular write transaction or considerable
regular “hauling” of uncached data throughout the working day then Kilostream is
going to be the more viable.

Because it was known that the first two Water Service offices to be connected
were “heavy” GIS users (they had been previously using GIS in a standalone capacity)
and there still appeared to be technical problems handling broadcast messages over
ISDN it was decided to adopt Kilostream for the first wide area connections.

Experience to date

Initial use indicates that the successful operation of the wide area links is
more dependent upon operational management than technical factors. The two remote
sites connected to date comprise of two locally networked GIS workstations currently
used for data capture work. By its very nature data capture work does not involve
frequent extended panning across the map base, hence “hauling” of uncached data
has not been a problem with a relatively large capacity cache which was pre-populated
prior to installation.

Data transfer across the wide area connection performs rather like a motorway
contraflow, in so much that if there is very little traffic on the motorway then,
ignoring speed limits, traffic flow is virtually as quick as if there were no contra
flow. However as the volume of traffic increases the actual throughput speed decreases
in almost exponential proportion.

1km2 of inner city water data takes only slightly longer to display
when retrieved over the wide area connection as when retrieved straight from cache.
However 1km2 of inner city water data plus all Landline OS data takes
significantly longer to display.

Grampian’s two Water Service offices have been configured so that local user’s
alternative versions are stored back on the main server, consequently data is passed
over the Kilostream every time a record is inserted or updated. Users have noticed
a degradation of write transaction time when they both write simultaneously. The
degree of degradation is acceptable but does indicate that sites with a number
of writing users may need to either store their alternative versions locally or
be provided with access to additional communication channels over the wide area
link.

The conclusion to date is that the nature of GIS usage needs to be understood
in order to specify and configure a wide area connection for optimum performance.

[ Figure 4 not available ]

What Of The Future

Grampian Regional Council believes that it has been able to implement wide area
networked GIS at realistic cost using technology which is available today. It has
been proven that a single channel Kilostream link operating at 64Kbps is adequate
for the scale of present implementation. Furthermore this has been achieved with
a great deal of “behind the scenes” activity which is totally transparent to the
user.

The computer press makes great play of cheap high speed local and wide area ATM
(Asynchronous Transfer Mode) networks being the way of the future (3), however
the technology is not yet available and until it is, it is difficult to see how
GIS data can be viably transferred between different systems in anything like real
time.

In the longer term the Council is keen to reduce the cost of providing wide area
connections to more marginal GIS users by using ISDN2 instead of Kilostream. It
is also keen to exploit the potential for transfer of data between different organisations
using ISDN. The cost of operating ISDN2 between locations over 35 miles apart is
the same no matter whether they are 36 or 500 miles apart. Unlike “fixed” Kilostream
links, ISDN connections can be made between any two locations which can dial to
one another.

Persistent Cache has also been seen as a way of relieving congestion on heavily
used local area networks. The Council is currently planning a 6 seat GIS sub network
in its headquarters which will use Persistent Cache to reduce the volume of GIS
data over the building’s main backbone LAN.

Acknowledgements

The authors wish to thank British Telecom, Gandalf Digital Communications Limited,
Grampian Regional Council and Smallworld Systems Limited for their support and
assistance in compiling this paper. Particular thanks go to Alistair Reid, Andrew
Swanson and George Wallace of Grampian Regional Council for their part in installing
wide area connection components and Andrew Reid of Gandalf for his enthusiastic
support, also to the staff of the Department of Water Services for acting as “test
drivers”.

References

1. SOUTHERTON A. Modern UNIX, Chapter 4, Wiley 1992.

2. NEWELL R.G. BATTY P.M. GIS databases are different. Proceedings of the AGI
93 Conference Part 3.

3. UNIX NEWS No 56 October 1993, ATM is the wave of the future p63-65.

Copyright © 1996 Smallworld Systems, Inc. All rights reserved

Top