• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of ijhealthgeoBioMed CentralBiomed Central Web Sitesearchsubmit a manuscriptregisterthis articleInternational Journal of Health GeographicsJournal Front Page
Int J Health Geogr. 2004; 3: 1.
Published online Jan 28, 2004. doi:  10.1186/1476-072X-3-1
PMCID: PMC343292

Towards evidence-based, GIS-driven national spatial health information infrastructure and surveillance services in the United Kingdom

Abstract

The term "Geographic Information Systems" (GIS) has been added to MeSH in 2003, a step reflecting the importance and growing use of GIS in health and healthcare research and practices. GIS have much more to offer than the obvious digital cartography (map) functions. From a community health perspective, GIS could potentially act as powerful evidence-based practice tools for early problem detection and solving. When properly used, GIS can: inform and educate (professionals and the public); empower decision-making at all levels; help in planning and tweaking clinically and cost-effective actions, in predicting outcomes before making any financial commitments and ascribing priorities in a climate of finite resources; change practices; and continually monitor and analyse changes, as well as sentinel events. Yet despite all these potentials for GIS, they remain under-utilised in the UK National Health Service (NHS). This paper has the following objectives: (1) to illustrate with practical, real-world scenarios and examples from the literature the different GIS methods and uses to improve community health and healthcare practices, e.g., for improving hospital bed availability, in community health and bioterrorism surveillance services, and in the latest SARS outbreak; (2) to discuss challenges and problems currently hindering the wide-scale adoption of GIS across the NHS; and (3) to identify the most important requirements and ingredients for addressing these challenges, and realising GIS potential within the NHS, guided by related initiatives worldwide. The ultimate goal is to illuminate the road towards implementing a comprehensive national, multi-agency spatio-temporal health information infrastructure functioning proactively in real time. The concepts and principles presented in this paper can be also applied in other countries, and on regional (e.g., European Union) and global levels.

Introduction

"A new wave of technological innovation is allowing us to capture, store, process and display an unprecedented amount of information about our planet and a wide variety of environmental and cultural phenomena. Much of this information will be 'geo-referenced' – that is, it will refer to some specific place on the Earth's surface. The hard part of taking advantage of this flood of geospatial information will be making sense of it, turning raw data into understandable information."

– Former American Vice President Al Gore [1]

The need for an evidence-based, spatio-temporal approach to public health

Geography plays a major role in understanding the dynamics of health, and the causes and spread of disease [2]. The classic public health triad composed of man, agent/vehicle and environment emphasises the importance of geographic location (environment or space where we live) in health and disease. Interactions within this triad can also change with time.

Today's health planners aim at developing health policy and services that address geographical and social inequalities in health, and therefore should benefit from evidence-based approaches that can be used to investigate spatial aspects of health policy and practice, and evaluate geographical equity (or inequity) in health service provision [3].

Besides policy development, and provision and management of health services, public health practitioners have other important and related tasks including prioritisation of interventions and programmes, responding to health alerts and concerns, intersectoral engagement, and community development initiatives. In all these tasks, they should strive to incorporate searching and using best evidence in their everyday decision-making processes in order to minimise investment of efforts and funds in areas where there is solid evidence of no effect, or evidence of harm, or of poor cost-effectiveness. Evidence-based approaches can also highlight areas where the evidence may be less than reliable, requiring further assessment before expending large funds and efforts. Ideally, the tools to achieve this goal should be accessible and usable by mainstream practitioners, transparently embedded into routine workflows, and seamlessly incorporated into existing busy work environments [4].

On geo-information and GIS

According to the US Federal Geographic Data Committee (FGDC), geographic location is a key feature of 80–90% of all government data [5]. The same can be also said about government data in other countries, including data generated by the health sector in different countries. This locational or spatial reference is a "main key" in the transformation of data into information, and for linking and integrating different datasets covering same and contiguous locations [6].

Spatial data are a resource on a par with employees, funds, etc. Use of spatial information opens up the possibility to increase efficiency in the public and private sectors. Unlike other resources, spatial data do not suffer any wear and tear from repeated use. On the contrary, reusing data increases the possibilities for improving the content quality of data collections. The real benefit of investments in spatial data increases dramatically with the multiple use of data [6].

In 2003, the US National Library of Medicine added the term "Geographic Information Systems" to its controlled vocabulary thesaurus known as MeSH (Medical Subject Headings – see http://www.nlm.nih.gov/cgi/mesh/2003/MB_cgi?term=GEOGRAPHIC+INFORMATION+SYSTEMS, a step reflecting the importance and growing use of GIS in health and healthcare research and practices.

The US FGDC defines GIS as "computer systems for the input, storage, maintenance, management, retrieval, analysis, synthesis, and output of geographic or location-based information. In the most restrictive usage, GIS refer only to hardware and software. In common usage (by organisations), they include hardware, software, and data. For some, GIS also imply the people and procedures involved in GIS operation" (cited in [7]).

The inclusion of "people" (properly trained staff with adequate work time to spend on GIS activities) and "procedures" as part of the above definition is essential for GIS applications in a public health context, given the need to link the science and methods of epidemiology to GIS output to avoid producing invalid or misleading results [7].

GIS are potentially powerful resources for community health for many reasons including their ability to integrate data from disparate sources to produce new information, and their inherent visualisation (mapping) functions, which can promote creative problem solving and sound decisions with lasting, positive impacts on people's lives [8,9].

Our experience in applying GIS to health issues has increased considerably over the last decade. However, GIS have been usually applied to time-limited, single, isolated aetiological research or surveillance issues processing mainly retrospective data rather than to ongoing, broad efforts and wide-scale applications processing real-time or near-real-time data for health planning, promotion and protection. This may be due to the problems encountered in identifying, acquiring and integrating a wide range of geo-referenced data relevant to community health in order to support decision-making and problem solving in community health planning, service delivery, and health promotion [8].

On spatial data infrastructures, and spatial information and knowledge management

In the early 1990s much attention was focused on GIS as a basis for spatial information systems. Soon it became obvious that the pure technical approach had to be replaced by a more holistic approach encompassing organisational, political and technical matters at the different local, national, regional, and global levels. The concept of "Spatial Data Infrastructure" became a reality [6].

Spatial information management is a discipline for the individual organisation, administration or enterprise, the micro level, and for society in general, the macro level. On the micro level there will be a technical approach whereas on the macro level political and organisational issues will be highlighted [6]. Both levels are interdependent and complementary.

Spatial information management is based on the idea that data, people, software and hardware interact, and that it is practicable to obtain synergy by coordinating changes and development to help users have a better overview of both simple and complex problems, and give them the possibility to create comprehensible, acceptable solutions and/or compromises. The concept covers various disciplines such as capture, storing, maintenance and upgrading of data and information, information technology, organisational issues and spatial data infrastructure [6].

Spatial information managers and responsible politicians will become the main catalysts in the development, implementation and maintenance of the necessary Spatial Data Infrastructures [6].

How the rest of this paper is organised

To conclude this introduction, we indicate how the rest of this paper is organised. In the next section on "GIS methods and technologies", we cast some light on the richness of GIS toolbox, which goes far beyond the mere production of simple maps (or digital cartography). The section that comes after, titled "GIS applications in health and healthcare", examines with examples the main uses of GIS in the health sector (apart from real-time GIS applications, which are covered in a separate section near the end). We then discuss the current state of GIS affairs in the UK NHS in the section titled "On the under utilisation of geo-information and GIS in the UK NHS: problems and challenges". This is followed by a section on "Geo-information and real-time GIS infrastructure requirements" in which we review the most important technical and organisational elements that are required for a successful implementation of a national geo-information infrastructure that can also support real-time GIS applications in public health. The section that follows, titled "Problematic issues and solutions", is a direct continuation of the one preceding it, and discusses tricky issues like data confidentiality and data/analysis errors, together with solutions that can address them. We next present examples of Spatial Data Infrastructures (SDIs) at different levels of development from around the world in the section on "Existing SDIs and SDI initiatives worldwide". Then in the section on "Proactive, real-time, GIS-enabled health and environmental surveillance services", we describe a wide-scale vision for, and some early real-world applications of, real-time GIS in emergency management, and in health and environmental surveillance. Such applications currently involve limited SDI-like arrangements, and would certainly benefit from the development of mature SDIs in their respective regions. The final section titled "Discussion, recommendations and concluding remarks" very briefly reiterates and wraps up the main points made in this paper, and provides some final recommendations and directions for future work.

GIS methods and technologies

This section on GIS methods and the following one discussing GIS applications are complementary to our review of the subject published in 2001 [10]. Freier also provides one of the best thorough, but concise and easy-to-follow descriptions of the main GIS methods available today for emergency management. These methods, which also apply to other types of health-related analyses using GIS, include overlay analysis of thematic data and spatial intersection, buffer generation, neighbourhood analysis, vector-based grid generation, network analysis, and (raster) surface modelling. These GIS methods should be coupled with proper spatio-temporal statistical methods to ensure valid analyses and robust conclusions [11,12].

GIS offer powerful features not available to users of either paper-drawn or electronic map images. In GIS, geographic boundaries of study areas can be accessed and modified, data class intervals and symbologies restructured, map layers (variables) vertically overlayed and integrated, new independent map variables added for multivariate spatial statistical analysis, spatial weights computed, spatial autocorrelation on predictor variables assessed, and probability scenarios of mapped variables explored based on modelled changes in regression coefficients over time, with unparalleled computational speed and ease. GIS also enable multi-dimensional surface images to be drawn to scale, a feature important in studies involving elevation or subsurface shape. The mathematical treatment of topographic or surface statistical values can be used as a filter against other variables or other surfaces. A range of statistical techniques have evolved that are well suited to GIS analysis, including density kernel estimation, grid and probability estimation, and kriging (see "Smoothed maps" below) [13].

Rushton suggests that GIS provide the capability to perform two types of spatial analysis that could not be performed without GIS: finding areas of high disease incidence that can be labelled as statistically significant and worthy of further investigation, and examining the spatial relationship between disease incidence and information that is geo-referenced differently from the disease data [14].

Rushton also argues that GIS are useful for exploratory spatial analysis but are less useful for confirmatory analysis [14], although it is clearly possible to integrate confirmatory statistical methods with GIS.

The use of small area measures of socio-economic conditions to examine variations in health status and health services provision

By combining health datasets with other sources, such as census data for small areas, GIS can be used to investigate spatial patterns in health outcomes in relation to socio-economic characteristics of areas, in identifying gaps in healthcare provision, as well as in monitoring the impacts of changes in policy [3].

GIS point-in-polygon analysis, which overlays points on area features, can be used to attach census data relating to small areas such as enumeration districts (in the UK) to individual point level data such as patient postcodes [3].

Higgs and Richards used GIS to examine the socio-demographic profiles of patients on a dental practice register in Wales. By working at the individual patient (point) level, they have demonstrated the potential for GIS to work with spatially disaggregate data to address key concerns of policy makers towards, for example, equity of healthcare provision. Their study also highlighted the importance of maintaining high quality (i.e., up-to-date, complete, accurate, fully postcoded, and one could also add clinically-coded) health registers and records [3].

Higgs and Richards explored the use of two different deprivation indices in their study, namely the Index of Multiple Deprivation (IMD) and the Townsend index. Deprivation indices are frequently used in relation to health needs assessment and in resource allocation. Different deprivation indices have different points of strength and weakness, and can yield different results in some studies [3,15].

Choropleth maps

Most traditional analyses of disease patterns examine disease rates at a given level of spatial resolution defined by spatial entities developed for administrative and other purposes. Choropleth maps are commonly used to depict the patterns of disease rates. Disease incidence and other spatio-temporal epidemiological events are portrayed on these maps as shaded polygons (each representing an administrative area). Each of these polygons contains a numerical value of the mapped disease incidence represented as a shaded value within the covered national framework. Visual communication of disease risk is oversimplified since all values appear evenly distributed within a polygon. Moreover, values among contiguous areas (polygons) in a choropleth map can differ abruptly at adjoining borders, while in reality disease incidence and most other spatio-temporal events and phenomena such as deprivation levels are continuous variables distributed continuously across space and do not change abruptly at arbitrarily defined administrative, census and political boundaries (Figure (Figure1).1). Other limitations of the choropleth design include the visual dominance of larger areas over smaller ones [14,16].

Figure 1
A simple choropleth map of Townsend Deprivation Score distribution in Bath City Electoral Wards A simple choropleth (graduated colour) map of Townsend Deprivation Score distribution in Bath City Electoral Wards (UK). Abbey and Twerton are the most deprived ...

Yet, despite all these limitations, the choropleth design remains in many cases the method of choice to communicate estimated spatial density of reported disease incidence, being quite easy and straightforward to construct compared to the use of geostatistics like kriging (see "Smoothed maps" below), which requires more complex computational choices [16].

The choropleth map could be considered a filtered map using a non-overlapping, variable-size, spatial filter with filter shapes selected from available political or administrative regions (hence its limitations – see "Smoothed maps" below). Rushton mentions three factors to explain why data is commonly made available for such odd-shaped and different sized regions: (1) data for such areas can be easily encoded from the information provided; (2) information is often requested for such areas as people are familiar with them and use them to convey the spatial limits of their interest, and also to enable comparisons between different administrative regions, e.g., regarding success in implementing a particular directive, health promotion programme or other intervention; and (3) aggregating health data to areas is one easy method to reduce the risk of disclosure and protect privacy of individuals [14].

Smoothed maps

To meet the purpose of exploratory spatial analysis, health data are better examined by methods that assume that disease rates are spatially continuous [14]. One can display data collected at smaller geographic areas (with fewer individuals) and still maintain the stability of the estimated rates by constructing a smoothed map. One way to do this is to use Bayesian or empirical Bayes methods, calculating the estimated rates for each smaller area by incorporating information about the observed data from neighbouring areas together with priors concerning the spatial variation of the rates [17].

Another approach is to use a spatial filter or ratio smoother [17]. The principal reason to filter disease data spatially is to examine the spatial pattern of disease at different levels of spatial resolution and to compute disease rates that are not dependent on the specific boundaries of the areas used in spatially aggregated data [14]. A spatial filter can be applied to individual point data, as well as to data aggregated into small census areas. In its simplest form, the estimated rate at a particular location, or grid point, is defined as the observed rate within a fixed distance from the grid point. The circles of neighbouring grid points are set to overlap to allow neighbouring grid points to share observations. After assigning estimated rates to each grid point, contouring software is used to create isarithmic maps in which regions with a constant range of values can be recognised. This enables the creation of a continuous smoothed map of the data [17].

Talbot et al propose a modified spatial filter for creating smoothed disease maps, where the spatial filter is defined in terms of constant or near constant population size rather than constant geographic size. This means that the circles will usually be larger in the rural areas (lower population density) compared to urban areas (higher population density) [17].

Kriging can be also used to produce continuous map surfaces from sample points. Croner and Cola provide some good examples from the literature of using the geostatistical procedure of kriging in disease epidemiology and public health. They also describe their own experience in using the procedure to model and forecast the underlying spatial structure of reported Lyme disease incidence in the US. Kriged smoothed maps may strengthen our ability to visually communicate event patterns, especially over time (also possibly through the combined use of kriging and animation). As a geostatistical modelling technique, kriging takes into account the existing underlying spatial structure of georeferenced information (distances among samples or observations). Statistically optimal estimates and their standard errors for locations with missing data (unsampled locations) may be derived, and the actual and estimated data represented together as a smoothed surface or raster data structure. Kriging can also take into consideration associative covariates when producing the final smoothed surface. However, the accuracy of kriging results depends on the aggregation level of the data used (e.g., state-level vs. finer county-level data in the US) [16].

Trend surface analysis is another technique for producing smoothed maps. Trend surface maps are commonly used to report the spatial diffusion process of disease epidemics (the movement of epidemics across geographical space). In their GIS-driven Drug Incidence and Prevalence Estimation Program (DIPEP), Field et al used trend surface maps to overcome the drawbacks of administrative boundary choropleth maps (e.g., ward-based maps in the UK). They also used animated sequences of trend surface maps to study the waves of diffusion of problematic drug misuse across time. Animated trend surface maps could be considered as illustrating a more accurate picture of the spatio-temporal characteristics of mapped events and phenomena, when compared to administrative boundary maps, since populations are distributed continuously across space [18].

It is noteworthy that the interpolation tools in ESRI ArcGIS 3D Analyst, Spatial Analyst, and Geostatistical Analyst extensions support kriging among other methods for the production of continuous surfaces from sampled points, while ESRI ArcGIS Tracking Analyst extension enables the visualisation and analysis of temporal data (including real-time data feeds) by defining events including time, location, and attribute information. (ArcGIS 3D Analyst also supports Triangulated Irregular Networks (TINs) and three-dimensional (3D) data visualisation giving users completely new perspectives about their data. For example, adding 3D to attribute data such as population growth allows better viewing of trends and changes. For detailed information about the complete range of ArcGIS Extensions, see http://www.esri.com/software/arcgis/arcgisxtensions/index.html

Testing for spatio-temporal disease clustering

Many different test statistics are also available to test for spatial disease clustering, with different powers for detecting different kinds of clustering. These tests include Besag-Newell's R, Cuzick and Edwards' k-Nearest Neighbours (k-NN), Moran's I, the spatial scan statistic (SatScan), Tango's Maximised Excess Events Test (MEET), Bonetti and Pagano's nonparametric M statistic, Swartz' entropy test, and Whittemore's test [19].

Rogerson's spatial pattern surveillance technique is a surveillance method for detecting changes in spatial pattern in cases over time relative to the population-at-risk. The location of new cases is monitored as they occur with the objective of detecting emerging clusters shortly after they occur. The method represents a cumulative sum statistic and procedure for the monitoring of changes in spatial pattern for observations processed sequentially [20].

For a comprehensive discussion of prospective statistical public health surveillance methods, the reader is urged to consult the recently published excellent review of the topic by Sonesson and Bock [21].

Spatial data mining

The Amsterdam Police department uses spatial data mining technology from Sentient http://www.sentient.nl/ and MapInfo in a cutting edge crime analysis and prediction system, able to detect patterns in a wide range of data, including criminal records, weather measurements, and socio-demographic information. This leads to better strategic insight, input for state and government policy and programmes, information for more effectively assigning finite resources and last but not least: more crimes being solved [22].

Related technologies: remote sensing and global positioning systems

The growing uses of remotely sensed imagery and satellite facilitated global positioning systems (GPS) are contributing to unprecedented surveillance of the environment. High-resolution satellite imagery provides timely and detailed digital representations of existing landscapes and land covers, which can be spectrally classified and statistically correlated with disease host and vector habitats. Remotely sensed data are being used both in historical and real-time modes to assess and model catastrophic health events [13].

Automated change detection applied to a sequence of digital imagery from satellites or aerial photos for a small area of interest can be used to observe changes over time, such as the addition of housing developments, roads, and landfills and other changes in land use and land cover. All these changes have implications in public health and are necessary to properly establish and revise community health priorities and plans [7].

In the US, NASA's CHAART (Centre for Health Applications of Aerospace Related Technologies) facilitates the use of remote sensing technology in public health research. Examples of projects carried at CHAART, as well as examples of GPS applications in health are described in [10] and [13].

Telegeoprocessing and mobile GIS

Xue et al define telegeoprocessing as a new discipline revolving around real-time spatial databases that are updated regularly by means of telecommunications systems in order to support problem solving and decision-making at any time and any place. It involves the integration of remote sensing, GIS, GPS and telecommunications [23].

Mobile phones and other digital devices are rapidly gaining location awareness and Web connectivity, promising new spatial technology applications that will yield vast amounts of spatial information [24]. Examples of such applications include in-the-field data entry and access, and many useful location-based services [25]. However, according to RSA Security Inc. http://www.rsasecurity.com/, wireless and mobile telecommunications also pose the following security challenges: more connectivity resulting in more points of vulnerability; information is more easily intercepted; and devices, being more portable, are more easily lost or stolen.

GIS applications in health and healthcare

Through multivariate spatial statistical modelling of disease processes, GIS enable the evaluation of potentially true disease outbreaks and a more effective allocation of sparse remedial resources towards their containment and prevention. GIS also assist users in better understanding the potential harmful effects of environmental pollutants, e.g., toxic waste sites, and even in understanding the occurrence of pedestrian and other injuries, and crimes. Today, environmental monitors measure air and water quality, solar irradiation, radon gas levels, and other exposures potentially deleterious to human health. These measurements can be brought into GIS, spatially referenced and integrated analytically with other health predictor variables and outcome data. In fact, any adverse (or positive) health-related phenomenon that can be defined spatially (atmospheric, aquatic or terrestrial) can lead to GIS analysis [13].

The determination of effective response time zones for the provision of emergency care services is another application already benefiting from the unique capabilities of GIS in calculating travel time isochrones [13,26].

GIS can also help promote healthy behaviours by documenting where the populations are located that have the greatest need of improved information, then using GIS-enabled Internet sites as an outreach vehicle for community health education [27]. For this reason, it is always encouraged to consider the public as one of the main beneficiaries of any national spatial health information infrastructure (see later), and they should be offered full access to data and information (subject to appropriate confidentiality and national security safeguards). The Bradford Community Statistics Project http://www.communitystats.org.uk/ provides a good example of public participation GIS projects, and aims at empowering residents to develop their own policy initiative and funding proposals.

Richards et al describe the advantages of GIS technology using some excellent public health example scenarios: a childhood lead poisoning prevention programme; mapping of motor vehicle injuries and fatalities in a community; and using data collected by marketing firms about consumer spending patterns and lifestyle segmentation profiles to identify the best target populations for prevention interventions, e.g., anti-smoking programmes, and to select the best media channel(s) and times of the day to communicate a particular message to a given population [7].

Richards et al also describe a feasible scenario for geographically enabled electronic medical records wherein all electronic inpatient and outpatient medical records in a given community are regularly scanned to map asthma cases (in the example given) and compare current week maps with those for prior time periods. In this way, any unusual case clusters or patterns in the community can be easily identified, e.g., an increase in asthma hospitalisations. Such patterns can be further and more closely investigated and appropriate actions taken. In the same asthma scenario described by Richards et al, most affected individuals in the hospital with the highest rate happened to work at the same factory. Using GIS technology linked to a database about workplace chemical exposures, the potential exposures at the factory in question were reviewed and the agents associated with asthma-related hospital admissions identified. An appropriate action was then initiated in the form of a request that an industrial hygienist visits the plant in question the same day [7].

Gavin and her colleagues provide examples of how developing African countries are currently using geo-information to produce enhanced capacity for emergency response, more effective and efficient government operations, increased transparency of public decision-making and better addressing of social inequalities. They mention a famine early warning system in Burkina Faso that uses climate, agricultural, and population data to provide timely, accurate projections of crop shortfalls, enabling the government to take corrective action. They also describe how geo-information used in a poverty mapping initiative in South Africa was combined with information on sanitation and safe water supplies to create a strategy for containing a cholera outbreak in KwaZulu Natal province. Data on illiteracy rates, dwelling types, and lack of basic services formed the basis for an effective, targeted health education campaign. The resulting fatality rate for this outbreak, 0.22%, was among the world's lowest ever recorded [9].

The World Health Organisation's HealthMapper application http://www.who.int/csr/mapping/tools/healthmapper/healthmapper/en/ and the Pan American Health Organisation's Sig-Epi (GIS in Epidemiology and Public Health – http://ais.paho.org/sigepi/ have already been described in our previous review [10].

Real-time GIS applications in health and environmental surveillance, and in emergency and epidemics management are presented later in this paper (see section titled "Proactive, real-time, GIS-enabled health and environmental surveillance services").

Traditionally, two broad types of GIS applications can be distinguished which also reflect the two traditions in health geography (geography of disease and geography of healthcare systems), namely health outcomes and epidemiology applications and healthcare delivery applications. There are also studies at the interface (overlap) between epidemiological and healthcare delivery applications, for example in relation to healthcare commissioning and needs assessment [10,28].

Health outcomes and epidemiology applications

A number of studies have used GIS to study disease patterns (e.g., identify leukaemia clusters), spatio-temporal variations in health outcomes, and identify possible causes of mapped patterns (e.g., the relationship between cancer incidence and various environmental factors). These generally involve the linkage of health information with environmental and socio-economic data. GIS can also be used to target resources for disease prevention by highlighting areas with significantly high rates, and to predict which areas might be at future risk and which may benefit most from future local population screening [28].

Examples of health outcomes and epidemiology applications using GIS include research carried in the UK at the West Midlands Cancer Intelligence Unit and the Small Area Health Statistics Unit (SAHSU) [10], and also the work published by Dunn et al in which they have examined the association between asthma incidence and proximity to industrial sites in North East England and suggested relationships with prevailing wind patterns [29].

Field et al describe an interesting application of GIS in modelling drug misuse. Current methods for estimating the incidence, prevalence, and spread of drug misuse tend to be retrospective (delivering information about past events) and are not capable of forecasting spatio-temporal trends. Field et al developed a GIS drug misuse system to create a dynamic model for forecasting and displaying spatio-temporal trends and linking environment with behaviour. It includes a range of parameters to model drug misuse and its geographic spread across a population using UK data as a basis for developing a European-wide forecasting system. Their approach provides the basis for examining more complex geographic diffusion scenarios such as the introduction of new practices by new users, the development of education and remedial initiatives, impacts of tourism and migration, cross-border contact, drug transportation, and increasing opportunities for economic and international contact [18].

The World Health Organisation (WHO) Regional Office for Europe has also produced an Atlas of Health in Europe, a statistical atlas that presents key health figures for the WHO European Region. It covers basic demographic data, mortality and morbidity, lifestyles and environmental indicators such as alcohol consumption and road traffic accidents, and types and levels of healthcare. Most indicators are presented as a map to show overall regional variations, a bar chart to indicate country rankings, and a time chart to show trends over time in three main country groupings [30].

The WHO's Atlas of Health in Europe offers static information about retrospective events and data. Even if the WHO keeps publishing updated versions of this atlas, it will always lack (in its current form) the interactivity, real-time or near-real-time processing of current data, and the proactive features desirable in a true regional/community public health surveillance and spatial decision support system.

In Sweden, the development of spatial analysis has started with a focus on health determinants at community and regional levels. GIS have been introduced for both presentation and analysis. In March 2003, an atlas presenting health status and health determinants for all municipalities in Sweden was published on the NIPH (National Institute of Public Health) Web site http://www.fhi.se/nyheter/data.asp?id=984. It is planned for this atlas to be expanded and updated regularly with an increasing number of determinants, and to cover a larger time frame [31]. Like the WHO's Atlas of Health in Europe, this Swedish atlas remains a collection of pre-drawn, static maps (still very valuable, but limited in many aspects).

A network of researchers and practitioners from various institutions in Sweden, is preparing for a training course, to be conducted in 2004, on spatial analysis research. A textbook on spatial analysis in Sweden is also planned. Cooperation on comparative analyses has been initiated with the University of Massachusetts (US) [31].

Healthcare delivery applications

These applications involve using GIS to plan healthcare delivery, study service need, accessibility and utilisation, and aid resource allocation. For such applications to be truly integrated into the strategic decision-making process, they should incorporate task-appropriate statistical and modelling techniques, e.g., spatial interaction models (allowing proposed health services/centres to be added interactively to assess their impact), and location-allocation models in order to forecast and evaluate the implication of modifying the configuration of existing services (i.e., "what if" scenarios), and thus play a proactive role in the healthcare planning process. GIS have been used in a number of studies to estimate the best/optimal location for a new clinic, hospital or GP surgery to minimise distances potential patients need to travel taking into account existing facilities, transport provision, hourly variations in traffic volumes, and population density. A number of these studies using the networking capabilities of GIS have been concerned with the concept of potential accessibility. Others, who have had access to spatially disaggregated data, have been concerned with (actual) revealed accessibility patterns of service utilisation [28].

Another remarkable application involves the use of GIS to improve hospital bed availability and avoid access block. Access block (hospital bed shortage) occurs when a patient in the emergency department (ED) requiring inpatient care is unable to gain access to an appropriate bed within a reasonable time. It is measured by the percentage of all patients admitted, transferred to another hospital for admission, or dying in the ED whose total ED time exceeds eight hours. Access block is the prime symptom of supply/demand mismatch in hospital bed stock in Australia. This is also true in many other parts of the world today, including the UK. Access block may result in ambulance bypass, increased ED waiting time and casualty queues, increased frequency of adverse events, increased patient complaints, and adverse media attention. Ashby describes a number of ameliorating strategies to avoid access block that have been implemented by the Royal Brisbane Hospital in Queensland, Australia where he works as Executive Director of Medical Services. Among these strategies is a method to improve hospital bed availability through precision bed management using integrated demand, utilisation and PAS (Patient Administration System) data, flow models and advanced GIS to map the geography of the hospital against variables such as patient numbers, staff numbers, and nurse dependency. GIS are used in patient flow modelling to look firstly at opportunities for flow reversal of ED, outpatients and secondary level inpatients, and secondly to improve efficiency through optimum distribution of patients according to a number of variables including nurse dependency, projected length of stay, projected time of discharge and infection status. GIS are proving superior to conventional patient activity systems in informing the organisation about bed management opportunities on an hour-by-hour basis [32,33].

It is noteworthy that Downey Regional Medical Centre (DRMC) in California, US, is currently using a large, multi-layered, GIS-enabled patient care and room management system that leverages digital floor plans, workflow analysis, and data visualisation for a better solution to how DRMC assigns patients to rooms, monitors the discharge process, and prepares rooms for new patients. The system captures, logs, manages, and analyses a vast array of information about patients checking in, switching rooms, checking out, and moving from in-hospital to outpatient care [34].

On the under utilisation of geo-information and GIS in the UK NHS: problems and challenges

In a recent review paper, Higgs and Gould highlighted the gap between academic health-related applications of GIS and their everyday use within the UK National Health Service (NHS). They argue for closer collaboration between GIS academics and NHS professionals to advance health-based GIS work [28]. GIS have been used in the UK health sector for over a decade, but their greatest contribution so far has been in low-level operational tasks (see "surveys of levels of GIS use in the NHS" below). There is little evidence that GIS are being formally considered or regularly used in strategic decision-making, e.g., major healthcare planning within the NHS [28,35].

Spatial data and GIS are not mentioned in UK health information strategy documents

The NHS Executive landmark report "Information for Health" published in 1998 sets out a strategy for improving the availability, reliability, management, analysis, and dissemination of digital data and information in the NHS over the coming years [36]. As outlined in this key strategy document, data-rich records kept by GPs remain a relatively untapped resource particularly in the areas of local health surveillance, service audit and resource targeting. Information for Health also calls for data sharing between NHS and non-NHS organisations in order to tackle health issues. Linkages between, for example, poor health and unemployment, housing, crime, and education are major drivers for partnership approaches between such organisations. The Acheson Report published in 1998 also recognised the need to adopt cross-governmental approaches to address health concerns [37]. Considering all of this, and given the recent media attention to geographical variations in healthcare service provision, which often revolve around the so-called "postcode lottery" in treatment levels, the fact that a considerable majority of the datasets used in UK primary and secondary care are geo-referenced, and the recent increase in the number of articles (e.g., [10]), books (e.g., [38-41]), and conferences (e.g., [42]) about the potentials and use of GIS in health applications, it is surprising there has been no mention in Information for Health or other more recent follow-up documents (e.g., "Building the Information Core: Implementing the NHS Plan" published in 2001 – [43]) of the role that spatial data and GIS could play in the new NHS [28,35].

The role of spatial information in the health sector in relation to, for example, local health improvement programmes or performance management is not identified in any of the core UK national strategy and policy documents, although the potential for using information from primary care systems to support needs assessment and resource targeting is one of the principal action points. There is also no mention of the potential for GIS to support partnership approaches for providing and exchanging information on such issues at either national or local scales [35].

The NHS Information Authority (NHSIA), established as a special Health Authority in 1999, states as one of its strategic objectives the need "to contribute to the implementation of Information for Health by establishing, maintaining, developing and supporting a national information infrastructure, national products, national standards, national services and working with the NHS and others to make effective use of these products and services" [44]. Again it is astonishing that there is no explicit mention of the potential for geo-information and GIS in addressing these aims. Neither are GIS included in any of the policy documents produced by the Information Policy Unit (IPU – http://www.doh.gov.uk/ipu/), which has overall responsibility for delivering the Information for Health strategy [28].

However, this author was able to spot several local implementation documents on the Web mentioning the use of GIS. One of these documents published on NHSIA Electronic Records Development and Implementation Programme (ERDIP) Web site mentions the use of GIS techniques, mapping to deprivation indices, and linkage to non-medical data in the context of electronic patient records. It also refers to using GIS in matching of demand to location of surgery provision [45]. Two other documents published on IPU Web site also cite GIS [46,47].

It is also noteworthy that a GIS special interest group has been set up in 2003 within the NHS Online Health Informatics Community Portal http://www.informatics.nhs.uk/ to disseminate information and provide support to users of GIS within the UK health industry.

Geographically enabling the electronic patient record offers a powerful advantage in visualising unfolding epidemiological events and patterns hidden in aggregated patient records. Unlike the UK national strategy documents and plans, the US National Health Information Infrastructure Strategy document (also known as "Information for Health") refers explicitly to GIS and real-time health and disease monitoring and states that "public health will need to include in its toolkit integrated data systems; high-quality community-level data; tools to identify significant health trends in real-time data streams; and geographic information systems" [48]. GIS are also explicitly included in the National Electronic Disease Surveillance System (NEDSS) specifications and systems architecture of the US Centres for Disease Control and Prevention (CDC) [49].

Earlier surveys of levels of GIS use in the NHS

During the 1990s there were three significant surveys of the take-up of GIS in the UK NHS. These studies pointed to sporadic levels of GIS use in mostly low-level, non-strategic tasks within the NHS, and have consistently flagged common sets of factors that are hindering the implementation and use of GIS within the NHS [28,35].

First, a study commissioned by the Association for Geographic Information, and carried out by Cummins and Rathwell (1991 – cited in [35]) noted that GIS uptake was being hindered by the low level of awareness of the value of GIS in spatially representing population and health needs-based information. This was attributed to a lack of spatial data-handling skills within the NHS and the failure to realise the value of geographical information, which was often being collected as part of existing operational tasks. Cummins and Rathwell also found that there was little in the way of infrastructure to enable staff, training, resource management, and financial budgets to implement GIS. The small size of many IT departments and high staff turnover, attributed to the higher wage levels of staff with spatial handling skills in the private sector, also hindered the effective implementation of GIS.

Gould (1992 – cited in [28,35]) found high levels of GIS awareness by Directors of Public Health and health authority information/IT officers in England and Wales, though respondents did not perceive any differences between GIS and computer assisted cartography software, and were not really using the software in anything more than simple mapping and low-level operational tasks. Typically, maps were being included in the annual reports of the Directors of Public Health to illustrate the health priorities of individual health authorities, with very little emphasis on using GIS in strategic tasks. Gould drew attention to a number of factors limiting the use of GIS for strategic tasks within the NHS, such as a low skills base in the handling of spatial information and the lack of functions for spatial analysis and spatial modelling in commercial packages (at that time; this is only marginally true in today's GIS packages).

These findings were replicated in a study conducted in the later part of the 1990s by Smith and Jarvis (1998 – cited in [28,35]). Smith and Jarvis surveyed changes in the use of GIS within the NHS following the reforms of the early 1990s and found that GIS use has again tended to be uncoordinated and low-level in nature, because of a lack of policy directives concerning appropriate systems, as well as a general lack of high quality data. Moreover, they noted a particular lack of collaboration between GIS academics and healthcare practitioners/managers. Research undertaken in academia has certainly highlighted the benefits of spatial statistics and GIS approaches in mapping disease and in healthcare planning, but still needs to respond to NHS needs on the ground.

One area where GIS have debatably made less impact is that of measuring and monitoring NHS performance. A number of dimensions could be measured such as improving the health of the general population, ensuring fair access to services, maintaining the effective delivery of appropriate care and analysing the outcomes of NHS care. GIS have a potential role in evaluating performance and could be used to enable comparisons to be made between health authorities and NHS trusts. GIS are being used by a relatively small number of authorities to assess the effectiveness and impact of health interventions and health education campaigns [28].

More recently, Cooper (2000) surveyed thirteen health authorities in the West Midlands. Two thirds of the surveyed health authorities stated that GIS was being under-utilised and cited the main reasons as the high costs of digital geographical data and the lack of resources for training and work-time constraints for NHS personnel, because of the low priority given to GIS within the organisations by senior management. Staff turnover, especially in organisations where only one member of staff is responsible for GIS work (this was the case in 30% of surveyed organisations), left such organisations vulnerable in terms of their GIS capabilities. Cooper advocates the setting up of regional health GIS centres, integrated with the newly formed public health observatories to provide GIS support for local health services and to coordinate training programmes to improve the GIS skills base. These regional services could be responsible for coordinating data collection at the regional level, and preventing any duplication of efforts in spatial data collection or processing [50].

Public health observatories

Public health observatories (PHOs), as proposed in the government White Paper "Saving Lives: Our Healthier Nation" [51], have been set up in each NHS region to draw information together from a range of sources with which to monitor health trends and to identify gaps in information [10]. Looking at their objectives and published agenda, PHOs could have easily undertaken the tasks suggested by Cooper in [50] (see above). However, after almost three years in existence now, it seems PHOs have failed to fulfil this task (or never thought of fulfilling it), though there are certainly some very good, but sporadic GIS activity within PHOs. For example, the North West PHO and the Institute for Health Research at Lancaster University have developed an introductory guide to GIS for North West Primary Care Trust Public Health teams http://www.nwpho.org.uk/gistraining/, while the West Midlands PHO has developed MAIGIS, a pilot Multi-Agency Internet Geographic Information Service http://maigis.wmpho.org.uk/. MAIGIS is a 3-year pilot project funded by the Public Health Development Fund to establish an interactive map-based Web site for sharing health, and related socio-economic and environmental data from different organisations for the West Midlands Region [52] (Figure (Figure22).

Figure 2
Screenshot of MAIGIS Screenshot of MAIGIS (Multi-Agency Internet Geographic Information Service – http://maigis.wmpho.org.uk/) showing an interactive map of the incidence rates by PCG (Primary Care Group) of prostate cancer in the West Midlands ...

Most recent published survey of levels of GIS use in the NHS

Higgs et al research agenda for understanding the (under) utilisation of spatial data within the NHS (executed as an Economic and Social Research Council – ESRC-funded project in 2001) included the following tasks: (1) to update previous surveys of GIS utilisation in the NHS; (2) to review the types of GIS applications currently being used; (3) to explore the reasons for variations in the use and wider implementation of GIS, and also consider technical and organisational barriers that influence wider application; and (4) to examine the nature and extent of data exchanges within, and external to, the NHS [28]. The results of this study were published in 2003 [35].

The study involved a postal questionnaire survey of all health authorities and trusts in the UK, and follow-up semi-structured interviews with selected key respondents from the NHS. Higgs et al found that 84% of health authorities and 29% of the health trusts that responded were using GIS in June-July 2001 (the time the study was conducted). However, only 54% of health authorities and 56% of health trusts within this active subset reported having fully operational GIS. Trusts using GIS include ambulance services NHS trusts that are using GIS in real-time emergency service despatching and control, and cancer intelligence units that are using GIS to examine spatial patterns of disease incidence. Factors such as historical precedent, the presence of dedicated GIS-able individuals or teams, and the presence of an effective infrastructure of GIS advice, guidance, and support available to NHS organisations (e.g., in West Midlands and Trent – for some examples, see http://www.sheffield.nhs.uk/healthdata/gis.htm and http://gis.sheffield.ac.uk/) could explain the observed patterns of health organisations that are GIS users or nonusers, and those that show higher degrees of collaboration with local authorities [35].

The production of maps was undertaken in 96% of the health authorities and 67% of the health trusts that reported using GIS in 2001. The same active subset of health authorities and trusts was also found to be using GIS to undertake geographically based analysis (75% and 72%), data manipulation (46% and 51%), and analysis of statistical data (36% and 44%) [35].

Health authorities were found to be making greater use of GIS for policy-related tasks, e.g., to produce health profiles of local populations, and in epidemiological research, assessing health needs for the purchase of health services, determining catchment areas for local services and planning the location of health facilities. In health trusts, the dominant reported use of GIS was for targeting resources towards local population groups. Using GIS in monitoring performance was undertaken in 30% of health authorities and 39% of health trusts in the active subset. Internet and Intranet GIS were found to be still rare within the NHS [35].

Higgs et al also attempted to measure the levels of joined-up working within NHS organisations and with external agencies (e.g., Police, local authorities, utilities, and other central government departments), which has the potential to address a wider range of cross departmental or governmental issues (e.g., health, poverty and social exclusion). They found limited exchange of data between health trusts and other organisations. By contrast, a more significant number of health authorities were reported exchanging data with external organisations, with significant collaboration with local authorities in over 30% of cases in the active subset (still not a very high figure) [35].

Despite these uses of GIS in operational and policy-related tasks, many respondents identified factors they perceived to be hindering the wider use of GIS within their organisation, and data exchange and collaboration with other organisations and local authorities. These obstacles included work-time constraints, insufficient staff and financial resources to implement systems fully and undertake data exchange duties with other organisations (e.g., establishing networks and contacts), lack of skills and insufficient training or guidance, lack of digital data in appropriate formats and problems of ensuring data quality, data confidentiality issues, lack of demand from within some organisations to the use of GIS (directors are not aware of value of GIS rather than not committed to GIS), and lack of a clear GIS strategy [35].

The lack of a clear organisational policy for exchanging data was among the most significant data exchange constraints identified by health authorities and trusts. This was compounded by the currently ambiguous criteria to conform to data confidentiality requirements, and the lack of a service-level agreement with Ordnance Survey http://www.ordnancesurvey.co.uk/ or other providers like ESRI and MapInfo) for the purchase of base digital data for organisations within the NHS. Another important problem reported was that of organisations not being aware of data held by other organisations [35].

Higgs et al suggest raising awareness of the benefits of joined-up working arrangements, and introducing significant organisational and cultural changes to facilitate enabling contexts for enhanced collaborative use of GIS between NHS organisations and local authorities, in order to support the wider joined-up government agenda currently being promoted in the UK [35].

It should be noted that Higgs et al carried their study and reported differences between health authorities and trusts in 2001 some time before the start of the current changes in the UK health system where Primary Care Trusts are now taking over many of the classical functions of health authorities and a smaller number of Strategic Health Authorities are taking an increasingly strategic role in performance management of trusts. Higgs et al's survey could be used as a baseline with which to monitor the impacts of current and future organisational restructuring on the uses of GIS within the NHS [35,53].

Geo-information and real-time GIS infrastructure requirements

In this section, we start by reviewing some of the recipes and recommendations provided by various specialist groups and researchers from around the world for a successful implementation of a national geo-information infrastructure that can also support real-time GIS applications in public health. We then present a Canadian case study that emphasises the importance of data modelling and community/university collaboration among other elements involved in the development of community health geo-information systems. The reader will notice that there are many recurring themes, and elements and ingredients which are common to all these recipes and the presented case study. The section concludes with a detailed discussion of some of these elements and others that are crucial for properly building a national spatial health information infrastructure.

The Nairobi statement – International Federation of Surveyors and the United Nations-Habitat, 2002

The International Conference on Spatial Information for Sustainable Development held in Nairobi, Kenya, in 2001 recognised that development and implementation of a National Spatial Data Infrastructure (NSDI) is a prerequisite for promoting sustainable development in any country. The conference also recognised that although every NSDI is different due to a variety of cultural, social and economic factors unique within each country, there are a significant number of common elements that can be shared, and which countries should avoid re-inventing; these elements include [6]:

(1) Fostering a culture of data sharing that considers spatial information an asset: A key success factor of NSDI implementation is the management of information (including spatial information) as an asset, e.g., only capture data that are needed and can be maintained, as in the case with finance and human resources. A NSDI requires a culture of data sharing to exist within a country. The benefits associated with data sharing should be researched to encourage wide participation [6].

(2) Education, training, and capacity building: The sharing of education and training resources and experiences by organisations is important for capacity building. Universities should be encouraged to work with local organisations in the provision of Continuing Professional Development [6]. (Of interest in this context is UNIGIS International http://www.unigis.org/, a worldwide network of educational institutions offering distance-learning courses in GIS.)

(3) Addressing crucial legal issues: Experience has shown that issues associated with national security, data privacy and associated liability are potential obstacles for NSDI initiatives. Unambiguous legal frameworks to address these crucial legal issues must be established as early as possible. Ordinary citizens must be considered one of the main NSDI beneficiaries and allowed access to NSDI information and services (where appropriate) [6].

(4) Development of effective partnerships, and involvement of all stakeholders and users: Mature NSDIs are complex solutions involving many stakeholders (including the health sector with all its organisations). NSDIs are underpinned by effective partnerships and cooperation amongst a wide variety of multi-disciplinary stakeholders in the public and private sectors and end-user communities. Appropriate business models must be agreed to support these partnerships at an early stage. The success of a NSDI depends upon delivering products and services that are acceptable and desired by end users (within the government and the private sector, and also citizens). It is essential that all users are involved when defining (user) requirements and testing the associated products and services. NSDI policy must be flexible to address rapidly changing user needs and adapt to changing technologies. NSDI Steering Groups (with end-user representation) should be formed to formulate appropriate policy and institutional frameworks and facilitate multi-stakeholder cooperation. However, complete policy and institutional frameworks need not be in place before implementation of a NSDI can begin. Roles and responsibilities among stakeholders must be clarified at an early stage, including the lead role – this should be an initial activity of a NSDI Steering Group [6].

(5) Adopting common standards and data models: ISO http://www.iso.org/ and the Open GIS Consortium http://www.opengis.org/ produce data and interoperability standards that should be adopted by NSDI stakeholders (see later). To be able to integrate and share data we need to understand and resolve different semantics in data. All NSDI datasets from different sources should adopt the same overarching philosophy and same/compatible data models to achieve multi-purpose data integration, both vertically and horizontally (within organisations, and across organisations and different administrative levels) [6].

(6) A combined top-down and bottom-up incremental implementation approach: It is recommended that a top down approach is combined with a pragmatic bottom up approach. A mature NSDI can only be achieved through simpler and smaller solutions that start with realistic and clear short-term objectives, and grow incrementally through political and market needs. Short-term bottom up projects will provide valuable experience that can feed into the formulation of NSDI policy and strategy. By creating "proof of concept and benefits applications", these projects can be also used to gain and sustain political support, and convince further funding of NSDI [6].

(7) Do not just focus on data; develop applications: Varied applications and services through a project-oriented approach will bring reality to the NSDI. An overemphasis on data acquisition, without a market-linked application, will not provide any momentum for further development. Visualisation, modelling and analysing activities will be the focus of value-added services in the coming years [6].

EIS-AFRICA – Gavin et al, 2002

EIS-AFRICA is a network for the cooperative management of environmental information in Africa. It is a pan-African, non-profit, non-governmental organisation, registered in South Africa, and born out of the World Bank's Environmental Information Systems in Sub-Saharan Africa programme (EIS-SSA). Building infrastructure for geo-information use is becoming as important to African countries as the building of roads and telecommunication networks. As with the investment in other basic infrastructures, investing in a Spatial Data Infrastructure (SDI) underpins the provision of many essential services. In a recent EIS-AFRICA position paper published in 2002, Gavin et al describe the following SDI components [9]:

(1) Up-to-date core digital geo-datasets: A country's ability to use geo-information effectively depends on the existence and proper investment into the provision and maintenance of up-to-date core digital geo-datasets, e.g., locations of river networks, roads, land cover, administrative boundaries, and populated places. The existence of these commonly used datasets facilitates the use of other geodata, such as demographic, socio-economic, epidemiological, environmental, and water quality data, which must be also available, accessible and up-to-date [9].

(2) Standards: Geodata must adhere to accepted standards to enable the unambiguous interpretation, integration, and comparison of related datasets from different sources. Stakeholders should work together locally and with international bodies to develop/adopt standards for geodata collection and documentation [9]. (Adopting international standards will also ensure that future collaboration is possible at regional and global levels.)

(3) Metadata: The accessibility of proper documentation (metadata) about existing geodata is also extremely important. The mere existence of geodata is not sufficient. Information about datasets is needed for the purposes of generating awareness of the data's existence among potential users, and for helping these users assess the reliability and/or relevance of available datasets for selected uses. This in turn requires that data providers publicise their metadata to the public and targeted users in a suitable catalogue, directory or clearinghouse to enable searching and retrieving documentation on available datasets. Metadata, too, should be standardised [9] (Figure (Figure33).

Figure 3
Screenshot of ArcCatalog, part of ArcGIS Desktop 8.3 from ESRI Screenshot of ArcCatalog and the built-in ISO Metadata Wizard/editor (part of ArcGIS Desktop 8.3 from ESRI). ArcCatalog enables users to view their GIS data holdings, and create, manage, and ...

(4) Policies and practices actively promoting the exchange and reuse of information, and greater public access to geodata are also needed. Policies should start by removing barriers to access, e.g., excessive costs to use an information product or lack of clarity concerning copyright. The absence of a policy concerning data access and sharing can often be as handicapping as the presence of an inhibiting policy. Existing policies need to be revised and new policies developed as necessary. Broad-based national committees of data producers, users, and other stakeholders should be created to oversee the development of geo-information policy and standards and ensure compliance [9].

(5) Appropriate human and other resources: Sufficient human and technical resources are required to collect, manipulate, interpret, and distribute geo-information. Without appropriate human resources, geo-information will remain unexploited. Sufficient financial resources must be available to invest in training people. Retaining technical expertise should be also a priority within institutions using geo-information. Adequate investments must be also made in technologies for digital data management and storage, and in improving communications infrastructure [9].

(6) Coordination between various stakeholders: The cost-effective development of SDI requires the coordinated harnessing of resources and expertise in many different government agencies, the private sector, universities, non-governmental organisations, and regional and international bodies. Collaborative frameworks (partnerships) are required to prevent duplication of effort (which would occur if various institutions pursue singular, uncoordinated agendas), and ensure that all captured and generated data and information conform to common standards, so that they can be easily combined and effectively analysed. Such frameworks should specify which organisations are gathering which kinds of information, how the information will be captured, and arrangements for data sharing [9].

(7) Raising awareness: Establishing a formal national programme can help heighten awareness and generate support. Policymakers need to be engaged in the process through awareness training, briefings, and policy dialogue. Organising conferences on geo-information, conducting studies on implementing SDI, and supporting professional development are all important ingredients [9].

WHO-AFRO – Briggs, 2000

In a 140-page report commissioned by the World Health Organisation – Regional office for Africa (WHO-AFRO), Briggs proposes a programme of action to advance environmental health hazard mapping in Africa that includes the following elements [54]:

(1) Data modelling: This involves developing and adapting health indicators according to specific local user needs [54] (see "Common semantics, data models and health indicators" below).

(2) Awareness-raising campaigns: These should be based on real-world examples and demonstrations of environmental health hazard mapping, and aimed at key decision makers in concerned organisations [54].

(3) Joint working in partnerships: This involves adoption of a multi-sectoral approach to environmental health hazard mapping, encouragement and support for the sharing of experience and facilities, and support for training and long-term capacity-building, e.g., by building up expert/national networks (partnerships) and organising workshops, seminars and study visits [54].

(4) An incremental approach aimed at making the best possible use of available data and expertise to address local needs [54].

Richards et al, 1999

In a paper published in 1999 in Public Health Reports, Richards et al stress the following points [7]:

(1) There is a need for intelligent tools specifically designed for public health, and seamlessly integrated into routine workflows [7].

(2) Training, its costs, and time needed for it should be all considered: Training should cover epidemiological methods to ensure appropriate use of GIS technology in public health. The cost of training programs offered by commercial GIS vendors and solution providers can be a financial burden, and GIS training programmes specifically designed for public health professionals are still relatively limited. The time required for training can be also a challenge for organisations in which demands on personnel are already high. Training materials should be offered in a variety of formats to facilitate distance learning (e.g., CD-ROMs and self-instruction Web-based courses). Public health professional specialties/bodies need to recognise continuing education credit for individuals participating in GIS software training [7].

(3) Current and accurate base data must be made available [7].

(4) Software and data acquisition, maintenance and upgrade costs should be secured [7]. (In the case of the UK, reaching an agreement to enable the whole NHS for example to access Ordnance Survey (OS) geographic information would be economically much better than asking each NHS organisation to strike a separate deal with OS. It is noteworthy that the business case outlining a proposed pilot agreement between OS and the NHS was approved by the NHSIA board in September 2003, and it now remains for the NHSIA and OS to determine the scope and funding of the pilot agreement, which is expected soon.)

(5) Confidentiality issues must be addressed [7] (see "Individual privacy, national security, and data confidentiality issues" below).

(6) Standards must be adopted and partnerships promoted at all levels [7].

(7) An incremental approach is needed: Longer-term solutions usually require a series of small successes, carefully built upon in incremental fashion over time [7].

In fact, much of the wider vision of a national public health spatial data infrastructure can be gradually and incrementally achieved through disparately funded and managed short-term projects, as long as we can ensure that these short-term projects make a useful and lasting contribution towards this wider vision.

Davenhall, 2002

In a recent ArcUser Online article, William F. Davenhall, ESRI Health and Human Service Solutions Manager http://www.esri.com/industries/health/index.html, describes an ambitious vision of a Community Health Surveillance System (CHSS – see later) spanning wide geographic areas, and mentions the following success factors [2]:

(1) Community data sharing must be systematic and regular [2].

(2) Adopting data standards and sharing agreements will ensure a CHSS works effectively in real time [2].

(3) Data have to be collected uniformly and include specifications for update frequency and allowed dissemination in different emergency and non-emergency situations, and for purposes other than those for which they were originally collected [2].

(4) CHSS also requires robust, epidemiologically sound analytical software, well-trained staff, full system redundancy/fault tolerance, standardised database replication and off-site backups, among other ingredients for success [2].

RODS – Tsui et al, 2003 and Wagner et al, 2003

RODS, the Real-time Outbreak and Disease Surveillance system, is a computerised public health surveillance system for early detection of disease outbreaks, including those caused by bioterrorism. RODS processes clinical encounter data from participating hospitals and sales data of over-the-counter (OTC) healthcare products from participating stores and pharmacies. The system was used during the 2002 Winter Olympics and currently operates in two US states – Pennsylvania and Utah (more details about RODS are presented below under "Proactive, real-time, GIS-enabled health and environmental surveillance services") [55,56]. RODS researchers identified the following key elements for success:

(1) Data-sharing agreements: These were executed in the case of RODS with every participating health system and OTC healthcare product retailer, and addressed confidentiality and other concerns. Data sharing agreements should allow redistribution of data to any public health authority and permit data to be used in research [55,56].

(2) National data utilities/services: Data sources that are amenable to a "national" approach should be formed into industry-based data utilities (services independent of any particular user interface) [56].

(3) A deep understanding of data and industry: Wagner et al found that a key element for success included the deep understanding of the OTC healthcare products industry provided to them by an industry expert [56].

(4) Official/governmental support: Equally key was a personal invitation, sent to the CEO of relevant corporations, for participation (sharing of otherwise proprietary data), authored by a highly respected government or public health official [56].

(5) Development of an interdisciplinary team with expertise in medical informatics, computer science, law, and engineering [56].

Morris and Henton, 2003

Morris and Henton list several key factors for progress and success of an environmental health surveillance system for Scotland (see "Large-scale environmental surveillance projects in the UK" below), including ensuring joint ownership of the project, successful partnerships and shared commitment among the disparate agencies that are involved in the project, adopting a phased approach, reaching a consensus on inputs and outputs, and having realistic expectations [57].

Higgs et al, 2003

In a recent study published in 2003 (see "Most recent published survey of levels of GIS use in the NHS" above), Higgs et al point to the following ingredients [35]:

(1) Establishing networks of GIS users from both the NHS and local authorities at local and higher levels to encourage more joined-up working, share expertise and experiences, as well as establish contacts and trust, and raise the awareness of the types of data that are held by different organisations [35].

(2) Raising awareness: A substantial proportion of respondents in Higgs et al's study from health authorities (90%) and trusts (74%) stated that a dedicated Web site giving advice on GIS matters for NHS organisations would be helpful in providing a forum or virtual network on the Web for the exchange of information and experiences, as well as in promoting and disseminating good practice examples of GIS use in healthcare, and identifying other suitable Web resources. Successful examples of collaborative projects between NHS and local authorities that have involved the use of GIS should be also highlighted. Other factors considered important in raising awareness include an annual GIS conference aimed at professionals from NHS organisations and local authorities, and the provision of seminars, workshops, and road shows. According to Higgs et al, such "raising awareness" activities are vital given the need to build business cases for the development of GIS within NHS organisations and to show the capabilities and "business benefits" of GIS to directors [35].

Croner, 2003

Croner points to several elements and tasks required to develop a nation's public health geospatial infrastructure and realise comprehensive Internet geospatial readiness in public health; these elements include [58]:

(1) Vision and leadership at the highest levels (e.g., departments of health): This is necessary to ensure national public health geospatial mobilisation and readiness. A suitable policy and funding must be established, including the provision of support to organisations lacking the resources to join in a common, coherent national initiative [58].

(2) Assessing current state of geospatial readiness to respond to normal and emergency community health needs, and identifying beacon sites as resources for guidance and other forms of assistance to those agencies and departments not yet or in early formative stages of involvement [58].

(3) Technology introduction; training and education programmes: This implies the provision of necessary budgets for these activities [58].

(4) Promoting collaboration with and between all sectors to share data, applications and expertise [58] (see "On partnerships" below).

(5) Moving to the Web and building all necessary critical connectivity/geospatial infrastructure that should not be independently recreated by all [58].

(6) Geospatial readiness also requires that geospatial data holdings be identified, described and made Web-searchable in a standardised manner forming a truly uniform, integrated, navigable and shareable national inventory of existing public health geospatial data resources. Best standards, rules, designs, and practices must be created/agreed upon and published (covering spatial metadata, geocoding, accessibility for visually and manually impaired data users, and data access restrictions among other things) for uniform Internet-enabled GIS services, in which standards, definitions, and look-and-feel of the data and Web-based technology are the same throughout the nation [58].

Case study – Buckeridge et al, 2002

Community/university research collaboration is a relatively new research paradigm that has recently become a major strategic theme of health funding agencies in Canada and elsewhere. Buckeridge and his colleagues present their experience in conducting a collaborative community/university respiratory health GIS research project in Canada. Their experience is a good example of research into the kind of partnerships (community/academia) that are also required to realise the envisaged national spatial health information infrastructure in the UK. Their specific project objectives were to: (1) develop and iteratively refine via active community/university collaboration a GIS for ready access to routinely collected health data (focusing on respiratory health), and to study logistical, conceptual and technical problems encountered during system development; and (2) to conduct a qualitative ethnographic study to document and analyse issues that can emerge in the process of community/university research collaboration [8].

Buckeridge et al adopted user-centred and rapid prototyping/iterative design methods. User feedback was gathered via questionnaires and discussions [8].

In an initial step, university and community partners jointly developed a conceptual data model (or ontology) to facilitate data integration and enable participants from different backgrounds to share a common vocabulary and dialogue (see also "Common semantics, data models and health indicators" below). The data model described by Buckeridge et al was based on a "determinants of health" model that explicitly acknowledges the influence of non-medical determinants (e.g., income, occupation, and environment) on population health status, and qualitatively relates these determinants to health outcomes [8]. Such models have been used successfully as the basis for other population health information system approaches, e.g., POPULIS (see "Caring for population demographics and socio-economic factors" below) [59,60].

The next steps involved identifying, evaluating, and acquiring potentially relevant datasets based on data needs identified from the data model. Data describing determinants of respiratory health included census, cartographic files, land use, traffic volume, air monitoring and emissions, and consumer spending patterns. Data describing outcomes of respiratory health included hospital separations (similar to Hospital Episode Statistics in England – see http://www.doh.gov.uk/hes/), ambulatory physician visits and procedures, and prescription drug sales. Once data were acquired, they were integrated into the GIS using the developed data model and the spatial unit of the enumeration area, a Canadian census sampling area with a median population of 400 in the study area (Southeast Toronto) to relate datasets to one another (in this case the enumeration area acted as a common high resolution geographical unit for linkage – the data model facilitated data integration around the common geographical unit of the enumeration area). The limited and inconsistent descriptions (metadata) of existing data were partially addressed by adopting a "standard" ad hoc metadata model within the system to represent available descriptions in an organised manner [8].

Buckeridge et al highlighted some important issues they have encountered during the development of their system, and which are also generalizable to other community health information systems [8]:

(1) Early and continued involvement of users in system development is important, if not essential. However, maintaining and coordinating consistent user involvement, especially across a number of organisations, is a difficult and resource-intensive task that should be well planned [8].

(2) All relevant system stakeholders should be involved in the development of a data model or ontology to facilitate data selection and integration, and support a common understanding of data by people [8].

(3) Challenges met while bringing disparate data together included lack of directories or catalogues for locating existing data, generally poor descriptions (metadata) for existing data, non-standard encoding of data, and concern over data "ownership" and/or privacy issues. Web accessible directories of data would greatly facilitate identifying data sources. In addition, action should be taken to improve data documentation (metadata), develop data standards, and enhance compliance with existing standards. Many data holders did not have an established protocol for access to their data, or a clearly identified person with the authority to release data. In the absence of these, data holders were reluctant to release data, and acquisition of some data required a considerable amount of negotiation and follow-up. The difficulties encountered in acquiring data indicate that privacy concerns present a serious barrier to system development. A wide range of stakeholders in society must collectively address the issues of privacy and stewardship of population health data [8].

(4) The potential for data display to be misleading and for misinterpretation of data was addressed by providing users with descriptions (metadata) of datasets and constraining map types by data types. Methods to allow only valid visualisation and analysis of data from a variety of sources across space and time must be developed and evaluated [8].

(5) Problems from an interface design perspective included the need to constantly change the interface to accommodate a refined understanding of user needs and changes in the underlying data structure (because of the iterative nature of the development process). Standard software engineering methods, such as design models and modular programming, helped to address these problems [8].

(6) Users of community health information systems will nearly always have variable skills and organisational contexts. The range of user skills and knowledge was partially addressed by developing a graphical user interface with multiple levels, each supporting a different user level. Another approach would be to use artificial intelligence, as employed in decision support systems, to facilitate user control of information visualisation [8].

(7) Community/university collaboration issues: As organisations and individuals are brought together to form research partnerships, differences in their organisational/institutional cultures become apparent. Community partners tend to see potential conflicts between service provision and research demands, while university partners tend to see the collaboration as posing threats to research rigor, control over the research process and constraints on publication opportunities [8].

Leadership style, vision, commitment to the idea of community/university collaboration, at least small amounts of "seed funding", and the willingness to learn from failures all appear to be significant features in successful collaborations. Issues that shaped and influenced the collaborative process and partnership that developed during the course of Buckeridge et al's project revolved around three major themes: separate cultures (differences in expectations, values, outcomes, reward systems and work styles), time, and uncertainty/ambiguity. These issues are neither positive nor negative. Rather, they represent challenges which, depending on how they are met, have the potential to shape the collaborative process in either positive or negative ways [8].

Sub-themes within the "separate cultures" category included issues around language, trust, and power. Language differences (different knowledge backgrounds and ways of understanding the world) occurred as frequently between university partners from different academic disciplines as between university and community partners. Trust developed gradually with time, as co-investigators came to recognise the strengths, commitment and knowledge of each other, and as the group worked to make joint decisions and solve conflicts. Issues of power arose from differences in status, resources, skills, and personal commitment to the project [8].

Time was a burden for individuals, but an asset to the collaborative project as a whole, as it supported the development of trust, mutual understanding and effective working relationships [8].

Many co-investigators pointed out that the most difficult aspect of their collaboration was to learn to accept and work with the uncertainty and ambiguity about where the project was going as it developed and unfolded (despite clear project goals and objectives). Nevertheless, uncertainty and ambiguity were found to be essential to the shared positive experience of exploration, debate, and reflection, and also created the space to ask critical questions [8].

Community partners engaged in collaborative research with universities should see themselves as equal partners. This could be achieved in part by making an organisational commitment to research (e.g., supporting staff involved in research and advocating with funding agencies for research resources). On the other hand, universities should foster community/university research partnerships by developing university structures that support such collaboration, and inducing positive changes in the current academic culture, which places more value on individual rather than collaborative research [8].

On partnerships

In the case study presented above, Buckeridge et al stressed the importance of community/university collaboration when developing a community health geo-information service [8]. Public health also needs to be an integral part of a larger structural, multi-agency whole, where government and other relevant agencies at all levels are brought together to build, integrate, leverage through sharing and partnerships, and optimise spatial information, both vertically within and horizontally across organisations, for comprehensive routine as well as emergency planning and response services. Intranet and Internet environments can help facilitating public health spatial data accessibility and integration at local, national and regional levels, and can support a physical and virtual "situation room" for both emergency and day-to-day management of operations for safeguarding the environment and protecting human health [58].

A San Diego Association of Governments report titled "Guidelines for Data Development Partnership Success" is based on many years of GIS partnering experience and cites guidelines that may help other agencies develop successful partnership activities [61].

A good example of successful partnerships is the online GIS service known as "Window to My Environment" (WME – http://www.epa.gov/enviro/wme/), which is offered by the US Environmental Protection Agency (EPA). WME is designed to provide public accessibility to a wide range of federal, state, and local geospatial data about environmental conditions and features in any US location. The data available in the WME application are distributed and reside at their respective agency servers. Thus each participating agency manages its own data and its timeliness, which can be current and even real-time. There is no limit on the number of WME partners. Any agency can participate by adding its own data layer(s) to existing ones. Participants can also create a reciprocal interface on their home server with WME connectivity. Public health databases are not yet included in WME, but there are no specific barriers to inclusion [58].

Common semantics, data models and health indicators

As information systems increase in complexity, models of the relationships between data elements become increasingly important. Data models, more correctly called ontologies, explicitly define how concepts within data sources relate to each other. They are conceptual models that facilitate integration of data by information systems and support a common understanding of data by people [8].

To explain the importance of adopting common semantics when developing health geo-information services that span administrative boundaries, Richards et al provide the example of two neighbouring public health departments that are addressing a common infectious disease problem and would like to join their independently developed GIS maps into a common map for both jurisdictions. Doing so requires consensus on a range of technical, GIS-related issues and public health-related issues. The latter for example include case definitions, sources for case reports, and the time period for the study [7].

In his report commissioned by WHO-AFRO, Briggs classifies environmental health hazards into eight categories: land/climate-related hazards, atmospheric hazards (outdoor air pollution), water-related hazards, food-borne hazards, vector-borne hazards, domestic hazards, occupational hazards, and infrastructural hazards. Briggs' report stresses the importance of indicators as essential tools for environmental health hazard mapping. Indicators provide the means of describing, monitoring, managing, and comparing hazards in terms that are relevant to information users. Three types of indicator are proposed [54]:

(1) Hazard indicators: define the hazard in terms of its extent, magnitude, duration, frequency or probability of occurrence, without reference either to the exposed population or health effect;

(2) Risk indicators: describe the hazard in terms of the number or percentage of people exposed; and

(3) Health impact indicators: describe the hazard in terms of the actual health outcome, measured as either morbidity or mortality [54].

Which type of indicator is most appropriate is likely to depend on the specific question being asked. Natural hazards, for example, can be readily described by hazard indicators, while hazards like suicides and domestic violence are more easily described by health impact indicators [54].

Unfortunately, there are no one-size-fits-all indicators that suit all users. Indicators need to be customised according to specific and local user circumstances and needs, the specific hazard of interest, the type of question being asked, the scale of analysis, and data availability and quality. For this reason, the emphasis in Briggs report was not on providing a core or generic set of environmental health hazard indicators, but on providing indicator profiles that show, for a sample of indicators, how they can be constructed/customised and used [54].

An indicator profile specifies the environmental health hazard(s) to which the indicator relates, the indicator's rationale and role, any alternative methods and definitions, any related indicator sets, sources of further information, and a listing of involved agencies. Each indicator must be clearly defined alongside all underlying terms and concepts involved in describing and constructing it. Data needed to construct an indicator must be identified and assessed regarding availability, quality, and characteristics in terms of the indicator in question. The ways in which the indicator is computed (e.g., a mathematical formula) and units of measurements used in presenting it (e.g., percentage or number per thousand head of population) must be also specified. The area across which the indicator can be used (scale of application or aggregation level) must be determined. Finally, the ways in which the indicator may be interpreted in relation to the hazard(s) it covers must be described. This includes determining what inferences can be made from apparent trends or patterns in the indicator, and any constraints on the interpretation of the indicator, due for example to data limitations or complexities in the relationships implied by the indicator [54,62].

Indicators are not limited to environmental health hazard mapping. In 2000, the US National Association of County and City Health Officials – NACCHO has produced a comprehensive list of core and extended health indicators as part of their Community Health Status Assessment (CHSA) Toolbox. CHSA collects data under eleven indicator categories (formatted in bold below) to answer three main questions [63]:

(1) Who are we and what do we bring to the table? (demographic characteristics; socio-economic characteristics; and health resource availability)

(2) What are the strengths and risks in our community that contribute to health? (quality of life; behavioural risk factors, e.g., substance abuse, lifestyle, and screening programmes; and environmental health indicators, e.g., air and water quality, workplace hazards, food safety, etc.)

(3) What is our health status? (social and mental health; maternal and child health; death, illness and injury; infectious disease; and sentinel events)

CHSA also calls for establishing a system to monitor these indicators over time, e.g., to detect sentinel events. The latter are cases of unnecessary disease, disability, or untimely death that could be avoided if appropriate and timely preventive services or medical care were provided. These include vaccine-preventable illness, avoidable hospitalisations (those patients admitted to the hospital in advanced stages of disease which potentially could have been detected or treated earlier), late stage cancer diagnosis, and unexpected syndromes or infections. Sentinel events may alert the community to health system problems such as inadequate vaccine coverage or lack of primary care and/or screening, a bioterrorist event, or the introduction of globally transmitted infections [63].

The CDC National Public Health Performance Standards Programme (NPHPSP – http://www.phppo.cdc.gov/nphpsp/index.asp) is a more current partnership effort to improve the practice of public health, the performance of public health systems, and the infrastructure supporting public health actions in the US. To achieve its goals, NPHPSP developed performance standards and matching assessment instruments for state and local public health systems, and for public health governing bodies. (NACCHO developed and tested the Local Public Health System Performance Assessment Instrument for NPHPSP – http://www.phppo.cdc.gov/nphpsp/Documents/Local_v_1_OMB_0920-0555.pdf) NPHPSP describes ten "Essential Public Health Services" that provide the fundamental framework for NPHPSP instruments by defining public health activities that should be undertaken in all communities http://www.phppo.cdc.gov/nphpsp/10EssentialPHServices.asp.

The Health Data Model (HDM) is a conceptually related collaborative project to develop a generic data model for health applications, using ESRI software. ESRI staff and researchers at the University of California at Santa Barbara (UCSB) are leading this consortium. The user members represent public health planning and research organisations, public health consulting firms, and GIS coordinators from medical centres around the US. The current phase of this work has assigned top priority to service site selection, emergency response, facility emergency response, campus facility management, regional environmental health, and disease surveillance. The outcome will be a basic data model with three components [64]:

(1) A conceptual object model of health application features, building relationships between health application geographies and users;

(2) UML (Unified Modelling Language) code which is easily transformed into an ESRI geo-database. The average user can immediately begin to populate the geo-database rather than to design it, and the inherent commonality between users and sites adopting the resultant geo-database(s) should facilitate exchange of data; and

(3) Documentation in the form of a book on GIS Health Applications [64].

In October 2003, this author contacted Dr. Mike Goodchild, HDM project leader, and asked him how does/will their conceptual object model relate/link to health indicators, e.g., those produced by NACCHO as part of their Community Health Status Assessment (CHSA) Toolbox, and those produced by WHO-AFRO as part of their consultation on environmental health hazard mapping for Africa. Goodchild replied that he thinks they should include health indicators, and that they will start investigating NACCHO and WHO-AFRO's indicators to see if they can come up with a suitable way of including them in their HDM (Mike Goodchild, HDM project leader at the University of California at Santa Barbara, personal communication – October 2003).

Caring for population demographics and socio-economic factors

More beds, more physicians and nurses, and more procedures do not always translate into better community health. Departing from this premise, the Manitoba Centre for Health Policy (MCHP – http://www.umanitoba.ca/centres/mchp/) has developed POPULIS, a POPULation health Information System, to answer questions like: "What factors – beyond access to medical care – determine the health of populations?" and "Would healthcare money have a greater impact on health if some were spent in other areas such as education, housing, nutrition, job creation and training?" [59,65]

POPULIS reports on the health of a population, and the relationship between health and the use of healthcare services. It also relates these to socio-economic factors like education, unemployment, housing, and single parent households. These factors are key components of the Socio-Economic Risk Index (SERI), a measure developed by MCHP. The higher a region's score on this index, the higher the death rate is among its residents – death rate being a key and rather obvious indicator of a population's health status [59,65].

POPULIS has been conceived to help policy makers avoid a "knee-jerk" reaction to one set of negative indicators or to pressures generated by one-sided media stories. It builds on data that are available but somewhat underused in today's healthcare systems, e.g., vital statistics, census, and healthcare service utilisation data, to provide healthcare decision makers with the continuously updated and localised detail essential for planning and managing a more effective and efficient healthcare system [59,65].

However, POPULIS has missed a lot by not being a GIS-enabled system. The original POPULIS (based on Statistical Analysis System (SAS), a very popular statistical package) proved to be hard to maintain and not scalable, and a more recent publication by Roos in 1999 [65] has moved from describing POPULIS as a SAS front-end or software program to presenting it as a framework or approach of concepts, methods, procedures and databases. GIS are excellent integrative, multidisciplinary knowledge management tools capable of linking and spatio-temporally analysing disparate, continuously changing datasets, and as such could have helped POPULIS achieve its vision in far much better ways.

Demographic shifts, e.g., the forecast rise in the number of elderly people in developed countries over the next decades, also have their impacts on healthcare services and expenditure, and must be carefully considered and modelled [66].

Integration and interoperability issues: GML and other technologies

Aggregating disparate data sources to a common geography has always been a strength of GIS. The challenge of nationwide, regional and global coordinated efforts in case of natural or man-made disasters, however, calls for aggregating the aggregates on short notice. For instance, if a disaster hits at the border of two cities or two EU countries, will their two information silos be able to work together, sharing and combining data instantaneously? Today, many systems are based on closed or proprietary interfaces and formats, and are difficult to integrate with brands and platforms in use by other organisations. Embracing open standards is the key to interoperability [67].

Interoperability allows spatial data silos distributed anywhere on the Web to be searched, located, retrieved and compiled, either by a Web GIS service provider or at an individual's desktop. The OpenGIS Consortium (OGC – http://www.opengis.org/) develops specifications to accommodate any operational differences and allow disparate Web GIS clients and desktop users to fully integrate Web accessible spatial data resources [58]. OGC's ultimate goal is to enable the "spatial Web" with products that plug and play across different processing platforms, vendor brands, networks, and programming languages [67].

Founded in 1994, OGC is an international industry consortium of 258 companies, government agencies and universities participating in a consensus process to develop publicly available geoprocessing specifications.

Geography Markup Language (GML) is the base language developed by OGC. GML is becoming the world standard for eXtensible Markup Language (XML) encoding of geographic features and geoprocessing service requests. The relevance of Web Services to spatial integration of disparate data sources is also obvious. XML encoding of geodata, using GML and Web Services http://www.opengis.org/initiatives/?iid=7 specifications and recommendations, makes it possible to display, overlay, and analyse geodata on any Web browser, even if the browser obtains views of different map layers from different remote map servers. For example, layering Web Services from two politically/administratively separate but geographically contiguous cities or regions would allow the integration of their independent data silos to answer questions about an emergency involving both (provided that issues of common semantics, data models and case definitions have been resolved) [58,67,68].

XML is also used for encoding spatial metadata (metadata are essential to aid the discovery of spatial data in a distributed environment) [58]. Standards also exist for metadata (see "Existing SDIs and SDI initiatives worldwide" below).

One of the keys to GML deployment is a companion specification, the OGC Web Feature Service (WFS). To get GML data, users query a Web server with an OGC Web Service Interface, collectively known as a Web Feature Server. The OGC interface enables standardised access to a feature store and enables users to add, update or retrieve GML data locally or across the Internet. Any data store can be used – users no longer need to care whether the underlying store is from ESRI, Oracle or IBM [69].

GML brings an alternative to expensive proprietary software, and an increasing number of companies have already joined the GML bandwagon. Ordnance Survey (OS), the UK's national mapping agency, has adopted GML as the only geospatial data format for its MasterMap of Great Britain http://www.ordnancesurvey.co.uk/oswebsite/products/osmastermap/. OS MasterMap boasts about 400 million geographic features in GML format. Each feature within OS MasterMap is assigned a unique 16-digit "topographic identifier" (TOID) that can be used by OS or its customers to reference any given feature in the database. This makes it much easier for users to associate other information to the spatial feature, to refer unambiguously to a particular feature, and, therefore, to share spatial information with other users [24,69].

By separating presentation from content, powerful maps can be made that offer enhanced functionality for users. GML contains map "content" only (e.g., where features are, their geometry, type and attributes), but it does not provide any information about how that map data should be displayed. This is actually a benefit because different "stylesheets" can be applied to the geographic data to make it appear however the user wishes [70,71]. By combining a selected map stylesheet with a WFS query, users are presented with a fully interactive and editable vector map that can be viewed in any Web browser [69] (Figure (Figure44).

Figure 4
GML map making Diagram showing the main steps involved in GML (Geography Markup Language) map making. GML contains map "content" only (e.g., where features are, their geometry, type and attributes), but does not provide any information about how that ...

Another key feature of GML is its ability to be "self describing" through the use of XML schema. Thanks to this feature, tools have been developed to model and load proprietary databases, e.g., Oracle Spatial databases, with geographic data supplied in GML formats [69].

GML 2 lacked some important features like metadata support and several other geographic information prerequisites [69]. The latest GML version, GML 3.0, was approved by OGC in 2003 and addresses the limitations of GML 2. GML 3 is backwards compatible with GML 2. New additions in GML 3 include support for metadata, units of measure, complex geometries, spatial and temporal reference systems (time information is essential in tracking applications like monitoring ambulance locations and in exploring the movement and growth of natural disasters), topology (the relationships between features, e.g., for use by routing applications popular in location-based services), gridded data, and default styles for feature and coverage visualisation. The new release is modular, allowing users to pick out only the schemas or schema components that apply to their work, which simplifies and minimises the size of implementations [72,73].

However, it should be noted that GML and Web Services are only part of the solution to integration and interoperability. Other health-related standards like HL7 (Health Level 7 – http://www.hl7.org/) and clinical coding schemes like SNOMED (Systematised Nomenclature of Medicine) – http://www.snomed.org/, LOINC (Logical Observation Identifiers Names and Codes – http://www.loinc.org/, and ICD (International Classification of Diseases – http://www.cdc.gov/nchs/icd9.htm) are also equally important. For example, RODS, the Real-time Outbreak and Disease Surveillance system, uses the HL7 message protocol to receive clinical encounter data from participating hospitals in real time [55], while the US Department of Defense Global Emerging Infections System is basing its seven syndromic surveillance categories on groups of related ICD codes http://www.geis.ha.osd.mil/GEIS/SurveillanceActivities/ESSENCE/ICD9May02.xls (see also "Proactive, real-time, GIS-enabled health and environmental surveillance services" below).

Lowe also stresses the fact that technologies like XML and SOAP (Simple Object Access Protocol – involved in Web Services) are only part of the integration issue, and points to integrating geoprocessing and databases at other levels, and the related issues of optimisers and federated databases. Industry professionals now manage very large spatial databases. Often, client programs will pull a copy of the database spatial data into their own environment to process it instead of asking the database to do the processing. If the client program request happens to involve a very large database table, the copy-and-exchange process may drag on endlessly or even fail because of overload. This same potential problem awaits users of multiple feature-streaming map services [67].

Alternatively, if the spatial processing remains within the database environment, an optimiser program common to all professional databases will internally organise a response to the query that returns results in the fastest possible time. A query from the larger integrated system goes into the database and only the results come out, taking advantage of the database optimiser, reducing processing loads on the client that generated the question, and also reducing transmission loads [67].

Each database vendor's optimiser works best within its own specific database environment. A potential problem arises in case one wants to optimise the use of multiple databases when a query joins data from several different databases (from different vendors) at the same time. In the same spirit as the Web Services model, agencies can keep their existing heterogeneous database technology, and use a federated database technology to unite the mix. IBM, for example, offers a federated database technology that simulates views of any other database tables in IBM DB2 database, offering a master view of all data holdings. Furthermore, the federated technology's optimiser is aware of the available processing resources in other databases and organises query responses appropriately [67,74].

Grid-based real-time distributed collaborative geoprocessing could also form the basis of a next-generation solution to data and computationally intensive geoprocessing applications that are extremely difficult to execute on conventional systems and networks [75]. Grid computing allows non-collocated computers to work on and process data together, not just communicate and exchange data between each other. It is already a reality with many ongoing projects (see for example http://mapcenter.in2p3.fr/datagrid-s/).

Automated geocoding

Automated (even "on-the-fly") geocoding is one of the most essential spatial infrastructure-building tasks [58]. Higgs and Richards mention how different geocoding methods (used to geo-reference UK postcodes) have different levels of accuracy, which could affect study results [3]. Researchers need to determine if the level of error caused by a chosen method of geocoding may affect the results of their particular project [76].

Also of relevance in this context are the North American Association of Central Cancer Registries GIS Handbook http://www.naaccr.org/Standards/GIS Handbook PDF 6-3-03.pdf, which discusses (in its second section) the importance of address geocoding for the spatial analysis of cancer data, and ProADDRESS, an ArcGIS extension that has been made available by ESRI UK for geocoding UK addresses and/or postcodes http://www.esriuk.com/products/ProAddress_products.asp?pid=55.

Automated conflation of geospatial databases

Conflation is the ability to precisely geo-reference variant data layers compiled into one view. This can be crucial in emergency situations such as terrorist and bioterrorist attacks. The need currently exists for the development of automated conflation techniques transparent to the user. Croner gives the example of New York City where lack of automated conflation methods following the fall 2001 World Trade Centre attack resulted in time-consuming problems for emergency response teams. New York City is now building automated conflation capability by modifying all city planning spatial databases to include standardised "hooks" for matching and seamless linkage [58].

Adequate telecommunications infrastructure and bandwidth for spatial data transmission

For public health, a variety of rapid developing emergency-related events, including floods, fires, chemical spills and earthquakes, necessitate timely Web delivery of large spatial databases for responsive disaster intervention and control. Bandwidth is not only a problem of developing countries, but developed ones as well. Again, in the emergency response to the fall 2001 terrorist attack, lack of bandwidth in some areas of New York City resulted in delays in providing processed and urgently needed data for the Emergency Mapping and Data Centre (EMDC). Because of low bandwidth Internet connections, large data files had to be written to CD-ROM and driven by state Police twice daily for delivery to the agencies that needed them. Bandwidth is a key component of the transmission process of spatial data and is rapidly increasing in developed countries, promising improved spatial data transmission speeds in the near future [58].

Seamless integration into routine workflows of tools that are easy-to-use by mainstream public health practitioners

Richards et al call for GIS technology to be linked with community health planning tools through data entry forms and automated procedures (e.g., automated geocoding for vital statistics data) to help public health practitioners map and plan interventions at community level. GIS software tools are needed that are specifically custom-designed for use in public health, especially by organisations with limited staff and resources. Richards et al anticipate that GIS technology may one day become embedded and so deeply "buried" in public health practice to the extent that it is invisible to workers. Future health GIS applications will "know" which data silos are needed and where they are located. After loading the appropriate data and performing relevant analyses, they will offer alternative courses of action ranging from informing other people in the public health system to issuing health advisories [7].

It is noteworthy that Epi Info Version 3 developed by the CDC in the US already fulfils part of this vision. Epi Info Version 3 has been released as public domain software for Microsoft Windows, and is available free of charge on the Internet for anyone to download http://www.cdc.gov/epiinfo/. The program has some GIS functionality allowing public health practitioners to import, utilise, and display map boundary files and data, but there is still room for further improvements. The ultimate system will be one that is fault-tolerant and capable of analysing and presenting assembled data in ways that facilitate only appropriate interpretations of integrated data. This can be achieved by using some form of user friendly, "intelligent", goal-oriented health GIS wizards (based on robust statistical methods where appropriate), so that only valid results and maps are produced, even when users attempt to select inappropriate settings or datasets for a particular analysis. To maximise their utility, these wizards should also be fully integrated into everyday public health workflows and decision-making process. Such seamless integration would let users focus and spend most of their time on what they want to achieve rather than on learning and overcoming the limitations of the tools they are supposed to use to achieve their goals.

User interface accessibility requirements

In the US, Internet-based health GIS services must ensure Section 508 compliance with the Rehabilitation Act Amendments http://www.usdoj.gov/crt/508/508law.html and http://www.section508.gov/ to make complex graphical and mapping files accessible to visually impaired users [58] (see also http://www.esri.com/software/section508/index.html). The UK/EU equivalents of these accessibility requirements can be consulted online [77,78].

The Web interactive cancer mortality maps developed by the National Cancer Institute (NCI) and the National Institutes of Health (NIH) in the US are a good example of Section 508-compliant GIS services http://www3.cancer.gov/atlasplus/index.html. These maps offer users choices about type of cancer, age, race, sex, geography (e.g., state or county), and selection of class intervals, colour shading and scaling. Charts and graphs associated with the maps translate graphical data into a comparison form accessible by screen readers and are thus compliant with Section 508 for those with visual or manual impairment [58] (Figure (Figure55).

Figure 5
Screenshot of Section 508-compliant NCI cancer mortality maps and graphs Screenshot of the customisable cancer mortality maps and graphs developed by the US National Cancer Institute (NCI – http://www3.cancer.gov/atlasplus/index.html). These maps ...

Also of relevance in this context is Cynthia Brewer's ColorBrewer http://www.personal.psu.edu/faculty/c/a/cab38/ColorBrewerBeta.html, a free-to-use online tool available from Pennsylvania State University Web site and designed to help people select good colour schemes for maps and other graphics.

(For other examples of interactive Web maps of health conditions, the reader is referred to CDC's Oral Health Maps http://apps.nccd.cdc.gov/gis/doh/, Heart Disease and Stroke Maps http://www.cdc.gov/cvh/maps/statemaps.htm, and Atlas of Reproductive Health http://www.cdc.gov/reproductivehealth/GISAtlas/.)

Adequate protection measures against cyber terrorism

As the value of our information and computing infrastructure increases so to does the value of disruption. Critical information infrastructures are potentially vulnerable to cyber terrorist attacks. A cyber terrorist attack could be also used in support of a physical attack to cause further confusion and possible delays in proper response with greater losses. Securing any spatial health information infrastructure we build against such attacks is thus extremely important. Kevin Coleman suggests several measures that can be taken for thwarting cyber terrorism; interested readers are urged to refer to his article [79].

Problematic issues and solutions

This section discusses some of the more problematic issues associated with the implementation of a spatial health information infrastructure and real-time public health GIS services. Individual privacy, national security, and data confidentiality issues, as well as a range of data/analysis errors and problems are covered below, together with an array of solutions (currently available or under development) that address them.

Individual privacy, national security, and data confidentiality issues

In public health worldwide, any public identification of an individual's health status and residence, regardless of level of contagion or risk, is usually prohibited with very few exceptions, e.g., Megan's Law in the US, which allows the release of residential information on registered child sex offenders to the public by local government [58]. SARS (Severe Acute Respiratory Syndrome) mapping in Hong Kong using disaggregate case data at individual building level in near real time was another noticeable exception to this well-established public health confidentiality rule, and also a unique and rare GIS opportunity that resulted in some very comprehensive public Internet mapping services (see "Real-time/near-real-time GIS for epidemics management" below) [80].

Spatial data confidentiality is a complex issue. Even if a single database may appear to have effective confidentiality safeguards, when several databases are linked within GIS, the "sum" may be less well protected than the "parts". A false identification may be just as damaging to an individual as a correct identification that is not kept confidential [7].

On the other hand, confidentiality constraints often preclude the release of disaggregate data about individuals, which limits the types and accuracy of the results of the analyses that could be done [81]. Individual agencies holding micro-data (small population/individual health and environmental data) often impose restrictions on the level of geography that can be reported. In the US, for example, the National Centre for Health Statistics (NCHS – http://www.cdc.gov/nchs/) requires that for all micro-data that are released outside of NCHS, geographic identification must be deleted for all areas below the State level, which contain fewer than some predetermined number of inhabitants. Traditional ecological analysis based on choropleth mapping and the analysis of aggregate data for administrative areas has been heavily criticised. It is increasingly becoming clear in the field of public health that individual-level health information aggregated to pre-existing political or other administrative areas to protect individual privacy often destroys information needed for geographical analyses making it impossible to address many important public health concerns, e.g., accident risk of particular environments, hazards of living close to hazardous waste sites, exposure risk from lead associated with urban highways, etc. Such concerns can only be addressed using micro-data. The lack of spatially-disaggregate data on healthcare utilisation and clinical activity also limits the types and power of healthcare delivery studies that can be carried [13,28,82,83].

Using aggregated data instead of address-level data (when the latter is required) produces what Jacquez calls "spatial uncertainty". Moreover, using area centroids instead of exact locations can yield misleading results [12]. According to Armstrong et al, when data are spatially aggregated to large areas, the ability of researchers to detect disease clusters or to investigate suspected relationships between environmental exposures and disease events is affected in four ways: (1) absolute and relative locations within the geographical extent of each area are unobservable making it impossible to perform tests of clustering, except for those designed to operate specifically on data aggregated to areas; (2) the effect of the geographic scale of the aggregation with respect to the geographic scale of the clusters means that the aggregation level used in an analysis limits the size of clusters that could be detected; (3) the shape and placement of aggregation areas in relation to the real-world distribution of the disease or clusters under study, e.g., when a disease cluster straddles two or more aggregation areas, may result in ambiguous or negative results; and (4) accurate analyses are only possible when health data are spatially encoded to the boundaries of areas with common levels of environmental exposure, which is usually not the case since exposure assessment data are generally collected for different areas than health and demographic data [83].

Fortunately, solutions exist that can preserve data confidentiality while still enabling fine-level analyses and reliable results. These solutions involve (1) the use of statistical and epidemiological methods to mask the geographic location of data in a way that can still permit meaningful analysis, e.g., special types of spatial and temporal aggregation of data; (2) the creation of secure (networked) environments with limited and multiple levels of access (to confidential data) in which public health researchers can be carefully monitored to ensure protection of individual and household confidentiality; and (3) the development, publication and strict enforcement of appropriate, unambiguous policies and regulations [7,58]. The three solution groups are discussed below.

(1) Statistical and epidemiological methods: Armstrong et al describe different promising types of geographical masks to encode the geography of health records. These masks not only preserve the confidentiality of individual health records, but also preserve, to the maximum degree possible, the geographic properties of the data, thus permitting the investigation of questions that can be validly answered only with some (adequate) knowledge about the location of health events [83].

The geographic coordinates of data collected at discrete locations can be subjected to a family of affine point transformations that move these locations deterministically to a new set of locations. Another technique is random perturbation, in which each point is displaced (within the range of a constant maximum magnitude of displacement) by a randomly determined amount, and in a randomly determined direction, specific to its original location. A third class of geographic masks is aggregation. Areal aggregation involves enumerating the total that exists within a region. Point aggregation uses a single, surrogate location to represent the location of several individual-level events. In the latter case, regions could be represented by their geographic centroids, or surrogate locations could be computed that are optimised regarding some defined relationship to the original locations (location-allocation methods). Other point aggregation methods include microaggregation and blurring. It is also possible to aggregate for non-conterminous "regions" of interest like releasing health data for all areas within a given distance of a specified hazard, e.g., all children's accidents within 20 metres of stop signs. Another possible approach to limiting disclosure is to remove all explicit geographic identifiers from the health record and replace them with contextual information of specific interest to the data user [83].

The best approach will depend on the purpose of the data user as well as the degree of disclosure risk that the data custodian wishes to tolerate. Preliminary research suggests that random perturbation of data, up to some limit, is superior to affine and aggregation masks for many analytical purposes [83].

Areal aggregation is perhaps the most commonly adopted approach among those suggested by Armstrong et al [83]. Health organisations are always looking for finer levels of boundary geography to aggregate their data to. The US South Carolina Department of Health and Environmental Control (SCDHEC) used to aggregate spatial health information at county level. While county-level data provide a wealth of information, information at this macro scale does not allow local health officials to adequately identify, analyse, and monitor health problems at the community level. Laymon describes SCDHEC's more recent approach to managing geo-referenced vital health records by geocoding them, then aggregating them at census tract level instead of county level. A US census tract is a small statistical subdivision of a county designed to be relatively homogenous with respect to demographics, socio-economic characteristics, and living conditions, and to contain between 2,500 and 8,000 residents. SCDHEC chose census tracts because they contain useful socio-economic data that could be combined with the aggregated vital records. Census tracts were also chosen because these geographic boundaries are updated once every decade (stable) [84].

The resultant system, SCDHEC's Vital Health and Census Data Integration System, automates geocoding and aggregation of vital records data (births and deaths), while a Health Data Query System provides easy access to the aggregated data. By joining aggregate vital records health data with existing socio-economic census data, the system provides a good tool for developing surveillance and intervention strategies while preserving residents' confidentiality. The point data resulting from the geocoding process (before aggregation) is treated with all the confidentiality of paper certificate data, and stored for future use in very high-resolution studies [84]

In England, Wales and Northern Ireland, Output Areas (OAs) have been introduced at the 2001 Census. OAs form a new level of boundary geography for reporting purposes. From April 2004 they will also be used as the finest level of reporting geography for Neighbourhood Statistics http://www.neighbourhood.statistics.gov.uk/, which includes crime, education, economic deprivation, work deprivation, and health. Output Areas are built from postcodes, and have been designed for homogeneity and to be static. Each OA contains approximately 125 households. Due to their smaller size (compared to Enumeration Districts), OAs allow for a finer resolution of data analysis while still ensuring data confidentiality [85].

Armstrong et al also mention another possible solution to data confidentiality problems based on software agents. Software agents are emerging as an important computing paradigm. If an agent were designed to support the analysis of public health data, users would not be required to have access to confidential health records. Rather, they would submit a request to an intelligent analysis agent that would assess the request, and if found appropriate, would complete the analysis and return a result to the data user without exposing any individual-level health data [83].

It is noteworthy that the Health System Resident Component (HSRC), part of RODS, the Real-time Outbreak and Disease Surveillance system (described below), is based on similar concepts. HSRC is located within the firewall of a health system, and its purpose is to provide RODS with additional public health surveillance functions that would not be possible if it were located outside of the firewall due to restrictions on the release of identifiable clinical data. It functions as a case detector in a distributed public health surveillance scheme linking laboratory and radiology data to increase the specificity of case detection. HSRC removes identifiable information before transmitting any data to RODS (outside the health system's firewall) [55].

(2) Secure networks and multiple levels of access: Croner describes a solution to data confidentiality problems consisting of multiple levels of access to data classified according to its nature, ranging from confidential/protected data to public/open access data according to user credentials. Access to confidential data can be accommodated for qualified users in secure Intranet or Internet settings [58].

There are many specifications and standards involved in designing and implementing secure networks, e.g., BS 7799/ISO 17799 http://www.riskserver.co.uk/bs7799/whatisit.htm. EPAL, the Enterprise Privacy Authorisation Language http://www.zurich.ibm.com/security/enterprise-privacy/epal/, and XACML, the eXtensible Access Control Markup Language http://sunxacml.sourceforge.net/, are also very relevant developments in this context.

It is noteworthy that Digital Rights Manangement (DRM), conventionally used for protecting Internet music and films, is now available for other types of digital documents. The latest Microsoft Windows Server 2003 Rights Management Services (RMS) technology offers the possibility to create multiple detail/data levels of data categorised according to sensitivity, and match them to multiple levels of access according to user credentials (see http://www.microsoft.com/rm). There are also other DRM solution providers today besides Microsoft, e.g., Macrovision http://www.macrovision.com/.

Davenhall introduces the concept of Private Secured Geography Networks (PSGN) built on geoservers, and capable of analysing geographic queries and distributing information with geographic relevance. Each participating organisation in a community health surveillance system (CHSS – see below) can run its own PSGN and geoserver behind its firewall, and directly control information content and access by internal and external entities and maintain the confidentiality of its data. While each participating organisation maintains its data securely, perhaps generating/holding different classes of data/levels of detail (e.g., anonymised vs. personal identifiable information) at a variety of security levels, all data can be automatically and quickly integrated when required, e.g., in the event of outbreak or epidemic, and released to only those who have proper access authorisation [2].

The US NCI GIS for Health (GIS-H) developed as part of the Long Island Breast Cancer Study Project (LIBCSP) provides a good example of a successful implementation of multiple levels of data access according to user credentials. A "Researchers" area of the LIBCSP Web site provides applications necessary for access and use of non-public resources that are subject to privacy and licensing restrictions http://www.healthgis-li.com/researchers/researchers.htm. On the other hand, data, information, maps and software that have been approved for public dissemination are available to anyone.

Similarly, the Washington State Health Department's online developmental Epidemiologic Query and Mapping System (EpiQMS – http://epiqms.doh.wa.gov/) incorporates three levels of security in order to accommodate citizens, public health and medical practitioners, and public health agency investigators access to state and regional health data. This security model allows different levels of access to the data depending on the likelihood that an individual's privacy could be compromised [58].

(3) Legislations, policies, and regulations: It is also important for public health agencies to develop unambiguous, standardised confidentiality guidelines and security rules for their database holdings, and Web sharing and use of their spatial data. Lack of sufficient or clear laws regarding privacy, and variations in protections of health data across different organisations and agencies may preclude or delay data sharing across regional lines and organisational boundaries, or involve unacceptable risks to the privacy of data that are transmitted [13,58].

Confidentiality guidelines and accessibility restrictions to the public and research community should be Web documented in searchable metadata that describe essential elements of the database. Through metadata all public health agencies can inform others of their spatial data holdings and any limitations associated with their use [58].

In the UK, the implications of recent legislation, such as the 1998 Data Protection Act [86], which came into force in March 2000, on the use of geocoded patient information in medical research are somewhat unclear and need to be closely examined. Potential changes in the provision of patient data to cancer registries such as the ethical requirement to obtain patient consent prior to information being passed to registries could, for example, have major implications for researchers examining spatial patterns in cancer incidence [28].

On the positive side, Section 60 of the Health and Social Care Act 2001, which applies to England and Wales, allows the Secretary of State to make regulations enabling disclosure of information for specified purposes, without consent, but without breaching common law requirements of confidentiality. This covers the processing of confidential patient information that relates to the present or past geographical locations of patients (including where necessary information from which patients may be identified) which is required for medical research into the locations at which disease or other medical conditions may occur [87].

Other legislation documents to be considered in a UK/European Union (EU) context include EU Data Protection Directives and the related CEN/TC 251 (European Standardisation of Health Informatics) guidance document [88,89]. The UK General Medical Council, the Department of Health Information Policy Unit, the NHS Information Authority, and the Department for Constitutional Affairs also publish important documents covering confidentiality, data protection and data sharing issues [90-94].

In the US, privacy rules in the Health Insurance Portability and Accountability Act (HIPAA) of 1996 and its new DHHS (Department of Health and Human Services – http://www.hhs.gov/ocr/hipaa/ privacy provisions contain extensive exemptions if the identification information is used for treatment, payment, research, or national priority activities that are carried out in the interest of public health and safety [58].

On another level, following the September 2001 events in the US, many federal and local spatial databases, e.g., "critical infrastructure" spatial data, were assessed by their holding agencies as a potential liability to national security and withdrawn from the Internet or public dissemination. The current concern is to find an appropriate balance between public access to spatial information and protection of information considered a priority for national security (this is another important aspect of data security and confidentiality) [58].

Maps that lie: the "gee whiz" effect and visual bias

GIS integration of complex data into visually easy-to-understand pictures can sometimes be a setup for misunderstanding and misuse. Richards et al call for sound epidemiological principles and methods to provide the foundation for the data analyses to be displayed on GIS maps. To avoid drawing false conclusions from maps, GIS users need to understand and apply epidemiological principles and methods in formulating study questions, testing hypotheses about cause-and-effect relationships, and critically evaluating how the chosen dataset(s) and GIS method(s), data quality, confounding factors, and bias may influence the interpretation of results, and hence any decisions based on them [7].

According to Monmonier, it is not just easy but also essential to lie with maps. The cartographer's paradox is that to avoid hiding critical information in a fog of detail, the map must offer a selective, incomplete view of reality [95]. Public health practitioners need to be alert for "lies" that can range from legitimate and appropriate suppression of some details selectively to help the user focus on what needs to be seen to more serious distortions in which the visual image suggests conclusions that would not be supported by careful epidemiological analysis. For example, when some geographic units of analysis have small denominators, disease rates computed for these areas may appear extremely high if any cases have occurred in these areas (the "small numbers" problem). When the rates for these geographic locations are displayed on a map, readers may incorrectly conclude that these are "hot spots", high priority locations for targeted interventions. More appropriately, these areas should be labelled to indicate that rates are statistically unstable due to small numbers and therefore not shown [7].

Along similar lines, in 1998, Jacquez defined the "gee whiz" effect as "the formulation of hypotheses to explain an apparent (visual) pattern whose existence has not been confirmed", and stressed the importance that appropriate and robust statistical methods be used to support the thematic data layers being displayed and analysed in order to avoid the consequences of visual bias in GIS processes, in which spatial patterns might seem to appear where none actually exists, and inferences might sometimes be made on invalid assumptions [12].

In a personal e-mail communication with Dr. Geoffrey Jacquez five years after his original definition of the "gee whiz" effect, he affirmed that he still stands by the idea that pattern recognition (both spatial and spatio-temporal) requires objective approaches that transcend the subjectivity of the human eye. According to Jacquez, these approaches play a role in both confirmatory and exploratory analyses. He continues: "Especially within the exploratory framework, one must be able to discriminate true patterns from apparent patterns that could be explained by chance. In the absence of such capability, both confirmatory and exploratory analyses spin their wheels because they lack an objective mechanism for identifying and quantifying relationships in the data." (Geoffrey M. Jacquez, personal communication – July 2003)

Jacquez also mentioned in his e-mail to this author that they have several initiatives underway in the development of surveillance systems at BioMedware http://www.biomedware.com/, the company he leads. One of these projects, NetSurv, will link diverse databases in real time, will support dynamic visualisation (linked windows and cartographic and statistical brushing), and will include surveillance and pattern recognition statistics for separating true signal from noise. This will enable prospective analysis of incoming health data (the continuous monitoring of health data, combining historical data with new information as it is received). The ultimate aim of the NetSurv project is to provide decision support and monitoring tools that will enhance existing disease surveillance systems and support timely analysis, policy formulation, and public health actions. An early version of the architecture, but one that is linked only to static cancer mortality outcomes has been developed for the US NCI and may be downloaded from https://www.terraseer.com/atlasviewer.html (Figure (Figure6).6). They also have Rogerson's spatial pattern surveillance method in their ClusterSeer software (see also "Testing for spatio-temporal disease clustering" above). Help for the method, as well as for the entire ClusterSeer software project is available online at https://www.terraseer.com/csr/clusterseer_help.html (Geoffrey M. Jacquez, personal communication – July 2003).

Figure 6
Screenshot of TerraSeer's Cancer Atlas Viewer Screenshot of TerraSeer's Cancer Atlas Viewer showing a US states diverging gradient map of Z-score standardised version of "R(ALL, ACC, BF, 7094)" numeric dataset, where R = the mortality rate per 100,000 ...

Three other software tools/visualisation projects are worth mentioning in this context. The first of these tools is GeoDa http://agec221.agecon.uiuc.edu/csiss/geoda.html. GeoDa was designed by Luc Anselin and co-workers at the University of Illinois at Urbana-Champaign to implement techniques for exploratory spatial data analysis on lattice data (points and polygons). It is intended to provide a user friendly graphical interface to methods of descriptive spatial data analysis, such as spatial autocorrelation statistics and indicators of spatial outliers. The second tool is GeoVISTA Studio http://www.geovistastudio.psu.edu/jsp/index.jsp. Developed at Pennsylvania State University Department of Geography, GeoVISTA Studio is a programming-free, open software development environment designed for geospatial data. It allows users to quickly build applications for geocomputation and geographic visualisation, and is freely distributed over the Web at no cost to academic and non-commercial users. The third project is Daniel Carr's micromap plots on the NCI/CDC State Cancer Profiles Web site http://statecancerprofiles.cancer.gov/micromaps/.

In the future, it may become possible to incorporate BioMedware's disease trend monitoring techniques and novel visualisation approaches that are currently being developed within the NetSurv project (as well as tools like GeoDa) as analytic components in other surveillance systems. However, early NetSurv pilot results showed that its Web-based interface was difficult, slow, and not user friendly [96]. Though we definitely need rigorous, "objective approaches that transcend the subjectivity of the human eye", we also equally need easy and reliable tools suitable for use by non-expert statisticians (mainstream public health practitioners and informaticians).

The ecologic fallacy and the atomistic fallacy

Users, including policy makers, may be tempted to infer causation from correlation and to make inferences about individuals from population data (the ecologic fallacy). While conclusions based on an analysis at the aggregate level are likely to be limited by aggregation bias and by the ecologic fallacy (failing to identify the true nature of cause-effect relationships at the level of the individual), conclusions based on analysis at the individual level may be also limited by the atomistic fallacy (failing to consider the broader context in which individual behaviour occurs). A balanced approach is needed. GIS technology could be used to link data for an individual (individual predictors) with contextual information and ecologic predictors aggregated at a variety of geographic (community) levels, enabling the preparation of multi-level spatial models to better evaluate and distinguish biological, contextual, and ecological effects [7].

Activity spaces/time geography

The potential discrepancy between the place of diagnosis and that of the exposure to environmental variables influencing the particular health outcome(s) in question must be taken into account. We need to consider the daily activity spaces of patients. Understanding the individual's time-space history can provide important (aetiological) information not only for the epidemiologist, but also for the clinician, and should be considered in order to address the effect of individuals' high mobility/activity space on any identified disease patterns, and to avoid erroneous aetiological hypotheses and conclusions. The problem is particularly acute for diseases that have a long lag or latency period. This requires the availability of disaggregated longitudinal databases containing the residential histories of patients. Clearly, complete datasets of this nature are currently rare [97].

Data problems and errors

Back in 1992, Openshaw (cited in [81]) identified the following sources of GIS data error: errors in the positioning of objects, errors in the attributes associated with objects, and errors in modelling spatial variation (e.g., by assuming spatial homogeneity between objects). Other GIS experts also include errors resulting from GIS operations on spatial data (transformation and interpolation), the effects of generalisation operations (aggregation), errors due to differences of a temporal nature, and representational errors [81].

The scale level should be appropriate for the issues being investigated in an analysis, otherwise the results will not be meaningful and may be even misleading [11]. Different diseases have patterns that are interesting at different spatial scales, and the optimum scale is the one that reveals the most interesting pattern [14].

Moreover, because accuracy is scale-dependent, users should always determine if any resultant error at the currently selected scale is acceptable for a given application. Users also need to be continually aware of the errors that could arise when map data compiled for different purposes, and frequently, at different scales are merged into one application [81].

Oppong describes variations between different locations in data collection methods and standards, in the recorded items, particularly data on patient residence, and in diagnostic standards and case definitions. Such variations are often encountered in cancer research (to give an example), and can result in serious problems when pooling data from different locations for a common analysis [81].

Oppong gives examples of GIS data problems in HIV/AIDS research in developing countries. National data reported to WHO is problematic because of differences between countries in adequacy of testing facilities and reporting practices, varying definitions of what constitutes a case of AIDS, and political distortions of data. Besides, due to the location of biomedical facilities in urban areas, available data tend to over-represent these areas on the expense of rural areas. Paucity of biomedical facilities in rural areas usually means many health conditions there pass unreported [81].

Since it is impossible (in practice) to perform error-free spatial analysis, users must develop increased sensitivity to and awareness of the various types of data errors and uncertainty, as well as competency in techniques for recognising and reducing their negative impact on conclusions drawn from spatial analysis. For example, the MARA (Mapping Malaria Risk in Africa – http://www.mara.org.za/ project resorted to establishing a malaria risk atlas instead of an incidence atlas due to the lack of reliable data for determining the level of malaria incidence and mortality in African countries [81].

Existing SDIs and SDI initiatives worldwide

Spatial data are strategically important to decision makers at all levels and thus should be an indispensable part of the basic infrastructure in the individual country, in line with roads, hospitals, schools, etc. By infrastructure we mean the basic structures and facilities necessary for a country or an organisation to function efficiently. An infrastructure has the following characteristics: (1) users are aware that "somebody" maintains the infrastructure, but do not regard this maintainer as an owner; (2) users expect it to always be available, even if there is a fee or other requirement for its use; (3) the delivery or provision of the service is largely standardised, and as a result of this, users take it for granted because of the ease of use; and (4) an infrastructure is expensive to develop and maintain, and the returns from the investment are usually long term [6].

Distributed geolibraries – a different hat for the same concepts

The vision and concepts behind a spatial information infrastructure are sometimes described in the literature under different hats. Back in 1999, the US National Research Council published a document on distributed geolibraries. A spatial information infrastructure and a distributed geolibrary share closely related concepts, and face a similar set of problems [98].

Distributed geolibraries are modelled on the operations of a traditional library, updated to a digital networked world (e.g., the Web), and focused on the supply of information in response to a geographically defined need (using GIS and related technologies). The contents of a distributed geolibrary are not limited to information normally associated with location maps or images of the Earth's surface, but also include any other information that can be associated with a geographic location. A geolibrary is distributed if its users, services, metadata, and information assets can be integrated among many distinct locations [98].

A distributed geolibrary would support collaborative work, such as multidisciplinary research by teams, and decision-making by groups of stakeholders. It should be also possible to access a distributed geolibrary right in the field where information is needed most (especially in emergency management) using portable systems and wireless communications. Moreover, specialised sensors may be brought to the field, supplying new data that will have to be integrated with existing data in the library [98].

The success of a distributed geolibrary is largely dependent on the ability to integrate information available about a place. This in turn depends on finding appropriate solutions to problems of indexing, visualisation, scaling, automated search and abstracting, formats and standards, and data conflation. In addition, there are a variety of social and organisational issues, privacy concerns and intellectual property rights that also need to be catered for [98].

To demonstrate how important the concept of geolibraries is, reference [98] provides some very realistic example scenarios (see http://www.nap.edu/html/geolibraries/ch1.html), including one about a public health researcher who wants to analyse the complex associations of environment and disease in a particular urban area, and another one dealing with a chemical spill emergency response. Information resources through distributed geolibraries could greatly assist rapid response to such emergencies and longer-term efforts aimed at prevention and mitigation.

US NSDI

Geo-referenced data form a significant part of the US National Information Infrastructure. In 1994, former US President Clinton issued an Executive Order covering: (1) the establishment of an electronic national spatial data clearinghouse; (2) standardisation of metadata; (3) improvement of public access to spatial data; and (4) the requirement of federal agencies to use the clearinghouse to locate existing data before expending tax dollars to collect more data [13].

In the same year (1994), the US Federal Geographic Data Committee (FGDC – http://www.fgdc.gov/) defined a National Spatial Data Infrastructure (NSDI) as "the technology, policies, standards, and human resources necessary to acquire, process, store, distribute, and improve utilisation of geospatial data" [99]. FGDC, which has lead responsibility for the orderly deployment of the US NSDI, works to coordinate federal activities, in conjunction with state and local government and the private sector, regarding the collection, documentation, and dissemination of spatial data. This interagency committee is responsible for coordinating the development of standards and partnerships for data description and exchange throughout the US government [13,58].

Today more than any time before, the US federal government is fully supporting the premise that digital spatial data constitute a federal capital asset. The return on spatial investment can be highly cost effective through the one-time development of spatial data, and the subsequent sharing of that data among many users, at all levels of government and all sectors, over time ("build once, use many times"). One of the most recent NSDI-related US e-government initiatives, Geospatial One-Stop, is intended to revolutionise electronic government by providing a geographic component for use in all Internet-based government activities across all government levels. This will enable immediate discovery and "one-stop" access to spatial metadata and data via a single Internet location/interface for different kinds of analyses and improved decision-making, and will eliminate the redundancies of costs associated with (duplicate efforts of) spatial data collection, conversion between formats, production and dissemination [58,100].

To achieve its vision, the Geospatial One-Stop initiative has launched Geodata.gov http://www.geodata.gov/, a Web-based portal for one-stop access to maps, data and other spatial services that will simplify the ability of all levels of government, private sector, academia and citizens to find spatial data and learn more about spatial projects underway.

The US health sector is rapidly becoming a responsive and integral part of the NSDI. The Department of Health and Human Services (DHHS) is already a member of FGDC, which also has representatives from CDC and NIH [13,58]. Web-enabled public health GIS developments under the umbrella of NSDI are to be guided by OpenGIS Consortium Web interoperability and GML/XML spatial Web content standards, and FGDC-endorsed spatial metadata standards, e.g., the Content Standard for Digital Geospatial Metadata, FGDC-STD-001-1998, approved by FGDC in 1998 [58,101]. However, it is expected that all current national metadata specifications, e.g., the US FGDC-STD-001-1998 and the UK GIgateway Discovery Metadata Specifications (see below), will ultimately converge to ISO 19115/19139 in the near future [102-104].

UK GIgateway

In the UK observation of the development of the US NSDI led in 1995 to what became the UK National Geospatial Data Framework (NGDF). The askGIraffe Data Locator was then launched in July 2000, and has now become superseded by GIgateway http://www.gigateway.org/moreinformation/history.html. GIgateway is an information service providing access to spatial metadata in the UK. At the heart of the service is the Data Locator http://www.gigateway.org.uk/datalocator/default.html – Figure Figure7),7), which is capable of carrying real-time searches across a number of distributed metadata-bases (held at their respective data providers' local sites that have registered with GIgateway), in addition to querying GIgateway's own catalogue. A query of the Data Locator using the keyword "health" yielded 426 records (on 26 November 2003). However, many of the returned metadata records had incomplete/empty fields, and no instant access over the Internet to the actual datasets they are describing, or to a license agreement/payment form to access these datasets, as one would expect from a comprehensive "one-stop" Web-based clearinghouse (e-mail contact details are usually provided instead). Moreover, the returned records did not include the latest Census 2001 Key Statistics for health areas in England and Wales http://www.statistics.gov.uk/census2001/cn_61.asp, which was a notable deficiency. GIgateway's metadata creation tool, a JAVA-based application called MetaGenie http://www.gigateway.org.uk/datalocator/metadatatool.html, enables the creation of geographic metadata compliant with GIgateway's Discovery Metadata Specifications. MetaGenie will be rewritten to be fully compliant with the new international standards, ISO 19115/19139, in the near future. GIgateway is funded by the UK Government and run by the UK Association for Geographic Information – AGI (Judith Jerome, GIgateway Information Services Manager, personal communication – October 2003).

Figure 7
Screenshot of GIgateway Data Locator search form Screenshot of GIgateway Data Locator search form http://www.gigateway.org.uk/datalocator/default.html. GIgateway Data Locator is intended to help users find and use up-to-date and accurate geographic information ...

INSPIRE ESDI

The equivalent of the US FGDC NSDI in Europe is INSPIRE, the INfrastructure for SPatial InfoRmation in Europe, a recent initiative launched by the European Commission http://inspire.jrc.it/. INSPIRE intends to trigger the creation of a European Spatial Data Infrastructure (ESDI) that delivers to the users integrated spatial information services. INSPIRE is founded on the following principles: (1) data should be collected once and maintained at the level where this can be done most effectively; (2) it must be possible to combine seamlessly spatial data from different sources across the EU and share it between many users and applications; (3) it must be possible for spatial data collected at one level of government to be shared between all levels of government; (4) spatial data needed for good governance should be available on conditions that are not restricting its extensive use; and (5) it should be easy to discover which spatial data are available, to evaluate their fitness for purpose and to know which conditions apply for their use [105].

A common infrastructure for spatial information in Europe can only be realised in the long run. Therefore, an extensible, step-by-step approach is being developed [105]. It is noteworthy that the US NSDI development activities, which started nearly ten years ago, are not yet complete with some serious gaps still needing to be addressed [5].

Other national SDIs

These include the Canadian Geospatial Data Infrastructure (CGDI – http://www.geoconnections.org/CGDI.cfm/fuseaction/home.welcome/gcs.cfm) and the Australian Spatial Data Infrastructure (ASDI – http://www.ga.gov.au/nmd/asdi/ and http://www.anzlic.org.au/infrastructure.html). The Australian Spatial Data Directory (ASDD – http://www.ga.gov.au/asdd/) provides search interfaces to discover spatial metadata throughout Australia.

GSDI

On a global scale, a Global Spatial Data Infrastructure (GSDI – http://www.gsdi.org/) is being advanced through the leadership of many nations and organisations represented by a GSDI Steering Committee. This multi-national Steering Committee includes representatives from all continents, and all sectors – government, academia, and the private sector. GSDI Web site provides the following definition: "GSDI supports ready global access to geographic information. This is achieved through the coordinated actions of nations and organisations that promote awareness and implementation of complementary policies, common standards and effective mechanisms for the development and availability of interoperable digital geographic data and technologies to support decision making at all scales for multiple purposes. These actions encompass the policies, organisational remits, data, technologies, standards, delivery mechanisms, and financial and human resources necessary to ensure that those working at the global and regional scale are not impeded in meeting their objectives." [106]

A GSDI brochure published in 2002 stresses that SDIs provide a basis for spatial data discovery, evaluation, and application, and mentions the following SDI elements [107]:

(1) Geographic data: the actual digital geographic data and information [107].

(2) Metadata: the data describing the data (content, quality, condition, and other characteristics). They permit structured searches and comparison of data in different clearinghouses and give the user adequate information to find data and use it in an appropriate context [107].

(3) Framework: includes base layers, which will probably differ from location to location. It also includes mechanisms for identifying, describing, and sharing the data using features, attributes, and attribute values, as well as mechanisms for updating the data without complete re-collection [107].

(4) Services: to help discover and interact with data [107].

(5) Clearinghouse: to actually obtain the data. Clearinghouses support uniform, distributed search through a single user interface; they allow the user to obtain data directly, or they direct the user to another source [107].

(6) Standards: created and accepted at local, national, and global levels [107].

(7) Partnerships: the glue that holds it together. Partnerships reduce duplication and cost of collection and leverage local/national/global technology and skills [107].

A free how-to book, "Developing Spatial Data Infrastructures: the SDI Cookbook", is also available for downloading from GSDI Web site in several languages; the English version is available from http://www.gsdi.org/pubs/cookbook/cookbook0515.pdf. The Cookbook gives geographic information providers and users the necessary background information to evaluate and implement existing components of SDI. It includes recommended existing and emerging standards and specifications, as well as business case examples of best practice. (See also http://www.gsdi.org/docs1997/97_ggdiwp2b.html.)

Proactive, real-time, GIS-enabled health and environmental surveillance services

The vision and services presented in this section involve SDI-like structures and arrangements or rely on early "small-scale" SDI implementations, and would certainly benefit from the presence of mature SDIs covering the regions where these services operate.

Public health surveillance and the need for real-time, proactive services

The US CDC define public health surveillance as "the ongoing systematic collection, analysis, and interpretation of health data essential to planning, implementation, and evaluation of public health practice, closely integrated with the timely dissemination of these data to those who need to know. The final link in the surveillance chain is the application of these data to prevention and control. A surveillance system includes a functional capacity for data collection, analysis, and dissemination linked to public health programmes." [108]

The systematic collection, analysis, and dissemination of health information are critical aspects of public health. Surveillance is the problem-finding/monitoring process. This should ideally be linked to public health action, which is the problem-solving process. Traditionally, surveillance was used for acute infectious diseases, but over the past decades there has been a significant expansion of surveillance into new areas of public health concern including injuries, environmental health, occupational safety and health, and chronic diseases.

One of the main problems of conventional public health disease surveillance, which relies on physician and laboratory reporting and manual off-line analysis of surveillance data, is that it is ill equipped for the timely detection of bioterrorist attacks [2,55]. It is unlikely that without an event or alert to raise his or her index of suspicion, a physician will attribute the early symptoms and signs of disease in a bioattack victim appropriately and report the case. A key limitation of the current system is that the lone physician is blind to the cases his or her colleagues in a nearby hospital are seeing – knowledge that might lead the physician to consider uncommon diseases more strongly in his or her diagnostic reasoning [55].

In fact, without a continuous (in real or near real time) and comprehensive health monitoring system covering a wide geographical scope, the public health community will never have much advanced warning of bioterrorist attacks to be able to abort them at an early stage, and limit their negative effects [2]. The question remains, if we build such systems (some early examples already exist – see below), what data should we monitor in real or near real time in order to be able to identify a covert bioterrorist attack.

Syndromic surveillance methods that can detect disease at an earlier stage are increasingly becoming an important research direction for public health surveillance. Because the data used by syndromic surveillance systems cannot be used to establish a specific diagnosis in any particular individual, syndromic surveillance systems must be designed to detect signature patterns of disease in a population to achieve sufficient specificity. For example, it would be irrational to use only the symptom of fever to attempt to establish a working diagnosis of inhalational anthrax in an individual, but it would be very sensible to consider anthrax release in a community if we were to observe a pattern of 1,000 individuals with fever distributed in a linear streak across an urban region consistent with the prevailing wind direction two days earlier [55].

A recent review paper by Mandl et al provides a comprehensive review of syndromic surveillance systems, and is intended to serve as a guide for informaticians, public health managers, and practitioners who are currently planning deployment of such systems in their regions. The paper also includes detailed discussions of the different outbreak detection methods that work with temporal and spatial data, as well as the metrics for measuring surveillance system quality. An interesting point raised by Mandl and his colleagues is the need for truly unique person identifier, so that individuals are not double-counted as they move between healthcare institutions (sometimes during the same day). Mandl et al also suggest using data already collected for other purposes whenever this is possible, since implementing new data collection processes can have prohibitive costs, and healthcare workers have repeatedly demonstrated poor compliance with additional data collection and administrative tasks. They also recommend designing "dual use" systems and not only focusing on the detection of bioterrorism or very rare outbreaks in order to boost the sustainability and long-term funding viability of such systems [109]. (See also the pages titled "Syndromic Surveillance: an Applied Approach to Outbreak Detection" and published by the CDC's Division of Public Health Surveillance and Informatics – http://www.cdc.gov/epo/dphsi/syndromic.htm.)

The dream now is to develop a universal multivariate surveillance system that can collect, analyse and interpret health-related information worldwide using modern information infrastructures for the global prevention of a wide range of health problems, or at least the early detection of such problems in order to mitigate their effects. GIS technologies and services that can function proactively in real time are extremely and critically important to realise this global public health surveillance vision (and indeed any smaller-scale surveillance services). Such surveillance services also require a sound and comprehensive spatial health information infrastructure to be built and maintained in a coherent way at all operation levels.

Real-time GIS for emergency management

Much of the information that underpins emergency preparedness, response, recovery, and mitigation is geospatial in nature [110]. According to FGDC, "without the real-time ability to quickly visualise activity patterns, map locations, and understand the multi-layered geospatial context of emergency situations, US homeland security will not be achieved." (The same principle also applies to other countries.) Geographic information technologies, combined with appropriate sets of timely, accurate and shareable geospatial information, provide an invaluable tool for the prevention of, protection against, timely detection of, preparedness, response to, and recovery from natural and man-made disasters [5]. These issues have now become more important after the September 2001 attacks.

Freier describes an organisational structure for emergency management operations using GIS based on the FEMA's approach (US Federal Emergency Management Agency – http://www.fema.org/). This approach recognises four stages: planning, preparedness, response, and recovery. Although Freier's paper is about using GIS to manage animal disease outbreaks and is thus primarily targeting animal health professionals, the GIS emergency management operations and methods it describes also apply largely to human health [11].

FEMA's Mapping and Analysis Centre (MAC) runs an integrated, state-of-the-art enterprise GIS (E-GIS) for the Agency. A GIS-based Consequence Assessment Tool Set (CATS), developed by SAIC (Science Applications International Corporation – http://cats.saic.com/ for FEMA and US Defense Threat Reduction Agency (DTRA), provides a powerful disaster prediction, analysis, planning, and response system. MAC maintains an extensive array of datasets (more than 150 databases/map layers in CATS alone) to ensure their ability to provide their customers (federal and government agencies) with the information they need in the form of GIS maps, tables, and analyses for planning, preparedness, response, and mitigation in relation to disasters and emergencies. MAC can produce GIS maps from important prediction model outputs, e.g., a hurricane wind model, a toxic plume model or an earthquake model, coupled with real-time data to provide estimates for projected damages in affected regions. It can also generate maps from damage assessment data after a disaster has occurred to visualise actual damages by analysing collected aerial reconnaissance and ground truth data. This can help emergency managers appreciate the spatial extent of damage, learn who was affected by the disaster and which resources were affected, and make timely, informed decision accordingly (e.g., a plume model can help determine those areas requiring evacuation; early informed interventions almost always result in mitigation of disaster effects) [111,112].

Real-time/near-real-time GIS for epidemics management

Johnson and Johnson provide a good, easy-to-read general introduction to GIS application areas, advantages and methods in public health and healthcare, with some emphasis on GIS uses in epidemiological surveillance and epidemics management [113].

The WHO has developed a comprehensive Event Management System to manage critical information about outbreaks and to ensure accurate and timely communications between key international public health professionals, including WHO Regional Offices, Country Offices, collaborating centres and partners in the Global Outbreak Alert and Response Network [114]. During outbreak response, the WHO uses a custom-made geographic mapping technology, which forms part of its existing system for outbreak alert and response, to assist in the location of cases and rapid analysis of an epidemic's dynamics. The WHO also uses this epidemiological mapping technology to predict environmental and climatic conditions conducive for some outbreaks [115]. The WHO aims to link the Event Management System to its Global Atlas of Infectious Diseases http://globalatlas.who.int/globalatlas/interactivemap/rmm/ for real-time mapping and tracking of new outbreaks [114].

Web-based maps allow for real-time or near-real-time map updates based on the latest datasets, for interactivity to be incorporated into the maps (desktop GIS-like functionality, e.g., drill-down and zooming), and for wider and more rapid dissemination of information (compared to other publishing media). Some of the best examples of Web-based maps were produced during the latest SARS outbreak, which is considered the first major new infectious disease of the 21st century and the Internet age that took full advantage of the opportunities for rapid spread along international air routes. Kamel Boulos reviewed several geographic mapping efforts of SARS on the Internet, including very detailed Hong Kong SARS distribution maps provided by Hong Kong Yellow Pages http://www.ypmap.com/en/viewer.asp?mapService=SARSMap; "SARS GIS" http://www.esrihk.com/SARS/Eng/sars_eng_main.htm, a service built by ESRI China (Hong Kong) Limited; and SarsNet http://rhone.b3e.jussieu.fr/sarsnet/www/activity.html, an online SARS database and GIS that was inspired from WHO/FluNet http://rhone.b3e.jussieu.fr/flunet/www/ and developed in collaboration with the WHO centre for electronic surveillance of diseases and the Institute for Medical Research and Health (INSERM Unit 444), Paris, France. The reviewed maps employed a variety of techniques like choropleth rendering, graduated circles, graduated pie charts, buffering, thematic mapping, overlay analysis and animation to allow public health decision makers, travellers and local populations at risk to visually monitor and appreciate at a glance changes, trends and patterns buried in different online SARS datasets that were continuously varying with time. Some of the mapping services presented provided very detailed information down to individual street/building level (in Hong Kong). This kind of support is vital for improving global vigilance and awareness at all levels, and for making well-informed decisions when designing and following up epidemic control strategies or issuing and updating travel advisories [80].

Davenhall's vision of a community health surveillance system

Davenhall defines a community health surveillance system (CHSS) as a network that constantly gathers, integrates, and analyses data on health indicators, occurrences, and transmissions of disease in a population; monitors the capabilities of the health system/level of health protection in that population; and spatially relates all this information using GIS. This proactive, geographically based approach can deal more effectively with and provide early warnings of health threats and disease outbreaks, particularly those caused by bio-weapons [2].

Davenhall distinguishes between a health surveillance system and a disease surveillance system. The former features a lower threshold for action than the latter. By the time someone is admitted to an acute care hospital with a communicable disease, that person may have been symptomatic for days or weeks and may have already been seen by healthcare professionals repeatedly, and would have already spread the disease to large numbers of persons. For example, smallpox, which begins with a rash that becomes more painful and extensive, is often initially treated with an anti-inflammatory antihistamine drug such as Diphenhydramine HCL (Benadryl). When this treatment proves ineffective, laboratory work is ordered and the diagnosis then made. From a community health perspective, a spike in the number of prescriptions for Benadryl in one area could be used as an indicator of a possible smallpox outbreak. A CHSS should be able to automatically detect such spike and raise an alarm early enough to contain the outbreak [2].

CHSS will have a GIS-based incident tracking system. Human intervention should not be required until pre-established critical levels – in the number and/or clustering of occurrences – are reached. The rule-based CHSS will use data interpretations made by epidemiologists and other public health officials [2].

However, the transition from episodic investigation to ongoing monitoring using GIS requires more robust data collection and analysis. CHSS relies on a continuous stream of clinical data that are gathered automatically across geographical boundaries. CHSS data include clinical data, such as symptoms, diagnostic results, and procedures (all coded using a suitable terminology or classification) and geographic information such as the locations of patients, medical personnel and assets, and outbreaks. To be reliable for the purposes of a CHSS, population-based data must also describe relatively small geographical areas. Data that reflect the level of "wellness" of a population in a geographic area are necessary to draw inferences about changes in health levels and exposure to disease [2].

The Real-time Outbreak and Disease Surveillance system (RODS)

RODS is a NEDSS-compliant, GIS-enabled (using ESRI ArcIMS 4.0) public health surveillance system for early detection of disease outbreaks, including those caused by bioterrorism. Hospitals send RODS data from clinical encounters over virtual private networks and leased lines in real time using the HL7 message protocol. RODS automatically classifies the free-text registration chief complaint from the visit into one of seven syndrome categories (constitutional, respiratory, gastrointestinal, neurological, botulinic, rash, haemorrhagic, and other) using Bayesian classifiers. It stores the data in a relational database, aggregates the data for analysis using data warehousing techniques, applies univariate and multivariate statistical detection algorithms to the data, and alerts users of when the algorithms identify anomalous patterns in the syndrome counts [49,55]. RODS processes sales of over-the-counter (OTC) healthcare products in a similar manner, but currently receives such data in batch mode on a daily basis. It also groups sales data of OTC products into analytic product categories relevant to public health surveillance (e.g., bronchial remedies, diarrhoea remedies, etc.) [55,56].

Real-time (continuous stream) transfer of data is to be preferred to batch transfer of data, as the latter may delay detection of suspicious events by as long as the time interval (periodicity) between batch transfers. For example, a surveillance system with daily batch transfer may delay by one day the detection of an outbreak [55]. Time intervals as small as hours can make a difference when a large cohort is exposed to rapidly progressing diseases such as anthrax. Furthermore, the challenge of merging similar data arriving from multiple sources with different time latencies is now a focus of attention in new surveillance approaches [56].

Preliminary studies suggest that sales of OTC healthcare products can be used for the early detection of outbreaks. People often engage in self-care with OTC medications such as cough syrups before seeking professional medical care. RODS' National Retail Data Monitor (NRDM) receives data daily from 10,000 stores/pharmacies that sell healthcare products. These stores belong to national chains that process sales data centrally and utilise Universal Product Bar Codes (UPC codes) and scanners to collect sales information at the cash register. The high degree of retail sales data automation enables NRDM to collect information from thousands of store locations in near real time for use in public health surveillance. Algorithms monitor the data automatically every day to detect unusual sales patterns. The current niche for NRDM is early detection of a mass exposure of a large number of people through air, food, or water contamination (a cohort exposure). Soon after such an exposure, the cohort will become symptomatic, and, depending on the symptoms, may begin self-treatment and then either recover or seek medical care. If the cohort is large enough, sales of OTC healthcare products will increase significantly above the normal, background sales level. The announced longer-term project plans include the expansion of monitoring to the level of selected prescription medications based on another standard coding system that is used in industry data systems [56].

Wagner et al cite the following desiderata for systems like RODS' NRDM: (1) collection and analysis of data in as near as real time as possible; (2) completeness of sales data collection (>=70% is considered an adequate figure) for both early detection and sensitivity to smaller outbreaks; (3) availability of precise spatial information like individual store locations, or at least store Zip Codes to support adequate spatial analysis of sales data; (4) collection of supplemental data, e.g., about retailers' promotions or how day of the week affects local sales volumes; (5) a system for maintaining UPC code masters and mappings to analytic categories (as new product codes are assigned); (6) an effective link with the intended users of the system (public health authorities) to effect the desired actions (e.g., order quarantine); and (7) as most large urban population centres cross jurisdictional health boundaries, a centralised national approach is recommended to provide a complete picture of the health of contiguous regions and prevent any redundant data collection for overlapping nearby jurisdictions [56].

Being linked to public health authorities and response also allows system developers to learn from prospective experience, to validate their data sources and algorithms in real-world settings, and to improve systems' ability to differentiate true infectious disease clusters from false alarms [109]. RODS also has a Web-based user interface that supports temporal and spatial analyses. RODS' password-protected, encrypted Web site allows users to review healthcare registration and sales of OTC healthcare products on epidemic plots and maps. When a user logs in, RODS will check the user's profile and will display data only for his or her health department's jurisdiction [55,56]. Because populations and market share coverage for sales of OTC healthcare products differ between Zip Codes, plotting raw sales counts is uninformative. NRDM maps represent a novel approach to presenting surveillance data. They plot for each Zip Code – using the colours green, blue, yellow, orange, and red to indicate increasing levels of concern – how "unusual" sales were for the day in question relative to historical patterns of sales for that Zip Code. In particular, the colours represent the number of standard deviations by which the observed sales of a product category in a Zip Code deviate from the expected counts. In presenting the data in this fashion, the map serves as a device to focus the user's attention on the degree(s) of anomaly. A user can quickly spot whether the map is predominantly green with a scattering of blue Zip Codes as would be expected, or whether there are confluent or linear patterns of blue, yellow, orange, or red indicating "unusual" sales activities. The map monitor computes the number of standard deviations relative to a residual signal that has zero mean and constant variation after removal of weekly and longer trends in the data by wavelet transformation. This procedure is intended to produce a "normalised" map that is very sensitive to sudden increases in product counts as would be the case in a medium- to large-scale air, food, or water contamination. Alternative data transformations are possible using different signal processing approaches focused on detecting more gradual increases. RODS researchers plan in the near future to screen the maps automatically with spatial scan statistics to identify those with anomalies suggesting a need for human review [56].

Some of RODS software has been bundled into downloadable ready-to-use packages that are available from http://www.health.pitt.edu/rods/sw/. However, deployment of such systems requires skilled network engineers, Oracle database administrators, and interface engineers. An application service provider model for RODS (and similar services) seems more suited for those health organisations with no access to that skills set. Such organisations can form coalitions to share the costs of such services [55].

Large-scale environmental surveillance projects in the UK

The relationship between physical environment and health is now accepted as complex, with environment acting not just directly but indirectly and in association with other influences to affect health and well-being. Indicators of health-relevant environmental exposures are invariably also indicators of social justice/inequalities [116].

The Environmental Health Surveillance System for Scotland (EHS3) is an ongoing project with funding from Scottish Neighbourhood Statistics that aims at providing for Scotland, the evidence base for better decision-making in environmental health. EHS3, in its completed form, will be an ongoing multi-agency collaboration involving NHS Board Areas, local authorities, the Scottish Environment Protection Agency (SEPA), Water Authorities and other relevant agencies. Its purpose will be to collect, hold and, as appropriate, analyse and interpret temporally and spatially tagged environmental and related health data throughout Scotland (e.g., attempt to correlate environmental exposures and health outcomes). EHS3 will also disseminate this data, much of which is currently available but is under-utilised [57,116].

EHS3 developers need to determine what information is currently available to begin with, and also need to address the problems of incomplete health and environmental data. EHS3 database will combine information obtained via ad hoc reporting of events, with a systematic active surveillance system. It will include environmental parameters like air quality, water quality, radiation, noise, mobile phone masts, and landfill sites. EHS3 will also incorporate health information from the SMR (Scottish Morbidity Record) hospital discharge data for a range of ICD 10 coded conditions, e.g., respiratory conditions, cerebrovascular disease, circulatory system disease, and malignant neoplasm. Other EHS3 health data sources include CMR (Continuous Morbidity Recording) data, and data from death record fields. The database thus created will be used to derive spatio-temporal trends in health and environmental exposure, which will be presented in tabular and geographical formats. In conformity with surveillance principles, data gathering will be ongoing and regular outputs will be agreed which will inform policy and action (as an evidential basis for action) to promote improved environmental standards and public health. With appropriate development, the system will also have potential as a predictive tool for managing environmentally occasioned (including weather-related) fluctuations in demand for NHS services. A further important characteristic of EHS3 will be its dynamic character with an ability to change emphasis and enhance outputs in response to circumstances as they emerge [57,116].

Another environmental project, the London Air Quality Network (LAQN), was launched in 1993 to coordinate and improve air pollution monitoring in London. By the end of 1999, twenty-nine London Boroughs were supplying data to the LAQN. Increasingly, these data are being supplemented by measurements from local authorities surrounding London, thereby providing an overall perspective of air pollution in South East England. The data are used to generate the daily updated London urban air pollution maps, which are published on LAQN Web site http://www.erg.kcl.ac.uk/london/asp/home.asp. The core LAQN activities are funded, operated and managed by the Environmental Research Group (ERG) at King's College London, with support and funds from the Department of Environment, Food and Rural Affairs (DEFRA) [117]. For all environmental health projects like those presented above, the importance of data quality, currency, completeness and fitness to the purpose at hand cannot be overemphasised. Accurate and statistically representative locational information along with standardised quality-controlled measurements of environmental exposures, over time, are essential if one is to perform robust spatial statistical analyses of suspected associations between the environment and human diseases [13].

Discussion, recommendations and concluding remarks

GIS offer a very rich toolbox of methods and technologies that goes far beyond the mere production of simple maps (or digital cartography). From a community health perspective, GIS could potentially act as powerful evidence-based practice tools for early problem detection and solving. When properly used, GIS can: inform and educate (professionals and the public); empower decision-making at all levels; help in planning and tweaking clinically and cost-effective actions, in predicting outcomes before making any financial commitments and ascribing priorities in a climate of finite resources; change practices; and continually monitor and analyse changes, as well as sentinel events.

However, although multiple novel spatial statistical and GIS methods are potentially available, we still need to unambiguously determine which method(s) specifically should be used by practitioners for each specific health condition of interest, and whether the proposed methods are cost-effective and scalable. A critical review is needed of the evidence for GIS for specific preventable, mitigable and treatable health conditions. A good starting point may be the CDC "Guide to Community Preventive Services" http://www.thecommunityguide.org/. Topics identified in this guide (e.g., alcohol abuse, cancer, diabetes, mental health, motor vehicle occupant injury, oral health, physical activity, sexual behaviour, social environment, tobacco product use, vaccine preventable diseases, violence) could be addressed one by one by conducting a focused review of GIS literature on each topic, and then categorising the "nature of the scientific evidence" documenting whether GIS add any value to our understanding and management of the reviewed topic and/or the evidence that it would be feasible and cost-effective for the respective public health programmes tackling the reviewed topic to adopt GIS. This could inform the development of successful GIS business plans for the health conditions under consideration. A good example that comes to mind in this context is the 73-page "GIS for cancer" handbook titled "Using Geographic Information Systems Technology in the Collection, Analysis, and Presentation of Cancer Registry Data: A Handbook of Basic Practices" that was published by the North American Association of Central Cancer Registries [118]. (However, as is the case with any country-specific GIS research and publications, care should be exercised when extending findings and recommendations to other countries with different health and healthcare system settings.)

In reviewing GIS literature for the above mentioned purposes, this author appreciates the fact that the set of definitions and criteria for reviewing evidence as used in the CDC Community Guide is not directly usable for reviewing currently available GIS literature due to the nature of the latter; a modified set of definitions and criteria first needs to be developed. Also organising focus groups that bring together programme administrators, practitioners and the public is required to complement the expected gaps and deficiencies in current GIS literature, and to define the key questions that decision makers would want to be able to answer with GIS for any health condition under review, and think explicitly about what data and methods should be used to answer those questions.

Traditionally, two broad types of GIS applications can be distinguished which also reflect the two traditions in health geography (geography of disease and geography of healthcare systems), namely health outcomes and epidemiology applications and healthcare delivery applications. The use of GIS for improving hospital bed availability is among the most notable applications under the latter category. There are also studies at the interface (overlap) between epidemiological and healthcare delivery applications, for example in relation to healthcare commissioning and needs assessment.

However, despite all these potentials for GIS, they remain very much under-utilised in the UK NHS in mostly low-level, non-strategic tasks and in a largely fragmented and uncoordinated way. Spatial data and GIS are still not mentioned in any main UK health information strategy or policy document (the US seems to be somewhat ahead of the UK in this respect). Table Table11 summarises the main factors hindering the wider use of GIS within NHS organisations, and precluding adequate spatial data exchange and collaboration between the NHS and other organisations and local authorities. Researchers have come to the conclusion that more networking is needed of people, skills, expertise and data. This can be achieved by establishing networks of GIS users from both the NHS and local authorities at local and higher levels to encourage more joined-up working, share expertise and experiences, as well as establish contacts and trust, and raise the awareness of the types of data that are held by different organisations. A dedicated Web site acting as forum or virtual network on the Web is one way to realise these networks of GIS users. However, this author thinks that a common coherent UK initiative is urgently needed to build a comprehensive national, multi-agency spatio-temporal health information infrastructure functioning proactively in real time.

Table 1
Factors hindering the wider use of GIS and the exchange of geo-information within the NHS. Summary of the main factors hindering the wider use of GIS within NHS organisations, and precluding adequate spatial data exchange and collaboration between the ...

The NHS should start by carefully defining the purpose(s) of a nation-wide, coherent GIS implementation across its organisations, and by developing a clear "GIS business plan". For each health condition amenable to GIS processing within the NHS, the desired information output and ways of using it must be also determined. Tomlison's methodology is targeted at people who have been charged with launching or implementing GIS for their organisation, and is thus strongly recommended in this regard [119]. Perhaps the NHS should also take a closer look at the three sets of standards published by the US CDC National Public Health Performance Standards Programme (NPHPSP), and their associated assessment instruments [120] and implementation toolkit [121], as well as NPHPSP's Essential Public Health Services fundamental framework for NPHPSP instruments [122]. Another project worth learning from in this context is the US Primary Care Service Area Project (PCSA – http://pcsa.hrsa.gov/). The PCSA Project builds on the Hospital Service Area approach that has been successfully employed by Dr. John Wennberg and his Dartmouth colleagues to produce the Dartmouth Atlas of Health Care series http://www.dartmouthatlas.org/. The PCSA database contains nationwide data of interest to US health policymakers at all jurisdictional levels as well as researchers about US primary healthcare resources, populations, utilisation, and associated outcomes compiled and presented in newly developed units of analysis, the Primary Care Service Areas (PCSAs), and related to other geopolitical regions [123].

Our experience with the health and healthcare applications of GIS has markedly increased over the last decade. However, GIS have been usually applied to time-limited, single, isolated aetiological research or surveillance issues processing mainly retrospective data rather than to ongoing, broad efforts and wide-scale applications processing real-time or near-real-time data for health planning, promotion and protection.

Moreover, in the early 1990s much attention was focused on GIS as a basis for spatial information systems. But soon it became clear that the pure technical approach had to be replaced by a more holistic approach comprising organisational, political and technical matters at the different local, national, regional, and global levels. The concept of "Spatial Data Infrastructure" (SDI) became a reality.

SDI principles originated in two US National Research Council reports in the early 1990s [124,125]. SDIs first developed outside the health sector, and then belatedly health began to discover their importance in many applications. SDIs contain the people and institutions that make, maintain, and make accessible, the foundation data layers that permit the custodians of other data layers to attach their data to the foundation layers. It must be stressed that the contents of a national health spatial data infrastructure are not just any georeferenced health data but, in addition, the foundation spatial data to which health data can be attached. The foundation layers for a health spatial data and information infrastructure are best exemplified in the US by PCSAs and Hospital Service Area data layers provided by the Dartmouth Project (see above). In a personal e-mail communication with Professor Gerard Rushton, he argues PCSAs and Hospital Service Area data layers are spatial data foundation layers because other US health data often collected and maintained locally, are more valuable after they have been linked to these layers (Gerard Rushton, Department of Geography, University of Iowa, personal communication – December 2003).

In a workshop paper presented in 2001, Professor David Rhind counts about 40 countries developing their national SDIs and highlights the problems that have been faced and the lessons learned. The latter include ensuring the involvement of the private sector as a central SDI player from the outset, having a realistic vision, securing political leadership and support, and coordinating between the many SDI players [126].

Table Table22 presents a summary of the recipes and main recommendations provided by various specialist groups and researchers from around the world for a successful implementation of a national/regional/global spatial data and information infrastructure that can also support real-time GIS public health applications.

Table 2
Requirements for a successful implementation of a national/regional/global geo-information infrastructure. Summary of the recipes and main recommendations provided by various specialist groups and researchers from around the world for a successful implementation ...

Raising awareness activities and campaigns are much needed and should put strong emphasis on real-world, practical GIS scenarios and examples to reach out to policy and strategy makers in the health and other sectors.

Training is also one of the most important elements listed in Table Table2.2. Training should cover epidemiological methods to ensure appropriate use of GIS technology in public health. Public health professional specialties/bodies need to recognise continuing education credit for individuals who participate in GIS software training (perhaps the recently established NHSU, the corporate university for the NHS – http://www.nhsu.nhs.uk/, could play a role in this regard).

Some excellent Web-based training material and courses are already available free of charge, but there is still an urgent need for many more training modules to be developed and most importantly to be thoughtfully and coherently integrated in sensible ways. Existing material includes Rushton's (2003) Short Course on Geocoding http://www.uiowa.edu/~gishlth/giswkshp/, the University of Iowa Global Urban Indicators Training Programme (2002 – http://www.uiowa.edu/~gishlth/ui_index.html), Kulldorff's (2003) Short Course on Spatial Statistics http://www.satscan.org/presentation, and Lawson's (2003) Introduction to Bayesian Mapping Methods http://www.sph.sc.edu/alawson/teaching/Introduction to BMM_Part1.ppt.

It is not uncommon for GIS research to include very practical and useful gems, but these often remain confined to the closed circles of researchers and hidden from the larger communities of GIS professionals and users. A good example of such gems that should be exposed and disseminated are Boscoe and Pickle's recently published guidelines for choosing geographic units for choropleth rate maps in the context of public health applications [127]. The best, current evidence derived from GIS research should be always embedded (and regularly updated) in all training programmes. This is one important way of linking the academia and research communities to real-world practice.

Sufficient financial resources must be available to invest in training people and retaining technical expertise. Adequate investments must be also made in technologies for digital data management and storage, and in improving communications and networking infrastructures.

Reliable intranet and Internet environments with adequate bandwidth can support a physical and virtual "situation room" for both emergency and day-to-day management of operations for safeguarding the environment and protecting human health.

The tricky issues of data security and confidentiality must be properly addressed. Today, solutions exist that can preserve data confidentiality while still enabling fine-level analyses and reliable results. These solutions involve: (1) the use of statistical and epidemiological methods to mask the geographic location of data in a way that can still permit meaningful analysis, e.g., special types of spatial and temporal aggregation of data; (2) the development and use of software agents and health system resident components that can process an analysis request and return a result to the data user without exposing any individual-level health data; (3) the creation of secure networked environments with limited and multiple levels of access (to confidential data) in which public health researchers can be carefully monitored to ensure protection of individual and household confidentiality; and (4) the development, publication and strict enforcement of appropriate, unambiguous policies and regulations.

Best standards, specifications, rules, designs, and practices (covering spatial metadata, geocoding, accessibility for visually and manually impaired data users, and data access restrictions among other things) must be created/agreed upon and published for uniform Internet-enabled GIS services.

All relevant infrastructure and systems stakeholders should be involved in the development of appropriate data models (or ontologies) for their various applications to facilitate data selection and integration, and ensure a common understanding of data. This author also predicts even more exciting developments in the coming months and years with the rapid advances in geospatial Semantic Web research and technologies [128,129].

Data/analysis problems and errors are not uncommon and include scale issues, the "small numbers" problem, issues of the atomistic and ecologic fallacies, changing activity spaces of mapped subjects, and the frequent variations between different locations in data collection methods and standards, in the recorded items, particularly data on patient residence, and in diagnostic standards and case definitions. Users must develop increased sensitivity to and awareness of the various types of data errors and uncertainty, as well as competency in techniques for recognising and reducing their negative impact on conclusions drawn from spatial analysis. There is also a need for intelligent tools specifically designed for public health, and seamlessly weaved into everyday public health workflows and decision-making processes to enable users to focus and spend the larger part of their work time on what they want to achieve rather than on learning and overcoming the limitations of tools they are supposed to use to achieve their goals. The tools must be able to convey meaningful, bottom-line conclusions that can support the decision maker rather than just outputting bunches of facts. The ideal tools also need to be fault-tolerant and capable of analysing and presenting assembled data in ways that facilitate only appropriate interpretations of integrated data. This can be achieved by using some form of user friendly, "intelligent", goal-oriented health GIS wizards (based on robust statistical and epidemiological methods where appropriate), so that only valid results and maps are produced, even when users attempt to select inappropriate settings for a particular analysis.

The tools are also best designed and built to work in modular and nested fashions, so that they may be reused, linked and combined in different ways as needed to serve different scenarios and compound situations with little or no modifications (of the tools).

Along similar lines, Professor Stan Openshaw thinks that GIS need to adopt and link to technologies that go beyond data collection, management and ownership, standards, simple mapping, and trivial analysis. According to Openshaw, the ideal spatial analysis methods should be safe and user friendly for use by people with no higher degrees in statistical or spatial sciences. The methods should also respond to user needs on the ground, be highly automated, explicitly handle spatial data imprecision, and produce self-evident results that can be mapped and communicated to non-experts. Openshaw's proposed typology of methods includes among others "pattern spotters and testers" and "relationship seekers and provers" [130].

Community data sharing must be systematic and regular. Data-sharing agreements are needed that address confidentiality and other concerns, allow redistribution of data to any public health authority, and permit data to be used in research. Data have to be collected uniformly and include specifications for update frequency and allowed dissemination in different emergency and non-emergency situations, and for purposes other than those for which they were originally collected. It is recommended that a combined top-down and bottom-up incremental (phased) implementation approach be adopted. Longer-term solutions usually require a series of small successes, carefully built upon in incremental fashion over time. In fact, much of the wider vision of a national/regional/global public health spatial data and information infrastructure can be gradually and incrementally achieved through disparately funded and managed short-term projects, as long as we can ensure that these short-term projects make a useful and lasting contribution towards this wider vision. Short-term bottom-up projects can feed valuable experience into the formulation and revision of the relevant policies and strategies. Moreover, by creating "proof of concept and benefits applications", these projects can be also used to gain and continue political support for the wider vision, and secure further funding towards achieving it.

We also quickly reviewed existing SDIs and SDI initiatives at different levels of development worldwide, including the US National Spatial Data Infrastructure (NSDI) and the related Geospatial One-Stop initiative with its Web-based service, Geodata.gov; the UK GIgateway; INSPIRE, the INfrastructure for SPatial InfoRmation in Europe, which intends to trigger the creation of a European Spatial Data Infrastructure (ESDI); other national SDIs; and the Global Spatial Data Infrastructure (GSDI) activities.

Finally, we discussed public health surveillance and syndromic surveillance methods (especially in the context of bioterrorism). We reviewed the use of real-time/near-real-time GIS for emergency and epidemics management, with examples from the 2003 SARS outbreak, and somewhat detailed reviews of the Real-time Outbreak and Disease Surveillance system (RODS) from the US and two large-scale environmental surveillance projects from the UK. Such applications currently involve limited SDI-like arrangements, and would certainly benefit from the development of mature SDIs in their respective regions.

The dream remains to develop a universal multivariate surveillance system that can collect, analyse and interpret health-related information worldwide using modern information infrastructures for the global prevention of a wide range of health problems, or at least the early detection of such problems in order to mitigate their effects. GIS technologies and services that can function proactively in real time are extremely and critically important to realise this global public health surveillance vision (and indeed any smaller-scale surveillance services). Such surveillance services also require a sound and comprehensive spatial health data and information infrastructure to be built and maintained in a coherent way at all operation levels.

As the reader might have noticed, there are many requirements, e.g., standards and security, and ingredients of success in common to both the nation-wide implementation of integrated electronic health and social care records and the building of a national spatial health information infrastructure. Both development directions are closely interrelated. In fact, properly implemented electronic health and social care records are always required (in aggregated form) as a key data source within a national spatial health information infrastructure.

References


Articles from International Journal of Health Geographics are provided here courtesy of BioMed Central
PubReader format: click here to try

Formats:

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...

Links

  • PubMed
    PubMed
    PubMed citations for these articles

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...