- GIS redirects here. For other meanings, see GIS (disambiguation).
A geographic information system (GIS) is a system for managing spatial data and associated attributes. In the strictest sense, it is a computer system capable of integrating, storing, editing, analyzing, and displaying geographically-referenced information. In a more generic sense, GIS is a "smart map" tool that allow users to create interactive queries (user created searches), analyze the spatial information, and edit data.
Geographic information systems technology can be used for scientific investigations , resource management, asset management, development planning , cartography and route planning. For example, a GIS might allow emergency planners to easily calculate emergency response times in the event of a natural disaster, or a GIS might be used to find wetlands that need protection from pollution.
History of development
35,000 years ago, on the walls of caves near Lascaux, France, Cro-Magnon hunters drew pictures of the animals they hunted. Associated with the animal drawings are track lines and tallies thought to depict migration routes. These early records followed the two-element structure of modern geographic information systems: a graphic file linked to an attribute database.
In the 1700s modern surveying techniques for topographic mapping were implemented, along with early versions of thematic mapping, e.g. for scientific or census data.
The early 20th century saw the development of "photo lithography" where maps were separated into layers. Computer hardware development spurred by nuclear weapon research would lead to general purpose computer "mapping" applications by the early 1960s.
The year 1967 saw the development of the world's first true operational GIS in Ottawa, Ontario by the federal Department of Energy, Mines and Resources . Developed by Roger Tomlinson, it was called "Canadian GIS" (CGIS) and was used to store, analyse and manipulate data collected for the Canada Land Inventory (CLI) - an initiative to determine the land capability for rural Canada by mapping various information on soils, agriculture, recreation, wildlife, waterfowl, forestry, and land use at a scale of 1:250,000. A rating classification factor was also added to permit analysis.
CGIS was the world's first "system" and was an improvement over "mapping" applications as it provided capabilities for overlay, measurement, digitizing/scanning, supported a national coordinate system that spanned the continent, coded lines as "arcs" having a true embedded topology, and it stored the attribute and locational information in separate files. Its developer, geographer Roger Tomlinson, has become known as the "father of GIS."
CGIS lasted into the 1990s and built the largest digital land resource data base in Canada. It was developed as a mainframe based system in support of federal and provincial resource planning and management. Its strength was continent-wide analysis of complex data sets. The CGIS was never available in a commercial form. Its initial development and success stimulated various commercial mapping applications being sold by vendors such as Intergraph. The development of micro-computer hardware spurred vendors such as ESRI and CARIS to successfully incorporate many of the CGIS features, combining the 1st generation approach to separation of spatial and attribute information with a 2nd generation approach to organizing attribute data into database structures. The 1980s and 1990s industry growth were spurred on by the growing use of GIS on UNIX workstations and the personal computer. By the end of the 20th century, the rapid growth in various systems had been consolidated and standardized on relatively few platforms and users were beginning to export the concept of viewing GIS data over the Internet, requiring data format and transfer standards.
Techniques used in GIS
Relating information from different sources
If you could relate information about the rainfall of your state to aerial photographs of your county, you might be able to tell which wetlands dry up at certain times of the year. A GIS, which can use information from many different sources in many different forms, can help with such analyses. The primary requirement for the source data consists of knowing the locations for the variables. Location may be annotated by x,y, and z coordinates of longitude, latitude, and elevation, or by other geocode systems like ZIP Codes or by highway mile markers. Any variable that can be located spatially can be fed into a GIS. Several computer databases that can be directly entered into a GIS are being produced by government agencies and non-government organizations. Different kinds of data in map form can be entered into a GIS.
A GIS can also convert existing digital information, which may not yet be in map form, into forms it can recognize and use. For example, digital satellite images generated through remote sensing can be analyzed to produce a map-like layer of digital information about vegetative covers.
Likewise, census or hydrologic tabular data can be converted to map-like form, serving as layers of thematic information in a GIS.
GIS data represents real world objects (roads, land use, elevation) with digital data. Real world objects can be divided into two abstractions: discrete objects (a house) and continuous fields (rain fall amount or elevation). There are two broad methods used to store data in a GIS for both abstractions: Raster and Vector.
Raster data type consists of rows and columns of cells where in each cell is stored a single value. Most often, raster data are images ( raster images), but besides just color, the value recorded for each cell may be a discrete value, such as land use, a continuous value, such as rainfall, or a null value if no data is available. While a raster cell stores a single value, it can be extended by using raster bands to represent RGB (red, green, blue) colors, colormaps (a mapping between a thematic code and RGB value), or an extended attribute table with one row for each unique cell value. The resolution of the raster dataset is its cell width in ground units. For example, one cell of a raster image represents one meter on the ground. Usually cells represent square areas of the ground, but other shapes can also be used.
Vector data type uses geometries such as points, lines (series of point coordinates), or polygons, also called areas (shapes bounded by lines), to represent objects. Examples include property boundaries for a housing subdivision represented as polygons and well locations represented as points. Vector features can be made to respect spatial integrity through the application of topology rules such as 'polygons must not overlap'. Vector data can also be used to represent continuously varying phenomena. Contour lines and triangulated irregular networks (TIN) are used to represent elevation or other continuously changing values. TIN's record values at point locations, which are connected by lines to form an irregular mesh of triangles. The face of the triangles represent the terrain surface.
There are advantages and disadvantages to using a raster or vector data model to represent reality. Raster datasets record a value for all points in the area covered which may require more storage space than representing data in a vector format that can store data only where needed. Raster data also allows easy implementation of overlay operations, which are more difficult with vector data. Vector data can be displayed as vector graphics used on traditional maps, whereas raster data will appear as an image that may have a blocky appearance for object boundaries.
Additional non-spatial data can also be stored besides the spatial data represented by the coordinates of a vector geometry or the position of a raster cell. In vector data, the additional data are attributes of the object. For example, a forest inventory polygon may also have an identifier value and information about tree species. In raster data the cell value can store attribute information, but it can also be used as an identifier that can relate to records in another table.
Data capture - entering information into the system - consumes much of the time of GIS practitioners. There are a variety of methods used to enter data into a GIS where it is stored in a digital format.
Existing data printed on paper or mylar maps can be digitized or scanned to produce digital data. A digitizer produces vector data as an operator traces points, lines, and polygon boundaries from a map. Scanning a map results in raster data that could be further processed to produce vector data.
Survey data can be directly entered into a GIS from digital data collection systems on survey instruments. Positions from a global positioning system (GPS), another survey tool, can also be directly entered into a GIS.
Remotely sensed data also plays an important role in data collection and consist of sensors attached to a platform. Sensors include cameras, digital scanners and LIDAR, while platforms usually consist of aircrafts and satellites.
The majority of digital data currently comes from photo interpretation of aerial photographs. Soft copy workstations are used to digitize features directly from stereo pairs of digital photographs. These systems allow data to be captured in 2 and 3 dimensions, with elevations measured directly from a stereo pair using principles of photogrammetry. Currently, analog aerial photos are scanned before being entered into a soft copy system, but as high quality digital cameras become cheaper this step will be skipped.
Satellite remote sensing provides another important source of spatial data. Here satellites use different sensor packages to passively measure the reflectance from parts of the Electromagnetic spectrum or radio waves that were sent out from an active sensor such as radar. Remote sensing collects raster data that can be further processed to identify objects and classes of interest, such as land cover.
In addition to collecting and entering spatial data, attribute data is also entered into a GIS. For vector data this includes additional information about the objects represented in the system.
After entering data into a GIS, it usually requires editing, to remove errors, or further processing. For vector data it must be made "topologically correct" before it can be used for some advanced analysis. For example, in a road network, lines must connect with nodes at an intersection. Errors such as undershoots and overshoots must also be removed. For scanned maps, blemishes on the source map may need to be removed from the resulting raster. For example, a fleck of dirt might connect two lines that should not be connected.
Data restructuring can be performed by a GIS to convert data into different formats. For example, a GIS may be used to convert a satellite image map to a vector structure by generating lines around all cells with the same classification, while determining the cell spatial relationships, such as adjacency or inclusion.
Since digital data are collected and stored in various ways, the two data sources may not be entirely compatible. So a GIS must be able to convert geographic data from one structure to another.
Projections, coordinate systems and registration
A property ownership map and a soils map might show data at different scales. Map information in a GIS must be manipulated so that it registers, or fits, with information gathered from other maps. Before the digital data can be analyzed, they may have to undergo other manipulations - projection and coordinate conversions, for example - that integrate them into a GIS.
The earth can be represented by various models, each of which may provide a different set of coordinates (e.g., latitude, longitude, elevation) for any given point on the earth's surface. The simplest model is to assume the earth is a perfect sphere. As more measurements of the earth have accumulated, the models of the earth have become more sophisticated and more accurate. In fact, there are models that apply to different areas of the earth to provide increased accuracy (e.g., North American Datum, 1983 - NAD83 - works well in North America, but not in Europe).
Projection is a fundamental component of map making. A projection is a mathematical means of transferring information from a model of the Earth, which represents a 3 three-dimensional curved surface, to a two-dimensional medium - paper or a computer screen. Different projections are used for different types of maps because each projection particularly suits certain uses. For example, a projection that accurately represents the shapes of the continents will distort their relative sizes.
Since much of the information in a GIS comes from existing maps, a GIS uses the processing power of the computer to transform digital information, gathered from sources with different projections and/or different coordinate systems, to a common projection and coordinate system.
TODO: move to data manipulation and Relating information...
Spatial analysis with GIS
It is difficult to relate wetlands maps to rainfall amounts recorded at different points such as airports, television stations, and high schools. A GIS, however, can be used to depict two- and three-dimensional characteristics of the Earth's surface, subsurface, and atmosphere from information points.
For example, a GIS can quickly generate a map with lines that indicate rainfall amounts.
Such a map can be thought of as a rainfall contour map. Many sophisticated methods can estimate the characteristics of surfaces from a limited number of point measurements. A two-dimensional contour map created from the surface modeling of rainfall point measurements may be overlaid and analyzed with any other map in a GIS covering the same area.
In the past 35 years, were there any gas stations or factories operating next to the swamp? Any within two miles and uphill from the swamp? A GIS can recognize and analyze the spatial relationships that exist within digitally stored spatial data. These topological relationships allow complex spatial modelling and analysis to be performed. Topological relationships between geometric entities traditionally include adjacency (what adjoins what), containment (what encloses what), and proximity (how close something is to something else).
If all the factories near a wetland were accidentally to release chemicals into the river at the same time, how long would it take for a damaging amount of pollutant to enter the wetland reserve? A GIS can simulate the routing of materials along a linear network. Values such as slope, speed limit, pipe diameter can be incorporated into network modelling in order to represent the flow of the phenomenon more accurately. Network modelling is commonly employed in transportation planning, hydrology modelling, and infrastructure modelling.
Powerful analysis techniques with raster data.
Using geostatistics to predict fields from points.
Point pattern analysis.
Calculating spatial locations (X,Y coordinates) from street addresses. A reference theme is required to geocode individual addresses, such as a road centerline file with address ranges. The individual address locations are interpolated, or estimated, by examining address ranges along a road segment. These are usually provided in the form of a table or database. The GIS will then place a dot approximately where that address belongs along the segment of centerline. For example, an address point of 500 will be at the midpoint of a line segment that starts with address 1 and ends with address 1000.
Various algorithms are used to help with address matching when the spellings of addresses differ. Address information that a particular entity or organization has data on, such as the post office, may not entirely match the reference theme. There could be variations in street name spelling, community name, etc. Consequently, the user generally has the ability to make matching criteria more stringent, or to relax those parameters so that more addresses will be mapped. Care must be taken to review the results so as not to erroneously map addresses incorrectly due to overzealous matching parameters.
Reverse geocoding is the process of returning an estimated street address number as it relates to a given coordinate. For example, a user can click on a road centerline theme (thus providing a coordinate) and have information returned that reflects the estimated house number. This house number is interpolated from a range assigned to that road segment. If the user clicks at the midpoint of a segment that starts with address 1 and ends with 100, the returned value will be somewhere near 50. Note that reverse geocoding does not return actual addresses, only estimates of what should be there based on the predetermined range. ġ
Data output and cartography
Cartography is the design and production of maps, or visual representations of spatial data. The vast majority of modern cartography is done with the help of computers, usually using a GIS. Most GIS software gives the user substantial control over the appearance of the data.
Cartographic work serves two major functions:
First, it produces graphics on the screen or on paper that convey the results of analysis to the people who make decisions about resources. Wall maps and other graphics can be generated, allowing the viewer to visualize and thereby understand the results of analyses or simulations of potential events. Web Map Servers facilitate distribution of generated maps via the web technology .
Second, other database information can be generated for further analysis or use. A list of all addresses within 1 mile of a toxic spill for instance.
Graphic display techniques
Traditional maps are abstractions of the real world, a sampling of important elements portrayed on a sheet of paper with symbols to represent physical objects. People who use maps must interpret these symbols. Topographic maps show the shape of land surface with contour lines; the actual shape of the land can be seen only in the mind's eye.
Today, graphic display techniques such as shading based on altitude in a GIS can make relationships among map elements visible, heightening one's ability to extract and analyze information. For example, two types of data were combined in a GIS to produce a perspective view or a portion of San Mateo County, California.
- The digital elevation model, consisting of surface elevations recorded on a 30-meter horizontal grid, shows high elevations as white and low elevation as black.
- The accompanying Landsat Thematic Mapper image shows a false-color infrared image looking down at the same area in 30-meter pixels, or picture elements, for the same coordinate points, pixel by pixel, as the elevation information.
A GIS was used to register and combine the two images to render the three-dimensional perspective view looking down the San Andreas Fault, using the Thematic Mapper image pixels, but shaded using the elevation of the landforms. The GIS display depends on the viewing point of the observer and time of day of the display, to properly render the shadows created by the sun's rays at that latitude, longitude, and time of day.
- GRASS is the largest and most comprehensive Free Software GIS package.
- FreeGIS maintains a comprehensive list of Free Software GIS applications
- PostGIS is GIS software which works at the database-server level, allowing geospacial queries on a database
- MapServer is a web based mapping server with many features and supported formats
Commercial or Proprietary
Caris is a software company that produces focused GIS systems for specific markets, particularly Hydrography and Cadastral systems.
MapInfo is a software company that integrates GIS software, data and services. 
ESRI is a software company, located in Redlands, CA, USA. Available software includes ArcGIS , ArcSDE , ArcIMS, and ArcWeb services . Known best for the ESRI shapefiles file format, which is often used to supply or transfer GIS data.
TerraLib is a GIS classes and functions library, available from the Internet as open source, allowing a collaborative environment and its use for the development of multiple GIS tools. 
Intergraph's Mapping and Geospatial Solutions division, develops multiple GIS tools. Based at Huntsville, AL, USA
The future of GIS
Many disciplines can benefit from GIS techniques. An active GIS market has resulted in lower costs and continual improvements in the hardware and software components of GIS. These developments will, in turn, result in a much wider application of the technology throughout science, government, business, and industry.
Open Geospatial Consortium (OGC)
Global change and climate history program
Maps have traditionally been used to explore the Earth and to exploit its resources. GIS technology, as an expansion of cartographic science, has enhanced the efficiency and analytic power of traditional mapping. Now, as the scientific community recognizes the environmental consequences of human activity, GIS technology is becoming an essential tool in the effort to understand the process of global change. Various map and satellite information sources can combine in modes that simulate the interactions of complex natural systems.
Through a function known as visualization, a GIS can be used to produce images - not just maps, but drawings, animations, and other cartographic products. These images allow researchers to view their subjects in ways that literally never have been seen before. The images often are equally helpful in conveying the technical concepts of GIS study-subjects to non-scientists.
Adding the element of time
The condition of the Earth's surface, atmosphere, and subsurface can be examined by feeding satellite data into a GIS. GIS technology gives researchers the ability to examine the variations in Earth processes over days, months, and years.
As an example, the changes in vegetation vigor through a growing season can be animated to determine when drought was most extensive in a particular region. The resulting graphic, known as a normalized vegetation index, represents a rough measure of plant health. Working with two variables over time would then allow researchers to detect regional differences in the lag between a decline in rainfall and its effect on vegetation.
GIS technology and the availability of digital data on regional and global scales enable such analyses. The satellite sensor output used to generate a vegetation graphic is produced by the Advanced Very High Resolution Radiometer or AVHRR . This sensor system detects the amounts of energy reflected from the Earth's surface across various bands of the spectrum for surface areas of about 1 square kilometer. The satellite sensor produces images of a particular location on the Earth twice a day. AVHRR is only one of many sensor systems used for Earth surface analysis. More sensors will follow, generating ever greater amounts of data.
GIS and related technology will help greatly in the management and analysis of these large volumes of data, allowing for better understanding of terrestrial processes and better management of human activities to maintain world economic vitality and environmental quality.
References and further reading
See also: cartography, remote sensing, Open GIS Consortium, GRASS GIS, geoinformation, geodesy, geoinformatics
- Berry, J.K. 1993. "Beyond Mapping: Concepts, Algorithms and Issues in GIS". Fort Collins, CO: GIS World Books.
- Heywood, I., Cornelius, S., and Carver, S. 2002. An Introduction to Geographical Information Systems. Andison Wesley Longman. 2nd edition.
- Longley, P.A., Goodchild, M.F., Maguire, D.J. and Rhind, D.W. (2005): Geographic Information Systems and Science. Chichester: Wiley. 2nd edition.
- Wise, S. 2002. "GIS Basics". London: Taylor & Francis.
- Worboys, Michael, and Matt Duckham. 2004. GIS: a computing perspective. Boca Raton: CRC Press. 
Last updated: 08-18-2005 03:28:30
Last updated: 08-23-2005 17:36:30