The following video provides a very brief introduction to what GIS is and why it is a useful tool for biologists to use in their research. Further, more detailed, information is provided in the text below. If, for some reason, you cannot see this video or hear the sound when it plays, click here for an alternative method of accessing it.
What Is GIS?
A Geographical Information System (known as a GIS for short) is a database that allows the user to explore spatial relationships within and between data sets. The basic concept of GIS is very simple. Normal databases primarily consist of a series of tables which can be linked together to allow the data within them to be extracted, compared or manipulated based on values in different fields or columns in them. However, normal databases have great difficulty in manipulating data in a spatial context. For example, while you can use a normal data base to link data on species occurrence to temperature for the month in which it was recorded (if you have such data available), it is not as easy to link it to temperature data from a similar location in space because the database cannot easily work out which temperature data point is closest to the location where a species was recorded. In contrast, a GIS, as well as consisting of a series of tables, also contains information on the spatial distribution of the data. As a result, in a GIS, it is much easier to link data within the tables based on their spatial relationships (figure 1). For example, the nearest temperature data point to any point where a species was recorded can be identified and linked to it. It is this power to compare and manipulate data based on their spatial relationships which makes GIS such powerful tools for biologists.
How Does A GIS Database Work?
The information on spatial distribution is held in the GIS as a series of data layers (analogous to individual tables within a normal database). These data layers not only have the attributes for each record in them, but also have information which defines the areas of the Earth which they represent, and the size, shape and position of any features within it. Based on this information, specialist GIS software can work out how features in different data layers relate to each other. For example, a GIS could be used to work out the values for different habitat or environmental variables, such as water depth, altitude or temperature, where a particular species was seen.
As a result, while tables are the key component of normal databases, data layers are the key component of a GIS. Each data layer will represent a specific source of data represented in a specific way. For example, one data layer may contain information on the locations where a particular species was recorded during a survey, which might be represented by a series of points, while another may contain information about the route of the survey itself, represented as a line. Others might then contain information about the location environment, such as elevation, temperature and land cover type represented in a variety of other ways.
By adding data layers that contain the specific information you are interested in, you can start investigating the spatial relationships between them. For example, if you wish to know what altitudes different species are found in, you can add one data layer which contains information on the locations where each species were recorded, and another which contains information on altitude. You can then compare the spatial relationships of these two sources of data to look at which species occur in which altitudes.
A GIS is generally created using specialist GIS software, and such software usually provides a series of tools which allow you to not only create, manipulate and edit data layers, but also to investigate the spatial relationships between them in a variety of ways. Therefore, the GIS software that you use is a key component of any GIS project. However, different GIS software may contain different tools, and some are better at some tasks than others. As a result, it is important that, where possible, you choose GIS software which is appropriate to your requirements. However, for novice biological GIS users, we recommend starting with a package called QGIS (you can find out more about why we recommend starting with QGIS here).
Why Is GIS Useful In Biological Research?
While you might only be interested in learning how to create nice maps and figures for presentations, reports and publications (and there is nothing wrong with that), it is worth remembering that many biological research projects inherently have a spatial component that is worth exploring. This can range from the distribution of sampling sites to survey tracks, capture locations, movements of individual animals and information about the distribution of specific habitat types.
As a result, many biological research projects would benefit from the creation of a GIS to explore spatial relationships within and between the data. In particular, while some projects can be done without using a GIS, many will be greatly enhanced by using it (click here for some examples of research projects which have used GIS). The very act of creating a GIS will make you think about the spatial relationships within your data, and will help you formulate hypotheses to test or suggest new ones to explore.
In addition, thinking about your data in a spatial manner will help you identify potential spatial issues and/or biases with your data. For example, plotting the spatial distribution of sampling sites may help you see whether sites which are closer to each other are more similar than those which are further apart. This is something which may not be clear if you only look at your data in a simple spreadsheet or database format. If such patterns exist in your data, and are driven by factors other than those you are studying, this may mean that you have something called spatial auto-correlation which violates the assumptions of many statistical techniques. In this case, you will need to deal with this in some way or other. Similarly, by plotting the locations where a specific species of animal has been recorded over different habitat variables, you can start to develop ideas about what is important for determining the distribution of that species. While this has always been easy to do for things like water depth or altitude, by plotting the locations on a chart, by using a GIS you can look at a wider range of variables and often ones which are not as clearly represented on paper maps or charts (such as the gradient or aspect of the local topography).
GIS can also be used to make measurements and carry out calculations which would otherwise be very difficult. For example, a GIS can be used to work out how much of your study area consists of a specific habitat type, or how much of it is over 1,000m high, or has a gradient greater than 5º, and so on. Similarly, a GIS can be used to calculate the size of the home range of an individual or the total area occupied by a specific species or how long your survey tracks are, or how much survey effort was put into different parts of your study area.
In addition, you can calculate new variables within a GIS. For example, you can use a GIS to calculate slope and aspect of the local topography from elevation or altitude information. Similarly, in the marine environment you can estimate where fronts are by analysing how similar the water temperatures are at one location to those that surround it.
GIS can also be used to link data together in the way that is needed for statistical analysis. For example, many statistical packages require all your data to be in a single table, with one line per sample and then information about that sample and the location where it came from in different columns or fields. A GIS provides you with a way to easily create such tables and populate it with information, such as the altitude at each location, the gradient of slope and the direction it faces, from other data sets. This makes preparing your data for statistical analysis much simpler.
Finally, while GIS is mostly used for displaying and/or analysing data, a GIS can provide important information when deciding where and how to collect your data in the first place. Given that collecting biological data is often complex and expensive, it is usually important to get your data collection right the first time. A bit of planning can go a long way to ensuring that your research is successful. There is nothing worse than spending all your time and money collecting lots of data only to find out when you come to analyse it that you are missing a vital bit of information or coverage. For example, you may find that you have not sampled the right range of altitudes, or that you have not sampled a specific part of your study area properly, or that you missed out a location which has a specific combination of different variables (click here for a case study on using GIS to investigate survey coverage), or that your sampling locations were too close together or too far apart. While creating a GIS at the planning stage will not banish such possibilities completely, if done correctly, it will help you reduce, to a minimum, the risk of these issues cropping up at a later date.