Click here for a summary.

By Wesley J. English; Committee: Evan Harrington, Chair; Nancy Zarse, Second Reader

Abstract

The current study documents the development of GeoProfile, a new geographic profiling software system, and tests its reliability against CrimeStat and Dragnet. A set of 55 serial offenders from Baltimore County, MD, were analyzed with GeoProfile, CrimeStat, and Dragnet using search cost and error distance as measures of accuracy. GeoProfile was found to be non-inferior to CrimeStat and Dragnet when measured by search cost and non-inferior to Dragnet when measured by error distance. The hypothesis that GeoProfile is as accurate as CrimeStat and Dragnet was supported by the finding that GeoProfile was non-inferior to at least one program in each category of accuracy measures.

Introduction

Although the field of geographic profiling exists to help law enforcement officers locate serial offenders, investigators have yet to widely embrace geographic profiling. This could be due to the novelty of the concepts. As a fairly new field, the verdict is still out on whether geographic profiling can significantly contribute to an investigation; nevertheless, the only way to answer this question is through increased use of geographic profiling by law enforcement (Paulsen, 2006a). Investigators do not have to master the concepts of geographic profiling to benefit from its findings. Geographic profiling software distills the research into a practical tool that provides pragmatic strategies for focusing an investigation. However, a very small percentage of law enforcement agencies use one of the three available software applications. Part of the problem may be the applications themselves. Each of the existing programs exhibit one of two limitations: either the application is (a) too cost-prohibitive to purchase and implement or (b) too difficult and time consuming for everyday use.

The same problems may also limit research. Although Rossmo makes his commercial application, Rigel Analyst, available at no cost to researchers, he decides which projects are eligible (P. MacLaren, personal communication, April 8, 2008). As a result, Rigel Analyst is unavailable for use in research that Rossmo does not consider valuable. Conversely, research using the two non-commercial applications, CrimeStat and Dragnet, may be limited by their unnecessary complexity. One of the most common critiques of geographic profiling studies is their small sample sizes (Paulsen, 2006a). Although this is due in part to the difficulty of finding good data, the time and effort required to run large data sets through their complicated software may restrict sample sizes.

Finally, research is limited by having only a few geographic profiling applications in existence. Research into the more advanced geographic profiling strategies can be divided into two categories. The first type of research deals with how geographic profiling strategies, as implemented by software, perform and can be applied to actual law enforcement investigations. The second type of research deals with the geographic profiling strategies themselves. As long as researchers have access to geographic profiling software, they can test hypotheses in the first category. However, only researchers who have developed their own applications are able to experiment with geographic profiling strategies. Consequently, a new geographic profiling system affords an additional researcher the opportunity to experiment with current and new approaches that are too complex to perform without computer assistance.

To address these issues, a new geographic profiling software system was developed around the principles of affordability and simplicity. The purpose of this study is to document the development of GeoProfile and establish its reliability through comparison with two existing applications: CrimeStat and Dragnet.


Literature Review

Geographic profiling comes out of the larger context of location theory, which seeks to find the optimal location for a given distribution of events or demographics over a particular area (Levine, 2004). Location theory negotiates the demand side of human behavior with the supply side of resources. For instance, customers seek to maximize value while minimizing travel costs. Retail chains use this principle of location theory to determine the best distribution of stores across a region (Levine, 2004). Several similar applications of location theory have been used over the last century, and researchers applied the theory to the ecology of criminal behavior as early as 1927 (Levine, 2004). Since then, researchers have noticed patterns in distributions of crime scenes that paralleled the principles of location theory. In this view, criminals are seen as customers shopping for targets of their particular crime; offenders seek to maximize the value of their crime while minimizing travel costs and the risk of apprehension. However, criminals may not think in these terms. Instead, this potentially unconscious thinking may be manifested in terms of convenience and familiarity with the area (Brantingham & Brantingham, 1984).

Turner (1969) found that delinquent burglars in Philadelphia tended to commit crimes close to their residences. Furthermore, he noticed that the frequency of their offenses decreased as the distance from their homes increased (Turner, 1969). This pattern, called the distance decay model because the probability of committing an offense “decays” with distance, played a major role in the development of the predictive models of geographic profiling. He also recognized that the pattern could be expressed mathematically in a number of ways. LeBeau (1987) also discovered consistent patterns in spatial analyses of rapists. Specifically, he found that serial rapists repeatedly used the same geographic areas and tended to commit a subsequent rape within half a mile of the last assault (LeBeau, 1987).

Turner (1969) also found that although the frequency of burglaries peaked a block from an offender’s residence, very few occurred in the intermediate area. He referred to this area as a “safety zone,” also called a “buffer zone,” and hypothesized it existed because offenders feared being recognized and caught close to their homes (Turner, 1969). Researchers later generalized the buffer zone to other types of crimes during the development of predictive models (Rossmo, 2000). However, Levine (2004) found this generalization premature, hypothesizing that the lack of crime scenes close to an offender’s home could be due to a lack of opportunity in the intervening area. Turner’s study involved burglars who skipped over the houses closest to their own, but some other types of crime do not have targets in residential areas. Unless a bank robber lives next door to a number of banks, he necessarily has to travel a certain distance away from home before finding a bank to rob. Furthermore, the buffer zone may not always be limited to a single area around an offender’s current residence. After the resolution of a serial rapist case, Strangeland (2005) found that a buffer zone around a former home had distorted the shape of the crime scene distribution; the offender stated in an interview that he avoided the area out of fear of being recognized by his former neighbors.

Brantingham and Brantingham (1981) demonstrated the cognitive processes behind the patterns observed in crime distributions. They applied the concept of cognitive mapping to criminals’ interaction with their environments. Cognitive mapping is the process by which people construct and use subjective mental representations of the objective surrounding geography (Brantingham & Brantingham, 1981). Cognitive maps can be understood as a hierarchy of subjective “spaces,” or subjective representation of specific areas. Within a person’s total geographic knowledge are “awareness spaces,” or areas of which the individual has more detailed geographic knowledge (Paulsen & Robinson, 2004). Within awareness spaces are “activity spaces,” or areas that an individual frequents during routine activities (Paulsen & Robinson, 2004). The interaction of awareness or activity spaces with areas containing potential targets form “opportunity spaces,” which explain the crime distribution for any given serial offender (Brantingham & Brantingham, 1984).

Although offenders live within their activity spaces by definition, they do not necessarily live within their opportunity spaces. Offenders may become aware of potential targets in areas around other “anchor points,” or frequent centers of activity, in their activity space (Warren, Reboussin, Hazelwood, Cummings, Gibbs, & Trumbetta, 1998). (A serial offender’s home is considered an anchor point, but is often distinguished from other anchor points through the terms “home base” and “home location.”) Consequently, each crime scene is a sample of the offender’s opportunity space, which allows for a partial reconstruction of the offender’s activity or awareness space (Brantingham & Brantingham, 1984).

Canter and Larkin (1993) observed that the crime distribution of some serial offenders seemed to have no relationship to their anchor points. As a result, they proposed another space called the “criminal range.” They created a binary classification to describe how serial offenders operate within their criminal range. They defined “marauders” as offenders who committed crimes outward from their anchor points as described by the Brantingham and Brantingham research (Canter & Larkin, 1993). Conversely, they defined “commuters” as offenders who travel a significant distance to commit crimes over an area that has little to no overlap with their activity space (Canter & Larkin, 1993).

Predictive Models

The above research described patterns observed in crime distributions and the cognitive process that led to their formation. Given a serial offender’s home location and anchor points, researchers could identify which areas the offender would likely target. Researchers took this descriptive model and inverted it to create the predictive models that make up geographic profiling (Canter, 2003; Rossmo, 2000). A number of models were developed, ranging from the very simple to the very complex, and are covered below. The models are divided into two categories based on their overall strategy. Spatial strategies use the geometric characteristics of the crime distribution to make a single-point prediction. Probability strategies use distance decay functions to calculate a contoured density map with prioritized search areas.

Spatial Strategies

Canter and Larkin (1993) found that 87% of a sample of serial rapists lived inside a circle with a diameter equal to the distance between the two furtherest crime scenes. However, this pattern, which they called the “circle hypothesis,” barely narrows the search area. Moreover, investigators were already in the practice of looking within the “convex hull,” the polygon created by connecting the outermost crime scenes, for the offender’s residence. The technique was modified to make a single-point prediction by designating the center of the circle as the anchor point (Snook, Canter, Bennell, & Taylor, 2002). Because the circle hypothesis incorporates only a fraction of the knowledge about serial offender behavior, it remains the most simplistic model in geographic profiling and is used more often as a control method.

The next two models are more complex in the sense that they predict an anchor point based on the shape and density of the crime distribution. The “centroid hypothesis” uses the average value of the crime scene coordinates, placing the anchor point in the weighted center of the crime scenes (Canter, Coffey, Huntley, & Missen, 2000). Consequently, if a crime distribution is denser in one area, the anchor point will be closer to that area. The “center of minimum distances” produces a similar result but is more sensitive to density and distance. Each point on the map is given a value equal to the sum of the distances to all crime scenes. The anchor point is the coordinate with the smallest sum. As a result, the point closest to all the crime scenes is designated the anchor point (Snook, Zito, Bennell, & Taylor, 2005). By using a “center of gravity” methodology to find the anchor point, both strategies model the hypothesis that serial offenders are more likely to commit crimes in their activity spaces (Canter et al., 2000).

Probability Strategies

The strength of spatial models in offering a single-point prediction is also their greatest weakness: they only provide information for one coordinate of the map. If the predicted anchor point is not the actual anchor point, the model leaves investigators on their own to interpret how that single point influences the search of the surrounding area. Probability strategies solve this problem by combining a center of gravity method with a distance decay function to provide information for every coordinate of the map.

For each point in a reference area containing the crime distribution, a likelihood score, also called a density or hit score, is calculated by a two-step process. First, the distance between itself and each crime scene is evaluated by the distance decay function. Second, the resulting values are summed to find the likelihood score. The higher the score, the more likely it is to be the serial offender’s primary anchor point (Canter et al., 2000). Buffer zones are sometimes incorporated into the probability strategy. The reference area is then separated into a contoured density map to create prioritized search areas. The top contour containing the highest likelihood scores is called the “top profile region.”

Although the output of probability strategies are most often referred to as probability maps, they are technically density maps because their likelihood scores do not fall between 0 to 1 (Levine, 2004). However, likelihood scores can be converted to probability scores by applying a normalization procedure, which allows for comparison between cases. Because serial offenders commit crimes in series that range in the number of scenes and distances traveled, the individual likelihood scores of density maps cannot be compared unless a normalization procedure is used. Canter et al. (2000) identify two methods of normalizing a density map. The first uses the mean interpoint distance between all crime scenes. The second uses the QRange, an index specifically developed for normalizing density maps. The QRange also takes into account the potential importance of major routes to crime scenes from the anchor point instead of assuming all distances are of equal weight (Canter et al., 2000). The QRange looks for “arterial pathways” by performing a linear regression on the crime scenes within the distribution and giving more weight to coordinates closer to any discovered axes (Canter et al., 2000). Regardless of whether the map reflects probability or density scores, the output appears the same.

Distance Decay Functions

As Turner (1969) predicted, researchers have developed several different distance decay functions (Levine, 2004). The curve defined by each function represents how the likelihood of offending changes as distance from the anchor point increases. Each function includes one or more variables, either arbitrarily chosen or empirically derived, that affect the size, gradient, and slope of the curve. Each distance decay function is a function of dij, the distance between the point under evaluation and the current crime scene. The formulas for the following distance decay functions are from Levine (2004).

The negative exponential function reflects the “friction effect” or the limitations that money, time, and energy place on traveling increasingly further from the anchor point (Canter & Hammond, 2006). The likelihood of offending at any given location decreases exponentially with distance from the anchor point. The function is expressed mathematically as:

Negative Exponential Function

The truncated negative exponential function features an initial linear increase in the likelihood of offending with distance from the anchor point until it peaks at a specified interval of distance. From this point, the likelihood of offending at any given location decreases exponentially with distance. As a result, a sort of buffer zone is incorporated into the distance decay function. The first part of the function is expressed as:

f(dij) = B * dij

When dij is greater than the peak likelihood distance, the negative exponential function is applied as expressed in the following equation:

Truncated Negative Exponential Function

The normal distance decay function also features an initial increase in likelihood of offending that subsequently decreases with distance. However, this function uses the smoother transition of a normal curve. Like the truncated negative exponential function, the normal function assumes a peak likelihood exists at some set distance from the anchor point. The function is expressed as:

Normal Distance Decay Function - Part 1 Normal Distance Decay Function - Part 2

The quadratic distance decay function departs from the other hypotheses. The previously discussed functions predict that the likelihood of offending always decreases with distance outside of any applied buffer zone. However, in the quadratic function, the probability of offending drops rapidly close to the home base, levels off, and then increases slightly again. Despite its difference from the other distance decay functions, researchers have found crime distributions that fit this pattern (Canter & Hammond 2006). Canter and Hammond (2006) hypothesize that this may be due to a need to find new crime opportunities after using up all the available opportunities around the home base.

In the negative linear function, the likelihood of offending decreases at the same rate with distance from the anchor point. While some crime series have fit this pattern, the likelihood of offending rarely decreases linearly. As a result, the negative linear function is more often used as a control function in research (Canter & Hammond, 2006). The function is expressed as:

f(dij) = A + B * dij

While other distance decay functions exist, these are the expressions most often used. Researchers continue to develop new functions in an attempt to better fit the behavior of serial offenders.

Software Implementation of Probability Strategies

The vast number of calculations required by probability strategies make the use of computers the only practical method for executing such an approach. Although four researchers have developed major geographic profiling software systems, only three applications are available. A program by the name of Predator has been mentioned on rare occasions in literature, but little is known about the application beyond that Godwin developed the program for his own research purposes and does not make it available (Rich & Shively, 2004).

Rossmo (2000) developed a comprehensive and user-friendly system called Rigel. He created two manifestations of the software, the much costlier Rigel Profiler and the less expensive and more widely used Rigel Analyst (Rich & Shively, 2004). The main difference between the applications is that Rigel Profiler is designed for full-time geographic profilers while Rigel Analyst is designed for less frequent use (Rich & Shively, 2004). Rossmo also requires an extensive training program in order to be certified in use of Rigel Analyst (Rich & Shively, 2004).

Although Canter (2003) initially developed Dragnet for research purposes, he has made it available to law enforcement agencies who have requested it. He has also personally assisted in investigations using the software and reported that that the investigators found Dragnet’s analysis helpful in apprehending the serial offenders. Dragnet is available from Canter at no cost. Levine (2004) added geographic profiling functionality to CrimeStat, the criminal spatial statistics software he developed from a grant with the National Institute of Justice (NIS). He named the module “journey-to-crime estimation” to distinguish it from geographic profiling, which he characterizes as including much more than a predictive pattern analysis. However, most researchers use the term “geographic profiling” to refer to the predictive models employed by the software.

Although the existing programs vary in their particulars, they all implement the probability strategy through three common components: input, calculation, and output.

Input

For input, geographic profiling software requires a series of crime scene locations linked to a single unknown serial offender. In CrimeStat, the user specifies a file containing the latitude-longitude coordinates of the crime scenes. A major strength of CrimeStat is that it allows for a diverse number of formats to be used for the input file, ranging from simple text files to complex database files. As a result, investigators and researchers can import crime series without having to convert the data to a different file format. Dragnet also allows the user to specify a file with the crime scene locations, but requires the locations to be in positive X-Y coordinates using units of kilometers. Dragnet also requires the file to follow a specific format and be saved as a text document with a particular extension. Additionally, Dragnet allows the user to enter crime scenes onto a positive X-Y coordinate grid by clicking on the screen. The points are then saved to a file that can be accessed at a later time. Rigel Analyst allows the user to enter crime scenes in one of three ways (Rossmo, n.d.). First, the user can manually enter the street addresses, which are then converted to latitude-longitude coordinates through a process known as “geocoding.” Second, a user can scan in a printed map that is subsequently digitized. Third, a user can manually enter the latitude-longitude coordinates.

Some of the geographic profiling applications also require the user to define a reference area. A reference grid can be created in CrimeStat by specifying the minimum-maximum latitude-longitude coordinates as well as the grid’s resolution. CrimeStat also allows the user to specify a reference area file that contains a list of latitude-longitude coordinates. When manually creating an input file, Dragnet requires the user to define a reference area by specifying the length of the X and Y axes in kilometers. Because Dragnet defines the reference area as a rectangle with an aspect ratio of 133:100, the X-Y axes must retain this ratio. If a user is inputing the crime scenes by clicking on the screen, Dragnet initially requires the user to enter the length of the Y axis in kilometers; the application then calculates the X axis using the 133:100 ratio. Rigel Analyst uses an algorithm based on the average distance between points to define its reference area (Levine, 2004). The average X interpoint distance is added to the maximum X ordinate and subtracted from the minimum X ordinate. The process is repeated for the Y values. Rigel Analyst also allows for a third type of input. Geographic layers, such as maps emphasizing different features, can be imported to appear with the density map.

Calculations

After the required input has been entered, geographic profiling software performs the calculations according to the particular probability strategy employed by the application. Dragnet adds a normalization parameter to convert density scores to probability scores to allow for case comparisons. CrimeStat and Dragnet default to a negative exponential function, but allow the user to set a different distance decay function and the values of any variables associated with that function. Rigel Analyst does not use any of the distance decay functions as expressed in the equations discussed above. Instead, the application uses a patented set of mathematical functions, which Rossmo calls “criminal geographic targeting,” that includes an inverse distance function and buffer zone (Levine, 2004).

Dragnet and CrimeStat both allow the user to specify whether distance is measured by the Euclidian or Manhattan method. Euclidian distance is the straight line measurement between two points. Manhattan distance accounts for the indirect routes caused by city blocks in urban areas, and is defined as the sum of the vertical and horizontal distances traveled between two points on a grid made up of city block-sized cells (Levine, 2004). Rigel Analyst uses Manhattan distances exclusively (Levine, 2004).

Output

After each point in the reference area has a probability or density score, a map can be outputted. Dragnet assigns each cell a color based on its probability score and displays the map on the screen. CrimeStat does not have built-in graphic capability and instead outputs the density map in one of a variety of geographic information systems (GIS) file formats. Consequently, an independent GIS software system is needed to view the output. Rigel Analyst displays the density map overlaid a street map in both two and three dimensions. Rigel Analyst also allows for suspect prioritization by statistically comparing the location of their anchor points in relation to the density map. None of the current applications are able to print maps or reports that are amenable to case files.

If the anchor point for a serial offender is known, geographic profiling software can output an accuracy measure. Dragnet and Rigel Analyst displays the “search cost,” or the percentage of the reference area searched before finding the actual anchor point. Dragnet can also indirectly display “error distance,” or the straight-line distance from the predicted to the actual anchor point. The user can hover the mouse pointer over each cell in the top profile region to find the point with the highest probability score. The distance can then be measured between that point and the actual anchor point using an on-screen tool.

The Human verses Machine Controversy

Researchers have questioned the assumption that geographic profiling software systems perform better than people briefly “trained” in the principles of geographic profiling. While comparing various journey-to-crime methods used in geographic profiling software for his manual on CrimeStat, Levine (2004) noticed they seemed to perform equally well. Because the simpler spatial models, which can be performed by people without the aid of a computer, appeared to perform as well as the more complex probability strategies, Levine questioned Rossmo’s extensive training requirements (Rich & Shively, 2004).

Snook, Canter, and Bennell (2002) first tested Levine’s suggestion that humans might perform as well as geographic profiling software. They divided participants into a control and an experimental group, each with 21 students. In the first phase of the study, participants were presented with ten spatial distributions, pieces of A4 paper with five crime scenes marked. They were then asked to mark where they thought the offender lived. Before phase two, the experimental group was taught two heuristics, or cognitive shortcuts. They were told that (a) most offenders committed crimes close to home and (b) most offenders lived within a circle defined by a diameter the length of the distance between the two furthermost crimes. The former describes a simplified distance decay strategy while the latter describes the much simpler circle hypothesis. In phase two, both groups were given the same ten spatial distributions and asked to predict the home location. Additionally, the point of highest probability for each distribution was calculated by Dragnet and printed to the same scale as the papers given to the participants. The error distance was then measured in millimeters for every distribution.

As expected, the control group showed no significant difference in error distance between the two phases. However, the experimental group significantly reduced their mean error distance after learning the two heuristics. Moreover, their error distance was not significantly different than Dragnet’s error distance. The study supported the hypothesis that humans can perform as accurately as geographic profiling software and created the template for subsequent research into the question.

Snook, Taylor, and Bennell (2004) ran a study with the same basic structure but increased the sample size (N=215) and separated out the two heuristics. In addition to the control group, they used a separate group for each heuristic, one for the circle hypothesis and one for the distance decay strategy. They also asked participants how they made their decisions at each step in order to explore the cognitions actually used to make the predictions. Finally, CrimeStat was used in place of Dragnet.

As in the previous study, Snook et al. (2004) found that (a) participants significantly improved after learning either of the heuristics and (b) the software did not perform significantly better than the participants. The questions regarding how participants made a prediction revealed that 49% of the participants used appropriate heuristics before training. Of the participants who reported using the assigned heuristic after training, 69% of the circle group and 78% of the distance decay group improved their accuracy. Finally, the study found that the participants using the distance decay heuristic performed marginally better than those using the circle hypothesis.

Snook et al. (2005) separated the question from humans to study whether the complexity of a geographic profiling strategy corresponded to accuracy. They mathematically expressed the complexity of 11 strategies from both the spatial and probability categories. Complexity was defined in terms of the number of mathematical and computational tasks needed to preform the strategy. Each strategy was then used to analyze a sample of 16 UK burglars with ten or more crimes in their series. They found that complex strategies were not significantly more accurate than simpler strategies.

Other studies comparing performance of humans and spatial strategies to geographic profiling software support the same general findings (Bennell, Snook, Taylor, Corey, & Keyton, 2007; Paulsen, 2006a, 2006b). However, Rossmo (2005b) criticized the studies for not representing the realities of law enforcement investigations. He pointed out that the studies often used cases that do not meet the assumptions of geographic profiling, such as including series with too few crime scenes or series committed by commuters. Rossmo (2005b) also argued that the linear nature of error distance does not represent the true error, which is non-linear.

When presented with two separate methods of applying the same model, it is logical to use the method most faithful to the model. Geographic profiling software is always objective and never distracted by unusual patterns. Humans, on the other hand, do not apply the model with computer precision and objectivity, and they may not know how to handle odd patterns. Additionally, humans introduce a factor absent from software: the skill level of a particular investigator. The above studies use mean accuracy measures for the group. Although variability was reduced by training, certain individuals still performed worse than others (Snook, Canter, & Bennell, 2002). Because geographic profiling is often done at the individual level in law enforcement, the investigator making the predictions may happen to be particularly bad at the task. Conversely, geographic profiling software makes the same prediction for a series no matter who operates the program.

Moreover, the human method only provides a single-point prediction, whereas geographic profiling software provides prioritized search areas for law enforcement officers to use in their investigation. The single-point prediction provided by humans cannot offer a search area because the probability does not decrease at an equal rate in all directions; concentric circles can not be drawn around the point to provide search areas congruent to those created by software. Furthermore, the human method cannot be calibrated. Most of the studies investigating the accuracy of geographic profiling software use generalized distance decay functions. However, accuracy may be improved by calibrating the variables in the distance decay functions on data available for that jurisdiction as outlined by Levine (2004).

Finally, refined and improved strategies may significantly increase the accuracy of software. Although Snook et al. (2005) demonstrated that the more complex strategies used by geographic profiling software do not correspond with greater accuracy, this finding may not hold as the current strategies are improved upon and new ones are developed to incorporate new findings in the field. In one sense, even the probability strategies are purely spatial since they do not consider land features and the layout of the roads. Adapting current models to account for the geography of an area may improve accuracy.

Therefore, if given an equal choice between using geographic profiling software and humans employing heuristics, the software is the logical choice, the personal preference of individual investigators notwithstanding. However, this choice is often not equal. Rigel Analyst is expensive; the cost to purchase the software and pay for training may be prohibitive, especially for smaller police departments. Although free, CrimeStat and Dragnet, require an initial investment of time for training and an ongoing expense of time and effort for using programs that can be difficult and time-consuming to use. If a geographic profiling application were affordable in terms of cost, effort, and time, then the controversy between human versus machine becomes much less relevant from a research perspective. However, investigators may still prefer the human approach regardless of what research has to say on the subject. A study that introduced law enforcement officers to geographic profiling heuristics and software and then asked which they preferred would help answer this question.

Evaluating the Comparative Accuracy of Geographic Profiling Software

Researchers evaluating the accuracy of geographic profiling models make two methodological decisions, consciously or otherwise, that have a major impact on the results: they decide which series to include and what type of accuracy measures to use for their results. The NIJ convened a panel of experts from the various roles involved with geographic profiling to address these two methodological considerations among others (Rich & Shively, 2004). A final report summarized the discussion and suggested a framework for evaluating geographic software. Rossmo (2005a) took issue with a number of their conclusions and offered an alternative methodology. The report and Rossmo’s response provide a framework for discussion on the issue of evaluation.

Inclusion of Cases

The NIJ expert panel recommended comparing the accuracy of geographic profiling software under a wide range of conditions that law enforcement officers may experience during an investigation (Rich & Shively, 2004). To accomplish this, they advocated constructing a data set with series from a variety of crime and offender types. They recommended collecting these cases from several law enforcement agencies from jurisdictions that range in size, layout of roads, and major obstructions to travel. Furthermore, they recommended only using series with three or more crimes. Finally, they stated that researchers could also include commuter-type offenders in order to simulate the reality that investigators have no way of knowing the classification of an unknown serial offender. However, they suggested that researchers note that geographic profiling applications are not designed to handle commuter-type offenders.

The expert panel based their recommendations on the assumption that comparing geographic profiling software under conditions most closely resembling actual working conditions produces the most relevant results (Rich & Shively, 2004). However, Rossmo (2005a) objected to including cases that geographic profiling software was never meant to handle. He advocated evaluating the software using recommended operating conditions so that the applications are ranked only on their ability to perform the task for which they were designed (Rossmo, 2005a). Rossmo (2005a) provided a set of criteria to use when deciding whether or not a case is appropriate for geographic profiling software. First, the series has at least five crimes. Second, the offender operates from a single anchor point that remains the same for the duration of the series. Third, the offender is a marauder who finds his targets during routine activity in familiar areas. Fourth, the series takes place over an area of relative uniformity in travel and target opportunity.

As Rossmo (2005a) noted, a distinction between research and evaluation is made when considering which cases to include in a particular study. Research into the overall accuracy of geographic profiling software may be made more meaningful using data sets that represent the nature of cases investigated by law enforcement officers. However, researchers have two options when evaluating the relative performance of different geographic profiling applications. They can either test their performance under the field conditions experienced by investigators or based on how well they execute the geographic profiling model. The former compares applications on applied terms, the latter on a theoretical terms.

Accuracy Measures

The second methodological decision necessary for evaluating geographic profiling software addresses how the accuracy of an application is measured. Five major accuracy measures have been widely used in the field (Rich & Shively, 2004):

  1. Error distance is the straight-line distance between the predicted and actual home location.
  2. Profile error distance is the straight-line distance between the actual home location and the nearest point on the edge of the top profile region.
  3. Top profile area is the ratio of the top profile region area to the total reference area.
  4. Profile accuracy is a binary test of whether the actual home location falls within the top profile region and is reported as either “yes” or “no.”
  5. Search cost is the percentage of the reference area searched before the actual home location is found.

Error distance and search cost are the most widely used. They also allow for better comparison between cases and software because the top profile region is an output characteristic of the program, not an inherent part of the probability strategy. For example, the same profile could be displayed in five contours just as easily as ten or twenty contours. The top profile region for the five-contour map would be much larger than the top profile region of a ten-contour map. Arguably, including more contours gives law enforcement more flexibility in how to use the profile. If they are restricted in resources, more contours result in a smaller top profile region, which may fit an agency’s budget better than the top profile region of a less contoured output.

The NIJ expert panel suggested that researchers decide which accuracy measure or measures to use based on their needs (Rich & Shively, 2004). However, Rossmo (2005a) argued that search cost is the only appropriate measure to use in evaluation. He presented three major problems with using error distance. First, only search cost reflects the way law enforcement agencies use predictions made by geographic profiling systems. The software provides investigators with a map of prioritized search areas, delineated by contours (Rossmo, 2005a). Search cost accounts for the shape of these lines; error distance does not.

Second, search cost better reflects the actual error produced by probability strategies (Rossmo, 2005a). Because investigators use a geographic profile to determine search areas, the subsequent error is based on area, which increases as a function of the square of the radius. If population density, and subsequently the pool of suspects, remains uniform over the reference area, then as the radius doubles, the pool of suspects quadruples (Rossmo, 2005a). The linear nature of error distance underestimates the error at an ever increasing rate as the distance from the anchor point increases.

Third, error distance precludes comparison between cases because it does not take into account the scale of the distribution (Rossmo, 2005a). A large crime scene distribution with relatively little error could have an identical error distance as a small crime scene distribution with a relatively large error; however, the actual error would be very different. Furthermore, because error distance does not take into account the scale of the distribution, measures of central tendency are meaningless for a set of error distances. Conversely, as a normalized score, a measure of central tendency can be obtained from a set of search costs. However, error distance can be used to compare an identical series calculated by two or more different geographic profiling applications because the scale, as well as other distribution characteristics, remains the same regardless of the program. Despite its limitations, error distance is commonly used for its research value. Error distance allows for comparison of different geographic profiling models ranging from the simplest circle method to the most complex of probability strategies. Only when a probability strategy produces multiple peaks with equal likelihood scores does error distance not work as an accuracy measure.

Limitations of Geographic Profiling Software

Because geographic profiling is based on the concepts of cognitive mapping and location theory, the predictive models assume the offender is operating from a stable anchor point. Consequently, geographic profiling software will produce extremely erroneous predictions when applied to a commuter-type offender. However, there is currently no way to reliably classify an unknown serial offender as a commuter or marauder with any certainty, and very little research has been done in this area (Paulsen, 2007). Paulsen (2007) ran a logistic regression on a number of spatial and temporal variables for a set of crime series and was able to improve predictions of offender classification from a 60% best guess to 81%. Three variables contributed significantly to the increased predictability: the degree of crime scene clustering, the area within the convex hull, and the average number of days between crimes (Paulsen, 2007). Further research needs to be conducted to confirm these variables as reliable predictors and to find ways to improve the accuracy of commuter-marauder prediction.

Some serial offenders do not fall into either classification. The Washington, D.C. snipers, John Allen Muhammad and Lee Boyd Malvo, committed crimes in a marauder fashion, but did so from their car, which acted as moving home base. The result was a large criminal range without a static center (Canter, 2003). Other serial offenders may live on the perimeter of their cluster of crimes, blurring the distinction between marauder and commuter. Because investigators are not able to reliably classify unknown serial offenders, geographical profiling may be inadvertently applied to an invalid case.

Geographic profiling also requires a series of crimes to be correctly linked to a single offender. However, investigators are not always able to accurately link crimes and may sometimes incorrectly attribute crimes to a particular serial offender (Canter, Coffey, Huntley, & Missen, 2000). Incorrectly linked crimes corrupts the results while too few linked crimes decreases the sample size of the known opportunity space. Geographic profiling models also do not work well with mentally ill offenders who are driven by irrational delusions or with spree killers who do not behave in a manner typical of other serial offenders (Rossmo, 2000). The hypotheses of geographic profiling are founded on rational decision-making on the part of the serial offender. Furthermore, the type of crime or choice of victim can influence where an offender commits crimes. Restricted target opportunities do not offer clean samples of an offender’s activity space and result in faulty predictions (Rossmo, 2005a). For example, the serial killer who murders prostitutes will most likely stick to very limited portions of the city in order to find his victims. Paulsen (2006b) also found that crime type influenced the accuracy of geographic profiling software.

Finally, O’Leary (2006) argued that the current spatial and probability strategies themselves contain a major flaw: the anchor point and high probability areas identified by the models always lie within the convex hull. However, geographic features, such as large bodies of water, can cause a serial offender to commit crimes in a commuter pattern out of necessity and in a way that excludes his home base from the center of the crimes. Furthermore, investigators are often not aware of all the crime scenes associated with a particular serial offender when an analysis is needed. As a result, the actual anchor point could be located outside of the convex hull of the partial crime data. O’Leary (2006) has developed a mathematical model to allow for the actual anchor point to reside outside the convex hull. He is currently refining it on a grant from the National Institute of Justice.

The Design and Development of GeoProfile

Critique of Existing Geographic Profiling Software

After reading literature on Rigel Analyst and using CrimeStat and Dragnet, it became clear that each of the existing programs exhibit one of two limitations: either the application is (a) too cost-prohibitive to purchase and implement or (b) too difficult and time consuming for everyday use. Rigel Analyst falls into the former category while CrimeStat and Dragnet fall into the latter. These limitations have implications for both research and law enforcement.

CrimeStat and Dragnet limit research by requiring a large amount of time and effort to use. Although CrimeStat can quickly import data sets of latitude-longitude coordinates, a separate reference file must be created for each series in the data set. CrimeStat also lacks a function to calculate accuracy measures; a researcher has to program a separate tool to find the the search cost and error distance for each series. Although Dragnet’s intuitive user interface makes the program easy to learn, converting a data set to its X-Y coordinate system requires a complicated conversion procedure. Researchers must either program a conversion tool or manually convert every series in the data set.

Dragnet makes it fast and easy to find the search cost for an entire data set using its batch analysis tool. However, the lack of an error distance function requires the researcher to spend a great deal of time manually calculating this measure. Because Dragnet does not mark the cell with the highest probability, a researcher has to manually check the probability value for each cell in the top profile region in order to measure the distance to the anchor point. The considerable cost of time and effort required by CrimeStat and Dragnet limit the scope of studies researchers can conduct with their finite schedules. By increasing the software’s ease-of-use and decreasing its time requirements, researchers could spend more time testing hypotheses.

The effort and time requirements of CrimeStat and Dragnet may also limit research by discouraging use by law enforcement. If the goal of geographic profiling research is to create an effective tool for investigators, then the only way to measure the success of that goal is through feedback from law enforcement. Furthermore, the practical experience of investigators may generate new ideas and directions for geographic profiling research. However, if law enforcement find the software too difficult or time consuming to use, then geographic profiling researchers may not address the needs of the investigators they hope to help.

The same expense of time and effort that restrict researchers also limit the number of investigators who use the software. They may have even less time and patience for needlessly complicated software than researchers who are accustomed to the tedium of research. By creating a simple and quick geographic profiling application with the same accuracy as existing software, investigators may be more likely to use geographic profiling software in their work.

Both CrimeStat and Dragnet have a specific weakness that becomes more relevant in a law enforcement setting. CrimeStat requires GIS software, such as ArcView or MapInfo, to view the results of the journey-to-crime module. Although a crime analyst is likely to have and know how to use GIS software, the average investigator does not. As a result, the use of CrimeStat is mostly limited to crime analysts. Although crime analysts use geographic profiling software to help investigators solve cases (Rich & Shively, 2004), making the software available to investigators may increase its use. Dragnet does not require any additional software to use. However, the lack of embedded mapping software and the inability to export the results to mapping software, make it difficult for investigators to overlay the probability map onto a street map.

The Design of GeoProfile

A new geographic profiling software system was designed to address these issues. GeoProfile, the new software system, was designed around four core concepts: simplicity, speed, affordability, and accuracy.

Simplicity

GeoProfile’s simplicity begins with its accessibility. Designed as a web application, GeoProfile simplifies use of geographic profiling software in two ways. First, researchers and investigators can use the application on any operating system as long as they have an internet connection. Although Rigel Analyst is operating system independent, CrimeStat and Dragnet only work on a few versions of Microsoft Windows. Second, GeoProfile is updated in one central location, on the web server. As a result, users do not have to download and install a new version of the application with every new release. Bugs can be fixed and features added without the time and effort required to distribute the new version. As a web application, GeoProfile also simplifies file handling. The program automatically saves the profile with each change so loss of work is minimized. Additionally, an unlimited number of profiles can be created and saved to a user’s account. Because the profiles are saved in a database on the web server, users do not have to store or keep track of any files on their computer. This allows users to access their profiles across multiple computers without having to transfer files.

Furthermore, GeoProfile was designed with a simple user interface. The screen is divided into two main parts: (1) a column on the left contains the list of the crime scenes and (2) a large area to the right contains the map. A header at the top of the screen displays the name and description of the profile. GeoProfile was also designed with a simplified workflow. The user begins by clicking on the “New” link in the header of the application and entering a name and optional description for the new profile. The user then adds crime scenes by clicking on the “New” link in the crime scene list and entering the street address or latitude-longitude coordinates as well as optional case information such as identification number, crime type, date, and notes. A marker immediately appears on the street map. When more than one crime scene exists, the density map automatically recalculates to account for the new location. If the case is solved, the user can add an anchor point by adding a crime scene with “Anchor” selected from the crime type drop-down menu. The search cost and error distance can then be viewed by clicking on the “Edit” link in the header. GeoProfile displays the details for the series, including the accuracy measures if an anchor point has been added.

GeoProfile also makes it easy for researchers to add large data sets. A data set with multiple crime series can be imported from a comma separated value (CSV) file containing a list of crime scenes and anchor points in latitude-longitude coordinates. Most database programs used to store crime data can export the data to this format. Mapping is also simplified. With an embedded map, no additional GIS software is needed. Additionally, GeoProfile uses Google Maps. Because many people are already familiar with Google Maps, there is a higher chance a user will already know how to use the map. Even without prior experience, Google Maps uses an intuitive interface that makes it easy to navigate, zoom, and switch between street, satellite, and hybrid street-satellite views.

Speed

The simplified workflow and user interface substantially reduces the amount of time needed to geographically profile a crime series. Dragnet requires crime scene locations to be first geocoded and then converted to X-Y coordinates while CrimeStat requires the creation of a separate reference grid. If a serial offender commits new crimes outside of the initial reference area, these processes will have to be repeated for each program. GeoProfile eliminates these extra steps because they are performed automatically every time a new crime scene is added or deleted. By eliminating these steps, GeoProfile saves the researcher and investigator time and effort. GeoProfile automatically recalculates the reference area for each crime scene, allowing users to create or modify series in the time it takes to enter the street addresses. Furthermore, the importer saves researchers from having to add each series individually.

Affordability

An efficient and easy-to-use geographic profiling application will not encourage further research or use by investigators if the financial cost is prohibitive. For this reason, GeoProfile was designed using open source software components. Without licensing fees, the cost of developing and deploying GeoProfile was significantly reduced to the point where a small grant could allow for use of GeoProfile by researchers and investigators at no cost to them.

Accuracy

GeoProfile was designed to address the above issues without sacrificing accuracy. By using the standard probability strategy employed by the established software, GeoProfile is expected to perform as accurately as CrimeStat and Dragnet. Specifically, GeoProfile uses the same negative exponential equation for its default distance decay function as CrimeStat and Dragnet. However, each program uses slightly different values for the variables in the distance decay function, and Dragnet adds a normalization parameter, which GeoProfile and CrimeStat do not. In theory, all three programs would produce the same output if they use the same equations. However, when two applications are developed using different programming languages, slight differences in output occur due to how the languages handle numerical precision. Furthermore, variations between two algorithms lead to variations in the output. For example, GeoProfile uses a grid with 562,500 cells compared to Dragnet’s 13,300. As a result, GeoProfile’s accuracy needed to be verified after development.

The GeoProfile Model

Whereas CrimeStat and Dragnet calculate the density map only after all crime scenes have been entered, GeoProfile updates the density map every time a new scene is added or deleted. GeoProfile runs through the following algorithm for the first time when two crime scenes are present. Then the algorithm is repeated with each addition or deletion of a crime scene. If the crime scenes are entered as street addresses, they are geocoded once, the first time they are entered, and saved as latitude-longitude coordinates for future calculations. Latitude-longitude coordinates are taken as entered.

  1. The reference area is defined as a square 50% larger than the area defined by the minimum and maximum X-Y coordinates of the crime scenes. A 750-by-750 grid of cells is overlaid the reference area.
  2. For each cell, the distance between itself and each crime scene is evaluated with the distance decay function and summed to produce the likelihood score. The algorithm defaults to the following negative exponential distance decay function:
    Negative Exponential Function
  3. The range of likelihood scores is divided into ten equal bands to form the contours of the density map. Each cell is assigned a colored pixel according to the band in which the likelihood score falls. The resultant density map is displayed over the street map.

GeoProfile uses a much higher resolution than the other applications for two reasons. First, it is hoped that including more information might offer a slight improvement on accuracy. Second, the embedded mapping software allows for a high degree of zoom; the higher resolution allows for smoother contour lines at greater levels of zoom. Nevertheless, these benefits would not be worth a significantly slowed processing time. However, the language used to program GeoProfile allowed for the evaluation of 562,500 cells without compromising speed.

The Development of GeoProfile

After the design for GeoProfile was established, Jim English, an applications developer for Taylor University at the time, was recruited to develop the current prototype. English chose the programming language and software components used to develop GeoProfile. He also tweaked the layout and functionality of the user interface to fit the standards used by the software industry. English’s role remained that of developer; he did not make any significant changes to the design of GeoProfile. However, developing a program to fit the often incompatible goals of simplicity, speed, and accuracy while maintaining affordability presented a significant challenge; English worked extremely hard and found several creative solutions to meet these goals.

English made another major contribution to the project by coding the software’s architecture using a flexible framework. He developed GeoProfile so that the various components can be easily switched out for others without modifying the core code. For example, if Google changes their licensing in a way that causes GeoProfile to violate their terms, a different open source mapping software could be implemented without having to change any of the code at the core of the application. The same applies for the database, geocoding, and user interface components. This flexibility allows GeoProfile to adapt to changes in the software industry as well as new research in the field of geographic profiling.

Method

Sample

The data used in this study were crime scene and home locations of 55 serial offenders from Baltimore County, Maryland. The series were selected from a sample data set distributed with CrimeStat that included data on 88 serial offenders. Levine confirmed that the data originate from police records and has been used by other researchers to analyze geographic profiling systems (N. Levine, personal communication, January 18, 2008).

This study follows the evaluation methodology outlined by Rossmo (2005a). Of the 88 series in the data set, 33 were removed for violating the assumptions of geographic profiling. Rossmo (2005a) recommends only including series with five or more crimes. Accordingly, series with less than five crimes were removed from the data set. Geographic profiling software also assumes the offender is a marauder (Rossmo, 2005a). Therefore, series with home locations outside the convex hull of crimes were removed. Lastly, Rossmo (2005a) argues that the geographic profiling model assumes a relatively uniform area for travel and target opportunity. Consequently, series were removed where large bodies of water or sections of forest disrupted the crime distribution.

Apparatus

Comparing GeoProfile to all three of the major geographic profiling applications would have produced a fuller picture of where GeoProfile ranks among the others in terms of accuracy. However, Rossmo denied a request to use Rigel Analyst in this study (P. MacLaren, personal communication, April 8, 2008). The purpose of establishing the reliability of GeoProfile is to allow researchers and investigators to confidently use the system. Furthermore, one of the major reasons for developing GeoProfile was to afford this researcher the opportunity to adapt the algorithm to test new hypotheses suggested by future literature. Although Rossmo was made aware of these goals in a research proposal, he denied the request on grounds that developing a new system, even one that could be modified to experiment with new ideas, would add nothing new to the field (P. MacLaren, personal communication, April 8, 2008). Rossmo feels that Rigel Analyst’s algorithm “has proven to be sufficiently robust for real world problems” (P. MacLaren, personal communication, April 8, 2008). However, researchers continue to suggest new directions that may improve the accuracy and practicality of geographic profiling software (Paulsen, 2007). Even Rossmo devotes a “significant ongoing research and development effort” to his program (P. MacLaren, personal communication, April 8, 2008).

Levine and Canter take a different approach to geographic profiling research. Each provided a copy of their program and discussed their research and this study through email correspondences. As a result, this study compares GeoProfile to CrimeStat 3.1 and Dragnet-K.

Procedure

The data set of 55 crime series were analyzed by CrimeStat, Dragnet, and GeoProfile. Because each application uses different input formats, computer programs were created to convert the data into the correct format for each program. GeoProfile’s importer directly added the data set in its original form. As sample data distributed with the software, CrimeStat also imported the data set without modification. However, CrimeStat requires an additional reference grid to tell the software which coordinates to use in its calculations. A computer program was created to output the same 750-by-750 reference grid used by GeoProfile for each series. Dragnet does not take latitude-longitude coordinates as input, but uses positive X-Y coordinates with an axis originating at 0, 0. A computer program was created to convert the data set to the X-Y coordinates in a file format readable by Dragnet.

After inputing the data, the series were analyzed by each application using its default settings. Dragnet defaults to a negative exponential distance decay function with a normalization procedure. CrimeStat also defaults to a negative exponential function with the default variables defined as follows: B=-0.06; A=1.89. Likewise, GeoProfile defaults to a negative exponential function with the default variables defined as follows: B=-0.2; A=1.89. Next, the search cost was calculated for the 55 crime series on each of the software systems. Because differences in the size and aspect ratio of reference area affect the search cost measurement, a standardized definition for the reference area was used on all three applications. When calculating search cost, the reference area was redefined as the area falling within the smallest possible rectangle containing all the crime scenes in a series. This definition allowed for consistent measurements between software systems and also resulted in a conservative search costs. The algorithm built into GeoProfile already calculates search cost according to this definition. However, Dragnet calculates search cost using the entire reference area. A program was created to calculate the search cost for the series analyzed by Dragnet. CrimeStat does not feature a search cost function so a similar program was created to calculate the search cost for CrimeStat.

Rossmo (2005a) makes a strong case for using search cost as an exclusive accuracy measure. However, most researchers in the field also use error distance. For this reason, error distance was also found in order to strengthen the results. The error distance was calculated in kilometers for each of the 55 crime series on the three applications. GeoProfile features an error distance function. Because the CrimeStat lacks the capability, a computer program was created to find its error distances. Although Dragnet does not report error distance, it can measure the distance between two points. First, the cell of highest probability was found. Occasionally, Dragnet produces a probability map with multiple peaks of equal probability. In these cases, the central most cell was selected as the apex of the map. If no central point presented itself, the cell closest to the home location was used. After finding the cell with the highest probability, the distance was measured between that point and the home location using Dragnet’s distance tool.

Results

Table 1: Non-Inferiority Tests
Accuracy Measure Application Non-Inferior P-Value
Search Cost CrimeStat Yes .00001
Dragnet Yes .0023
Error Distance CrimeStat No .0556
Dragnet Yes .000001

Because the aim of this study is to establish the reliability of GeoProfile by testing whether or not it performs as accurately as existing geographic profiling software, standard significance testing cannot be used to provide support for this research question. Failing to reject a null hypothesis of no difference between groups is not evidence for their equivalence (Rogers, Howard, & Vessey, 1993). Researchers in the field of pharmacology face the same problem when attempting to demonstrate that a new drug works as well as the current drug on the market. Biostatisticians use a procedure known as equivalence testing to establish that no practical difference exists between two drugs.

Rogers et al. (1993) introduced equivalence testing to psychology. They argue that the procedure could provide further analysis of negative or borderline results. In cases where a traditional significance test failed to show a difference, an equivalence test could do one of two things. If the equivalence test demonstrates no practical difference between the two groups, it would suggest that the negative result indicates a similarity between the groups (Rogers et al., 1993). However, if the test failed to show equivalence, the results remain ambiguous; the difference between the groups are not statistically significant nor are they statistically equivalent (Rogers et al., 1993). The converse of this method applies to this study. Equivalence testing is used to evaluate the hypotheses while traditional significance testing is used to analyze any negative results.

Figure 2

Figure 2: Cumulative Search Cost

In equivalency testing, the mean of the test subject is compared to the mean of the reference subject. If the test mean falls within a predetermined range of the reference mean, than the test subject is considered equivalent to the reference subject. This is done in one of two ways, using either the null hypothesis approach or confidence interval approach (Rogers et al., 1993). In the null hypothesis approach, two one-sided t-tests, one for each side of the mean, are used to calculate a p value (Rogers et al., 1993). The null hypothesis states that the difference between the means of two groups is as great or greater than the pre-determined difference. The difference determined a priori by the researcher is expressed as a percentage of the mean. The FDA requires that the mean effect of the new drug be +/- 20% of reference drug (Luzar-Stiffler & Stiffler, 2002). The confidence interval approach uses the standard t-test for the difference of means to find the (1-2alpha) confidence interval (Rogers et al., 1993). The test subject is considered equivalent to the reference subject if the confidence interval for the test mean falls entirely within the equivalence range defined by the predetermined percentage of the reference mean (Rogers et al., 1993).

Figure 3

Figure 3: Cumulative Error Distance

In certain circumstances, a test subject that performed significantly better than the reference subject would be considered an acceptable, even desirable, result; however, it would fail the equivalence test. In these cases, a non-inferiority test, which only looks at one side of the equivalence range, is used. This is the situation of this study. If GeoProfile performs significantly better than CrimeStat or Dragnet on an accuracy measure, it would fail an equivalence test but still support the hypothesis that GeoProfile performs as accurately as existing applications. As a result, non-inferiority tests were used to evaluate only the upper limit of the equivalence test.

Equivalence testing requires normality, an assumption the data produced by this study fails. According to the Anderson-Darling test for normality, both measures of accuracy resulted in non-normal data (p < .005) for all three software systems. To correct the problem, a natural log transformation was applied to the data. The transformed data meets the assumption of normality. [caption id="attachment_127" align="aligncenter" width="300"]Figure 4 Figure 4: Scaled Cumulative Search Cost[/caption]

It was hypothesized that GeoProfile would perform as accurately as Dragnet and CrimeStat as measured by search cost and error distance. To evaluate that hypothesis, four non-inferiority tests were performed using Schuirmann’s one-sided test to obtain a p-value. The upper equivalence range was defined a priori as 20% worse than the reference mean. Because the data was converted to a natural log scale, the upper limit of the equivalence range was also converted to the same scale. As a result, the upper equivalence limit was defined as 125% of the reference mean of the transformed data (Luzar-Stiffler & Stiffler, 2002).

GeoProfile was non-inferior to CrimeStat (p = .00001) and Dragnet (p = .0023) when search cost was used as a measure of accuracy. GeoProfile was non-inferior to Dragnet (p = .000001) when error distance was used as a measure of accuracy. GeoProfile was not non-inferior to CrimeStat (p = .0556) when error distance was used as a measure of accuracy. However, a significant difference between the mean error distances of GeoProfile and CrimeStat was not found (p = .3372). In this instance, even the established applications were not equivalent. Dragnet was also not non-inferior to CrimeStat (p = .6332) when error distance was used as a measure of accuracy, and a significant difference between the mean error distances of Dragnet and CrimeStat was found (p = .0022). Table 1 contains a summary of all four non-inferiority tests.

Table 3: Error Distance
Application Median Mean SD
GeoProfile 1.26 3.18 3.98
CrimeStat 1.33 2.69 3.16
Dragnet 1.56 3.28 3.91
Table 2: Search Costs
Application Median Mean SD
Dragnet 4.52 23.50 40.99
GeoProfile 4.99 24.55 44.90
CrimeStat 8.82 26.25 42.21

Figure 2 illustrates the software systems’ comparative performance by graphing the percentage of the cumulative sample with a particular search cost. Figure 3 demonstrates the same principle but with the first five kilometers of error distance. If GeoProfile’s results are comparable to CrimeStat and Dragnet, then its line would be expected to closely follow the plot of the other two software systems, which is what both graphs demonstrate. Three-quarters of the sample has a search cost of 25% or less for all three software systems. Consequently, a great deal of information is lost when looking at the graphs at full scale. Figure 4 plots the cumulative sample for search costs 25% and below. As demonstrated by this graph, the pattern seen in the previous search cost graph persists when scaled up to accommodate for the loss of detail.

Because the data is non-normal, the median serves as the relevant measure of central tendency for the untransformed data. Table 2 lists the medians, as well as the mean and standard deviation, for the search cost produced by each software system. Table 3 shows the same measures for error distance.

Discussion

The purpose of this study was to develop and establish the reliability of a new geographic profiling software system. In a comparison of search costs, GeoProfile was non-inferior to both CrimeStat and Dragnet. In a comparison of error distances, GeoProfile was found to be non-inferior to Dragnet. Although GeoProfile was not shown to be non-inferior to CrimeStat when error distances were used as an accuracy measure, a significance difference was not found either; as a result, the results remain ambiguous in this instance. GeoProfile was shown to be non-inferior to at least one of the major geographic profiling applications in each category of accuracy measures. Consequently, the hypothesis that GeoProfile performs as accurately as the established software systems is supported by these results. Furthermore, the performance graphs illustrate that GeoProfile follows the same pattern of performance at each increment of the data set as CrimeStat and Dragnet on both sets of accuracy measures.

Including Rigel Analyst in the study would have provided a clearer picture of where GeoProfile stands in terms of accuracy in relation to all three of the major geographic profiling applications. Although Rossmo denied the request to use his software for this study, he may grant future requests once these results have been published; a follow-up study with Rigel Analyst included could then be conducted to confirm this study’s results. A sample that included series from multiple jurisdictions around the country would have strengthened the results by eliminating any confounding variables related to the geography and demographics of Baltimore County.

The introduction of GeoProfile to the collection of existing applications has three potential effects. First, GeoProfile may lessen the time and effort required to perform geographic profiling research. Second, the simplified user interface and workflow may appeal to investigators who would be willing to use a geographic profiling tool only if it was fast and easy to use. As the intended users of geographic profiling research, investigators could offer valuable feedback on their experience using the software in the context of actual investigations. The continuing goal for GeoProfile is to remain responsive to the needs of investigators. Additional features could be added at the request of investigators to make it a more powerful and helpful tool. For instance, investigators may wish to include GeoProfile’s analysis in their reports; a feature that prints maps in a format that meets the requirements of case files could be added to fulfill this need. Third, GeoProfile makes it possible for another researcher to experiment with the geographic profiling strategies. Researchers who have not developed their own application are limited in the types of research questions they can answer because they cannot modify existing software.

Contrary to Rossmo’s suggestion that the introduction of another application has no research value, enough problems remain in the field to occupy researchers for some time to come. For example, researchers still do not have a reliable way to classify an unknown serial offender as a commuter or a marauder; Paulsen (2007) has only begun exploration into this much needed area of research. Additionally, the current probability strategies apply a generalized model to local areas that differ widely from each other. The model does not take into account road networks, population densities, and empty areas of land or water. Even Levine’s (2004) methods of calibration and kerneling, which uses data on the distances travelled from the anchor point to the crime scene, do not directly consider local features. Developing a probability strategy that responds to local characteristics might significantly increase the accuracy of geographic profiling predictions. At least one profiler asked for such functionality (Strangeland, 2005). With this area of research still unexplored, it seems premature for Rossmo to suggest no further need for research into the probability strategies (P. MacLaren, personal communication, April 8, 2008).

Neural networks may provide a way for geographic profiling software to learn local features of a reference area. Modeled after the way neurons collectively operate in the brain, neural networks have been used in many fields to find solutions to problems (Krose & Smagt, 1996). They work especially well where patterns are involved. A neural network trained with crime data from a particular jurisdiction could adjust the hit score initially calculated by the probability strategy. Alternatively, a neural network might recognize the pattern well enough on its own without the guidance of a probability strategy. In addition to recognizing local features, neural networks may also discern patterns in how offender characteristics affect crime scene locations. Neural networks may also help classify an offender as either marauder or commuter. Training a neural network on the variables Paulsen (2007) found to have predictive value would be a good starting point. Future research will determine if neural networks have anything to contribute to geographic profiling.

GeoProfile’s web-based approach could potentially offer an additional resource for geographic profiling researchers. With the application hosted on a centralized web server, voluntary sharing of data sets could be facilitated through GeoProfile. Researchers that used GeoProfile to analyze crime series could choose to share their data by selecting an option in the settings menu. The series would then post to a listing through which other researchers could browse. An exporter tool could then be added to GeoProfile to allow researchers to use the shared data sets with other software. Likewise, investigators using GeoProfile could also choose to share their data with researchers by selecting a similar option in the settings menu. GeoProfile would then replace identifying information with unique IDs.

However, the web-based nature of GeoProfile may discourage investigators, and possibly some researchers, from using the application due to the sensitive nature of the data. The advantage of desktop applications is that the data is saved to that particular computer, which is usually in a secure location. To help overcome this fear, each account on GeoProfile would be password protected and the database would be encrypted so that even the site’s operators could not access the data. If investigators still feared for the privacy of the data, an option could be added to allow users to save their series to their desktop instead of the database on the web server.

GeoProfile could also be modified to work on mobile devices such as smart phones. Most contemporary phones have the advantage of built-in global positioning system (GPS ) capability. Investigators would then be able to walk right up to a body of a homicide victim and mark the exact location in GeoProfile using their cell phone. This would be especially helpful for crime scenes without an address, such as a body dump in a forest.

With GeoProfile’s reliability supported by the findings of this study, it is hoped that the software can now contribute to the field of geographic profiling. The next step is to introduce the application to researchers through publication, presentation, and personal communication.

References

  • Bennell, C., Snook, B., Taylor, P.J., Corey, S., & Keyton, J. (2007). It’s no riddle, choose the middle: The effect of number of crimes and topographical detail on police officer predictions of serial burglars’ home locations. Criminal Justice and Behavior, 34, 119-132.
  • Brantingham, P., & Brantingham, P. (1981). Environmental criminology. Beverly Hills, CA: Sage
  • Brantingham, P., & Brantingham, P. (1984). Patterns in crime. New York: Macmillan.
  • Canter, D., & Larkin, P. (1993). The environmental range of serial rapists. Journal of Environmental Psychology, 13, 63–69.
  • Canter, D., Coffey, T., Huntley, M., & Missen, C. (2000). Predicting serial killers’ home base using a decision support system. Journal of Quantitative Criminology, 16, 457–478.
  • Canter, D., & Lundrigan, S. (2001). Spatial patterns of serial murder: An analysis of disposal site location choice. Behavioral Sciences and the Law, 19, 595-610.
  • Canter, D. (2003). Mapping murder: Walking in killers’ footsteps. London: Virgin Books.
  • Canter, D., & Hammond, L. (2006). A comparison of the efficacy of different decay functions in geographical profiling for a sample of US serial killers. Journal of Investigative Psychology and Offender Profiling, 3, 91-103.
  • Cribbie, R.A., Gruman, J.A., & Arpin-Cribbie, C.A. (2004). Recommendations for applying tests of equivalence. Journal of Clinical Psychology, 60, 1-10.
  • Godwin, M. (n.d.) Retrieved May 31, 2008 from http://www.investigativepsych.com/predator.htm
  • Krose, B., & Smagt, P. (1996). An introduction to neural networks. Amsterdam: The University of Amsterdam.
  • LeBeau, J. (1987). Patterns of stranger and serial rape offending: Factors distinguishing apprehended and at large offenders. Journal of Criminal Law and Criminology, 78, 309-326.
  • Levine, N. (2004). Crimestat: A spatial statistics program for the analysis of crime incident locations (version 3). Washington, DC: National Institute of Justice.
  • Lundrigan, S., & Canter, D. (2001). Spatial patterns of serial murder: An analysis of disposal site location choice. Behavioral Sciences & the Law, 19, 595-610.
  • Luzar-Stiffler, V., & Stiffler, C. (2002). Equivalence testing the easy way. Journal of Computing and Information Technology, 3, 233-239.
  • O’Leary, M. (2006, July). A new mathematical technique for geographic profiling. Poster session presented at the NIJ Conference, Washington, DC.
  • Paulsen, D. J., & Robinson, M. B. (2004). Spatial aspects of crime: Theory and practice. Allyn and Bacon: Boston.
  • Paulsen, D. (2006a). Connecting the dots: Assessing the accuracy of geographic profiling software. Policing: An International Journal of Police Strategies and Management, 29, 306-334.
  • Paulsen, D. (2006b). Human versus machine: A comparison of the accuracy of geographic profiling methods. Journal of Investigative Psychology and Offender Profiling, 3, 77-89.
  • Paulsen, D. (2007). Improving geographic profling through commuter/marauder prediction. Police Practice and Research, 8, 347-357.
  • Rogers, J.L., Howard, K.I., & Vessey, J.T. (1993). Using significance tests to evaluate equivalence. Psychological Bulletin, 113, 553-565.
  • Rossmo, D. K. (2000). Geographic profiling. Boca Raton, FL: CRC Press.
  • Center for Geospatial Intelligence and Investigation (2005a). An evaluation of NIJ’s evaluation methodology for geographic profiling software. San Marcos, TX: Rossmo, D.K.
  • Rossmo, D.K. (2005b). Commentary: Geographic heuristics or shortcuts to failure?: Response to Snook et al. Applied Cognitive Psychology, 19, 651-654.
  • Rossmo, D.K. (n.d.). Rigel Quick Feature Tour. Retrieved May 31, 2008 from http://www.geographicprofiling.com/rigel/tour.html
  • Rich, T., & Shively, M. (2004). A methodology for evaluating geographic profiling software. Cambridge, MA: Abt Associates.
  • Snook, B., Canter, D. V., & Bennell, C. (2002). Predicting the home location of serial offenders: A preliminary comparison of the accuracy of human judges with a geographic profiling system. Behavioral Sciences & the Law, 20, 1–10.
  • Snook, B. (2004). Individual differences in distance travelled by serial burglars. Journal of Investigative Psychology and Offender Profiling, 1, 53-66.
  • Snook, B., Taylor, P. J., & Bennell, C. (2004). Geographic profiling: The fast, frugal and accurate way. Applied Cognitive Psychology, 18, 105–121.
  • Snook, B., Cullen, R., Mokros, A., & Harbort, S. (2005). Serial murders’ spatial decisions: Factors that influence crime location choice. Journal of Investigative Psychology and Offender Profiling, 2, 147-164.
  • Snook, B., Zito, M., Bennell, C., & Taylor, P. J. (2005). On the complexity and accuracy of geographic profiling strategies. Journal of Quantitative Criminology, 21, 1–26.
  • Strangeland, P. (2005). Catching a serial rapist: Hits and misses in criminal profiling. Police Practice and Research, 6, 453-469.
  • Turner, S. (1969). Delinquency and distance. In Wolfgang, M. E., and Sellin, T. (eds.), Delinquency: Selected studies, John Wiley, New York.
  • Warren, J., Reboussin, R., Hazelwood, R., Cummings, A., Gibbs, N. & Trumbetta, S. (1998). Crime scene and distance correlates of serial rape. Journal of Quantitative Criminology, 14, 35-59.