Pages

Friday, April 7, 2017

Exercise 6: Data Normalization, Geocoding, and Error Assessment

Introduction

The goals of this assignment were to normalize a data table containing addresses and PLSS locations for mines in western Wisconsin, geocode the addresses of the mines using the address locator tool in ArcMap, and compare the locations I picked for the mine sites to the locations picked by my classmates and the official locations from the DNR. This assignment provided insight into how a geocoding process might look in the professional world.

Methods

The first step in this assignment was to normalize the provided data table. This was done in excel and the normalization techniques included: breaking up the provided addresses into PLSS location, street address, city, city/town/village, county, and state. This helped to ensure that each field hadn't too much information and that the information was more organized and easy to navigate when geocoding.
Figure 1: The highlighted fields indicate normalized data fields.
Once the table was normalized, the next step was to geocode the addresses of the mines in ArcMap. First, the table was imported into the geocoding toolbar and the input fields were established.

Figure 2: The sheet containing the normalized address data was used and the highlighted fields were selected from the normalized data table.
Once the tool was set up, the addresses were automatically matched by ArcMap and put into the interface as a shapefile.
Figure 3: Matching locations in ArcMap.
From there, the Rematch Addresses function of the geocoding tool was used to look at where the potential addresses of the points are located. Then they could either be verified by the user or edited to have a manually selected point be the new location for that mine in the shapefile.
Figure 4: Editing and verifying address points in the mines shapefile. The mine in this photo is highlighted by a red circle.
Upon completion of editing and verifying the locations of mine addresses, the changes were saved as a shapefile layer. Next, the shapefiles of three classmates who were assigned mutual mine locations were imported into my map as well as the DNR's official mine locations shapefile. The merge tool was used to join each of my classmate's and my attribute tables together based on the mine unique ID field.
Figure 5: Merge tool inputs.
After the merge tool completed, a composite table was created and added to the map as a feature layer to which the "Mine_Uniqu" field was sorted by ascending order. Then the table was sifted through measuring the distance from my mine location to my other classmate's using the measure tool. These values were recorded in a separate table and used to analyze the error in my locations versus the legitimate ones. 

Results

After following all of the procedures described in the methods section, a map of the locations I used, a screen grab of my classmate's locations, and a comparison table were created.

Figure 6: Map of my mine locations.
Figure 7: Screen grab of my mine locations (red) and my classmate's mine locations.
Figure 8: Comparison table.
Discussion

Looking at the results and using the fourth chapter of CP Lo's Concepts and Techniques in Geographic Information Systems, the quality and accuracy of the resulting data is examined. Clearly, when referencing figure 8, the average distances between what I had as my mine location and what the class had or what the DNR had, were over 8,000 meters and 12,000 meters respectively. This is an enormous difference especially considering the sheer amount of sand mining that occurs in western Wisconsin. A difference that large could mean a location of a mine is in fact a completely different mine, and in some cases, it was. This however could also potentially be the result of these gross errors. Gross errors refer to large errors that can be easily detected and are usually caused by inadequate training or failing to adhere to standard procedures. With this assignment, there were some gross errors in that there was a level of uncertainty with the data as well as there was perhaps some confusion with using the PLSS addresses to locate the mines. Again, since there were so many mines throughout the imagery of western Wisconsin, there could be multiple mines within the same section and on the same road. Some other sources of error were in misuse or blunders with the geocoding tool. Both my classmates and I had run into some errors or confusion when selecting a point from the map. Sometimes a point on the map would be selected and then it appeard that the tool didn't recognize the selection, so the user clicked again, making two points for the address without realizing it. There were also times when the tool itself wouldn't actually use the point the user selected, but would place the point a few meters off. 

Determining the accuracy of locations in this exercise was difficult. For example, the DNR dataset containing the "actual mine locations" had a few points that clearly were inaccurate (i.e. in the middle of a crop field 200 meters away from the mine). Obviously, some of the points that I selected were also inaccurate, so it can be difficult to truly tell. The only way to do so would be to consult multiple sources and find the most recurring point or go to the location yourself and get the site's geographic coordinates. This brings up issues with time and resources to collecting that information, however.  

Conclusion

Overall, this dataset would need another comb-through if it were to be used for further analysis, due to the somewhat inaccurate data and uncertainties that are associated with it. In order to assure data accuracy, the geographer would need to go to these locations themselves and get the geographic coordinates, due to the clear lack of precision within the datasets used. On the other hand, I thought I walked away from this assignment with a better understanding of the issues associated with imperfect datasets and how to better understand correcting inaccurate data. I also feel as though I'm more comfortable with using the geocoding/ address locator tools in ArcMap after having done this assignment.

No comments:

Post a Comment