ArcMap Text Points and Census Tables Lab

Due in two weeks

Objective: The objective of this lab is to keep you going with ArcMap. You will: import text files with point coordinates as maps; convert them to shapefiles; join tabular data in text files to shapefiles using both spatial and attribute joining; and use selection, sorting, summarizing, and descriptive statistics to answer some more questions. You will also learn a bit about the US Census Bureau's Census2010 data and web-access mechanisms.

Part One: Text files of point data to maps and analysis

Idea: It is really handy to be able to map and analyze data from a text files containing coordinate points and attributes. That is what this part is about.

Data:

(Here's a hint for an up-coming "Write"... Notice that these are just comma separated text files (ASCII BCD) that you can 'type' and edit with editors like 'vi', 'notepad' or 'edit', with just a few formatting conventions. Notice especially the quoted strings as column headings, the ".txt" extension. To see how close two points can be in ArcMap try typing a file of points that get very, very, very close to each other and then seeing what "ArghGIS" does with them.)

Pre-set-up: First, examine the three text files with a text editor, wordprocessor, or the DOS/CMD "more" or "type" commands, to see what's in them and how they are layed-out.

Set-up: Start with an empty ArcMap project. Maybe - set the data frame properties to use world geographic coordinates (WGS84). Add the three text files to the project (with the add data button).

Display each as a table (rt clk, view table).

Display each as a map:

Convert each to a shapefile in your (U:/) directory:

and add the shapefiles into your project and remove the text files.

Part One Questions:

  1. How many census blocks are in the data?
  2. How many hospital points are in the data?
  3. How many school points are in the data file?
  4. How well do these data match your understanding of those phenomena on Oahu? (Are things in the right places? Are the data complete? How would/do you know?)
  5. What is the mean population for Oahu blocks?
  6. What proportion of Oahu's blocks have above that mean population?
  7. For the following consider a degree of arc to be approximately 110 kilometers. Use the buffer tool to create 5 km buffers (merged) around the schools and hospitals. Then use spatial selection to answer the next two questions.
  8. What proportion of Oahu's population lives within 5 km of a hospital?
  9. What proportion of Oahu's population lives within 5 km of a school?
  10. Now use spatial joins of the point sets to calculate each block's distance to hospitals and schools.
  11. What is the mean distance of an Oahu census block to a hospital?
  12. What is the mean distance of an Oahu census block to a school?
  13. Now, for a little thinking... blocks aren't people, so it might be more useful to weight block's distances by their populations for that type of analysis. (table, new field, calculate, distance X POP100)
  14. What is the mean distance of a person on Oahu to a hospital?
  15. What is the mean distance of a person on Oahu to a school?
  16. What assumptions about representing people's locations are we messing with to answer these last few questions?

Some Census Notes:

Census geography. The US census is collected and reported on a hierarchical spatial partitioning scheme. From top to bottom the hierarchy (with SUMLEV codes) is:

Nation (010) > Region > Division > State (040) > County (050) > Tract (140) > Block Group (150) > Block (101)

Census Summary Files. (Old School) SF1, SF2, SF3, and SF4 refer to particular releases of Census data, that differ in the topics and levels of detail included, and in the time required to release them. To understand them you should read their documentation, an example of which, for the 2010 SF1 data, can be found here: www.census.gov/prod/cen2010/doc/sf1.pdf.

The up-shot of that document is that the census data are distributed across a set of files (segments). The segments each contain a couple of hundred columns of data comprising several census tables, each of which might contain several variables, and report the variables at several levels of aggregation (blocks up to state). One of the segments contains geographic reference information. The data for each geographic unit can be linked across segments by Logical Record Numbers. This is the "old school" approach.

Part Two: Joining tabular data to shapefiles

Idea: Joining tables to shapefiles lets one access tabular data when needed without having to have it stored "in" the shapefile. I.e., one shapefile can be used with many different tables without having to become a huge single shapefile nor multiple shapefiles. Joining depends on two tables each having columns that contain matching "keys", so that records whose keys match can be joined together. The columns can have different names. It is the data in the columns that are used to make the join.

In this case, you'll join the two datasets below by the logical record numbers, an attribute join.

Data

hi00042010sf1.txt is "Segment 4" of the 2010 SF1 data, containing Census tables P10, P11, P12, P13 and P14 for Hawaii. There are 33,600 rows (33,599 plus one adding column names) and 244 columns of data. These census tables cover some race and age characteristic variables documented around page 6-30 in the pdf documentation noted above.

higeo2010sf1-extract.txt are several columns of the geographic reference segment of the 2010 SF1 release, converted to csv with column labels added. NB: in Spring 2013 one quotation mark is missing in the first line of the file... add it back in!

Before set-up: Skim through the sf1.pdf description of the census data, especially Chapter 5 and Chapter 7. In Chapter 7, the Table (Matrix) Section is especially useful for locating particular data. It points to the segment file and gives the reference variable name for the particular data. Page 6-30 gets to the data tables for this exercise.

Set-up: Start with a blank ArcMap project, set the frame coordinate system, add the higeo2010sf1-extract.text file to the project. Select the SUMLEV that you want (probably 140 tract level, but for other uses 150=block-group and 101=block might be good, see page 4-1 of the documentation to see all the options) and convert those points to a shapefile (rt ckl name, Data, Export, selected features). Add that shapefile to the project and remove the cruft.

(Attribute) Join the hi00042010sf1.txt file to it.

The steps in a tabular join are:

The example above should add the attribute data to the points. (NB, if you had a shapefile of polygons it would work the same way.)

Part Two Questions:

  1. Which Census data table and variable contain the total population count?
  2. What data variable contains the number of total number of females?
  3. Which Oahu Census block groups have the most disparate ratios of males to females? Is this a pattern that you can explain?
  4. What data variables contain the number of people between 18 and 25 years old?
  5. Describe the distribution of 18 to 25 year olds. Where are they concentrated? Where are they missing?
  6. What ages are in the oldest cohort in the data?
  7. Describe the distribution pattern of that oldest cohort.

Part Three - New-School Census access to shapefiles

Objectives: Get shapefile data from the Census Bureau. Load the shapefile. "Attribute join" to link the csv file.

This is a fairly easy way to get a shapefile of census geometry, but seems to limit you to only one attribute. And the default file naming ("thematicdata.csv"?) that they use complicates keeping data straight. (You can download several variables sequentially, and should rename each .csv file with something more descriptive before the next download over-writes it.)

Background. The Census is changing the way it does things. No more decennial dumps three years after the collection. Continuous collection of ACS data that are released for 1, 3, and 5 year periods. Data winnowing access...

Using 
American Factfinder 2... look for the "ADVANCED SEARCH" tab at
the top of the page and then...

In the Blue Tabs to the left:
1. select geographies
   Census tracts
   Hawaii state
   Honolulu county
   All tracts       
2. select topics
   e.g. DP02 (social), DP03 (economic), or DP04 (housing)

3. View Tab

4. Create a Map Tab
   select a variable to map (click on the data in the table)
   show map
   download the shapefile in zip format

5. extract the files into your directory

6. check what is in the CSV file.  

7. open the shapefile and link the data in the CSV file to it.

8. symbolize the data.  (quantitative)

9. print the map.

Part Three Questions

  1. What data did you pick to map?
  2. What were the names of the "Key fields" to link the files?
  3. Was all of the 'geography' (tracts) that you expected actually in the shapefile?

Part Four: Experimental Census KML Access with Spatial Join

The census is experimenting with distributing the geometry side of their data as "kml" files. They seem to be doing it oddly --- devoid of links (like logical record numbers) to the attribute side of the data. We can use "spatial join" to join them based on location.

Objectives: Get KML (Keyhole Markup Language) data from the Census Bureau. Load KML data into ArcGIS. Use Spatial join to relate data to census geography units.

See www.census.gov/geo/maps-data/data/tiger-kml.html and www2.census.gov/geo/tiger/KML/2010_Proto/Readme.txt for an overview and download the detailed Hawaii census tracts from: www2.census.gov/geo/tiger/KML/2010_Proto/2010tract_dt/. - these are detailed 2010 Census tracts in kml for each state - Hawaii is state "15"

download "2010dttract_15.kml"


Import the kml file into Arc...
  Turn on the "Data Interoperability" extension license...
  ToolBox -> Conversion Tools -> From KML -> KML to Layer
   (it will want to make a geodatabase, let it)

Notice that Census has provided neither LOGRECNO nor FIPS
codes to link data to this geometry.

Spatial join the Tract centroid points to these tract polygons.
(You may  want to select them first... SUMLEV=???)

Symbolize a data variable from the P10, P11, P12, P13, or P14
tables of your choice.

Part Four Questions

  1. What variable are you mapping?
  2. Does it make more sense to map the population characteristics as points or as polygons?
  3. Was this a variable that should be normalized by area or by population?
  4. Print and submit a copy of the map.