acs.R example: downloading all the tracts in a county or state

Posted by Ezra Glenn on July 03, 2013
Census, Code

An acs.R user asks:

 
> How do I use acs to download all the census tracts? is there
> some handy command to do that?

Here’s some help:

All the tracts in a single county

You can’t automatically download all the tracts for the whole country (or even for an entire state) in a single step (but see below for ways to do this). If you just need all the tracts in a single county, it’s really simple — just use the “*” wildcard for the tract number when creating your geo.set.

The example below creates a geo.set for all the tracts in Middlesex County, Massachusetts, and then downloads data from ACS table B01003 on Total Population for them.

> my.tracts=geo.make(state="MA", county="Middlesex", tract="*") 
> acs.fetch(geography=my.tracts, table.number="B01003")

All the tracts in a state

If you happen to have a vector list of the names (or FIPS codes) of all the counties in a given state (or the ones you want), you could do something like this to get all the tracts in each of them:

> all.tracts=geo.make(state="MA", county=list.of.counties, 
  tract="*")
> acs.fetch(geography=all.tracts, table.number="B01003")

As an added bonus, if you don’t happen to have a list of counties, but want to use the package to get one, you could do something like this:

> mass=acs.fetch(geography=geo.make(state=25, county="*"), 
  table.number="B01003")

#  mass is now a new acs object with data for each county in
#  Massachusetts.  The "geography" function returns a dataframe of the
#  geographic metadata, which includes FIPS codes as the third
#  column.  So you can use it like this:

> all.tracts=geo.make(state="MA", 
  county=as.numeric(geography(mass)[[3]]), 
  tract="*", check=T)
> acs.fetch(geography=all.tracts, table.number="B01003")

All the tracts in the entire country

In theory, you could even use this to get all the tracts from all the 3,225 counties in the country:

> all.counties=acs.fetch(geography=geo.make(state="*", county="*"),
  table.number="B01003")
> all.tracts=geo.make(state=as.numeric(geography(all.counties)[[2]]),,
  county=as.numeric(geography(all.counties)[[3]]), tract="*", check=T)

Unfortunately (or perhaps fortunately), this is just too much for R to download without changing some of the internal variables that limit this sort of thing — if you try, R will complain with “Error: evaluation nested too deeply: infinite recursion…” To prove to yourself that it works, you could limit the number of counties to just the first 250, and try that — it will get you from Autauga County, Alabama to Bent County, Colorado.

> some.counties=all.counties[1:250]
> some.tracts=geo.make(state=as.numeric(geography(some.counties)[[2]]), 
  county=as.numeric(geography(some.counties)[[3]]), tract="*", check=T)
> lots.of.data=acs.fetch(geography=some.tracts, table.number="B01003")

This is really a lot of data — on my machine, this took about 18 seconds, resulting in a new acs object containing population data on 11,872 different tracts. I haven’t checked to see what the upper limits are, but I imagine it wouldn’t take much to figure out a way to get tract-level data from all 3,225 counties. (But remember: with great power comes great responsibility — don’t be too rough on downloading stuff from the Census, even if it is free and easy.)

Using the built-in FIPS data

An alternative approach to these last two examples would be to use the FIPS datasets that we’ve built-in to the acs.R package. For example, the “fips.county” dataset includes the names of each county, by state. Feed this (or part of this) to your geo.make command and you can do all sorts of neat things.

> head(fips.county)
  State State.ANSI County.ANSI    County.Name ANSI.Cl
1    AL          1           1 Autauga County      H1
2    AL          1           3 Baldwin County      H1
3    AL          1           5 Barbour County      H1
4    AL          1           7    Bibb County      H1
5    AL          1           9  Blount County      H1
6    AL          1          11 Bullock County      H1
> 

So instead of the last block above, you could do something like this:

> random.counties=sample(x=3225,size=20, replace=F)
> some.tracts=geo.make(state=fips.county[random.counties,1], 
  county=fips.county[random.counties,3], tract="*", check=T)
Testing geography item 1: Tract *, Ponce Municipio, Puerto Rico .... OK.
Testing geography item 2: Tract *, Alleghany County, North Carolina .... OK.
Testing geography item 3: Tract *, Wayne County, Pennsylvania .... OK.
Testing geography item 4: Tract *, Comerio Municipio, Puerto Rico .... OK.
Testing geography item 5: Tract *, Lafayette County, Wisconsin .... OK.
Testing geography item 6: Tract *, Hartford County, Connecticut .... OK.
Testing geography item 7: Tract *, Real County, Texas .... OK.
Testing geography item 8: Tract *, Costilla County, Colorado .... OK.
Testing geography item 9: Tract *, Sarpy County, Nebraska .... OK.
Testing geography item 10: Tract *, McLennan County, Texas .... OK.
Testing geography item 11: Tract *, Donley County, Texas .... OK.
Testing geography item 12: Tract *, McIntosh County, Georgia .... OK.
Testing geography item 13: Tract *, Chilton County, Alabama .... OK.
Testing geography item 14: Tract *, Richland County, Montana .... OK.
Testing geography item 15: Tract *, Mitchell County, Kansas .... OK.
Testing geography item 16: Tract *, Muscogee County, Georgia .... OK.
Testing geography item 17: Tract *, Martin County, Indiana .... OK.
Testing geography item 18: Tract *, Naguabo Municipio, Puerto Rico .... OK.
Testing geography item 19: Tract *, Aguas Buenas Municipio, Puerto Rico .... OK.
Testing geography item 20: Tract *, Washington County, Arkansas .... OK.

> # you may get different counties in your random set
>
> acs.fetch(geography=some.tracts, table.number="B01003")

Which will return population data from all the tracts in a random set of 20 counties.

Tags: , , , ,

Leave a Reply