Monthly Archives: March 2013

acs.R: a worked example using blockgroup-level data

Posted by Ezra Glenn on March 11, 2013
Census, Code / 2 Comments

A very nice user wrote the following in an email to me about the latest version of the acs.R package:

 
> Thanks for providing such a wonderful package in R. I'm having
> difficulty defining a geo at the block group level. Would you mind
> sharing an example with me?

I responded via email, but thought that my answer — which took the form of a short worked-example — might be helpful to others, so I am posting it here as well. Here’s what I said:

To showcase how the package can create new census geographies based on stuff like blockgroups, let’s look in my home state of Massachusetts, in Middlesex County. If I wanted to get info on all the block groups for tract 387201,1 I could create a new geo like this:

> my.tract=geo.make(state="MA", county="Middlesex", 
  tract=387201, block.group="*", check=T)
Testing geography item 1: Tract 387201, Blockgroup *, 
  Middlesex County, Massachusetts .... OK.
> 

(This might be a useful first step, especially if I didn’t know how many block groups there were in the tract, or what they were called. Also, note that check=T is not required, but can often help ensure you are dealing with valid geos.)

If I then wanted to get very basic info on these block groups – say, table number B01003 (Total Population), I could type:

> total.pop=acs.fetch(geo=my.tract, table.number="B01003")
> total.pop
ACS DATA: 
 2007 -- 2011 ;
  Estimates w/90% confidence intervals;
  for different intervals, see confint()
              B01003_001  
Block Group 1 2681 +/- 319
Block Group 2 952 +/- 213 
Block Group 3 1010 +/- 156
Block Group 4 938 +/- 214 
> 

Here we can see that the block.group=”*” has yielded the actual four block groups for the tract.

Now, if instead of wanting all of them, we only wanted the first two, we could just type:

> my.bgs=geo.make(state="MA", county="Middlesex", 
  tract=387201, block.group=1:2, check=T)
Testing geography item 1: Tract 387201, Blockgroup 1, 
  Middlesex County, Massachusetts .... OK.
Testing geography item 2: Tract 387201, Blockgroup 2, 
  Middlesex County, Massachusetts .... OK.
> 

And then:

> bg.total.pop=acs.fetch(geo=my.bgs, table.number="B01003")
> bg.total.pop
ACS DATA: 
 2007 -- 2011 ;
  Estimates w/90% confidence intervals;
  for different intervals, see confint()
              B01003_001  
Block Group 1 2681 +/- 319
Block Group 2 952 +/- 213 
> 

Now, if we wanted to add in some blockgroups from tract 387100 (a.k.a. “tract 3871” — but remember: we need those trailing zeroes) – say, blockgroups 2 and 3 – we could enter:

> my.bgs=my.bgs+geo.make(state="MA", county="Middlesex", 
  tract=387100, block.group=2:3, check=T)
Testing geography item 1: Tract 387100, Blockgroup 2, 
  Middlesex County, Massachusetts .... OK.
Testing geography item 2: Tract 387100, Blockgroup 3, 
  Middlesex County, Massachusetts .... OK.

And then:

> new.total.pop=acs.fetch(geo=my.bgs, table.number="B01003")
> new.total.pop
ACS DATA: 
 2007 -- 2011 ;
  Estimates w/90% confidence intervals;
  for different intervals, see confint()
              B01003_001  
Block Group 1 2681 +/- 319
Block Group 2 952 +/- 213 
Block Group 2 827 +/- 171 
Block Group 3 1821 +/- 236
> 

Note that the short rownames can be confusing – as in this example — but if you type:

> geography(new.total.pop)
           NAME state county  tract blockgroup
1 Block Group 1    25     17 387201          1
2 Block Group 2    25     17 387201          2
3 Block Group 2    25     17 387100          2
4 Block Group 3    25     17 387100          3
> 

you can see that the two entries for “Block Group 2” are actually in different tracts. (Also note: you can combine block groups and other levels of geography, all in a single geo objects…)

And now, to show off the coolest part! Let’s say I don’t just want to get data on the four blockgroups, but I want to combine them into a single new geographic entity. Before downloading, I could simply say:

> combine(my.bgs)=T
> combine.term(my.bgs)="Select Blockgroups"
> new.total.pop=acs.fetch(geo=my.bgs, table.number="B01003")
> new.total.pop
ACS DATA: 
 2007 -- 2011 ;
  Estimates w/90% confidence intervals;
  for different intervals, see confint()
                   B01003_001               
Select Blockgroups 6281 +/- 481.733328720362
>

And see – voila! – it sums the estimates and deals with the margins of error, so you don’t need to get your hands dirty with square roots and standard errors and all that messy stuff.

You can even create interesting nested geo.sets, where some of the lower levels are combined, like this:

> combine.term(my.bgs)="Select Blockgroups, 
  Tracts 387100 and 387201"
> more.bgs=c(my.bgs, geo.make(state="MA", 
  county="Middlesex", tract=370300, block.group=1:2, check=T), 
  geo.make(state="MA", county="Middlesex", tract=370400, 
  block.group=1:3, combine=T, combine.term="Select Blockgroups, 
  Tract 3703", check=T)) 
Testing geography item 1: Tract 370300, Blockgroup 1, 
  Middlesex County, Massachusetts .... OK.
Testing geography item 2: Tract 370300, Blockgroup 2, 
  Middlesex County, Massachusetts .... OK.
Testing geography item 1: Tract 370400, Blockgroup 1, 
  Middlesex County, Massachusetts .... OK.
Testing geography item 2: Tract 370400, Blockgroup 2, 
  Middlesex County, Massachusetts .... OK.
Testing geography item 3: Tract 370400, Blockgroup 3, 
 Middlesex County, Massachusetts .... OK.
> more.total.pop=acs.fetch(geo=more.bgs, table.number="B01003")
> more.total.pop
ACS DATA: 
 2007 -- 2011 ;
  Estimates w/90% confidence intervals;
  for different intervals, see confint()
                                             B01003_001               
Select Blockgroups, Tracts 387100 and 387201 6281 +/- 481.733328720362
Block Group 1                                315 +/- 132              
Block Group 2                                1460 +/- 358             
Select Blockgroups, Tract 3703               2594 +/- 487.719181496894
> 

In closing: I hope this helps, and be sure to contact me if you have other questions/problems about using the package.

Footnotes:

1 Note that tracts are often referred to in a strange “four-digit+decimal extension” shorthand, so “tract 387201” may be also known as “tract 3872.01”. When working with this package, be careful and always use six-digit tract numbers in this package without the decimal point. If the tract number seems to only be four digits long, add two extra “trailing” zeroes at the end.

Tags: , , , ,