A very nice user wrote the following in an email to me about the latest version of the acs.R package:
> Thanks for providing such a wonderful package in R. I'm having > difficulty defining a geo at the block group level. Would you mind > sharing an example with me?
I responded via email, but thought that my answer — which took the form of a short worked-example — might be helpful to others, so I am posting it here as well. Here’s what I said:
To showcase how the package can create new census geographies based on stuff like blockgroups, let’s look in my home state of Massachusetts, in Middlesex County. If I wanted to get info on all the block groups for tract 387201,1 I could create a new geo like this:
> my.tract=geo.make(state="MA", county="Middlesex", tract=387201, block.group="*", check=T) Testing geography item 1: Tract 387201, Blockgroup *, Middlesex County, Massachusetts .... OK. >
(This might be a useful first step, especially if I didn’t know how many block groups there were in the tract, or what they were called. Also, note that check=T is not required, but can often help ensure you are dealing with valid geos.)
If I then wanted to get very basic info on these block groups – say, table number B01003 (Total Population), I could type:
> total.pop=acs.fetch(geo=my.tract, table.number="B01003") > total.pop ACS DATA: 2007 -- 2011 ; Estimates w/90% confidence intervals; for different intervals, see confint() B01003_001 Block Group 1 2681 +/- 319 Block Group 2 952 +/- 213 Block Group 3 1010 +/- 156 Block Group 4 938 +/- 214 >
Here we can see that the block.group=”*” has yielded the actual four block groups for the tract.
Now, if instead of wanting all of them, we only wanted the first two, we could just type:
> my.bgs=geo.make(state="MA", county="Middlesex", tract=387201, block.group=1:2, check=T) Testing geography item 1: Tract 387201, Blockgroup 1, Middlesex County, Massachusetts .... OK. Testing geography item 2: Tract 387201, Blockgroup 2, Middlesex County, Massachusetts .... OK. >
And then:
> bg.total.pop=acs.fetch(geo=my.bgs, table.number="B01003") > bg.total.pop ACS DATA: 2007 -- 2011 ; Estimates w/90% confidence intervals; for different intervals, see confint() B01003_001 Block Group 1 2681 +/- 319 Block Group 2 952 +/- 213 >
Now, if we wanted to add in some blockgroups from tract 387100 (a.k.a. “tract 3871” — but remember: we need those trailing zeroes) – say, blockgroups 2 and 3 – we could enter:
> my.bgs=my.bgs+geo.make(state="MA", county="Middlesex", tract=387100, block.group=2:3, check=T) Testing geography item 1: Tract 387100, Blockgroup 2, Middlesex County, Massachusetts .... OK. Testing geography item 2: Tract 387100, Blockgroup 3, Middlesex County, Massachusetts .... OK.
And then:
> new.total.pop=acs.fetch(geo=my.bgs, table.number="B01003") > new.total.pop ACS DATA: 2007 -- 2011 ; Estimates w/90% confidence intervals; for different intervals, see confint() B01003_001 Block Group 1 2681 +/- 319 Block Group 2 952 +/- 213 Block Group 2 827 +/- 171 Block Group 3 1821 +/- 236 >
Note that the short rownames can be confusing – as in this example — but if you type:
> geography(new.total.pop) NAME state county tract blockgroup 1 Block Group 1 25 17 387201 1 2 Block Group 2 25 17 387201 2 3 Block Group 2 25 17 387100 2 4 Block Group 3 25 17 387100 3 >
you can see that the two entries for “Block Group 2” are actually in different tracts. (Also note: you can combine block groups and other levels of geography, all in a single geo objects…)
And now, to show off the coolest part! Let’s say I don’t just want to get data on the four blockgroups, but I want to combine them into a single new geographic entity. Before downloading, I could simply say:
> combine(my.bgs)=T > combine.term(my.bgs)="Select Blockgroups" > new.total.pop=acs.fetch(geo=my.bgs, table.number="B01003") > new.total.pop ACS DATA: 2007 -- 2011 ; Estimates w/90% confidence intervals; for different intervals, see confint() B01003_001 Select Blockgroups 6281 +/- 481.733328720362 >
And see – voila! – it sums the estimates and deals with the margins of error, so you don’t need to get your hands dirty with square roots and standard errors and all that messy stuff.
You can even create interesting nested geo.sets, where some of the lower levels are combined, like this:
> combine.term(my.bgs)="Select Blockgroups, Tracts 387100 and 387201" > more.bgs=c(my.bgs, geo.make(state="MA", county="Middlesex", tract=370300, block.group=1:2, check=T), geo.make(state="MA", county="Middlesex", tract=370400, block.group=1:3, combine=T, combine.term="Select Blockgroups, Tract 3703", check=T)) Testing geography item 1: Tract 370300, Blockgroup 1, Middlesex County, Massachusetts .... OK. Testing geography item 2: Tract 370300, Blockgroup 2, Middlesex County, Massachusetts .... OK. Testing geography item 1: Tract 370400, Blockgroup 1, Middlesex County, Massachusetts .... OK. Testing geography item 2: Tract 370400, Blockgroup 2, Middlesex County, Massachusetts .... OK. Testing geography item 3: Tract 370400, Blockgroup 3, Middlesex County, Massachusetts .... OK. > more.total.pop=acs.fetch(geo=more.bgs, table.number="B01003") > more.total.pop ACS DATA: 2007 -- 2011 ; Estimates w/90% confidence intervals; for different intervals, see confint() B01003_001 Select Blockgroups, Tracts 387100 and 387201 6281 +/- 481.733328720362 Block Group 1 315 +/- 132 Block Group 2 1460 +/- 358 Select Blockgroups, Tract 3703 2594 +/- 487.719181496894 >
In closing: I hope this helps, and be sure to contact me if you have other questions/problems about using the package.
Footnotes:
1 Note that tracts are often referred to in a strange “four-digit+decimal extension” shorthand, so “tract 387201” may be also known as “tract 3872.01”. When working with this package, be careful and always use six-digit tract numbers in this package without the decimal point. If the tract number seems to only be four digits long, add two extra “trailing” zeroes at the end.