Posted by Ezra Glenn
on March 29, 2012
Census,
Code /
No Comments
As noted elsewhere here on CityState, I’ve developed a package for working with data from the American Community Survey in the R statistical computing language. The most recent official version of the package is 0.8, which can be found on CRAN. Since the package is still in active development, I’ve decided to provide development snapshots here, for users who are looking to work with the latest code as I develop it.
I’m hoping that the next major release will be version 1.0, due out sometime this spring. As I work towards that, here is version 0.8.1, which can be considered the first “snapshot” headed toward this release.
acs_0.8.1.tar.gz
To install, simply download, start R, and type:
> install.packages("path/to/file//acs_0.8.1.tar.gz")
> library(acs)
Updates include:
read.acs can now accept either a csv or a zip file downloaded directly from the FactFinder site, and it does a much better job (a) guessing how many rows to skip, (b) figuring out how to generate intelligent variable names for the columns, and (c) dealing with arcane non-numeric symbols used by FactFinder for some estimates and margins of error.
plot now includes a true.min= option, which allows you to
specify whether you want to allow error bars to span into negative values (true.min=T, the default), or to bound them at zero (true.min=F – or some other numeric value). This seemed necessary because it looks silly to say “The number of children who speak Spanish in this tract is 15, plus or minus 80…” At the same time, if the variable turns out to be something like the difference in the income of Males and the income if Females in the geography, a negative value may make a lot of sense, and should be plotted as such.
Tags: acs, census, data, R, software
Posted by Ezra Glenn
on March 18, 2012
Census,
Code /
2 Comments
I’ve just released a new version of my acs package for working with the U.S. Census American Community Survey data in R, available on CRAN. The current version 0.8 includes all the original version 0.6 code, plus a whole lot more features and fixes. Some highlights:
- An improved
read.acs function for importing data downloaded from the Census American FactFinder site.
rbind and cbind functions to help create larger acs objects from smaller ones.
- A new
sum method to aggregate rows or columns of ACS data, dealing correctly with both estimates and standard errors.
- A new
apply method to allow users to apply virtually any function to each row or column of an acs data object.
- A snazzy new
plot method, capable of plotting both density plots (for estimates of a single geography and variable) and multiple estimates with errors bars (for estimates of the same variable over multiple geographies, or vice versa). See sample plots below.


- New functions to deal with adjusting the nominal values of currency from different years for the purpose of comparing between one survey and another. (See
currency.convert and currency.year in the documentation.)
- A new tract-level dataset from the ACS for Lawrence, MA, with dollar value currency estimates (useful to show off the aforementioned new currency conversion functions).
- A new
prompt method to serve as a helper function when changing geographic rownames or variable column names.
- Improved documentation on the
acs class and all of these various new functions and methods, with examples.
With this package, once you’ve found and downloaded your data from FactFinder, you can read it into R with a single command, aggregate multiple tracts into a neighborhood with another, generate a table of estimates and confidence intervals for your neighborhood with a third command, and a produce a print-ready plot of your data (complete with error bars for the margins of error) with a fourth:
my.data=read.acs("some_data.csv")
my.neighborhood=apply(my.data, FUN="sum", MARGIN=1, agg.term="My.Neighborhood")
confint(my.neighborhood, conf.level=.95)
plot(my.neighborhood, col="blue", err.col="violet", pch=16)
Already this package has come a long way, in large part thanks to the input of R users, so please check it out and let me know what you think — and how I can make it better.
Tags: acs, census, data, R, software
Posted by Ezra Glenn
on February 11, 2012
Census,
Good Causes /
No Comments
On March 14, 2012, I’ll be working again with the Mel King Institute for Community Building to offer a half-day training in “Making Use of Local Census Data.” We designed the class for planners and community development practitioners working at the neighborhood-scale, and we’ll talk about ways to access the latest data from the U.S. Census American Community Survey (and how to use it responsibly).
Unlike earlier versions of the training, we’ll be working exclusively with the New American Factfinder (previously discussed in this post) to download data. We’ve also moved the class to one of MIT’s computer labs, and added an hour at the end as a “clinic,” so participants will get some hands-on time to dig up data on their own community.
For more information about the Mel King Institute, or to register for the training, see this page. See you there!
Tags: CDCs, census, factfinder, MKI
Posted by Ezra Glenn
on November 09, 2011
Census,
Data,
News/Commentary /
No Comments
Update: a few weeks ago I posted this article calling attention to yet more delays in the unrolling of the long-awaited Supplemental Poverty Measure from the Census Bureau. As it turns out, they have recently announced that this new index is in fact ready for prime time (see, for example, this press release).
More analysis and thoughts later after I am able to take a look, but I wanted to file something quick just to acknowledge the effort to get something out there.
Tags: census, data, poverty
Posted by Ezra Glenn
on October 17, 2011
Census,
News/Commentary /
No Comments
Last year, I was excited to learn that Census Bureau was beginning work to establish a new, supplemental poverty measure, to address long-standing problems with the official statistic. The Bureau was quick to ensure us that the official poverty measure would continue to be used to establish eligibility for government programs, and “will remain the definitive statistical measure,” but based on their elegant description of the new measure as “a more complex and refined statistic,” both the data fiends and the poverty scholars started to get excited. I was reminded of a great story by Barry Bluestone, Director of the Dukakis Center at Northeastern University: as he tells it, while trying to explain about different ways to measure unemployment to a reporter, he became frustrated at the press’s unwillingness to delve into the complexity of these numbers. When the reporter explained, “I can’t print three different numbers for unemployment—people won’t follow that,” Bluestone retorted, “Have you ever read your paper?,” and went on to point out how almost every section had multiple measures: weather (temperature, wind-chill, humidity index), business (high, low, 52-week average), sports (batting average, slugging percentage, RBIs, OBP, and so on).
Unfortunately, it appears that the supplemental poverty measure is the latest good idea to fall victim to budget cuts: in a recent update (which received significantly less press than the original release), the Bureau reports:
Since the FY 2011 federal budget did not include the funding requested by the President for the Supplemental Poverty Measure (SPM) initiative, the Census Bureau and the Bureau of Labor Statistics do not currently have the resources necessary to move the Supplemental Poverty Measure from research mode to production mode. Without these additional resources, the September 2011 release date for the Supplemental Poverty Measure estimates suggested in the Interagency Technical Working Group document is not feasible.
The update goes on to note the useful ground-work that was undertaken over the past 18 months on the topic, including a few conferences and some very useful reports (see the Census Bureau page collecting Working Papers and Conference Presentations), and promising some modified approach to yield at least partial results in the near future, but overall you gotta figure it’s pretty bad when we can’t even afford to measure how poor we’ve become.
Tags: census, measurement, poverty
Posted by Ezra Glenn
on October 07, 2011
Census,
Self-promotion /
No Comments
The Census Bureau continues to roll out the latest data from the American Community Survey, last month announcing the availability of the first real nationwide data for 2010 in the form of the 1-year ACS Estimates. I’ve been conducting some trainings on how to get and use ACS data for local-level community planning (self promotion: check out the Mel King Institute’s training in Boston, or wait for us to offer it again), which has prompted me to pay some more attention to the new “American FactFinder” platform, which has prompted me to write this post.
Continue reading…
Tags: census, data, factfinder, planning