Monthly Archives: March 2012

Constantly Improving: acs development versions

Posted by Ezra Glenn on March 29, 2012
Census, Code / No Comments

As noted elsewhere here on CityState, I’ve developed a package for working with data from the American Community Survey in the R statistical computing language. The most recent official version of the package is 0.8, which can be found on CRAN. Since the package is still in active development, I’ve decided to provide development snapshots here, for users who are looking to work with the latest code as I develop it.

I’m hoping that the next major release will be version 1.0, due out sometime this spring. As I work towards that, here is version 0.8.1, which can be considered the first “snapshot” headed toward this release.

acs_0.8.1.tar.gz

To install, simply download, start R, and type:

 

> install.packages("path/to/file//acs_0.8.1.tar.gz") > library(acs)

Updates include:

  • read.acs can now accept either a csv or a zip file downloaded directly from the FactFinder site, and it does a much better job (a) guessing how many rows to skip, (b) figuring out how to generate intelligent variable names for the columns, and (c) dealing with arcane non-numeric symbols used by FactFinder for some estimates and margins of error.
  • plot now includes a true.min= option, which allows you to

specify whether you want to allow error bars to span into negative values (true.min=T, the default), or to bound them at zero (true.min=F – or some other numeric value). This seemed necessary because it looks silly to say “The number of children who speak Spanish in this tract is 15, plus or minus 80…” At the same time, if the variable turns out to be something like the difference in the income of Males and the income if Females in the geography, a negative value may make a lot of sense, and should be plotted as such.

Tags: , , , ,

acs Package Updated: version 0.8 now on CRAN

Posted by Ezra Glenn on March 18, 2012
Census, Code / 2 Comments

I’ve just released a new version of my acs package for working with the U.S. Census American Community Survey data in R, available on CRAN. The current version 0.8 includes all the original version 0.6 code, plus a whole lot more features and fixes. Some highlights:

  • An improved read.acs function for importing data downloaded from the Census American FactFinder site.
  • rbind and cbind functions to help create larger acs objects from smaller ones.
  • A new sum method to aggregate rows or columns of ACS data, dealing correctly with both estimates and standard errors.
  • A new apply method to allow users to apply virtually any function to each row or column of an acs data object.
  • A snazzy new plot method, capable of plotting both density plots (for estimates of a single geography and variable) and multiple estimates with errors bars (for estimates of the same variable over multiple geographies, or vice versa). See sample plots below.

 

  • New functions to deal with adjusting the nominal values of currency from different years for the purpose of comparing between one survey and another. (See currency.convert and currency.year in the documentation.)
  • A new tract-level dataset from the ACS for Lawrence, MA, with dollar value currency estimates (useful to show off the aforementioned new currency conversion functions).
  • A new prompt method to serve as a helper function when changing geographic rownames or variable column names.
  • Improved documentation on the acs class and all of these various new functions and methods, with examples.

With this package, once you’ve found and downloaded your data from FactFinder, you can read it into R with a single command, aggregate multiple tracts into a neighborhood with another, generate a table of estimates and confidence intervals for your neighborhood with a third command, and a produce a print-ready plot of your data (complete with error bars for the margins of error) with a fourth:

my.data=read.acs("some_data.csv")
my.neighborhood=apply(my.data, FUN="sum", MARGIN=1, agg.term="My.Neighborhood") 
confint(my.neighborhood, conf.level=.95) 
plot(my.neighborhood, col="blue", err.col="violet", pch=16)

Already this package has come a long way, in large part thanks to the input of R users, so please check it out and let me know what you think — and how I can make it better.

Tags: , , , ,

TEDx + inTeractive Somerville

Posted by Ezra Glenn on March 05, 2012
Good Causes / No Comments

Yesterday I was pleased to be a part of the first-ever TEDx Somerville event. I only had four minutes on stage (which, if you’ve ever heard me speak, is barely enough time to get through a few opening wise-cracks), but it did provide a great platform to plug SCC’s new inTeractive Somerville website. In an attempt to showcase the ability to use this site to encourage and enable sharing (of ideas, data, meeting notes, news and personal stories, and more), I snapped a quick photo of the TEDx audience and by the end of the talk had it uploaded, geo-tagged, and posted to create a new discussion thread. (Full disclosure: there was actually some shifty behind-the-scenes driving tricks going on, thanks to Christian Spanring, who was hidden just offstage.)

The videos aren’t posted yet, but when they do go live, be sure to skip right past me and look for Somerville’s own “Alex the Jester” playing different tunes on three recorders all at the same time.

PS: If you are interested in learning more about inTeractive Somerville, and possibly adapting the platform for use in your own community, you should know that the code for the site is all open-source and available at github.

Tags: , , , , ,