Examining Historical Growth I: Basic trends

Posted by Ezra Glenn on April 11, 2012
Data, Missions, Shape Your Neighborhood, Simulation / No Comments

The nature of predictions

To paraphrase John Allen Paulos, author of A Mathematician Reads the Newspaper, all expert predictions can be essentially restated in one of two ways: “Things will continue roughly as they have been until something changes”; and its corollary, “Things will change after an indeterminate period of stability.” Although these statements are both true and absurd, they contain a kernel of wisdom: simply assuming a relative degree of stability and painting a picture of the future based on current trends is the first step of scenario planning. The trick, of course, is to never completely forget the “other shoe” of Paulos’s statement: as the disclaimer states on all investment offerings, “Past performance is not a guarantee of future results”; at some point in the future our present trends will no longer accurately describe where we are headed. (We will deal with this as well, with a few “safety valves.”)

From the second stage of the Rational Planning Paradigm (covered in the background sections of the book) we should have gathered information on both past and present circumstances related to our planning effort. If we are looking at housing production, we might have data on annual numbers of building permits and new subdivision approvals, mortgage rates, and housing prices; if we are looking at public transportation we might need monthly ridership numbers, information of fare changes, population and employment figures, and even data on past weather patterns or changes in vehicle ownership and gas prices. The first step of projection, therefore, is to gather relevant information and get it into a form that you can use.

Since we will be thinking about changes over time in order to project a trend into the future, we’ll need to make sure that our data has time as an element: a series of data points with one observation for each point or period of time is known as a time series. The exact units of time are not important—they could be days, months, years, decades, or something different—but it is customary (and important) to obtain data where points are regularly spaced at even intervals.1 Essentially, time series data is a special case of multivariate data in which we treat time itself as an additional variable and look for relationships as it changes. Luckily, R has some excellent functions and packages for dealing with time-series data, which we will cover below in passing. For starters, however, let’s consider a simple example, to start to think about what goes into projections. Continue reading…

Tags: , , , , ,

acs Package at Upcoming Conference: UseR! 2012

Posted by Ezra Glenn on April 09, 2012
Census, Code, Self-promotion / No Comments

I’m happy to report that I’ll be giving a paper on my acs package at the 8th annual useR! conference, Coming June 12-15th to Vanderbilt University in Nashville, TN. The paper is titled “Estimates with Errors and Errors with Estimates: Using the R acs Package for Analysis of American Community Survey Data.” Here’s the abstract:


"Estimates with Errors and Errors with Estimates: Using the R acs
Package for Analysis of American Community Survey Data"
Ezra Haber Glenn

Over the past decade, the U.S. Census Bureau has implemented the
American Community Survey (ACS) as a replacement for its traditional
decennial ``long-form'' survey.  Last year—for the first time
ever—ACS data was made available at the census tract and block group
level for the entire nation, representing geographies small enough to
be useful to local planners; in the future these estimates will be
updated on a yearly basis, providing much more current data than was
ever available in the past.  Although the ACS represents a bold
strategy with great promise for government planners, policy-makers,
and other advocates working at the neighborhood scale, it will require
them to become comfortable with statistical techniques and concerns
that they have traditionally been able to avoid.

To help with this challenge the author has been working with
local-level planners to determine the most common problems associated
with using ACS data, and has implemented these functions as a package
in R.  The package—currently hosted on CRAN in version 0.8—defines
a new ``acs'' class object (containing estimates, standard errors, and
metadata for tables from the ACS), with methods to deal appropriately
with common tasks (e.g., combining subgroups or geographies,
mathematical operations on estimates, tests of significance, plots of
confidence intervals, etc.).

This paper will present both the use and the internal structure of the
package, with discussion of additional lines of development.

Hope to see you all there!

Tags: , , , , ,

A Richer Neighborhood Profile, Part I: Getting tract-level data

Posted by Ezra Glenn on April 08, 2012
Census, Missions, Reconnaissance, Shape Your Neighborhood / No Comments

In a previous mission (see Finding Obama in the smallest Census geography) we delved down to the see what data was available at the level of individual blocks. Unfortunately, as we noted there, the Census doesn’t provide a whole lot of useful data at the block-level, since the results will exclude sample data from the SF3 “long form” (or, post-2000, the American Community Survey). If we want to know more about a neighborhood we will need to think in slightly larger geographies, and seek data at the tract-level or higher.

For this mission, we’ll be zooming into to Park Slope neighborhood on Brooklyn, and gathering data on income, race, education, and the breakdown of owners and renters for a single census tract. Since its often helpful to be able to view data like this in the context of the surrounding neighborhood, subsequent missions will explore ways to make comparisons with this sort of data, either to other tracts or to larger geographies.

But for starters, our target: although defining the exact edges of a neighborhood is never easy – especially ones in dense, diverse areas, where even residents disagree over terminology and the continual processes of gentrification, urban decline, migration, and other demographic shifts continually redefine the categories – most observers would agree that the neighborhood extends roughly north and west from Bartel Pritchard Square, at the lower corner of Prospect Park, with both 15th Street and Prospect Park itself providing something of an “edge.” Since edges are often exciting places to observe change, we will select an address along 15th Street, near the corner of 5th Avenue. Continue reading…

Tags: , , , ,

Master Plan (Robert Todd, 2011)

Posted by Ezra Glenn on April 03, 2012
Film / No Comments

Yesterday we screened Robert Todd’s Master Plan at MIT’s Department of Urban Studies & Planning, and I’m still thinking about it. It’s a beautiful documentary of the best kind: one that presents stirring images and thought-provoking juxtapositions, but once stirred and provoked the viewer’s thoughts are allowed to marinate a while. The film shies away from any pat conclusions, seeming much more comfortable presenting a landscape of places, ideas, and lines of inquiry for us to wander and ponder along with Todd, rather than a single “punch line” he wants us to “get”; I was reminded of the line from Zen and the Art of Motorcycle Maintenance, where Pirsig talks about the importance of thinking about “what things are,” and not just “what things mean.”

Indeed, the film had a certain Zen-like quality, both in its attention to small details and quietly “just being” in the places it explores, as well as its non-attachment to a single-purpose narrative. Although described as “a feature length film about housing,” its scope extends far beyond simply looking at physical housing: its subject is homes, habitats, communities, neighborhoods, buildings, landscapes, and the ways people interact in, around, and with them; the bulk of the footage presents a wonderfully rich portrait — or perhaps nonstop pan — of the ways humans live in places. Beyond all this — and the luxuriously decompressed pace takes plenty of time meandering before arriving at this point — the focal point of the film finally settles on a prolonged meditation on the homes and communities of incarcerated individuals, which is apparently a longer-term project for Todd. (An earlier film, In Loving Memory, explored the experiences of prisoners on death row; his next major project will examine ways that former prisoners are re-integrated into their home communities.)

Continue reading…

Tags: , , , ,

Building Blocks: Finding Obama in the smallest Census geography

Posted by Ezra Glenn on April 02, 2012
Missions, Reconnaissance, Shape Your Neighborhood / 1 Comment

The most basic unit of the U.S. Census is the individual household — that’s who fills out the surveys – but the Census won’t report data at the household level: in order to deliver on its promise of privacy and confidentiality (and thereby ensure our willingness to be enumerated), the Census always aggregates data before releasing it. This is important, and should become something of a mantra for would-be data analysts: all Census data is summary data. That said, we can still learn quite a lot at these micro-geographies, especially when we know what we are looking for.

Finding Barack

As an example of how to work with the building blocks of Census summary data – the individual “blocks” – let’s go back a bit in time and look at a very particular neighborhood in Chicago. At the time of the 2000 Census, President Obama was serving as a Senator from Illinois, living at 5429 S. Harper Avenue in Chicago. Starting with just an address, you can easily find how it fits into the census geography on the “American FactFinder” site: just visit the main Census site, click the menu-bar for Data, and select the link for American FactFinder.

Continue reading…

Tags: , , , , , ,

Constantly Improving: acs development versions

Posted by Ezra Glenn on March 29, 2012
Census, Code / No Comments

As noted elsewhere here on CityState, I’ve developed a package for working with data from the American Community Survey in the R statistical computing language. The most recent official version of the package is 0.8, which can be found on CRAN. Since the package is still in active development, I’ve decided to provide development snapshots here, for users who are looking to work with the latest code as I develop it.

I’m hoping that the next major release will be version 1.0, due out sometime this spring. As I work towards that, here is version 0.8.1, which can be considered the first “snapshot” headed toward this release.

acs_0.8.1.tar.gz

To install, simply download, start R, and type:

 

> install.packages("path/to/file//acs_0.8.1.tar.gz") > library(acs)

Updates include:

  • read.acs can now accept either a csv or a zip file downloaded directly from the FactFinder site, and it does a much better job (a) guessing how many rows to skip, (b) figuring out how to generate intelligent variable names for the columns, and (c) dealing with arcane non-numeric symbols used by FactFinder for some estimates and margins of error.
  • plot now includes a true.min= option, which allows you to

specify whether you want to allow error bars to span into negative values (true.min=T, the default), or to bound them at zero (true.min=F – or some other numeric value). This seemed necessary because it looks silly to say “The number of children who speak Spanish in this tract is 15, plus or minus 80…” At the same time, if the variable turns out to be something like the difference in the income of Males and the income if Females in the geography, a negative value may make a lot of sense, and should be plotted as such.

Tags: , , , ,

acs Package Updated: version 0.8 now on CRAN

Posted by Ezra Glenn on March 18, 2012
Census, Code / 2 Comments

I’ve just released a new version of my acs package for working with the U.S. Census American Community Survey data in R, available on CRAN. The current version 0.8 includes all the original version 0.6 code, plus a whole lot more features and fixes. Some highlights:

  • An improved read.acs function for importing data downloaded from the Census American FactFinder site.
  • rbind and cbind functions to help create larger acs objects from smaller ones.
  • A new sum method to aggregate rows or columns of ACS data, dealing correctly with both estimates and standard errors.
  • A new apply method to allow users to apply virtually any function to each row or column of an acs data object.
  • A snazzy new plot method, capable of plotting both density plots (for estimates of a single geography and variable) and multiple estimates with errors bars (for estimates of the same variable over multiple geographies, or vice versa). See sample plots below.

 

  • New functions to deal with adjusting the nominal values of currency from different years for the purpose of comparing between one survey and another. (See currency.convert and currency.year in the documentation.)
  • A new tract-level dataset from the ACS for Lawrence, MA, with dollar value currency estimates (useful to show off the aforementioned new currency conversion functions).
  • A new prompt method to serve as a helper function when changing geographic rownames or variable column names.
  • Improved documentation on the acs class and all of these various new functions and methods, with examples.

With this package, once you’ve found and downloaded your data from FactFinder, you can read it into R with a single command, aggregate multiple tracts into a neighborhood with another, generate a table of estimates and confidence intervals for your neighborhood with a third command, and a produce a print-ready plot of your data (complete with error bars for the margins of error) with a fourth:

my.data=read.acs("some_data.csv")
my.neighborhood=apply(my.data, FUN="sum", MARGIN=1, agg.term="My.Neighborhood") 
confint(my.neighborhood, conf.level=.95) 
plot(my.neighborhood, col="blue", err.col="violet", pch=16)

Already this package has come a long way, in large part thanks to the input of R users, so please check it out and let me know what you think — and how I can make it better.

Tags: , , , ,

TEDx + inTeractive Somerville

Posted by Ezra Glenn on March 05, 2012
Good Causes / No Comments

Yesterday I was pleased to be a part of the first-ever TEDx Somerville event. I only had four minutes on stage (which, if you’ve ever heard me speak, is barely enough time to get through a few opening wise-cracks), but it did provide a great platform to plug SCC’s new inTeractive Somerville website. In an attempt to showcase the ability to use this site to encourage and enable sharing (of ideas, data, meeting notes, news and personal stories, and more), I snapped a quick photo of the TEDx audience and by the end of the talk had it uploaded, geo-tagged, and posted to create a new discussion thread. (Full disclosure: there was actually some shifty behind-the-scenes driving tricks going on, thanks to Christian Spanring, who was hidden just offstage.)

The videos aren’t posted yet, but when they do go live, be sure to skip right past me and look for Somerville’s own “Alex the Jester” playing different tunes on three recorders all at the same time.

PS: If you are interested in learning more about inTeractive Somerville, and possibly adapting the platform for use in your own community, you should know that the code for the site is all open-source and available at github.

Tags: , , , , ,

org2blog

Posted by Ezra Glenn on February 18, 2012
Code / 1 Comment

For those of you who’ve noticed that I’ve started being a more active blogger over the last few weeks, there’s a good explanation: I’ve discovered org2blog.

Given that I try to live as much of my life as possible in emacs (or at least as much of my virtual life as possible), org2blog is a godsend. Using the emacs’ excellent org-mode has already revolutionized my writing, coding, and the way I organize my time and my projects, and now – through this intuitive and clever extension — it is helping organize my blog activity as well.

Others (for example, here and here have already written extensively on the how and the why of org2blog: basically, you install org-mode (already built-in to most modern emacsen), load a few more special .el files, and with a little customization you’re good to go.

The real magic, however, comes in the use of org-mode to bring order to the chaos of your thoughts, so that blog posts are planned, scheduled, and reflective – and the resulting blog is actually organized and structured (as opposed to the random “shopping lists of my thoughts” model).

Continue reading…

Tags: , , , ,

World on a Wire (Fassbinder, 1973)

Posted by Ezra Glenn on February 13, 2012
Film / No Comments

Janus/Criterion has just re-released a beautiful print of Rainer Werner Fassbinder’s 1973 two-part film, World on a Wire, and I was fortunate enough to have 210 minutes free on a Saturday afternoon to go to the Brattle to watch it. It’s great.

Plot-wise, the film covers much of the same ground as The Matrix and Inception – although it was made 30 years earlier – but this aspect is covered pretty well by other reviews. That said, the themes of living in the dream-like reality of a world of simulacra – and the ultimate dream of escape to a higher reality – take on a special richness in Fassbinder’s work, infused with the pathos of counter-cultural 1970s Germans.1

Visually, the entire film (originally shot in square 16mm for television, like an instamatic photograph) is beautifully fake, presenting the veneer of the world that was the 1970s: plastic molded offices full of plastic molded furniture and plastic molded people with plastic, blank faces – with the exception of our hero, Fred Stiller, the new Director of the Simulacron Project at the Institute for Cybernetics and Futurology. Stiller’s work, known as Simulacron 1, is the most sophisticated computer simulation ever made, a massive program modeling a world of 10,000 “identity units” for the purpose of making accurate scientific and government projections. It’s a planner’s dream: a simulated world where real life plays out for the purposes of forecasting future conditions and testing varios alternatives (“How much steel production will the economy require in 30 years?”; “Should we build more housing units in Baden-Württemberg or Schleswig-Holstein?”; and so on).

Continue reading…

Tags: , ,