2010 U.S. Census Data for Postgresql

Census data published by the U.S. Census Bureau is not readily transformed for Relational Database Management System (RDMS) insertion. Here we provide an archive file to create a United States 2010 Census Short Form 1 (SF1) relational database for my favorite open source RDMS PostgreSQL. Check back here for 2020 Census updates, coming soon.

Database Features

Census Bureau instructions pertain to Microsoft Access 2007 database creation but then only to separate databases for each state. Here we create data tables that combine all 50 states (plus District of Columbia and Puerto Rico). Tables are inked with a PRIMARY KEY defined on stusab (state) and logrecno. An example table definition appears below.

create table sf1_00001 (
fileid varchar(510),
stusab varchar(510),
chariter varchar(510),
cifsn varchar(510),
logrecno integer,
p0010001 integer,
PRIMARY KEY (logrecno, stusab)
);

Restoring the DB

Download link (11GB): Census_2010_sf1_pg This 2010 Census PostgreSQL database archive was created with the following Pg commands:

pg_dump -Fc -f /archive/Census_2010_sf1_pg -d sf1_2010r &

The pg_dump custom format (-Fc) allows flexibility to restore one or more objects. The database user is not saved so you can easily restore your own user. The following Pg commands will restore the database.

createdb sf1_2010r
pg_restore -Fc -d sf1_2010r -v < Census_2010_sf1_pg

The size of the final database is 230GB, so be prepared.

Testing

Try this query at the psql prompt:

select sf1_00003.stusab as “State”,
sf1_00003.p0030001 as “Population”,
geo_header_sf1.geocomp,
geo_header_sf1.logrecno
from geo_header_sf1,sf1_00003
where geo_header_sf1.sumlev=’040′ and
geo_header_sf1.geocomp = ’00’ and
geo_header_sf1.logrecno = sf1_00003.logrecno and
geo_header_sf1.stusab = sf1_00003.stusab
order by “Population” desc;

If everything is working, you should see all 50 states (+DC, PR)

State | Population | geocomp | logrecno
——-+————+———+———-
CA | 37253956 | 00 | 1
TX | 25145561 | 00 | 1
NY | 19378102 | 00 | 1
FL | 18801310 | 00 | 1
IL | 12830632 | 00 | 1
PA | 12702379 | 00 | 1
OH | 11536504 | 00 | 1
MI | 9883640 | 00 | 1
GA | 9687653 | 00 | 1
NC | 9535483 | 00 | 1
NJ | 8791894 | 00 | 1
VA | 8001024 | 00 | 1
WA | 6724540 | 00 | 1
MA | 6547629 | 00 | 1
IN | 6483802 | 00 | 1
AZ | 6392017 | 00 | 1
TN | 6346105 | 00 | 1
MO | 5988927 | 00 | 1
MD | 5773552 | 00 | 1
WI | 5686986 | 00 | 1
MN | 5303925 | 00 | 1
CO | 5029196 | 00 | 1
AL | 4779736 | 00 | 1
SC | 4625364 | 00 | 1
LA | 4533372 | 00 | 1
KY | 4339367 | 00 | 1
OR | 3831074 | 00 | 1
OK | 3751351 | 00 | 1
PR | 3725789 | 00 | 1
CT | 3574097 | 00 | 1
IA | 3046355 | 00 | 1
MS | 2967297 | 00 | 1
AR | 2915918 | 00 | 1
KS | 2853118 | 00 | 1
UT | 2763885 | 00 | 1
NV | 2700551 | 00 | 1
NM | 2059179 | 00 | 1
WV | 1852994 | 00 | 1
NE | 1826341 | 00 | 1
ID | 1567582 | 00 | 1
HI | 1360301 | 00 | 1
ME | 1328361 | 00 | 1
NH | 1316470 | 00 | 1
RI | 1052567 | 00 | 1
MT | 989415 | 00 | 1
DE | 897934 | 00 | 1
SD | 814180 | 00 | 1
AK | 710231 | 00 | 1
ND | 672591 | 00 | 1
VT | 625741 | 00 | 1
DC | 601723 | 00 | 1
WY | 563626 | 00 | 1
(52 rows)

References

  1. Census Bureau data set
  2. 2010 Census Summary File 1 technical documentation.