Skip navigation.

Catherine Devlin

Syndicate content
Databases taste better with Python.
Updated: 2 hours 51 min ago

auto-generate SQLAlchemy models

Mon, 2014-07-28 15:30

PyOhio gave my lightning talk on ddlgenerator a warm reception, and Brandon Lorenz got me thinking, and PyOhio sprints filled my with py-drenaline, and now ddlgenerator can inspect your data and spit out SQLAlchemy model definitions for you:

$ cat merovingians.yaml
name: Clovis I
from: 486
to: 511
name: Childebert I
from: 511
to: 558
$ ddlgenerator --inserts sqlalchemy merovingians.yaml

from sqlalchemy import create_engine, Column, Integer, Table, Unicode
engine = create_engine(r'sqlite:///:memory:')
metadata = MetaData(bind=engine)

merovingians = Table('merovingians', metadata,
Column('name', Unicode(length=12), nullable=False),
Column('reign_from', Integer(), nullable=False),
Column('reign_to', Integer(), nullable=False),

conn = engine.connect()
inserter = merovingians.insert()
conn.execute(inserter, **{'name': 'Clovis I', 'reign_from': 486, 'reign_to': 511})
conn.execute(inserter, **{'name': 'Childebert I', 'reign_from': 511, 'reign_to': 558})

Brandon's working on a pull request to provide similar functionality for Django models!


Tue, 2014-07-01 10:37

Yesterday was my first day at 18F!

What is 18F? We're a small, little-known government organization that works outside the usual channels to accomplish special projects. It involves black outfits and a lot of martial arts.

Kidding! Sort of. 18F is a new agency within the GSA that does citizen-focused work for other parts of the U.S. Government, working small, quick projects to make information more accessible. We're using all the tricks: small teams, agile development, rapid iteration, open-source software, test-first, continuous integration. We do our work in the open.

Sure, this is old hat to you, faithful blog readers. But bringing it into government IT work is what makes it exciting. We're hoping that the techniques we use will ripple out beyond the immediate projects we work on, popularizing them throughout government IT and helping efficiency and responsiveness throughout. This is a chance to put all the techniques I've learned from you to work for all of us. Who wouldn't love to get paid to work for the common good?

Obviously, this is still my personal blog, so nothing I say about 18F counts as official information. Just take it as my usual enthusiastic babbling.


Fri, 2014-05-23 15:09

I've had it on github for a while, but I finally released ddlgenerator to PyPI.

I've been frustrated for years that there was no good open-source way to set up RDBMS tables from flat data files. Sure, you could import the data - after setting up the DDL by hand. ddlgenerator handles that; in fact, you can go from zero, setting up and populating a table in a single line. Nothing up my sleeve:

$ psql -c "SELECT * FROM knights"
ERROR: relation "knights" does not exist
LINE 1: SELECT * FROM knights
$ ddlgenerator --inserts postgresql knights.yaml | psql
$ psql -c "SELECT * FROM knights"
name | dob | kg | brave
Lancelot | 0471-01-09 00:00:00 | 82.0000 | t
Gawain | | 69.2000 | t
Robin | 0471-01-09 00:00:00 | | f
Reepacheep | | 0.0691 | t

This is a fairly complex tool so I'm sure you'll be using the bug tracker. But I hope you'll enjoy it nonetheless!


Wed, 2014-05-21 10:50

I went down a refactoring rabbit hole on ddl-generator and ended up pulling out the portion that pulls in data from various file formats. Perhaps it will be useful to others.

>>> from data_dispenser.sources import Source
>>> for row in Source('animals.csv'):
... print(row)
OrderedDict([('name', 'Alfred'), ('species', 'wart hog'), ('kg', '22'), ('notes', 'loves turnips')])
OrderedDict([('name', 'Gertrude'), ('species', 'polar bear'), ('kg', '312.7'), ('notes', 'deep thinker')])
OrderedDict([('name', 'Emily'), ('species', 'salamander'), ('kg', '0.3'), ('notes', '')])

Basically, I wanted a consistent way to consume rows of data - no matter where those rows come from. Right now, JSON, CSV, YAML, etc. all require separate libraries, each with its own API. This abstracts all that out, for reading purposes; now each data source is just a Source.

I'd love bug reports, and sample files to test against. And feel free to contribute patches! For example, it wouldn't be hard to add MS Excel as a data source.

G+ Public Hangout Fail

Tue, 2014-05-06 21:09
tl;dr:Do not use public Google+ Hangouts under any circumstances, because people suck.

Before the PyCon 2014 CFP came due, PyLadies hosted several G+ hangouts for talk proposal brainstorming. Potential speakers could talk over and flesh out their ideas with each other, producing better talk proposals. More importantly, it was a nice psychological stepping stone on the way to filling out that big, scary CFP form all alone. I thought they went great.

I wanted to emulate them for Postgres Open and PyOhio, which both have CFPs open now. The PyLadies hangouts had used EventBrite to preregister attendees, and I unfortunately did not consider this and the reasons why. Instead, I just scheduled hangouts, made them public, and sent out invitations with the hangout URLs, encouraging people to forward the invites onward. Why make participating any harder than it has to be?

The more worldly of you are already shaking your heads at my naiveté. It turns out that the world's exhibitionists have figured out how to automatically detect and join public hangouts. For several seconds I tried kicking out and banning them as they joined, but new ones kept arriving, faster than one per second. Then I hung up - which unfortunately did not terminate the hangout. It took me frantic minutes to find how to delete a hangout in progress. I dearly hope that no actual tech community members made it to the hangout during that time.

I had intended to create a place where new speakers, and women especially, would feel safe increasing their community participation. The absoluteness of my failure infuriates me.

Hey, Google: public G+ hangouts have been completely broken, not by technical failure, but by the degraded human condition. You need to remove them immediately. The option can only cause harm, as people accidentally expose themselves and others to sexual harrassment.

In the future, a "public" hangout URL should actually take you to a page where you request entrance from the organizer by text message (which should get the same spam filtration that an email would). But fix that later. Take the public hangouts away now.

Everybody else, if you had heard about the hangouts and were planning to participate, THANK YOU - but I've cancelled the rest of them. You should present anyway, though! I'd love to be contacted directly to talk over your ideas for proposals.