approaches for embedding a data language in a general purpose language

From: Marshall <marshall.spight_at_gmail.com>
Date: 9 Oct 2006 08:12:37 -0700
Message-ID: <1160406756.523915.48510_at_m73g2000cwd.googlegroups.com>



Hello all,

There are various different approaches one can take for embedding a domain specific lanuage into a general purpose programming language. Common examples are regular expression libraries inside languages that don't directly support regular expressions, and, directly to our purpose, SQL inside Java or C/C++.

The three main approaches I can think of are: 1) a library that accepts text written in the language as string parameters
ex.: JDBC, ODBC

2) Code generation
ex.: Hibernate, any one of ~1000 O/R mappers

3) Direct embedding using a preprocessor ex.: SQL-J, embedded SQL (for C) etc.

I've used the first two extensively, but never the third one. I've got the nagging suspicion that it's the one I would like the best. (Of course
one must immediately suspect grass-is-greener syndrome here.)

An issue is that general purpose languages typically need to know the types of things up front, and that means that in the code generation approach, it's necessary to regenerate the code every time a query with a new result set type is needed. That's a bit inconvenient, and means that your modification will necessarily be far away from the point in the code that's motivating it.

One thing particularly pernicious about the code generation approach is that it really pushes the programmer in the direction of row-at-a-time
thinking. This leads to horrific performance.

Anyway, I'd be interested in a discussion of the merits and deficits of the various approaches, and particularly if anyone has anything to say about 3). I can't help but feel there must be a better way that what I've been doing.

Marshall Received on Mon Oct 09 2006 - 17:12:37 CEST

Original text of this message