Oracle FAQ Your Portal to the Oracle Knowledge Grid

Home -> Community -> Usenet -> c.d.o.misc -> Data Manipulation Programming Languages

Data Manipulation Programming Languages

From: Robert <>
Date: Tue, 27 Nov 2007 14:59:27 -0800 (PST)
Message-ID: <>

DATA MANIPULATION PROGRAMMING LANGUAGES For Data Transformation and Data Preparation, there are several choices. All of them are useful, none are perfect for every data transformation problem. They fall into two broad categories: specialized and all purpose. The family of data transformation languages is fuzzy and has much overlap with the family of statistical programming languages and the database management language(s):

SPECIALIZED PROGRAMMING LANGUAGES - less development time and maintenance, and targeted for solving certain problems in particular:

SQL - Primarily for data query and database management. Also useful for some types of data transformation. SQL, of course, is an open standard, which can be implemented by anyone.

PARAGRAPH-STYLE STATISTICAL PROGRAMMING LANGUAGES: SAS, SPSS, and Vilno. SAS is by far the most well known language in this category. Vilno is the newest by far, not well known or mature, but quite versatile for those who choose to look at it. SAS and SPSS are both proprietary, and the two languages do not have a name other than the name of the corporation that developed it. Most SPSS users do not use the programming language underneath the user interface.

To be an open standard, a language has to be both non-proprietary and widely used - there is no open standard paragraph-style statistical programming language at this time. Also, the third famous statistical programming language, S, is not paragraph style, because it's structure and mentality is very different from SAS/SPSS.

Paragraph-style statistical programming languages are quite useful for many different types of data transformation - including those where SQL is the most elegant solution. This is because para-stat languages have an absorbing philosophy - if a feature is useful, it can be incorporated into the next revision as an additional procedure. ( For example, Vilno data processing paragraphs can optionally begin with a many-to-many join with a where clause, just like SQL SELECT , and SAS has added the PROC SQL). Para-stat languages are the swiss army knifes of data transformation.

MDX - highly specialized for the purpose of query and some analyses off of a multidimensional cube. Sometimes implemented under a GUI ( like SPSS ).

ALL PURPOSE PROGRAMMING LANGUAGES - you can do anything you want with all purpose languages - if you have the time , for both development and maintenance. The three that come to mind are: Perl, Python, Ruby.

These three are considered at a higher productivity level than C or Java. But if you have a data transformation project for which you believe a specialized programming language will not suffice, you had better set aside the time.

Also note that volunteers ( open source hackers ) are largely driving innovation in all-purpose programming languages, but innovation in specialized languages depends on the private sector more, and happens much more slowly. Hackers tend not to fall in love with specialized languages, not even SQL. But there are thousands of hackers out there who dream of being the next Larry Wall.

The three most famous statistical programming languages ( SAS, SPSS, and S ), along with SQL, are at least 30 years old. And yet all 4 of these languages, under certain scenarios, have productivity problems. Received on Tue Nov 27 2007 - 16:59:27 CST

Original text of this message