Re: Joins vs. Sub-Selects

From: David Cressey <cressey73_at_verizon.net>
Date: Thu, 19 Jul 2007 16:47:59 GMT
Message-ID: <3BMni.6296$4J4.5811_at_trndny05>


<perryv_at_gmail.com> wrote in message
news:1184858324.825144.238180_at_x40g2000prg.googlegroups.com...
> Which relational operation perform faster in general, JOINs or SELECT
> subqueries?
>
> Suppose we have these tables:
>
> create table departments (
> dept_id int not null primary key auto_increment,
> dept varchar(50) not null unique
> );
>
> create table employees (
> emp_id int not null primary key auto_increment,
> name varchar(50) not null unique,
> dept_id int not null,
> foreign key (dept_id) references departments (dept_id)
> );
>
> create table websites (
> emp_id int not null primary key,
> url varchar(50) not null,
> foreign key (emp_id) references employees (emp_id)
> );
>
> An employee belongs to exactly one department, but not all employees
> have a website.
>
> Suppose I want to know the name, department and website of all
> employees. There are many ways to formulate this query in SQL. In
> particular, one possible query is this:
>
> select name, dept, url
> from employees inner join departments
> on employees.dept_id = departments.dept_id
> left join websites
> on employees.emp_id = websites.emp_id;
>
> Another possible one is this:
>
> select name,
> (select dept from departments where dept_id = employees.dept_id) as
> dept,
> (select url from websites where emp_id = employees.emp_id) as url
> from employees
>
> Which of these two queries would execute faster?
>
> Here are some sample data:
>
> "departments" table:
> +---------+------------+
> | dept_id | dept |
> +---------+------------+
> | 1 | Marketing |
> | 2 | Accounting |
> | 3 | MIS |
> +---------+------------+
>
> "employees" table:
> +--------+---------+---------+
> | emp_id | name | dept_id |
> +--------+---------+---------+
> | 1 | Anne | 1 |
> | 2 | Tim | 3 |
> | 3 | Brad | 3 |
> | 4 | Shirley | 2 |
> +--------+---------+---------+
>
> "websites" table:
> +--------+-------------------+
> | emp_id | url |
> +--------+-------------------+
> | 1 | www.anne-site.com |
> | 3 | www.bradguru.com |
> +--------+-------------------+
>
> Sample query result:
> +---------+------------+-------------------+
> | name | dept | url |
> +---------+------------+-------------------+
> | Anne | Marketing | www.anne-site.com |
> | Tim | MIS | NULL |
> | Brad | MIS | www.bradguru.com |
> | Shirley | Accounting | NULL |
> +---------+------------+-------------------+
>

It depends.

It depends on which RDBMS package you are using, on which version of that package, and how the database is set up.

It depends on the cardinalities of the tables, on the load, and on the resources used.

So... don't try to generalize on this one. Play around with your enviornment and discover the answer for yourself. In the environment where I was working in 1994, joins tended to outperform subqueries. But not always. Received on Thu Jul 19 2007 - 18:47:59 CEST

Original text of this message