Comparison between join and IN clause [message #355451] |
Fri, 24 October 2008 02:24 |
xokas11
Messages: 28 Registered: January 2008 Location: Guadalajara, Spain
|
Junior Member |
|
|
Hi! I have a question regarding performance of 2 sql statements.
I have 2 tables: RESULTADOS and ALUMNOS_CURSOS they are related by the ID_MATRICULA column.
RESULTADOS(
ID NUMBER,
id_elemento number,
fecha_examen date,
xml_resultado blob,
nota number,
id_matricula NUMBER)
and ALUMNOS_CURSOS(
ID_MATRICULA NUMBER,
ID_USUARIO NUMBER,
ID_CURSO NUMBER,
FECHA_ALTA DATE,
BAJA NUMBER)
I want to know the number of entries in resultado that have nota >= 0.5 for every ALUMNOS_CURSOS entry that has a defined id_curso. I implemented this this way:
select count(*) num_aprobados
from resultados
where nota >= 0.5
and id_matricula in (
select id_matricula
from alumnos_cursos
where id_curso = &id_curso_1
)
I know there is the possibility of implementing it trough a join like this:
SELECT SUM(tests_aprobados) num_tests_aprobados
FROM alumnos_cursos ac
JOIN
(SELECT COUNT(*) tests_aprobados,
resultados.id_matricula
FROM resultados
WHERE resultados.nota >= 0.5
GROUP BY resultados.id_matricula
) r
ON (ac.id_matricula=r.id_matricula)
WHERE ac.id_curso = &id_curso
what's the most optimal way of getting this information? I assume it's the first one but maybe in the future if there are hundreds of thousands of rows it may change. What's your opinion? How do you usually do this kind of select (getting some info from some table that's present on another table)?
Regards, Joaquin
|
|
|
Re: Comparison between join and IN clause [message #355482 is a reply to message #355451] |
Fri, 24 October 2008 07:09 |
rleishman
Messages: 3728 Registered: October 2005 Location: Melbourne, Australia
|
Senior Member |
|
|
Modern versions of Oracle will generally unnest the IN subquery and perform it as a join. So there is no appreciable difference.
Both methods have about the potential to achieve a similar optimal plan. When you run them, one or both may choose a sub-optimal plan, so neither one is "best" from a performance perspective.
My golden rule is to use the version that makes sense. If you want rows from A that match the rows in B, use IN. If you want rows from A and B combined, use a join.
Pick one and then start tuning if it is sub-optimal.
Ross Leishman
|
|
|