| Oracle FAQ | Your Portal to the Oracle Knowledge Grid | |
Home -> Community -> Usenet -> comp.databases.theory -> Re: A better SQL implementation?
>From "3.3 Horror scenario 1", Dr. Harry McKame indicates that under a
b-tree implimitation, 2,000 disk access would be required for this SQL:
select * from employees where job = 'programmer' and dept =
'dept1'
and we assume that there are only 10 rows in the result.
This is incorrect, as the actual number of IO is 21 using SQL Server. The IO ratio is then 21/12 or 175% not 2 orders of magnitude.
Dr McKame is also incorrect on the data range values when he states "Assume now 1,000,000 employees in the table, among them 1,000 programmers. Assume also 1,000 different departments in this company (for symmetry). "
With this senario, the combination of Job and Department would be unique. As an Alternative, have 1,000 different jobs, 100 departments and each department has each of the 1,000 jobs ten times.
Finally, Dr McKame states that a set system will do one access for each of the indicies. Since the undelying physical implimitation model has not been provided, it is not possible to validate this statements.
However, assumming that a fixed storage unit size of 8K is used and the
record id is 8 bytes, I have recalculated the number of disk storage
units and then the disk I/O:
For each of the 1000 JobId, there would 1,000 employees, the set index
would then have a size of 1000 * 8 bytes, or about 8K, which is 1
storage unit.
For each of the 100 DeptIds, there would be 10,000 employees, the set
index would then have a size of about 80K, which is 10 storage unit.
Based on one I/O for each storage unit, the set based index I/O would be 21 based on:
1. 1 I/O to read the JobId index entry 2. 10 I/O to read the DeptId index entry 3. 10 I/O to get the data rows
This totals 21 I/O for the set based index which is exactly equal to the 21 I/O B-Tree index solution used by SQL Server.
The SQL Server execution plan is:
Here is the SQL to reproduce:
Table master.dbo.sequences constains 32768 rows with seq having values
from 0 to 32767. For how to populate this tables see
http://www.aspfaq.com/show.asp?id=2516
create table Employees
(EmployeeId integer not null identity(1,1) ,JobId integer not null ,DeptId integer not null
master.dbo.sequences Depts
cross join
master.dbo.sequences EDJ where Jobs.seq between 1 and 1000 and Depts.seq between 1 and 100 and EDJ.seq between 1 and 10
select * from Employees where JobId = 345 and DeptId = 67
select * from Employees where JobId = 345 and DeptId = 67
![]() |
![]() |