How to optimise aggregate queries with distinct rows?

From: Dirk Gomez <usenet_at_dirkgomez.de>
Date: 31 Aug 2001 07:21:11 +0200
Message-ID: <m34rqo4xfc.fsf@colorado.arsdigita.de>

How to do good aggregate queries on n:m-mapped objects.

This is my schema:

create table sn_objects (

        object_id
          integer
            constraint sn_objects_id_pk
              primary key,
	...

);

create table sw_object_category_map (

       category_id     integer
                       constraint sw_object_category_map_cate_fk
                       references categories on delete cascade,
       object_id       integer
                       constraint sw_object_category_map_obj_fk
                       references acs_objects on delete cascade,
       constraint swc_object_category_map_ids_un
                       primary key (object_id, category_id)

);

create table categories(

category_id integer
constraint categories_category_id_pk primary key, ...
);

So each object can belong to arbitrarily many categories. I have a webpage where I want to show the number of objects that belong to a particular set of categories. It looks like this:

supercategory name a (32 objects)
supercategory name b (123 objects)
...

As each object can belong to many categories, it could be in category a-a and a-b, but should be counted only once (it is only one "physical thing").

The query to retrieve that count looks like this:

select count(unique obj.object_id)
from sn_objects obj, sw_object_category_map cm where obj.object_type_id=:object_type_id ...
and cm.object_id = obj.object_id
and cm.category_id in (
select d.object_id from sw_flat_cat, sw_category_dim d where parent=:parent_id
and d.node_id = child)

With 813 rows as the result, I get 885990 consistent reads. Removing the unique gives me 855 - wrong! - , but requires only 1515 reads.

I know that unique or distinct incur a sort operation and this is why the difference is so big. Maybe I could rewrite the query, so that the sort overhead isn't that gigantic.

Any ideas?

cheers Dirk Received on Fri Aug 31 2001 - 00:21:11 CDT