Re: Getting a grip on nested intervals and matrix encoding

From: Tegiri Nenashi <tegirinenashi_at_gmail.com>
Date: Fri, 2 Jul 2010 10:21:38 -0700 (PDT)
Message-ID: <061efadc-c246-4868-87a5-894c51d6c002_at_x1g2000prc.googlegroups.com>


On Jul 1, 2:05 pm, creecode <creec..._at_gmail.com> wrote:
> Hello all,
>
> I'm attempting to use Vadim Tropashko's nested intervals and matrix
> encoding technique.  As described in chapter 5 of his "SQL Design
> Patterns" book and also in various online articles and discussion
> groups.
>
> I thought I was making some progress but I ran into a snag when I
> tried a descendants query.  In the book there is some confusion
> ( printing errors or typos ) which I found some corrections for on
> Vadim's weblog <http://vadimtropashko.wordpress.com/%E2%80%9Csql-design-patterns%E2%8...>.  So I don't know if I've run into an error in the book or I've made
>
> an error somewhere along the line.  The later is more likely! :-)
>
> Math is not my strong point so the heavy ( to me ) math explanations
> don't work well for me.  I do better with step by step examples that I
> can tinker with.  I realize that the book is "not suitable for
> beginners" but I'm forging ahead trying to get some working routines
> figured out.
>
> I have a good sized discussion group in a database table that has over
> 243,000 entries.  If I can get Vadim's technique working I'd like to
> try and apply it to my discussion group.
>
> What I have so far could be all wrong so hopefully folks can set me in
> the right direction.
>
> I'm using MySQL v5.0.45.  Here is what I have so far...
>
> /* create table */
>
> CREATE TABLE `MatrixTreeNodes` (
>   `name` varchar(32) default NULL,
>   `materialized_path` varchar(32) default NULL,
>   `a11` int(11) default NULL,
>   `a12` int(11) default NULL,
>   `a21` int(11) default NULL,
>   `a22` int(11) default NULL,
>   UNIQUE KEY `uk_node` (`a11`,`a21`),
>   KEY `fk_adjacency` (`a12`,`a22`)
> ) ENGINE=MyISAM DEFAULT CHARSET=utf8;
>
> ** The materialized_path column is for reference only.  I don't plan
> to use a materialized path column in my final table. **
>
> /* create insert node procedure */
>
> DELIMITER //
> CREATE PROCEDURE insert_node ( IN parent_name VARCHAR ( 32 ), IN
> child_name VARCHAR ( 32 ), child_materialized_path VARCHAR ( 32 ) )
>     BEGIN
>         SELECT a11, a12, a21, a22 INTO _at_parent_a11, @parent_a12,
> _at_parent_a21, @parent_a22 FROM MatrixTreeNodes WHERE name =
> parent_name;
>         SELECT CASE WHEN COUNT( * ) = 0 THEN 0 ELSE MAX( FLOOR( a11 /
> a12 ) ) END + 1 FROM MatrixTreeNodes WHERE a12 = _at_parent_a11 AND a22 =
> _at_parent_a21 INTO @N;
>         INSERT INTO MatrixTreeNodes ( name, materialized_path, a11,
> a12, a21, a22 ) VALUES ( child_name, child_materialized_path,
> _at_parent_a11 * ( @N + 1 ) - @parent_a12, @parent_a11, @parent_a21 *
> ( _at_N + 1 ) - @parent_a22, @parent_a21 );
>     END;
> //
> DELIMITER ;
>
> /* initialize the table with root node */
>
> INSERT INTO `MatrixTreeNodes`
> (`name`,`materialized_path`,`a11`,`a12`,`a21`,`a22`) VALUES
> ('KING','1','2','1','1','0');
>
> ** This could be my first big problem.  Did I use the wrong numbers
> for the root node? **
>
> Now lets do some inserts to fill in the table...
>
> /* insert JONES */
>
> CALL insert_node ( 'KING', 'JONES', '1.1' );
>
> /* insert SCOTT */
>
> CALL insert_node ( 'JONES', 'SCOTT', '1.1.1' );
>
> /* insert ADAMS */
>
> CALL insert_node ( 'SCOTT', 'ADAMS', '1.1.1.1' );
>
> /* insert FORD */
>
> CALL insert_node ( 'JONES', 'FORD', '1.1.2' );
>
> /* insert SMITH */
>
> CALL insert_node ( 'FORD', 'SMITH', '1.1.2.1' );
>
> /* insert BLAKE */
>
> CALL insert_node ( 'KING', 'BLAKE', '1.2' );
>
> /* insert ALLEN */
>
> CALL insert_node ( 'BLAKE', 'ALLEN', '1.2.1' );
>
> /* insert WARD */
>
> CALL insert_node ( 'BLAKE', 'WARD', '1.2.2' );
>
> /* insert MARTIN */
>
> CALL insert_node ( 'BLAKE', 'MARTIN', '1.2.3' );
>
> /* select the data */
>
> SELECT * FROM MatrixTreeNodes;
>
> And we get back...
>
> name    materialized_path   a11 a12 a21 a22
> KING    1                   2   1   1   0
> JONES   1.1                 3   2   2   1
> SCOTT   1.1.1               4   3   3   2
> ADAMS   1.1.1.1             5   4   4   3
> FORD    1.1.2               7   3   5   2
> SMITH   1.1.2.1             11  7   8   5
> BLAKE   1.2                 5   2   3   1
> ALLEN   1.2.1               8   5   5   3
> WARD    1.2.2               13  5   8   3
> MARTIN  1.2.3               18  5   11  3
>
> ** Hopefully the formatting doesn't get too messed up.  If it does I
> could follow up with a CSV version. **
>
> Now let's try some direct descendant queries...
>
> /* direct descendants JONES */
> SELECT child.name FROM MatrixTreeNodes AS parent, MatrixTreeNodes AS
> child WHERE parent.name = 'JONES' AND child.a12 = parent.a11 AND
> child.a22 = parent.a21;
>
> name
> SCOTT
> FORD
>
> /* direct descendants BLAKE */
>
> SELECT child.name FROM MatrixTreeNodes AS parent, MatrixTreeNodes AS
> child WHERE parent.name = 'BLAKE' AND child.a12 = parent.a11 AND
> child.a22 = parent.a21;
>
> name
> ALLEN
> WARD
> MARTIN
>
> /* direct descendants ADAMS */
>
> SELECT child.name FROM MatrixTreeNodes AS parent, MatrixTreeNodes AS
> child WHERE parent.name = 'ADAMS' AND child.a12 = parent.a11 AND
> child.a22 = parent.a21;
>
> ** no results, as expected **
>
> /* direct descendants KING */
>
> SELECT child.name FROM MatrixTreeNodes AS parent, MatrixTreeNodes AS
> child WHERE parent.name = 'KING' AND child.a12 = parent.a11 AND
> child.a22 = parent.a21;
>
> name
> JONES
> BLAKE
>
> Those queries seem to have gone well.  Now lets try some parent
> queries.
>
> /*  parent JONES */
>
> SELECT parent.name FROM MatrixTreeNodes AS parent, MatrixTreeNodes AS
> child WHERE child.name = 'JONES' AND child.a12 = parent.a11 AND
> child.a22 = parent.a21;
>
> name
> KING
>
> /* parent KING */
>
> SELECT parent.name FROM MatrixTreeNodes AS parent, MatrixTreeNodes AS
> child WHERE child.name = 'KING' AND child.a12 = parent.a11 AND
> child.a22 = parent.a21;
>
> ** no results, as expected **
>
> /* parent BLAKE */
>
> SELECT parent.name FROM MatrixTreeNodes AS parent, MatrixTreeNodes AS
> child WHERE child.name = 'BLAKE' AND child.a12 = parent.a11 AND
> child.a22 = parent.a21;
>
> name
> KING
>
> /* parent MARTIN */
>
> SELECT parent.name FROM MatrixTreeNodes AS parent, MatrixTreeNodes AS
> child WHERE child.name = 'MARTIN' AND child.a12 = parent.a11 AND
> child.a22 = parent.a21;
>
> name
> BLAKE
>
> Those seem to have gone OK as well.  Now let's try that pesky
> descendants query...
>
> /* descendants JONES */
>
> SELECT descendant.name FROM MatrixTreeNodes AS descendant,
> MatrixTreeNodes AS node WHERE descendant.a11 * node.a21 >=
> descendant.a21 * node.a11 AND descendant.a11 * node.a22 >=
> descendant.a21 * node.a12 AND node.name = 'JONES';
>
> name
> KING
>
> ** that isn't right **
>
> So I tried a version from the comment section of above errata page
> that Vadim made.
>
> SELECT descendant.name FROM MatrixTreeNodes descendant,
> MatrixTreeNodes node
> WHERE descendant.a11 * node.a21 >= descendant.a21 * node.a11
> AND descendant.a11 * node.a21 - descendant.a11 * node.a22 <=
> node.a11 * descendant.a21 - node.a12 * descendant.a21
> AND node.name = 'JONES';
>
> ** no results, that isn't right **
>
> Any help much appreciated!
>
> Toodle-loooooooo..........
> creecode

create table hierarchy (
name varchar2(10),

a11 integer,
a12 integer,
a21 integer,

a22 integer
);
insert into hierarchy values ('KING', 2,   1,   1,   0);
insert into hierarchy values('JONES',  3,   2,   2,   1);
insert into hierarchy values('SCOTT', 4,   3,   3,   2);
insert into hierarchy values('ADAMS', 5,   4,   4,   3);
insert into hierarchy values('FORD', 7,   3,   5,   2);
insert into hierarchy values('SMITH', 11,  7,   8,   5);
insert into hierarchy values('BLAKE', 5,   2,   3,   1);
insert into hierarchy values('ALLEN',  8,   5,   5,   3);
insert into hierarchy values('WARD', 13, 5, 8, 3); insert into hierarchy values('MARTIN', 18, 5, 11, 3); commit;

select descendant.name
from hierarchy node, hierarchy descendant where node.name = 'JONES'
and descendant.a11*node.a21-descendant.a11*node.a22 >= node.a11*descendant.a21-node.a12*descendant.a21
and descendant.a11 * node.a21 <= descendant.a21 * node.a11;

NAME



JONES
SCOTT
ADAMS
FORD
SMITH Received on Fri Jul 02 2010 - 19:21:38 CEST

Original text of this message