Getting a grip on nested intervals and matrix encoding

From: creecode <creecode_at_gmail.com>
Date: Thu, 1 Jul 2010 14:05:39 -0700 (PDT)
Message-ID: <dc59e647-aa9c-43ea-aa5b-7e6134e64a45_at_w15g2000pro.googlegroups.com>



Hello all,

I'm attempting to use Vadim Tropashko's nested intervals and matrix encoding technique. As described in chapter 5 of his "SQL Design Patterns" book and also in various online articles and discussion groups.

I thought I was making some progress but I ran into a snag when I tried a descendants query. In the book there is some confusion ( printing errors or typos ) which I found some corrections for on Vadim's weblog < http://vadimtropashko.wordpress.com/%E2%80%9Csql-design-patterns%E2%80%9D-book/errata/ >. So I don't know if I've run into an error in the book or I've made an error somewhere along the line. The later is more likely! :-)

Math is not my strong point so the heavy ( to me ) math explanations don't work well for me. I do better with step by step examples that I can tinker with. I realize that the book is "not suitable for beginners" but I'm forging ahead trying to get some working routines figured out.

I have a good sized discussion group in a database table that has over 243,000 entries. If I can get Vadim's technique working I'd like to try and apply it to my discussion group.

What I have so far could be all wrong so hopefully folks can set me in the right direction.

I'm using MySQL v5.0.45. Here is what I have so far...

/* create table */

CREATE TABLE `MatrixTreeNodes` (

  `name` varchar(32) default NULL,
  `materialized_path` varchar(32) default NULL,
  `a11` int(11) default NULL,
  `a12` int(11) default NULL,
  `a21` int(11) default NULL,
  `a22` int(11) default NULL,

  UNIQUE KEY `uk_node` (`a11`,`a21`),
  KEY `fk_adjacency` (`a12`,`a22`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
  • The materialized_path column is for reference only. I don't plan to use a materialized path column in my final table. **

/* create insert node procedure */

DELIMITER //
CREATE PROCEDURE insert_node ( IN parent_name VARCHAR ( 32 ), IN child_name VARCHAR ( 32 ), child_materialized_path VARCHAR ( 32 ) )

    BEGIN
        SELECT a11, a12, a21, a22 INTO _at_parent_a11, @parent_a12, _at_parent_a21, @parent_a22 FROM MatrixTreeNodes WHERE name = parent_name;

        SELECT CASE WHEN COUNT( * ) = 0 THEN 0 ELSE MAX( FLOOR( a11 / a12 ) ) END + 1 FROM MatrixTreeNodes WHERE a12 = _at_parent_a11 AND a22 = _at_parent_a21 INTO @N;

        INSERT INTO MatrixTreeNodes ( name, materialized_path, a11, a12, a21, a22 ) VALUES ( child_name, child_materialized_path, _at_parent_a11 * ( @N + 1 ) - @parent_a12, @parent_a11, @parent_a21 * ( _at_N + 1 ) - @parent_a22, @parent_a21 );

    END;
//
DELIMITER ;
/* initialize the table with root node */

INSERT INTO `MatrixTreeNodes`
(`name`,`materialized_path`,`a11`,`a12`,`a21`,`a22`) VALUES ('KING','1','2','1','1','0');

  • This could be my first big problem. Did I use the wrong numbers for the root node? **

Now lets do some inserts to fill in the table...

/* insert JONES */

CALL insert_node ( 'KING', 'JONES', '1.1' );

/* insert SCOTT */

CALL insert_node ( 'JONES', 'SCOTT', '1.1.1' );

/* insert ADAMS */

CALL insert_node ( 'SCOTT', 'ADAMS', '1.1.1.1' );

/* insert FORD */

CALL insert_node ( 'JONES', 'FORD', '1.1.2' );

/* insert SMITH */

CALL insert_node ( 'FORD', 'SMITH', '1.1.2.1' );

/* insert BLAKE */

CALL insert_node ( 'KING', 'BLAKE', '1.2' );

/* insert ALLEN */

CALL insert_node ( 'BLAKE', 'ALLEN', '1.2.1' );

/* insert WARD */

CALL insert_node ( 'BLAKE', 'WARD', '1.2.2' );

/* insert MARTIN */

CALL insert_node ( 'BLAKE', 'MARTIN', '1.2.3' );

/* select the data */

SELECT * FROM MatrixTreeNodes;

And we get back...

name materialized_path a11 a12 a21 a22

KING    1                   2   1   1   0
JONES   1.1                 3   2   2   1
SCOTT   1.1.1               4   3   3   2
ADAMS   1.1.1.1             5   4   4   3
FORD    1.1.2               7   3   5   2
SMITH   1.1.2.1             11  7   8   5
BLAKE   1.2                 5   2   3   1
ALLEN   1.2.1               8   5   5   3
WARD    1.2.2               13  5   8   3
MARTIN  1.2.3               18  5   11  3

  • Hopefully the formatting doesn't get too messed up. If it does I could follow up with a CSV version. **

Now let's try some direct descendant queries...

/* direct descendants JONES */

SELECT child.name FROM MatrixTreeNodes AS parent, MatrixTreeNodes AS child WHERE parent.name = 'JONES' AND child.a12 = parent.a11 AND child.a22 = parent.a21;

name
SCOTT
FORD
/* direct descendants BLAKE */

SELECT child.name FROM MatrixTreeNodes AS parent, MatrixTreeNodes AS child WHERE parent.name = 'BLAKE' AND child.a12 = parent.a11 AND child.a22 = parent.a21;

name
ALLEN
WARD
MARTIN
/* direct descendants ADAMS */

SELECT child.name FROM MatrixTreeNodes AS parent, MatrixTreeNodes AS child WHERE parent.name = 'ADAMS' AND child.a12 = parent.a11 AND child.a22 = parent.a21;

  • no results, as expected **

/* direct descendants KING */

SELECT child.name FROM MatrixTreeNodes AS parent, MatrixTreeNodes AS child WHERE parent.name = 'KING' AND child.a12 = parent.a11 AND child.a22 = parent.a21;

name
JONES
BLAKE Those queries seem to have gone well. Now lets try some parent queries.

/* parent JONES */

SELECT parent.name FROM MatrixTreeNodes AS parent, MatrixTreeNodes AS child WHERE child.name = 'JONES' AND child.a12 = parent.a11 AND child.a22 = parent.a21;

name
KING
/* parent KING */

SELECT parent.name FROM MatrixTreeNodes AS parent, MatrixTreeNodes AS child WHERE child.name = 'KING' AND child.a12 = parent.a11 AND child.a22 = parent.a21;

  • no results, as expected **

/* parent BLAKE */

SELECT parent.name FROM MatrixTreeNodes AS parent, MatrixTreeNodes AS child WHERE child.name = 'BLAKE' AND child.a12 = parent.a11 AND child.a22 = parent.a21;

name
KING
/* parent MARTIN */

SELECT parent.name FROM MatrixTreeNodes AS parent, MatrixTreeNodes AS child WHERE child.name = 'MARTIN' AND child.a12 = parent.a11 AND child.a22 = parent.a21;

name
BLAKE Those seem to have gone OK as well. Now let's try that pesky descendants query...

/* descendants JONES */

SELECT descendant.name FROM MatrixTreeNodes AS descendant, MatrixTreeNodes AS node WHERE descendant.a11 * node.a21 >= descendant.a21 * node.a11 AND descendant.a11 * node.a22 >= descendant.a21 * node.a12 AND node.name = 'JONES';

name
KING

  • that isn't right **

So I tried a version from the comment section of above errata page that Vadim made.

SELECT descendant.name FROM MatrixTreeNodes descendant, MatrixTreeNodes node
WHERE descendant.a11 * node.a21 >= descendant.a21 * node.a11 AND descendant.a11 * node.a21 - descendant.a11 * node.a22 <= node.a11 * descendant.a21 - node.a12 * descendant.a21 AND node.name = 'JONES';

  • no results, that isn't right **

Any help much appreciated!

Toodle-loooooooo..........
creecode Received on Thu Jul 01 2010 - 23:05:39 CEST

Original text of this message