Re: Mixing OO and DB

From: Bob Badour <bbadour_at_pei.sympatico.ca>
Date: Wed, 14 May 2008 23:51:45 -0300
Message-ID: <482ba547$0$4046$9a566e8b_at_news.aliant.net>


Bob Badour wrote:

> topmind wrote:
>

>> Robert Martin wrote:
>>
>>> On 2008-03-09 01:02:47 -0600, Marshall <marshall.spight_at_gmail.com> said:
>>>
>>>
>>>> On Mar 8, 6:07 pm, Robert Martin <uncle..._at_objectmentor.com> wrote:
>>>>
>>>>> On 2008-03-06 15:37:56 -0600, topmind <topm..._at_technologist.com> said:
>>>>>
>>>>>
>>>>>>> Each small group of classes becomes a little roll-your-own data 
>>>>>>> access
>>>>>>> and manipulation scheme that is perfectly tuned for it's very 
>>>>>>> specific
>>>>>>> purpose.
>>>>>
>>>>>
>>>>>> Which is over-kill for the task-level.
>>>>>
>>>>>
>>>>> Do you have proof that it's overkill?  Do you have any objective
>>>>> measurements that it's overkill?  Or it is just your own opinion.  I
>>>>> mean, if it works for you that's great, but don't force your own
>>>>> opinions on everyone else  <grin>
>>>>
>>>>
>>>> This is a fallacious argument. You're proposing extra effort without
>>>> justification. The idea that, in the absence of evidence either way,
>>>> topmind's proposal of not putting in that effort is on equal footing
>>>> with yours doesn't hold. Extra effort requires justification. What
>>>> you are saying is, "hey, we don't know if this work has any value
>>>> or not, so doing it is just as justified as not doing it."
>>>
>>>
>>> Go back to the root of the argument.  You'll see that the initial
>>> premise is that the programmer organizes the data into a form that is
>>> more convenient for him to get his computational job done.  So there
>>> *is* justification.
>>
>>
>> Depending on how "convenient" is measured.
>>
>> Note that the effort to wrap SQL in methods is only one of the issues
>> against it.
>>
>>
>>>>> It is very common for programmers to manipulate data into forms that
>>>>> are particularly convenient for the application they are writing.
>>>>> Databases are seldom in that form since (for one thing) they must
>>>>> usually serve many different and competing applications.
>>>>
>>>>
>>>> (I'm going to just label the above as bogus without justification.
>>>> It's late and I'm lazy.)
>>>
>>>
>>> That's fine.  Consider, for example, an algorithm that finds the
>>> minimum spanning distance of a graph.  (e.g. cheapest network route, or
>>> cheapest travel itinerary, etc).  The node and edges of the graph are
>>> stored in database tables.
>>>
>>> Shall we execute that algorithm by doing thousands of tiny queries as
>>> we walk from node to node through the edges?  Or shall we query all the
>>> nodes and edges in one gulp, arrange them into a graph of objects, and
>>> walk through them that way?

>
>
> It's interesting that the self-aggrandizing ignorant should mention
> minimum spanning trees. Creating a generic procedure for calculating
> minimum spanning trees in SqlServer is on my to-do list as I write.
>
> If one studies the algorithms for minimum spanning trees, one quickly

> sees the task involves no traversals whatsoever. In fact, one generally
> creates the MST as a precursor to some sort of traversal, and the
> algorithms themselves are specified in terms of sets, which makes them
> ideal for implementing relationally.
>
> At the moment, I lean toward Kruskal's algorithm. Mostly, I just don't
> understand Chazelle's algorithm well enough, and I don't have the
> patience to hunt down the remaining details. Plus, it is difficult to
> tease out how Chazelle's approximations might interact with dynamic cost
> functions.
>
> The algorithm relies on a disjoint-set structure, which is just a tuple
> really. One would start by initializing a set of these structures with
> an element for each vertex. Hmmmm... a set of tuples... hmmmm... I
> wonder what sort of variable I could use to hold a set of tuples... I
> think I will use a relvar.
>
> For a general solution we need to associate a cost with each edge.
> Hmmmm... an edge and a cost... hmmmm... I think I will use a tuple for
> each of those and a relvar for the entire set once again.
>
> For the result, we need a set of edges. I think I will use a relvar for
> that too.
>
> Once we initialize these relvars, the algorithm is quite simple:
>
> While our edge-cost relation has at least one tuple:
> Choose one of the edges with the minimum cost.
> Perform a disjoint-set union on the two vertexes joined by the edge.
> Insert the edge to the MST result relvar.
> Delete all edges from the edge-cost relvar
> where the disjoint-set find places them in the same set.
> Done.
>
> The general solution on my to-do list is a little more complicated
> because the cost function can be dynamic. There has to be another step
> in the loop to re-evaluate the costs for any edges where the cost might
> change.
>
> [topmind's somewhat less informed response snipped]

The Delaunay Triangulation code was more complex (proper domain support, which SqlServer lacks, would have helped), but the Voronoi Tessellation code could be as easy as a view. Lloyd's algorithm for k-means clustering was rather straightforward in t-sql once I had the delaunay and voronoi code in place.

My to-do list is now void of computational geometry stuff for the time being. Sigh. Received on Thu May 15 2008 - 04:51:45 CEST

Original text of this message