Distributed joins, are they REALLY NEEDED?

From: Jay <li_at_news.cs.columbia.edu>
Date: 9 Feb 1995 23:33:42 -0500
Message-ID: <CMM.0.90.2.792390818.li_at_ground.cs.columbia.edu>


Hi all,

Forgive me for the cross-posting. I am currently working on my doctoral thesis on Distributed Join Processing algorithms and their performance issues (e.g., how to efficiently join N relations residing in M distributed sites so that the network cost and join processing cost are minimized.) Trying to avoid doing something purely academic, I would like to get some insights from hands-on practioners regarding the necessity of distributed join handling. In particular the following questions are of great interest to me.

  • Before you think about performing ad-hoc distributed join queries, won't replicating the referenced relations locally solve your problem better?
  • What's the nature of your application requiring distributed join queries? Are the queries highly restrictive in their join result size?
  • How large is the data volume involved?

Please reply by email. I would post a summary if enough interests are shown by the netters.

Regards,

  • Jay Li

Department of Computer Science
Columbia University
New York, NY 10027 Received on Fri Feb 10 1995 - 05:33:42 CET

Original text of this message