Scalable parallel (?) read instances and one write instance in a dot-com environment...

From: Todd and Mary Ryder <tryder2_at_home.com>
Date: 1999/11/07
Message-ID: <Bi5V3.20317$Ec1.209778_at_news1.frmt1.sfba.home.com>#1/1


Have what I think is a neat idea for an N-tier scalable (read) database architecture for a high volume (10M users, 30K simultaneous connects) Web site:

1.middle-tier java app servers have "affinity" for specific Oracle instances (search, sort, fetch) for all GET/QUERY form actions - i.e. all GET and QUERY acts are distributed (perhaps by modulus against connected uid) equally across N "read only" servers - the read only part is logical - i.e. we wouldn't use read only tablespaces or anything - you'll see why in a minute

2. all (above) GET/QUERY instances, each run on a separate Unix host, but mount just one database between them, in parallel, using Oracle 8.0.x or 8.1.x Parallel Server Option

3. all POST actions, from all java app servers, goto a SINGLE write-only instance, which does NOT run OPS.

4. setup one of the parallel GET/QUERY instances as a replication slave of the all-write (POST) master DB so that all the GET/QUERY instances periodically synchronize themselves, albeit with some (acceptable) latency resulting from periodic (as opposed to row-level) replication refreshes

Questions:

  1. Having experience with the replication, but not OPS, I am operating under the following assumptions, derived from researching usenet:

Would appreciate any/all feedback regarding my plan/assumptions or even the validity of partitioning reads (but not writes) across N Oracle instances, for scalable performance under massive load in a Web environment

Worst case, I could bag the whole OPS idea, and run stand-alone databases for each of the N, get/query instances, but that would require each to participate in Advanced Replication (as slave) to refresh itself from the lone, all-write master. This would still be acceptable (but more hassle to setup/maintain) I think, because the snapshot refreshes could be restricted at each slave to include only data for logged-in (affinity) users. Thus, total replication network/disk disk IOs would remain X, where X represents ALL writes to the master, instead of N * X, where N is the number of GET/QUERY slaves. Does that make sense? I think it does, as long as each GET/QUERY slave keeps a log table containing uid's of each connection it is supporting.

Sorry for being so wordy, but I just want anyone that might consider responding to really understand my questions/concerns. All feedback is greatly appreciated.

Please respond directly as well as to the group. Thanks in advance.
Todd Ryder
tryder_at_captech.com
tryder2_at_home.com
tryder_at_best.com Received on Sun Nov 07 1999 - 00:00:00 CET

Original text of this message