Re: (long) Sniffing redo logs to maintain cache consistency?

From: Telemachus <tollg_at_tendwa.rns.net>
Date: Thu, 27 Feb 2003 12:10:27 -0000
Message-ID: <Uen7a.12163$V6.16434@news.indigo.ie>

Been through this before Nuno ?
(grin)

I have to say that post should be stapled to developers foreheads ...

I'm going to copy it to a couple of ours who are thinking about this kind of thing.

YDM !
"Noons" <nsouto_at_optusnet.com.au.nospam> wrote in message news:Xns932FE0A252F12Tokenthis_at_210.49.20.254...
> Following up on Andrej Gabara, 27 Feb 2003:
>
> > Could not find a good definition on the web,
>
> Bingo! Precisely. But you think you need to
> "put it out of the database". Put out what, if you
> can't see a definition anywhere? See what I mean?
>
> Business logic is the set of operations the application
> code has to perform to respond to a business transaction
> request.
>
> Data access logic is the set of data constraints and data
> manipulation logic that defines how data interacts together,
> how it is stored and how it is retrieved to be used by consumers.
>
>
>
> > My concern is on (1), because it causes objects in a cache
> > to become invalid. I don't worry about (2). Also, I'm talking only
> > about caching object states, not caching any calculated data.
> > Some of (2) can be be done just by a simple DB query, which
> > could bypass the cache.
> >
>
> Transactional model is what you want. As soon as you use
> the expression "object states", that is EXACTLY what you are
> dealing with. It's a very old field and most of its problems
> have been worked out now.
>
> There are many solutions. One that is very dear to the OO brigade
> (but is impossible to implement) is the concept that all data can
> be cached and serialization should only be done at startup and
> shutdown. I can't even begin to describe how insane this
> concept really is. It used to be used at the beginning of any
> discussion on transactional models to describe why treansactions
> themselves were needed!
>
> It has however the advantage that there is no data refresh conflict
> with caches. Of course, it is so impractical that it's much better
> to look at other techniques and forget the OO-dream upfront.
> Instead of wasting time chasing up dreams. That gets costly...
> All you have to do is make sure that the user is aware of the
> compromise.
>
> An example solution is exposed below ( there are many more!). It is
> dramatically more efficient for your object cache and your data
> cache sync and it imposes very small restrictions on user
> friendliness. It works and is dirt easy to implement.
>
>
> > Do you agree that (1) is a problem for effective caching of
> > those java objects?
>
> No, not at all. An object cache is NOT the same as a data cache.
>
> > The problem is with stored procedures
> > that update data in the tables affecting cached objects,
> > without being obvious to the app server what those changes
> > are when calling the stored procedure.
> >
>
> There you go again. The stored procedures do NOT update
> data that is in cached objects out of their own initiative.
> Stored procedures have no willpower of their own!
>
> All they can do is respond to a call from Java to go and do something.
> That's all. You initiate the procedure call from a business
> object in the cache, through the DAO.
>
>
>
> Use this transactional model:
>
>
> you create your objects in your cache. In Java. Do NOT concern
> yourself yet with the db side of them. Just make sure you got
> your object model sorted out. Do NOT, I repeat, do NOT include
> hierarchical data dependencies into your object model. Those
> are data dependencies, not object!
>
> That is by FAR the most common error in mapping object to
> relational. Yes, I know, all the books say that OO supports
> hierachical very well. True, if all it has to do is work
> with the 3-table-models you see in all the examples AND it
> doesn't have to serialise objects.
>
> Otherwise, you are in deep caca if you try to include hierarchy
> handling in your object model. It will all become clear with an
> example.
>
> Let's look at the example you gave:
>
> > A user opens up a project, looks up his task,
> > and sets the task state to "completed". The business logic will
> > change the task's state to "completed" and updates any
> > dependent tasks, setting those to "ready".
>
>
> OK, there are steps here, aren't there? There is a series of steps
> that need to be handled by the Java logic IN the object cache.
> And data consequences of those steps that are handled by the
> data layer.
>
> Let's assume you have a tasks table. The tasks table
> needs to support hierarchical dependencies. Ie, a task may
> have dependent tasks. But that is a data rule. We handle
> that by letting a FK constraint back to the same table.
>
> So, you NEED something that will read/update/re-write/delete
> from that data model in a consistent manner such that
> it acts like a simple task table for your workflow app.
>
> The process that does that is data access logic and has NOTHING
> to do with business logic. It is this process that you should be using
> your stored procedures for. And it should be handled (called, results
> verified, etc) through a Data Access Object in your DAO layer.
>
> The steps in your use case above are handled by methods in
> objects cached in your Java cache.
>
> They initiate a transaction by grabbing a transaction context
> out of J2EE. This will now be in action until the
> business code has terminated the transaction. Nothing new
> here.
>
> Now, updating the task's state is trivial. The problem is
> when you have to "cascade" to other dependent tasks.
> How do you handle that WITH the Java cache?
>
> Answer: you don't. You let the database handle that. It's
> already defined there, you don't need to duplicate it in
> your objects!
>
> In fact, if you did, you'd have to duplicate the
> ENTIRE data model in your Java objects, and cache ALL data
> at ALL times. I can GUARANTEE you such an application would
> NEVER scale to 10000 users, because hardware that can
> cost-efficiently handle that sort of cache does not exist.
> Today or in the near or far future.
>
> So, stop right there: all you are doing is buying yourself
> a bucket-load of problems in trying to convince someone
> that their Tera-byte db has to live in memory all the time.
> You will NOT manage to do it, believe me!
>
>
> OK, so how do you handle the problem?
>
> Let's follow the steps of a possible solution.
>
> User asks for a task to be marked as complete.
>
> Your object code does whatever it has to do to make sure
> the screen shows that task as complete. And it
> eventually calls the DAO to go and make a change to the
> task table in the db. The DAO now calls the stored procedure
> to go and change the task to complete. It passes to it
> the data for the table that populated the object instance
> in the object cache.
>
> The first thing the stored procedure does is compare the
> timestamps of the data in the object cache (obtained from the db
> last time the data was read into the object cache) and the
> timestamp of the row in the db now. If they are different, the
> update is IMMEDIATELY REJECTED and the DAO passes an "invalidate
> object" back to the cached object.
>
> Which now reacts by sending a message to the user that
> "the data in your screen is stale, please re-query".
> User obediently re-queries and all is well, he gets the
> real data now. The task he wanted to complete is now
> complete, so he doesn't have to do anything, just go home.
> (not really, but you get the picture)
>
> If the timestamps are the same, the code in the stored
> procedure marks the task as complete and updates the
> timestamp. AND it marks any other dependent tasks as
> complete as well, with changes to THEIR timestamps.
> Ie, it cascades. This is all doens in the db, by the stored
> procedure, with no intervening JDBC anyehere.
>
> When finished, it returns to the DAO a status saying:
> "I have cascaded!".
> The DAO returns that to the object in cache and a method
> somewhere in there is activated to send a little blink to
> the user that "this has caused dependent tasks to be changed".
> Just a nice warning. Possibilities in there too for other
> handling, but I won't go into it now.
>
> Now the user keeps on working and eventually selects one of the tasks
> that was cached in the object cache, was changed in the db as a result
> of a cascade, but the object cache knows nothing about it.
>
> He then tries to set it to complete. Object code calls the DAO,
> this one calls the stored proc, this one checks the timestamps
> and BANG!, we are at step one of this little exercise above.
>
> See? Bingo, there is your object cache to data cache synch
> done on the cheap, efficiently and with minimal impact to
> the user.
>
>
> > For a business logic case like (2) a stored procedure would
> > probably outperform Java, because the stored procedure is
> > always closer to the data it is accessing. And a lot of
> > those accesses can be multiple complex queries.
>
> It will ALWAYS outperform Java, by several orders
> of magnitude. For many reasons that don't come into here,
> but mainly to do with how JDBC works and J2EE performs
> I/O via EJB's.
>
>
>
> > (1) it may not be, because of cache coherency. But if most
> > objects are cached in the app server and are valid, then
> > Java for case (2) could also outperform PL/SQL;
>
>
> Of course! You wouldn't even GO to the db if the objects
> were ALWAYS in the cache, would you?
>
> You will find however that in the real world and with
> 10000 users on the system, the chances of you having
> sufficient hardware that will cache "most objects" like you
> say above is very remote! And all you need is one to not be
> cached and all crumbles...
>
>
> > model and not the data model. That is why I favor a model
> > where most business logic is implemented in Java and none
> > or very little in PL/SQL.
>
>
> One thing should not have anything to do with the other,
> as I hope I have explained.
>
> And a model that needs all object instances to be permanently
> in cache in order for it to be workable is to me something
> that is seriously flawed.
>
> Not to use the correct expression, which should be "deranged"!
> That is the problem with the "everything in Java" model: it
> simply cannot work with real world hardware and real world volumes.
>
> Not to mention that little problem called TCO: there is no
> way you can convince anyone to spend the moolah for that
> sort of "scalable" solution...
>
> Of course, it works PERFECLY in the 3-table examples with two
> rows each. That is not however the real world, I'm afraid!
>
>
> > To be able to take advantage of
> > an effective java object cache, you have to make good
> > decisions of where you put the business logic.
> >
>
>
> But you also have to define and separate what is
> business logic and what is data access logic.
> There is NO WAY in the world you can reliably scale a
> model that relies on having an entire database in memory!
> That is just "Java-guru lah-lah-land".
>
> It's total bull: no commercial software/hardware combo ANYWHERE
> operates that way and is scalable to any significant number
> of users while being reliable, resilient and cost effective.
>
> Don't waste your time with snake-oil solutions: they have the
> bad habit of turning around and biting your hip-pocket.
> EVERY SINGLE TIME!
>
>
>
> --
> Cheers
> Nuno Souto
> nsouto_at_optusnet.com.au.nospam
Received on Thu Feb 27 2003 - 06:10:27 CST