Re: Design question regarding event log table processing

From: Niall Litchfield <niall.litchfield_at_gmail.com>
Date: Fri, 25 Jan 2013 07:06:17 +0000
Message-ID: <CABe10sYGvs-GcQNb-OeOUSx2JypbnVhVryjD7Zontry91=w52g_at_mail.gmail.com>



Hi
I can think of at least 4 patterns (hate that word) for this type of problem. Two of these you already have. These are 1 staging table with processed flag.
2 Oracle's AQ event processing technology 3 Partition load (ELT) on a per source per period basis 4. Batch inserts (either DB or app based)

You don't say what happens to processed rows, how complex the processing is, what response time criteria you have and what if any processing time goals you have. 1 and 4 are likely the simplest and most easily maintained, 2 and 3 probably the most scalable.
On Jan 24, 2013 6:05 PM, "Bruno Lavoie" <bl_at_brunol.com> wrote:

> Hello,
> In our application, we are going record some specific user triggered
> events. These triggered events are often specific pages access and user
> clicks on some web page controls.
>
> At first, these events will be buffered on app servers and events inserts
> batched to a table to limit contention on excessive commit. Each app
> servers, five at the moment, will have their own buffer and async process
> for batching at regular interval or at intelligent batching based on
> current workload. As you can see, this architectural part will be as like
> as the redo log buffer and the lgwr process.
>
> Once inserted, each events need to be processed to get out some base
> statistics. These statistics are generally cumulative, eg: counters and
> sums. Using regular Oracle batch job we only want to process new events
> records since last run. So we want to have the best possible design
> (simple, performing well, scalable, minimal redo, etc) regarding this
> incremental event log processing.
>
> We are thinking that having a specific functional index containing only
> rows waiting for processing is the best way to advance in the event log.
> Many implementation can be made to have this functional index: keep row
> processing time stamp and index only rows with null time stamps OR simply a
> less verbose flag. We also know to not use bitmaps index as they are not
> scalable and concurrent on insert/update. We also know that timestamp
> pointers can be problematic with the way read consistency and
> multi-versioning is working. A record with a past timestamp can appear
> after the the current timestamp pointer due to possible commit delays.
>
> I cannot see any other traps to avoid and any other way to to this. One
> size doesn't fit all, but is there a feature or proven pattern for this
> type of process?
>
> We investigated and loaded access logs from our app servers. During peak
> periods we got around 140 events (only these we are interested on, for all
> app server nodes) per seconds. We'll have a workload increase to bring
> these numbers higher. To be rock solid I thing that our solution need to
> sustain high hypothetical peaks at 500 events per seconds. Overall this is
> not all about peak seconds, the current peak days events we got 2 000 000
> events a day, this number will be pushed up too.
>
> Other details:
> - Database 11g
> - 32 GB RAM (and easily more)
> - ASM, we'll probably calibrate operational and this «warehouse» disk IOs
> on different diskgroups.
> - Windows 64 bits
> - no partitioning yet, but I think that this option will be necessary
> - no OLAP option yet, the current needs are very specific, but we want to
> log out the maximum details to, maybe, having more advanced things like
> olap, mining, etc.
>
> If any other details are needed, ask me.
>
> Regards
> Bruno Lavoie, OCA 11g
> bl_at_brunol.com
>
> --
> http://www.freelists.org/webpage/oracle-l
>
>
>

--
http://www.freelists.org/webpage/oracle-l
Received on Fri Jan 25 2013 - 08:06:17 CET

Original text of this message