Re: Design question regarding event log table processing

From: mohamed houri <mohamed.houri_at_gmail.com>
Date: Fri, 25 Jan 2013 08:39:35 +0100
Message-ID: <CAJu8R6hko7ha=3Mnmvg3HmLT2f7QxD7Yf6Aiei2HifT9oMTSsQ_at_mail.gmail.com>



I will suppose that the events monitoring table, in which you will be collecting your events, will be inserted into only. I will suppose also that the Oracle batch job is the only job which can update this monitoring table in order to signal the events processing status. In such a situation I would have considered 3 extra columns:

· processed_status

       0 à event is ready to be processed

      1 à event is currently being processed

 2 à events has been successfully processed

 3 à events has been in error the last time it has been processed

· processed_date

Null when ready to be processed not null for the other statuses

· processed_error

Not null in case of event processed with error

As such, the Oracle job will start by

  1. updating events having their processed_status in (0,3) to a processed_status = 1 signaling to any other batch job its intention to treat these specific events
  2. then it will select events having their processed_status = 1 (the above updated events) and start processing them event by event
  3. each time an event is successfully processed the batch job will update it status from 1 to 2 and eventually its error to NULL or vice versa in case of event unsuccessfully processed i.e. status = 3 and processed_error = the appropriate error

If during the batch processing a new record is created in the events monitoring table it will have a processed_status = 0 allowing it to be considered by the next run of the batch.

I have to warn you that if you create an index on the processed_status this index could be less efficient during its lifecycle because he has transient statuses: it starts by having status =0 and then swap very quickly to 1 and will finish by 2 in the majority of cases.

There are nevertheless few other questions to clarify, particularly about what identify a record in your event monitoring table? Is it possible to process duplicate records? And also what will “processing an event’’ do? Will it go back and take information from the data base corresponding to that event identifier being treated? If so, have you considered the effect of read consistency and treating the same information twice by two different runs of the job ?

Best Regards

Mohamed Houri

www.hourim.wordpress.com

2013/1/25 Niall Litchfield <niall.litchfield_at_gmail.com>

> Hi
> I can think of at least 4 patterns (hate that word) for this type of
> problem. Two of these you already have. These are
> 1 staging table with processed flag.
> 2 Oracle's AQ event processing technology
> 3 Partition load (ELT) on a per source per period basis
> 4. Batch inserts (either DB or app based)
>
> You don't say what happens to processed rows, how complex the processing
> is, what response time criteria you have and what if any processing time
> goals you have. 1 and 4 are likely the simplest and most easily maintained,
> 2 and 3 probably the most scalable.
> On Jan 24, 2013 6:05 PM, "Bruno Lavoie" <bl_at_brunol.com> wrote:
>
> > Hello,
> > In our application, we are going record some specific user triggered
> > events. These triggered events are often specific pages access and user
> > clicks on some web page controls.
> >
> > At first, these events will be buffered on app servers and events inserts
> > batched to a table to limit contention on excessive commit. Each app
> > servers, five at the moment, will have their own buffer and async process
> > for batching at regular interval or at intelligent batching based on
> > current workload. As you can see, this architectural part will be as like
> > as the redo log buffer and the lgwr process.
> >
> > Once inserted, each events need to be processed to get out some base
> > statistics. These statistics are generally cumulative, eg: counters and
> > sums. Using regular Oracle batch job we only want to process new events
> > records since last run. So we want to have the best possible design
> > (simple, performing well, scalable, minimal redo, etc) regarding this
> > incremental event log processing.
> >
> > We are thinking that having a specific functional index containing only
> > rows waiting for processing is the best way to advance in the event log.
> > Many implementation can be made to have this functional index: keep row
> > processing time stamp and index only rows with null time stamps OR
> simply a
> > less verbose flag. We also know to not use bitmaps index as they are not
> > scalable and concurrent on insert/update. We also know that timestamp
> > pointers can be problematic with the way read consistency and
> > multi-versioning is working. A record with a past timestamp can appear
> > after the the current timestamp pointer due to possible commit delays.
> >
> > I cannot see any other traps to avoid and any other way to to this. One
> > size doesn't fit all, but is there a feature or proven pattern for this
> > type of process?
> >
> > We investigated and loaded access logs from our app servers. During peak
> > periods we got around 140 events (only these we are interested on, for
> all
> > app server nodes) per seconds. We'll have a workload increase to bring
> > these numbers higher. To be rock solid I thing that our solution need to
> > sustain high hypothetical peaks at 500 events per seconds. Overall this
> is
> > not all about peak seconds, the current peak days events we got 2 000 000
> > events a day, this number will be pushed up too.
> >
> > Other details:
> > - Database 11g
> > - 32 GB RAM (and easily more)
> > - ASM, we'll probably calibrate operational and this «warehouse» disk
> IOs
> > on different diskgroups.
> > - Windows 64 bits
> > - no partitioning yet, but I think that this option will be necessary
> > - no OLAP option yet, the current needs are very specific, but we want
> to
> > log out the maximum details to, maybe, having more advanced things like
> > olap, mining, etc.
> >
> > If any other details are needed, ask me.
> >
> > Regards
> > Bruno Lavoie, OCA 11g
> > bl_at_brunol.com
> >
> > --
> > http://www.freelists.org/webpage/oracle-l
> >
> >
> >
>
> --
> http://www.freelists.org/webpage/oracle-l
>
>
>

-- 
Bien Respectueusement
Mohamed Houri

--
http://www.freelists.org/webpage/oracle-l
Received on Fri Jan 25 2013 - 08:39:35 CET

Original text of this message