Re: measuring performance of scan listeners

From: Martin Berger <martin.a.berger_at_gmail.com>
Date: Tue, 8 Apr 2014 10:09:06 +0200
Message-ID: <CALH8A90CcZVdjnh0rBZh22FLKW22ShqrmwpuJRAcN5GgHvrJ_w_at_mail.gmail.com>



Please be very careful what you want to measure and how you measure it. SCAN listeners do not create a server process themselves. They "only" redirect to one of the local_listeners (exception here: shared servers).

So to compare the way a client connects to a RAC DB SCAN mode vs. VIP-listeners only
(this is only one of many scenarios):

  1. the client reads the connection definition (let's say tnsnames.ora)

2s) client resolves the SCAN-name on DNS 2v) client choose one of the VIP-names and reslves it on DNS

3) client connects to one IP it got by 2) and asks for service

4s) SCAN listener tells client about the VIP-name of least loaded server 4v) VIP listener decides if itself is the least loaded server, if yes => creates circuit, if no => tells client about the VIP-name of least loaded server

5) client resolves (DNS) name it got back

6) client connects to IP, and gets circuit created by VIP-istener.

So the big difference here is at point 4. with a probability of 1/(number_of_clusternodes) the VIP-only config does not need the additional roundtrip of point 5/6

In the sequence described above you should measure (1xDNS lookup + 1xnetwork-roundtrip + 1xlistener_redirect_answer + 1xclient_code_to_ask_listener) x (1 / (number_of_clusternodes) )

So please if you try to measure the difference, make sure you know about the different parts involved and do sanity checks to see if the result is reasonable.
Otherwise you might test the performance of the test-generator (e.g. the DBD implementation in Perl) and not your SCAN listeners.

I always found it very card to get proper logging with synchronous timing infos on such setups. But it is a lot of fun to trace it down.

hth
 Martin

On Mon, Apr 7, 2014 at 6:43 PM, Ryan January <rjjanuary_at_multiservice.com>wrote:

> Could your solution be as simple as a script looping over n connections
> (possibly 100's or 1000's) and a basic select, timing the connection from
> the client end? This is made pretty simple using libraries such as
> cx_oracle (python), DBD::Oracle (Perl), or even a shell script wrapping
> sqlplus with the 'time' command.
>
> Run the script against the standard listener and scan listener, comparing
> the results. This gives you something to point back to showing the end-user
> impact it did (or did not) have. If it's only finger pointing you're
> concerned with, providing numbers of actual measured impact should suffice.
>
> We've run similar tests in the past providing the max, min, and average
> connection times proving the effects of TNS changes.
>
>
> On 04/07/2014 10:48 AM, Dba DBA wrote:
>
> yeah. I didnt think there is much to go on. However, I am getting client
> push back.
>
> thanks. I dont think it will be an issue. My bigger concern is finger
> pointing. If something is slow for some reason, Ill get see I told you
> about scan listeners, even though the events or whatever I look at will
> point to something else.
>
>
> On Mon, Apr 7, 2014 at 10:48 AM, Andy Wattenhofer <watt0012_at_umn.edu>wrote:
>
>> I haven't seen any replies to this, so I'll take a stab at it. And I'll
>> admit up front, I think it is bizarre that SCAN would be suspected as a
>> performance hit as opposed to having one listener per database. I'll keep
>> the discussion to answering your question about monitoring listener
>> performance, however.
>>
>> Somewhere in here there needs to be a discussion about what is expected
>> of listener performance. Maybe that is something you can take back to your
>> customer. Is it number of connections established per minute? Is it number
>> of bytes sent and received? There are many ways to frame it.
>>
>> Beware also that it will be impossible to produce anything useful
>> without testing both configurations. You'll have to set up scan listeners
>> before you can run tests. That effort alone makes this somewhat moot, as I
>> expect the performance (or suspected lack thereof) with SCAN will be
>> immediately realized as soon as you spin up the scan listeners.
>>
>> A quick and dirty method of monitor performance would be to grep the
>> listener logs for number of instance name occurrences. Something like this:
>> "grep <instance name> listener.log | wc -l". That will tell you how many
>> times activity is logged for each instance. You'll want to further filter
>> the listener log entries for your testing timeframe.
>>
>> I'll admit I think that is a dumb way to test this, but maybe it is
>> enough to prove the point. It depends on the audience, I guess.
>>
>> For a real test, I think you would have to turn on listener tracing and
>> then use trcasst to generate performance stats. You would need to do this
>> both before and after setting up SCAN. It will be a lot of work. Looking at
>> the documentation for 11.2<http://docs.oracle.com/cd/E11882_01/network.112/e41945/trouble.htm#NETAG442>,
>> it appears that *trcasst -s* can give you lots of stats on bytes sent
>> and received, sessions served, and so on.
>>
>> Andy
>>
>>
>> On Wed, Apr 2, 2014 at 11:51 AM, Dba DBA <oracledbaquestions_at_gmail.com>wrote:
>>
>>> Clusterware(2 nodes): 11.2.0.4
>>> DB Homes: 10.2.0.4, 11.2.0.2, 11.2.4 (all use same clusterware)
>>> OS: Redhat 6.
>>>
>>> Migrating existing DBs from a customer to us. Our standard is to use
>>> SCAN listener. The customer uses 1 listener per DB. This is primarily
>>> because DBs upgraded from earlier versions. Its a much larger project to
>>> require TNS changes to many locations if you change to scan. In our case
>>> all tns needs to change because we are moving to different servers.
>>>
>>> Getting pushback about 'performance issues with just the scan
>>> listener'. I dont know how to measure this. Its outside the DB so the event
>>> system won't work. I don't think it matters since its the same set of CPUs,
>>> but Id rather get some numbers instead of guessing.
>>>
>>> We are also required to use SSL connections and plan to follow the
>>> following note. Not sure if this has any impact on performance.
>>>
>>> [image: Description:
>>> https://support.oracle.com/epmos/adf/images/t.gif]
>>>
>>> *Using Class of Secure Transport (COST) to Restrict Instance
>>> Registration in Oracle RAC (Doc ID 1340831.1)*
>>>
>>> Issues:
>>> 1. Some clusters have 20+ DBs. Concern(not my concern) is that scan
>>> listener alone can't handle this many DBs.
>>>
>>> 2. SSL connections: I think the customer actually set this up wrong.
>>> They had the TCPS configured in the listener, but gave their users the TCP
>>> connection. So they are not really using SSL. For security reasons, I made
>>> the decision to use SSL across the board and do it correctly. I dont know
>>> if this has any impact on performance and I dont know how to measure it.
>>>
>>> 3. Some DBs have very high process parameters (over 1100). This
>>> implies these DBs have applications that don't use connection pooling. So
>>> this would mean more listener usage. I dont know for sure. All I have to go
>>> on is the init.ora file. We are in the early stages. I also would be
>>> surprised if this matters to scan vs. lots of listeners.
>>>
>>> I dont think going to scan will impact performance. Its the same
>>> number of CPUs. I think lots of listeners may be a little worse since its
>>> more processes that need to be spawned. That being said, I have no idea how
>>> to measure this. Its outside the DB, so the DB event system won't work.
>>>
>>> Any suggestions on how to measure 'listener performance'. It sounds
>>> bizarre, but I don't know how to respond.
>>>
>>

--
http://www.freelists.org/webpage/oracle-l
Received on Tue Apr 08 2014 - 10:09:06 CEST

Original text of this message