Re: ORDS Restart requirement

From: Ruan Linehan <ruandav_at_gmail.com>
Date: Wed, 20 Oct 2021 01:11:56 +0100
Message-ID: <CAP0kZ-3Gv36oy-k_obsh5GagetHyKU221+c3X5py+aQPK0RM1g_at_mail.gmail.com>



Hi Kris,

We are using Oracle Rest Data Services version 20.3.0. I've scanned through the Release Notes for subsequent versions, but I've not yet been able to identify an obvious match to what you describe below within the issues fixed or changes.

Regards,
Ruan

On Tue, Oct 19, 2021 at 3:43 PM kris rice <kris.rice_at_jokr.net> wrote:

> Ruan,
> There was an issue a while back that if not available at startup it was
> flagged and never revisited. That was changed but I can not recall what
> version. There were basically 2 issues that got addressed. 1) what you see
> in that a down db is bad listed and 2) connecting to every defined db at
> startup. What version are you using?
>
> -kris
>
> On Mon, Oct 18, 2021 at 7:24 PM Ruan Linehan <ruandav_at_gmail.com> wrote:
>
>> Hi Kris/Tim,
>>
>> After quite a few various scenario based attempts at "breaking" the
>> connectivity between ORDS and the DB endpoints, I think I have observed
>> what is really going on. Trying to recreate my original suggested symptom
>> by creating a service availability disconnect, even for long periods of
>> time, was unsuccessful; insofar as ORDS always then successfully resumed
>> work once the DB connection was re-established (Which is good). So
>> apologies for mis-representing the issue in my original mail.
>>
>> I investigated some of the older historical APEX logs. What is more
>> likely happening in our situation, is that restarts of the ORDS services
>> (i.e. Which we do frequently to introduce new endpoint configs) on the load
>> balanced VMs is sometimes crossing over with periodic maintenance window
>> periods of individual pluggable databases.
>>
>> Therefore, sometimes it will happen that for 1 in 100 endpoints, there
>> will be an ORDS complaint with respect to the initial startup validation of
>> the pool config. e.g. "WARNING: The pool named: |apex|pu| is missing and
>> will be ignored: The database service named: |apex|pu| does not exist."
>> These startup errors were not being properly trapped / flagged as part of
>> our processes, so I can easily fix that.
>>
>> Testing of this scenario with an ORDS startup, whilst a PDB database
>> service is currently down, does result in the connection never establishing
>> once the PDB service is eventually started. So I believe this is actually
>> what is occurring for us; Not a disconnect of the database from ORDS, but
>> it is the initial startup verification which if unsuccessful for a
>> particular endpoint, means that the pool entry is literally "ignored" from
>> then on.
>>
>> I assume this is the expected behaviour and if so, is there any work
>> around beyond a restart of ORDS at that point?
>>
>> Kind regards,
>> Ruan
>>
>> On Tue, Oct 12, 2021 at 9:58 PM Ruan Linehan <ruandav_at_gmail.com> wrote:
>>
>>> *"and it never requires a restart. The pools should reestablish
>>> themselves as needed"*
>>>
>>> Thanks Tim and Kris for taking the time to reply.
>>>
>>> Well, that is quite puzzling to me that our 'broken' connection issue
>>> seems unexpected to you both; but it also makes me hopeful that this is
>>> some mis-config or error in implementation on our part. I'm certainly not
>>> proficient in configuring ORDS as I'm usually investigating from the other
>>> side of the (database) fence. I'm intrigued now, as you mentioned Tim, that
>>> maybe we have some sequence of events or triggers leading to our particular
>>> issue. Yes, we have multiple ORDS installs on separate VMs behind a
>>> load-balancer.
>>>
>>> I've just "broken" a non-production environment ORDS web services pool
>>> connection in the last hour, by stopping the PDB services and killing
>>> existing ORDS sessions to test. I'm going to leave this down now for a few
>>> different periods of hours to see what happens and will reply back here
>>> with some specifics.
>>>
>>> Kind regards
>>> Ruan
>>>
>>> On Tue, Oct 12, 2021 at 1:57 PM kris rice <kris.rice_at_jokr.net> wrote:
>>>
>>>> I'd suggest, as always, upgrade. We have ords nodes on some databases
>>>> that go up/down, active/readonlny,... and it never requires a restart. The
>>>> pools should reestablish themselves as needed. For example, I have to
>>>> manage all the ords nodes in Autonomous DB and those things come and go on
>>>> the whim of customers kicking tires or shutting down to save money when not
>>>> in use. These ords nodes run somewhere around 2k connection pools each and
>>>> never need a restart, even when the pdb is relocated out from under us to
>>>> another CDB.
>>>>
>>>> Happy to jump on a zoom or medium of choice and chat more if you'd like.
>>>>
>>>> -kris
>>>>
>>>> On Tue, Oct 12, 2021 at 7:43 AM Tim Hall <tim_at_oracle-base.com> wrote:
>>>>
>>>>> Hi.
>>>>>
>>>>> That's interesting.I can't ever remember having to restart ORDS as a
>>>>> result of a database outage. Even prolonged ones. We install in the PDB,
>>>>> not the CDB and have one or more ORDS instances in Docker containers for
>>>>> each PDB. As a result, a problem with one instance doesn't affect
>>>>> everything else. Even so, these are basic connection issues you are having,
>>>>> so I don't think the topology differences can be that relevant. I think
>>>>> this may be a job for the ORDS team. They could certainly tell you what the
>>>>> expected behaviour is.
>>>>>
>>>>> Do you have a non-prod/test setup where you can test some failure
>>>>> scenarios? I wonder if there are specific patterns that cause the issue,
>>>>> rather than a general overarching issue.
>>>>>
>>>>> I guess in the interim I would consider a mitigation. I assume you
>>>>> have multiple ORDS installations behind a load balancer to support this. If
>>>>> so, you could script a restart of all ORDS instances (one at a time of
>>>>> course), and call that at the end of every piece of scheduled maintenance.
>>>>> It would minimise the apparent outage.
>>>>>
>>>>> I'll see if I can get someone from the ORDS team to look at this
>>>>> thread.
>>>>>
>>>>> Cheers
>>>>>
>>>>> Tim...
>>>>>
>>>>> On Tue, Oct 12, 2021 at 9:22 AM Ruan Linehan <ruandav_at_gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> I've researched elsewhere but not been able to identify a suitable
>>>>>> solution, so I'm asking here in the hopes that an ORDS aficionado might
>>>>>> provide some direction.
>>>>>>
>>>>>> My issue is around the perception of a restart of ORDS being a
>>>>>> requirement to re-establishing a connection to an endpoint which may have
>>>>>> been unavailable for a period of time.
>>>>>>
>>>>>> We run ORDS v20 on a Linux VM as part of a solution accompanying an
>>>>>> Exadata multitenant environment. ORDS is made available to all PDBs,
>>>>>> installed in the CDB. Within the 'conf' directory of ORDS - we stage all
>>>>>> the associated apex_aa.xml, apex_ab.xml, apex_ac.xml etc configuration
>>>>>> mapping files. Periodically, one of the PDB environments may be made
>>>>>> unavailable (i.e Closed or else RAC services stopped, or someone
>>>>>> inadvertently locks the ORDS_PUBLIC_USER account etc) for maintenance, for
>>>>>> a day or weekend etc. When this takes place, the pluggables
>>>>>> ORDS_PUBLIC_USER database sessions are terminated and the ORDS connection
>>>>>> cannot be re-established for a period of time to that PDB. So far so good.
>>>>>>
>>>>>> Once the maintenance is complete, and the PDB is re-opened once
>>>>>> again, RAC services restarted, ORDS does not automatically re-establish a
>>>>>> database connection to that same PDB.
>>>>>>
>>>>>> If I need to get the ORDS_PUBLIC_USER connections re-established once
>>>>>> more for that specific PDB, then I need to stop ORDS processes for all
>>>>>> clients and restart.
>>>>>> i.e. This reads the url mapping xml and validates the associated
>>>>>> apex_aa.xml files etc., and eventually successfully re-establishes ALL the
>>>>>> database connection and all is good.
>>>>>>
>>>>>> The difficulty is though, that we have literally hundreds of these
>>>>>> PDBs in a CDB, and literally hundreds of accompanying ORDS endpoints. So,
>>>>>> if one of these environments is impacted by a "maintenance" of some sort,
>>>>>> and the database connection is severed for a time, then it requires a full
>>>>>> restart of ORDS for ALL to get it back, which is rather painful.
>>>>>>
>>>>>> There must be something I am missing right? I understand XML config
>>>>>> changes require a restart of ORDS to be picked up, but I find it troubling
>>>>>> that a full restart is also required when just one client endpoint
>>>>>> connection out of a hundred is impacted?
>>>>>> Is there any way I can force ORDS to 'reinit' and re-read the conf
>>>>>> files to re-establish a single broken connection with restarting?
>>>>>>
>>>>>> Kind regards,
>>>>>> Ruan
>>>>>>
>>>>>

--
http://www.freelists.org/webpage/oracle-l
Received on Wed Oct 20 2021 - 02:11:56 CEST

Original text of this message