RE: Meltdown and spectre

From: Fd Habash <fmhabash_at_gmail.com>
Date: Mon, 8 Jan 2018 17:12:46 -0500
Message-ID: <5a53ecde.0f1a240a.1ce7c.d172_at_mx.google.com>



For those who are on the cutting edge, did you apply the patch on an Oracle RDBMS host? If yes, what overhead did you notice? I know mileage will vary, just looking for some real-life perspective.



Thank you

From: Reen, Elizabeth
Sent: Monday, January 8, 2018 2:41 PM
To: 'Tim Hall'; 'oracle-l_at_freelists.org' Subject: RE: Meltdown and spectre

            The links I was referring to are https://spectreattack.com/spectre.pdf and https://meltdownattack.com/meltdown.pdf. They give an excellent explanation.  My masters is in systems engineering.  My dissertation was writing an operating system.   I wish I have the time to surf the web to read these things, but I have a real job supporting a couple hundred databases.  I got my information from the business journals my bosses were reading. What they wrote was dumbed down and lacking the details.  Most of the vendor notes were as usual vague.

      Linus Torvalds answer was just passing the buck.  He depended on the hardware to do his work.  Did Intel know that he was outsourcing that to them?   It is not that hard to check that the process is not going outside its memory.  I will give him that he originally designed this as a single user o/s.  It has morphed into more but no one went back to look at the design. One thing an o/s must do is protect itself against developers.  Leaving this open is like allowing anyone to update the disk map, an invitation for disaster.  I also question AWS and Oracle for allowing their users to have the privs to do this.  If you can control your own destiny it is better.  I understand that a lot of companies cannot afford their own IT department.

            When you buy a Focus, you don’t expect it to be designed like a Rolls Royce.  Why people think that cheaper machines/software will do the same thing as the more expensive versions is beyond me.

Liz

From: timseanhall_at_gmail.com [mailto:timseanhall_at_gmail.com] On Behalf Of Tim Hall Sent: Friday, January 05, 2018 6:18 PM
To: Reen, Elizabeth [ICG-IT]
Cc: knecht.stefan_at_gmail.com; rajendra.pande_at_ubs.com; Andrew Kerber; Mark W. Farnham; oracle-l_at_freelists.org; fmhabash_at_gmail.com; niall.litchfield_at_gmail.com Subject: Re: Meltdown and spectre

RE: "This is the first piece that makes sense."

This made me lol. Project Zero announced the issue. It was linked or mentioned in pretty much every piece I read on this...

RE: "I don’t see any reason why an O/S would allow a process to read outside of its memory."

I'm not a CPU engineer or OS/Hypervisor developer so excuse my over-simplistic thoughts and anyone feel free to jump in and correct me... Operating systems and hypervisors will offload as much as possible to the hardware. As an example, back in the day VMware used binary translation for almost all system calls, but that was pretty slow. Once Intel and AMD included their respective virtualization tech onto the chips (2006/2007) guess what everyone they did? They passed some of this responsibility on to the hardware to make the hypervisors leaner and more efficient. They trust the chip to do what it says it will do. Likewise, if the OS or hypervisor developer believes operations are safe because the chip is not going to do something stupid, then they will definitely remove code path to improve performance, since all the belt & braces code takes more processing and slows stuff down. At the kernel level, every click counts, as is evident by some of the performance impacts coming through.

https://access.redhat.com/articles/3307751

Linus Torvalds response included,

"A *competent* CPU engineer would fix this by making sure speculation doesn't happen across protection domains."

I know bugger all about this stuff, but his post reads to me like he thought the CPU was protecting these calls, so the OS kernel didn't need to.

The picture from VMware doesn't sound so bad initially.

https://blogs.vmware.com/security/2018/01/vmsa-2018-0002.html

I read some other stuff (can't find the links now) which suggested VMware were being very naive and downplaying the problem. I guess we will see how this one plays out over the coming weeks.

I also saw some other stuff, once again can't find the links, that suggested it will need to be a combination of kernel and firmware to mitigate the issues until the chip designs are altered.

Cheers

Tim...

On Fri, Jan 5, 2018 at 10:19 PM, Reen, Elizabeth <elizabeth.reen_at_citi.com> wrote: Thanks Stefan!  This is the first piece that makes sense.  There is a possibility that this could be fixed in the firmware, but that is going to take time.  I can see why an o/s fix would work.  I don’t see any reason why an O/S would allow a process to read outside of its memory.  When I played with assemblers, this was not allowed by the O/S.   
A shared environment such as AWS would be a great risk from this.  An environment within a company would be a lesser risk, but odds are they would be hacked in an easier fashion.  To give Linux and Windows their due, they did start our as large multi user O/Ses.   
 
Liz
 
 
From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Stefan Knecht Sent: Friday, January 05, 2018 4:58 PM
To: dmarc-noreply_at_freelists.org
Cc: tim_at_oracle-base.com; rajendra.pande_at_ubs.com; Andrew Kerber; Mark W. Farnham; oracle-l_at_freelists.org; fmhabash_at_gmail.com; niall.litchfield_at_gmail.com

Subject: Re: Meltdown and spectre
 
I'm not a CPU engineer - but from my understanding, CPUs try to optimize by "predicting" where they will need to jump to. And apparently that's something that people can abuse.

The very first paragraph kind of has it all:

https://googleprojectzero.blogspot.com/

"We have discovered that CPU data cache timing can be abused to efficiently leak information out of mis-speculated execution, leading to (at worst) arbitrary virtual memory read vulnerabilities across local security boundaries in various contexts."

The key being "mis-speculated". They apparently thought that it's a good idea to execute something ahead of time, just in case we will need to execute it. How no-one imagined the potential abuse is beyond me.

Also interesting to see Linus Torvald's response to all of this:  https://lkml.org/lkml/2018/1/3/797  
 
Stefan
 
 
On Sat, Jan 6, 2018 at 4:34 AM, Reen, Elizabeth <dmarc-noreply_at_freelists.org> wrote:

            All of that happens in the O/S not on the chip.  One does not log into a processor.  
 
 

Liz
 
Elizabeth Reen
CPB Database Group Manager
718.248.9930  (Office)
Service Now Group: CPB-ORACLE-DB-SUPPORT  
 
From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Tim Hall Sent: Friday, January 05, 2018 4:06 PM
To: rajendra.pande_at_ubs.com
Cc: Andrew Kerber; Mark W. Farnham; oracle-l_at_freelists.org; dmarc-noreply_at_freelists.org; fmhabash_at_gmail.com; niall.litchfield_at_gmail.com

Subject: RE: Meltdown and spectre
 
According to this the RHEL fixes can be overridden if you need performance.  
https://access.redhat.com/articles/3307751  
They say bare-metal and containers have similar overheads, but virtual guests are likely to be hit harder...  
Cheers
 
Tim... (On crappy phone)
 
On 5 Jan 2018 7:04 pm, <rajendra.pande_at_ubs.com> wrote: The answer (ref meltdown) apparently is KAISER that has shown to be effective against Meltdown and hence (I guess) updates to the OS  
https://www.reuters.com/article/us-cyber-intel-researcher/how-a-researcher-hacked-his-own-computer-and-found-worst-chip-flaw-idUSKBN1ET1ZR  
 
From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Andrew Kerber Sent: Friday, January 05, 2018 1:58 PM
To: Mark W. Farnham
Cc: ORACLE-L; dmarc-noreply_at_freelists.org; fmh; niall.litchfield_at_gmail.com; tim_at_oracle-base.com Subject: Re: Meltdown and spectre
 
According to what i am reading Meltdown affects only Intel, but AMD is affected by Spectre, as is Intel. And spectre may be a more difficult fix in the long run.   
On Fri, Jan 5, 2018 at 11:55 AM Mark W. Farnham <mwf_at_rsiz.com> wrote: re: 2) was my question:
 
“So, will there be an “insecure” patch to skip the overhead and rely on server access control?” Follow-up: For all the millions of single user in fact intel based systems, will there be “insecure” patches? The point being, yes, you will have to do patches outside of lab machines kept for particular vintage reasons. Will you be forced to get the performance penalty?  
From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Tim Hall Sent: Friday, January 05, 2018 12:18 PM
To: dmarc-noreply_at_freelists.org
Cc: mwf_at_rsiz.com; niall.litchfield_at_gmail.com; andrew.kerber_at_gmail.com; fmh; ORACLE-L

Subject: Re: Meltdown and spectre
 
Does not compute. 
 
1) This is a problem with Intel chips. It's not a problem with Linux. The OS vendors are putting in patches to fix/mitigate issues so you don't have to scrap your Intel servers and replace them with servers with AMD chips.  
2) Do I need to patch my servers? So you are never going to patch your kernel again? Ever? If you ever do, you will get these fixes. Good luck with never patching stuff again...  
Cheers
 
Tim...
 
On Fri, Jan 5, 2018 at 5:06 PM, Reen, Elizabeth <dmarc-noreply_at_freelists.org> wrote:

            Since all of my servers are in house behind numerous firewalls, do I need to patch everything?  The performance hit is going to hurt.  Do I need to do that for dev and testing servers which run with redacted data?  I could need to double the amount of servers I own.  Yes they are cheap, but it adds up after a while.  What about licenses?  Will I need to up them because I need more iron to do the same work? 
 
            I agree that you can’t stop a fully prived account from reading memory under this scenario.  It is a bad operating system that lets this happen.  Given the say Linux was developed, it is easy for something like this to sneak through.  Linux is a great o/s, but you get what you pay for here.  The reason it is so popular is that it is so inexpensive.  This is not an issue in AIX, Sparc, or HP/UX.  They cost money because they have been designed and tested.  They did not start out life as an alternative to windows.
 
            Wrapper on every syscall is probably the fastest fix.  It is far from the best fix.  Hopefully they will put in the correct fix.
 

Liz
 
From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Mark W. Farnham Sent: Friday, January 05, 2018 8:38 AM
To: niall.litchfield_at_gmail.com; andrew.kerber_at_gmail.com Cc: 'fmh'; 'ORACLE-L'
Subject: RE: Meltdown and spectre
 
This also poses what I think is a relevant question for folks who place their physical RDBMS server(s) securely and only have privileged logons anyway. (You really can’t stop a fully privileged account from viewing memory or any other resources anyway and only in memory encryption can frustrate that if a bad actor has gained a privileged access to a server.)  
So, will there be an “insecure” patch to skip the overhead and rely on server access control?  
Then we can have a fresh round of the debate about whether “physical” or “virtual” is faster with the playing field thus tilted significantly in favor of “physical.”  
I also wonder for “virtual” servers whether this could be merely a “hypervisor” patch (which in ring security theory dating back to the 1970’s could establish a memory address bounded area at the privileged account layer (which should be a heckuva lot cheaper than a wrapper on every “syscall.”)  
DTSS is lookin’ pretty good right now. Still it was our own fault for not explaining clearly to enough to management that 100 million (plus) copies at $39.95 each was more than 12 copies at $10 million each. Sigh.  
mwf
 
From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Niall Litchfield Sent: Thursday, January 04, 2018 10:58 AM To: andrew.kerber_at_gmail.com
Cc: fmh; ORACLE-L
Subject: Re: Meltdown and spectre
 
There absolutely should be an OEL patch for this - for the RH kernel they'll probably take upstream - for UEK  I'd expect an Oracle patch. I'd expect Oracle shops to be regression testing to determine the likely impact on RDBMS (and java app for that matter) performance.   
On Thu, Jan 4, 2018 at 3:40 PM, Andrew Kerber <andrew.kerber_at_gmail.com> wrote: I was wondering the same thing. But I dont think its up to Oracle to patch this, its going to be at the OS and firmware level.  But everything I read says that its going be a huge performance hit, anywhere from 10-50%, and the higher end will be on IO bound systems.  
On Thu, Jan 4, 2018 at 9:33 AM, Fred Habash <fmhabash_at_gmail.com> wrote: Checked Oracle security bulletins but didn't find anything related. Did Oracle release an official statement for these vulnerabilities at least for the RDBMS and OEL.   
Thanks 

--

Andrew W. Kerber

'If at first you dont succeed, dont take up skydiving.'

 
--

Niall Litchfield
Oracle DBA
http://www.orawin.info
 
--

Andrew W. Kerber

'If at first you dont succeed, dont take up skydiving.'

Please visit our website at
http://financialservicesinc.ubs.com/wealth/E-maildisclaimer.html for important disclosures and information about our e-mail policies. For your protection, please do not transmit orders or instructions by e-mail or include account numbers, Social Security numbers, credit card numbers, passwords, or other personal information.

 
--

//
zztat - The Next-Gen Oracle Performance Monitoring and Reaction Framework! Visit us at zztat.net | Support our Indiegogo campaign at igg.me/at/zztat | _at_zztat_oracle

--

http://www.freelists.org/webpage/oracle-l Received on Mon Jan 08 2018 - 23:12:46 CET

Original text of this message