Re: linux memory: limiting filesystem caching

From: Christo Kutrovsky <kutrovsky.oracle_at_gmail.com>
Date: Wed, 13 Jul 2005 17:34:14 +0300
Message-ID: <52a152eb0507130734cf1af08@mail.gmail.com>

Zhu,

How did you measure the performance differences when using bigpages and not using them? What kind of tests did you run ? Were they CPU or disk IO specific ? Do you have statistics of CPU usage in "user" mode and "system" mode?

We have at least 3 cases where converting a production system to bigpages/hugepages was the difference between a hung system and a well performing system.

As for memory savings, what counter-arguments do you have? Keep in mind that PTE pointers are allocated as needed, and if your sessions are touching the entire SGA, for example as often happens, they reconnect often.

I agree that it doesn't sound reasonable, thus the hugepages solution.

-- 
Christo Kutrovsky
Database/System Administrator
The Pythian Group



On 7/13/05, zhu chao <zhuchao_at_gmail.com> wrote:


> It is nice to people using bigpage/hugepages.

> 

> I played in on my production RAC box before (now it is gone), actually

> I did not see much performance gains.  I installed my RAC with no

> hugepage/bigpage(I was using bigpage as I was running 2.1 AS) and

> later I converted one node using bigpage and the other node no change.

> No significant CPU usage change and VM usage change. Later I converted

> both nodes to bigpage, no significant performance change.

> 

> The server was loaded (4 cpu box each, load average 2-3 most of the time).

> 

> As regarding your bigpage save that huge of memory, I don't have that

> kind of theory, but using 2GB kernel memory to manage 2GB does not

> sound reasonable. Maybe other linux expert can give their opinions.

> 

> 

> On 7/13/05, Christo Kutrovsky <kutrovsky.oracle_at_gmail.com> wrote:

> > Hello Teehan,

> >

> > You don't mention which RH version you have, but assuming 3.0 advanced server.

> >

> > as zhu chao mentioned, /proc/sys/vm/pagecache is the parameter you need.

> >

> > My recommendation is to give at most 50% to file caching.

> >

> > vi /etc/sysctl.conf

> >

> > and add:

> > vm.pagecache=10 50 50

> >

> > then run "sysctl -p" to apply. That way on next boot those will be in effect.

> >

> > You can monitor with "vmstat 2" in another session if the memory for

> > "cache" drops and "free" gets high.

> >

> > In addition, on LINUX you should always use hugepages (hugetlbpool)

> > for Oracle. That way Oracle's memory is locked in physical RAM and is

> > not managed (almost invisible) by the linux memory manager, thus

> > reducing memory management cost.

> >

> > Also the benefits of the big pages (chunks of 2mb) will reduce your

> > overal memory usage, especially if you have a lot sessions. Some

> > simple math:

> >

> > SGA of 2 gb , in 4 Kb pages = 524288 pages * 8 bytes per pointer (i

> > think) = 4 Mb per process for the page pointers. If you have 500

> > sessions then that's 500 * 4 mb = 2 Gb of memory to manage 2 gb of

> > memory.

> >

> > Compared to SGA of 2 gb in 2 Mb pages = 1024 pages * 8 bytes = 8 Kb

> > per process. For the same 500 sessions you will use 4 Mb of memory to

> > manage 2 gb of memory. A significant improvment.

> >

> > Also the CPU has only so many entries in the virtual to physical

> > memory map cache, thus having that many less pages will improve

> > significantly the hit ratio of your virtual to physical mapings.

> >

> > So simply by using hugepages you:

> > - reduce memory for PTE pointers by a factor of 512 (2 gb for 500 sessions)

> > - lock Oracle's SGA in physical memory

> > - reduce the memory management costs for the linux kernel

> > - improve the CPU cache for virtual to physical mapings

> > - reduce the amount of memory that you have to touch overal

> >

> > I hope this helps.

> >

> >

> > --

> > Christo Kutrovsky

> > Database/System Administrator

> > The Pythian Group

> >

> > On 7/13/05, Teehan, Mark <mark.teehan_at_csfb.com> wrote:

> > > Hi all

> > > I have several redhat blade clusters running 10.1.0.4 RAC on 2.4.9-e.43enterprise. All database storage is OCFS, with ext3 for backups, home dirs etc. The servers have 12GB of RAM, of which about 2GB is allocated to the database, which is fine. Linux, in its wisdom, uses all free memory (10GB in this case) for filesystem caching for the non OCFS filesystems (since OCFS uses directIO); so every night when I do a backup it swallows up all available memory and foolishly sends itself into a swapping frenzy; and afterwards it sometimes cannot allocate enough free mem for background processes. This seems to be worse on e43; I was on e27 until recently. Does anyone know how to control filesystem block caching? Or how to get it to de-cache some? For instance, I have noticed that gziping a file, then ctrl-C'ing it can free up  a chunk of RAM, I assume it de-caches the original uncompressed file. But its not enough!

> > >

> > > Rgds

> > > Mark

> > >

> > > ==============================================================================

> > > Please access the attached hyperlink for an important electronic communications disclaimer:

> > >

> > > http://www.csfb.com/legal_terms/disclaimer_external_email.shtml

> > >

> > > ==============================================================================

> > >

> > > --

> > > http://www.freelists.org/webpage/oracle-l

> > >

> >

> >

> > --

> > Christo Kutrovsky

> > Database/System Administrator

> > The Pythian Group

> > --

> > http://www.freelists.org/webpage/oracle-l

> >

> 

> 

> --

> Regards

> Zhu Chao

> www.cnoug.org

> 



-- 
Christo Kutrovsky
Database/System Administrator
The Pythian Group
--
http://www.freelists.org/webpage/oracle-l

Received on Wed Jul 13 2005 - 09:36:54 CDT