Date: Wed, 19 Jun 2002 13:30:06 +0200
From: Craig Kulesa <ckul...@as.arizona.edu>
Subject: [PATCH] (1/2) reverse mapping VM for 2.5.23 (rmap-13b)
Message-ID: <Pine.LNX.4.44.0206181340380.3031-100000@loke.as.arizona.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it (1.20)
NNTP-Posting-Host: a.631.anti-phl.bofh.it
Newsgroups: linux.kernel
Organization: linux.*_mail_to_news_unidirectional_gateway
Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!
news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!news.mailgate.org!bofh.it!robomod
X-Original-Cc: linux...@kvack.org
X-Original-Date: Wed, 19 Jun 2002 04:18:00 -0700 (MST)
X-Original-Sender: linux-kernel-ow...@vger.kernel.org
X-Original-To: linux-ker...@vger.kernel.org
Lines: 127



Where:  http://loke.as.arizona.edu/~ckulesa/kernel/rmap-vm/

This patch implements Rik van Riel's patches for a reverse mapping VM 
atop the 2.5.23 kernel infrastructure.  The principal sticky bits in 
the port are correct interoperability with Andrew Morton's patches to 
cleanup and extend the writeback and readahead code, among other things.  
This patch reinstates Rik's (active, inactive dirty, inactive clean) 
LRU list logic with the rmap information used for proper selection of pages 
for eviction and better page aging.  It seems to do a pretty good job even 
for a first porting attempt.  A simple, indicative test suite on a 192 MB 
PII machine (loading a large image in GIMP, loading other applications, 
heightening memory load to moderate swapout, then going back and 
manipulating the original Gimp image to test page aging, then closing all 
apps to the starting configuration) shows the following:

2.5.22 vanilla:
Total kernel swapouts during test = 29068 kB
Total kernel swapins during test  = 16480 kB
Elapsed time for test: 141 seconds

2.5.23-rmap13b:
Total kernel swapouts during test = 40696 kB
Total kernel swapins during test  =   380 kB
Elapsed time for test: 133 seconds

Although rmap's page_launder evicts a ton of pages under load, it seems to 
swap the 'right' pages, as it doesn't need to swap them back in again.
This is a good sign.  [recent 2.4-aa work pretty nicely too]

Various details for the curious or bored:

	- Tested:   UP, 16 MB < mem < 256 MB, x86 arch. 
	  Untested: SMP, highmem, other archs.  

	  In particular, I didn't even attempt to port rmap-related 
	  changes to 2.5's arch/arm/mm/mm-armv.c.  

	- page_launder() is coarse and tends to clean/flush too 
	  many pages at once.  This is known behavior, but seems slightly 
	  worse in 2.5 for some reason. 

	- pf_gfp_mask() doesn't exist in 2.5, nor does PF_NOIO.  I have 
	  simply dropped the call in try_to_free_pages() in vmscan.c, but 
	  there is probably a way to reinstate its logic 
	  (i.e. avoid memory balancing I/O if the current task 
	  can't block on I/O).  I didn't even attempt it.

	- Writeback:  instead of forcing reinstating a page on the 
	  inactive list when !PageActive, page->mapping, !Pagedirty, and 
	  !PageWriteback (see mm/page-writeback.c, fs/mpage.c), I just 
	  let it go without any LRU list changes.  If the page is 
	  inactive and needs attention, it'll end up on the inactive 
	  dirty list soon anyway, AFAICT.  Seems okay so far, but that 
	  may be flawed/sloppy reasoning... We could always look at the 
	  page flags and reinstate the page to the appropriate LRU list 
	  (i.e. inactive clean or dirty) if this turns out to be a 
	  problem...

	- Make shrink_[i,d,dq]cache_memory return the result of 
	  kmem_cache_shrink(), not simply 0.  Seems pointless to waste 
	  that information, since we're getting it for free.  Rik's patch 
	  wants that info anyway...

	- Readahead and drop_behind:  With the new readahead code, we have 
	  some choices regarding under what circumstances we choose to 
	  drop_behind (i.e. only drop_behind if the reads look really 
	  sequential, etc...).  This patch blindly calls drop_behind at 
	  the conclusion of page_cache_readahead().  Hopefully the 
	  drop_behind code correctly interprets the new readahead indices. 
	  It *seems* to behave correctly, but a quick look by another 
	  pair of eyes would be reassuring. 

	- A couple of trivial rmap cleanups for Rik:
		a) Semicolon day!  System fails to boot if rmap debugging 
		   is enabled in rmap.c.  Fix is to remove the extraneous 
		   semicolon in page_add_rmap():

				if (!ptep_to_mm(ptep)); <--

		b) The pte_chain_unlock/lock() pair between the tests for 
		   "The page is in active use" and "Anonymous process 
		   memory without backing store" in vmscan.c seems
		   unnecessary. 

		c) Drop PG_launder page flag, ala current 2.5 tree.

		d) if(page_count(page)) == 0)  --->  if(!page_count(page))
		   and things like that...

	- To be consistent with 2.4-rmap, this patch includes a 
	  minimal BIO-ified port of Andrew Morton's read-latency2 patch
	  (i.e. minus the elvtune ioctl stuff) to 2.5, from his patch 
	  sets.  This adds about 7 kB to the patch. 

	- The patch also includes compilation fixes:  
	(2.5.22)
	      drivers/scsi/constants.c (undeclared integer variable)
	      drivers/pci/pci-driver.c (unresolved symbol in pcmcia_core)
	(2.5.23)
	      include/linux/smp.h (define cpu_online_map for UP)
	      kernel/ksyms.c    (export default_wake_function for modules)  
	      arch/i386/i386_syms.c   (export ioremap_nocache for modules)


Hope this is of use to someone!  It's certainly been a fun and 
instructive exercise for me so far.  ;)

I'll attempt to keep up with the 2.5 and rmap changes, fix inevitable 
bugs in porting, and will upload regular patches to the above URL, at 
least until the usual VM suspects start paying more attention to 2.5.  
I'll post a quick changelog to the list occasionally if and when any 
changes are significant, i.e. other then boring hand patching and 
diffing.   


Comments, feedback & patches always appreciated!

Craig Kulesa
Steward Observatory, Univ. of Arizona

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Message-ID: <3D10AEBC.176441BA@zip.com.au>
Date: Wed, 19 Jun 2002 18:20:07 +0200
From: Andrew Morton <a...@zip.com.au>
X-Mailer: Mozilla 4.79 [en] (X11; U; Linux 2.4.19-pre9 i686)
X-Accept-Language: en
MIME-Version: 1.0
Subject: Re: [PATCH] (1/2) reverse mapping VM for 2.5.23 (rmap-13b)
References: <Pine.LNX.4.44.0206181340380.3031-100000@loke.as.arizona.edu>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it (1.20)
NNTP-Posting-Host: a.965.anti-phl.bofh.it
Newsgroups: linux.kernel
Organization: linux.*_mail_to_news_unidirectional_gateway
Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!
news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!nntp.infostrada.it!
bofh.it!robomod
X-Original-Cc: linux-ker...@vger.kernel.org, linux...@kvack.org,
	Rik van Riel <r...@conectiva.com.br>
X-Original-Date: Wed, 19 Jun 2002 09:18:04 -0700
X-Original-Sender: linux-kernel-ow...@vger.kernel.org
X-Original-To: Craig Kulesa <ckul...@as.arizona.edu>
Lines: 79

Craig Kulesa wrote:
> 
> ...
> Various details for the curious or bored:
> 
>         - Tested:   UP, 16 MB < mem < 256 MB, x86 arch.
>           Untested: SMP, highmem, other archs.
> 
>           In particular, I didn't even attempt to port rmap-related
>           changes to 2.5's arch/arm/mm/mm-armv.c.
> 
>         - page_launder() is coarse and tends to clean/flush too
>           many pages at once.  This is known behavior, but seems slightly
>           worse in 2.5 for some reason.
> 
>         - pf_gfp_mask() doesn't exist in 2.5, nor does PF_NOIO.  I have
>           simply dropped the call in try_to_free_pages() in vmscan.c, but
>           there is probably a way to reinstate its logic
>           (i.e. avoid memory balancing I/O if the current task
>           can't block on I/O).  I didn't even attempt it.

That's OK.  PF_NOIO is a 2.4 "oh shit" for a loop driver deadlock.
That all just fixed itself up.

>         - Writeback:  instead of forcing reinstating a page on the
>           inactive list when !PageActive, page->mapping, !Pagedirty, and
>           !PageWriteback (see mm/page-writeback.c, fs/mpage.c), I just
>           let it go without any LRU list changes.  If the page is
>           inactive and needs attention, it'll end up on the inactive
>           dirty list soon anyway, AFAICT.  Seems okay so far, but that
>           may be flawed/sloppy reasoning... We could always look at the
>           page flags and reinstate the page to the appropriate LRU list
>           (i.e. inactive clean or dirty) if this turns out to be a
>           problem...

The thinking there was this: the 2.4 shrink_cache() code was walking the
LRU, running writepage() against dirty pages at the tail.  Each written
page was moved to the head of the LRU while under writeout, because we
can't do anything with it yet.  Get it out of the way.

When I changed that single-page writepage() into a "clustered 32-page
writeout via ->dirty_pages", the same thing had to happen: get those
pages onto the "far" end of the inactive list.

So basically, you'll need to give them the same treatment as Rik
was giving them when they were written out in vmscan.c.  Whatever
that was - it's been a while since I looked at rmap, sorry.

> ...
> 
>         - To be consistent with 2.4-rmap, this patch includes a
>           minimal BIO-ified port of Andrew Morton's read-latency2 patch
>           (i.e. minus the elvtune ioctl stuff) to 2.5, from his patch
>           sets.  This adds about 7 kB to the patch.

Heh.   Probably we should not include this in your patch.  It gets
in the way of evaluating rmap.  I suggest we just suffer with the
existing IO scheduling for the while ;)

>         - The patch also includes compilation fixes:
>         (2.5.22)
>               drivers/scsi/constants.c (undeclared integer variable)
>               drivers/pci/pci-driver.c (unresolved symbol in pcmcia_core)
>         (2.5.23)
>               include/linux/smp.h (define cpu_online_map for UP)
>               kernel/ksyms.c    (export default_wake_function for modules)
>               arch/i386/i386_syms.c   (export ioremap_nocache for modules)
> 
> Hope this is of use to someone!  It's certainly been a fun and
> instructive exercise for me so far.  ;)

Good stuff, thanks.

-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Content-Type: text/plain; charset=US-ASCII
From: Daniel Phillips <phill...@bonn-fries.net>
Subject: Re: [PATCH] (1/2) reverse mapping VM for 2.5.23 (rmap-13b)
Date: Wed, 19 Jun 2002 19:10:10 +0200
X-Mailer: KMail [version 1.3.2]
References: <Pine.LNX.4.44.0206181340380.3031-100000@loke.as.arizona.edu>
In-Reply-To: <Pine.LNX.4.44.0206181340380.3031-100000@loke.as.arizona.edu>
MIME-Version: 1.0
Content-Transfer-Encoding: 7BIT
Message-ID: <E17KipF-0000up-00@starship>
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it (1.20)
NNTP-Posting-Host: a.433.anti-phl.bofh.it
Newsgroups: linux.kernel
Organization: linux.*_mail_to_news_unidirectional_gateway
Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!
news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!nntp.infostrada.it!
bofh.it!robomod
X-Original-Cc: linux...@kvack.org, Linus Torvalds <torva...@transmeta.com>
X-Original-Date: Wed, 19 Jun 2002 19:00:57 +0200
X-Original-Sender: linux-kernel-ow...@vger.kernel.org
X-Original-To: Craig Kulesa <ckul...@as.arizona.edu>,
	linux-ker...@vger.kernel.org
Lines: 24

On Wednesday 19 June 2002 13:18, Craig Kulesa wrote:
> Where:  http://loke.as.arizona.edu/~ckulesa/kernel/rmap-vm/
>
> This patch implements Rik van Riel's patches for a reverse mapping VM 
> atop the 2.5.23 kernel infrastructure...
>
> ...Hope this is of use to someone!  It's certainly been a fun and 
> instructive exercise for me so far.  ;)

It's intensely useful.  It changes the whole character of the VM discussion 
at the upcoming kernel summit from 'should we port rmap to mainline?' to 'how 
well does it work' and 'what problems need fixing'.  Much more useful.

Your timing is impeccable.  You really need to cc Linus on this work, 
particularly your minimal, lru version.

-- 
Daniel

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Date: Wed, 19 Jun 2002 19:20:08 +0200
From: Dave Jones <da...@suse.de>
Subject: Re: [PATCH] (1/2) reverse mapping VM for 2.5.23 (rmap-13b)
Message-ID: <20020619191136.H29373@suse.de>
Mail-Followup-To: Dave Jones <da...@suse.de>,
	Daniel Phillips <phill...@bonn-fries.net>,
	Craig Kulesa <ckul...@as.arizona.edu>, linux-ker...@vger.kernel.org,
	linux...@kvack.org, Linus Torvalds <torva...@transmeta.com>,
	rwh...@earthlink.net
References: <Pine.LNX.4.44.0206181340380.3031-100000@loke.as.arizona.edu> 
<E17KipF-0000up-00@starship>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.5i
In-Reply-To: <E17KipF-0000up-00@starship>; 
from phillips@bonn-fries.net on Wed, Jun 19, 2002 at 07:00:57PM +0200
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it (1.20)
NNTP-Posting-Host: a.564.anti-phl.bofh.it
Newsgroups: linux.kernel
Organization: linux.*_mail_to_news_unidirectional_gateway
Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!
news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!nntp.infostrada.it!
bofh.it!robomod
X-Original-Cc: Craig Kulesa <ckul...@as.arizona.edu>,
	linux-ker...@vger.kernel.org, linux...@kvack.org,
	Linus Torvalds <torva...@transmeta.com>, rwh...@earthlink.net
X-Original-Date: Wed, 19 Jun 2002 19:11:36 +0200
X-Original-Sender: linux-kernel-ow...@vger.kernel.org
X-Original-To: Daniel Phillips <phill...@bonn-fries.net>
Lines: 24

On Wed, Jun 19, 2002 at 07:00:57PM +0200, Daniel Phillips wrote:
 > > ...Hope this is of use to someone!  It's certainly been a fun and 
 > > instructive exercise for me so far.  ;)
 > It's intensely useful.  It changes the whole character of the VM discussion 
 > at the upcoming kernel summit from 'should we port rmap to mainline?' to 'how 
 > well does it work' and 'what problems need fixing'.  Much more useful.

Absolutely.  Maybe Randy Hron (added to Cc) can find some spare time
to benchmark these sometime before the summit too[1]. It'll be very
interesting to see where it fits in with the other benchmark results
he's collected on varying workloads.

        Dave

[1] I am master of subtle hints.

-- 
| Dave Jones.        http://www.codemonkey.org.uk
| SuSE Labs
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Date: Wed, 19 Jun 2002 19:40:10 +0200
From: Rik van Riel <r...@conectiva.com.br>
X-X-Sender: r...@imladris.surriel.com
Subject: Re: [PATCH] (1/2) reverse mapping VM for 2.5.23 (rmap-13b)
In-Reply-To: <20020619191136.H29373@suse.de>
Message-ID: <Pine.LNX.4.44L.0206191429300.2598-100000@imladris.surriel.com>
X-Spambait: aardv...@kernelnewbies.org
X-Spammeplease: aardv...@nl.linux.org
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it (1.20)
NNTP-Posting-Host: a.30.anti-phl.bofh.it
Newsgroups: linux.kernel
Organization: linux.*_mail_to_news_unidirectional_gateway
Path: archiver1.google.com!news2.google.com!news1.google.com!
newsfeed.stanford.edu!news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!
nntp.infostrada.it!bofh.it!robomod
References: <20020619191136.H29373@suse.de>
X-Original-Cc: Daniel Phillips <phill...@bonn-fries.net>,
	Craig Kulesa <ckul...@as.arizona.edu>,
	<linux-ker...@vger.kernel.org>, <linux...@kvack.org>,
	Linus Torvalds <torva...@transmeta.com>, <rwh...@earthlink.net>
X-Original-Date: Wed, 19 Jun 2002 14:35:45 -0300 (BRT)
X-Original-Sender: linux-kernel-ow...@vger.kernel.org
X-Original-To: Dave Jones <da...@suse.de>
Lines: 43

On Wed, 19 Jun 2002, Dave Jones wrote:
> On Wed, Jun 19, 2002 at 07:00:57PM +0200, Daniel Phillips wrote:
>  > > ...Hope this is of use to someone!  It's certainly been a fun and
>  > > instructive exercise for me so far.  ;)
>  > It's intensely useful.  It changes the whole character of the VM discussion
>  > at the upcoming kernel summit from 'should we port rmap to mainline?' to 'how
>  > well does it work' and 'what problems need fixing'.  Much more useful.
>
> Absolutely.  Maybe Randy Hron (added to Cc) can find some spare time
> to benchmark these sometime before the summit too[1]. It'll be very
> interesting to see where it fits in with the other benchmark results
> he's collected on varying workloads.

Note that either version is still untuned and rmap for 2.5
still needs pte-highmem support.

I am encouraged by Craig's test results, which show that
rmap did a LOT less swapin IO and rmap with page aging even
less. The fact that it did too much swapout IO means one
part of the system needs tuning but doesn't say much about
the thing as a whole.

In fact, I have a feeling that our tools are still too
crude, we really need/want some statistics of what's
happening inside the VM ... I'll work on those shortly.

Once we do have the tools to look at what's happening
inside the VM we should be much better able to tune the
right places inside the VM.

regards,

Rik
-- 
Bravely reimplemented by the knights who say "NIH".

http://www.surriel.com/		http://distro.conectiva.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Date: Wed, 19 Jun 2002 22:00:10 +0200
From: Ingo Molnar <mi...@elte.hu>
Reply-To: Ingo Molnar <mi...@elte.hu>
Subject: Re: [PATCH] (1/2) reverse mapping VM for 2.5.23 (rmap-13b)
In-Reply-To: <Pine.LNX.4.44L.0206191429300.2598-100000@imladris.surriel.com>
Message-ID: <Pine.LNX.4.44.0206192151390.20865-100000@e2>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it (1.20)
NNTP-Posting-Host: a.575.anti-phl.bofh.it
Newsgroups: linux.kernel
Organization: linux.*_mail_to_news_unidirectional_gateway
Path: archiver1.google.com!news2.google.com!news1.google.com!
newsfeed.stanford.edu!news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!
nntp.infostrada.it!bofh.it!robomod
References: <Pine.LNX.4.44L.0206191429300.2598-100000@imladris.surriel.com>
X-Original-Cc: Dave Jones <da...@suse.de>,
	Daniel Phillips <phill...@bonn-fries.net>,
	Craig Kulesa <ckul...@as.arizona.edu>,
	<linux-ker...@vger.kernel.org>, <linux...@kvack.org>,
	Linus Torvalds <torva...@transmeta.com>, <rwh...@earthlink.net>
X-Original-Date: Wed, 19 Jun 2002 21:53:23 +0200 (CEST)
X-Original-Sender: linux-kernel-ow...@vger.kernel.org
X-Original-To: Rik van Riel <r...@conectiva.com.br>
Lines: 21


On Wed, 19 Jun 2002, Rik van Riel wrote:

> I am encouraged by Craig's test results, which show that
> rmap did a LOT less swapin IO and rmap with page aging even
> less. The fact that it did too much swapout IO means one
> part of the system needs tuning but doesn't say much about
> the thing as a whole.

btw., isnt there a fair chance that by 'fixing' the aging+rmap code to
swap out less, you'll ultimately swap in more? [because the extra swappout
likely ended up freeing up RAM as well, which in turn decreases the amount
of trashing.]

	Ingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Date: Wed, 19 Jun 2002 22:30:10 +0200
From: Craig Kulesa <ckul...@as.arizona.edu>
Subject: Re: [PATCH] (1/2) reverse mapping VM for 2.5.23 (rmap-13b)
In-Reply-To: <Pine.LNX.4.44.0206192151390.20865-100000@e2>
Message-ID: <Pine.LNX.4.44.0206191310590.4292-100000@loke.as.arizona.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it (1.20)
NNTP-Posting-Host: a.216.anti-phl.bofh.it
Newsgroups: linux.kernel
Organization: linux.*_mail_to_news_unidirectional_gateway
Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!
news.tele.dk!small.news.tele.dk!194.25.134.62!newsfeed00.sul.t-online.de!
t-online.de!bofh.it!robomod
References: <Pine.LNX.4.44.0206192151390.20865-100000@e2>
X-Original-Cc: Rik van Riel <r...@conectiva.com.br>, Dave Jones <da...@suse.de>,
	Daniel Phillips <phill...@bonn-fries.net>,
	<linux-ker...@vger.kernel.org>, <linux...@kvack.org>,
	Linus Torvalds <torva...@transmeta.com>, <rwh...@earthlink.net>
X-Original-Date: Wed, 19 Jun 2002 13:21:29 -0700 (MST)
X-Original-Sender: linux-kernel-ow...@vger.kernel.org
X-Original-To: Ingo Molnar <mi...@elte.hu>
Lines: 22


On Wed, 19 Jun 2002, Ingo Molnar wrote:

> btw., isnt there a fair chance that by 'fixing' the aging+rmap code to
> swap out less, you'll ultimately swap in more? [because the extra swappout
> likely ended up freeing up RAM as well, which in turn decreases the amount
> of trashing.]

Agree.  Heightened swapout in this rather simplified example) isn't a 
problem in itself, unless it really turns out to be a bottleneck in a 
wide variety of loads.  As long as the *right* pages are being swapped 
and don't have to be paged right back in again.   

I'll try a more varied set of tests tonight, with cpu usage tabulated.

-Craig

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Date: Wed, 19 Jun 2002 22:30:19 +0200
From: Linus Torvalds <torva...@transmeta.com>
Subject: Re: [PATCH] (1/2) reverse mapping VM for 2.5.23 (rmap-13b)
In-Reply-To: <Pine.LNX.4.44.0206191310590.4292-100000@loke.as.arizona.edu>
Message-ID: <Pine.LNX.4.33.0206191322480.2638-100000@penguin.transmeta.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it (1.20)
NNTP-Posting-Host: a.502.anti-phl.bofh.it
Newsgroups: linux.kernel
Organization: linux.*_mail_to_news_unidirectional_gateway
Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!
news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!nntp.infostrada.it!
bofh.it!robomod
References: <Pine.LNX.4.44.0206191310590.4292-100000@loke.as.arizona.edu>
X-Original-Cc: Ingo Molnar <mi...@elte.hu>,
	Rik van Riel <r...@conectiva.com.br>, Dave Jones <da...@suse.de>,
	Daniel Phillips <phill...@bonn-fries.net>,
	<linux-ker...@vger.kernel.org>, <linux...@kvack.org>,
	<rwh...@earthlink.net>
X-Original-Date: Wed, 19 Jun 2002 13:24:55 -0700 (PDT)
X-Original-Sender: linux-kernel-ow...@vger.kernel.org
X-Original-To: Craig Kulesa <ckul...@as.arizona.edu>
Lines: 21


On Wed, 19 Jun 2002, Craig Kulesa wrote:
> 
> I'll try a more varied set of tests tonight, with cpu usage tabulated.

Please do a few non-swap tests too. 

Swapping is the thing that rmap is supposed to _help_, so improvements in
that area are good (and had better happen!), but if you're only looking at
the swap performance, you're ignoring the known problems with rmap, ie the
cases where non-rmap kernels do really well.

Comparing one but not the other doesn't give a very balanced picture..

		Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Date: Thu, 20 Jun 2002 14:20:07 +0200
From: Craig Kulesa <ckul...@as.arizona.edu>
Subject: [PATCH] Updated rmap VM for 2.5.23 (SMP, preempt fixes)
In-Reply-To: <Pine.LNX.4.44.0206181340380.3031-100000@loke.as.arizona.edu>
Message-ID: <Pine.LNX.4.44.0206200451590.4448-100000@loke.as.arizona.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it (1.20)
NNTP-Posting-Host: a.874.anti-phl.bofh.it
Newsgroups: linux.kernel
Organization: linux.*_mail_to_news_unidirectional_gateway
Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!
news.tele.dk!small.news.tele.dk!212.177.105.133!news.mailgate.org!
bofh.it!robomod
References: <Pine.LNX.4.44.0206181340380.3031-100000@loke.as.arizona.edu>
X-Original-Cc: linux...@kvack.org
X-Original-Date: Thu, 20 Jun 2002 05:08:37 -0700 (MST)
X-Original-Sender: linux-kernel-ow...@vger.kernel.org
X-Original-To: linux-ker...@vger.kernel.org
Lines: 52



Fixed patches have been uploaded that fix significant bugs in the rmap 
implementations uploaded yesterday.  Please use the NEW patches (with "-2" 
appended to the filename) instead.  ;)

In particular, neither patch was preempt-safe; thanks go to William Irwin 
for catching it.  A spinlocking bug that kept SMP-builds from booting was 
tripped across by Steven Cole; it affects the big rmap13b patch but not 
the minimal one.  That should be fixed now too.  If it breaks for you, I 
want to know about it! :)


Here's the changelog:

	2.5.23-rmap-2:  rmap on top of the 2.5.23 VM

		- Make pte_chain_lock() and pte_chain_unlock() 
		  preempt-safe  (thanks to wli for pointing this out)


	2.5.23-rmap13b-2:  Rik's full rmap patch, applied to 2.5.23

		- Make pte_chain_lock() and pte_chain_unlock()         
                  preempt-safe	(thanks to wli for pointing this out)

		- Allow an SMP-enabled kernel to boot!  Change bogus
		  spin_lock(&mapping->page_lock) invocations to either
		  read_lock() or write_lock().  This alters drop_behind()
		  in readahead.c, and reclaim_page() in vmscan.c. 

		- Keep page_launder_zone from blocking on recently written 
		  data by putting clustered writeback pages back at the 
		  beginning of the inactive dirty list.  This touches 
		  mm/page-writeback.c and fs/mpage.c.  Thanks go to Andrew 
		  Morton for clearing this issue up for me.

		- Back out Andrew's read-latency2 changes at his 
		  suggestion; it's distracting to the issue of evaluating 
		  rmap.  Thusly, we are now using the unmodified 2.5.23 
		  IO scheduler.  


FYI, these are the patches that I will benchmark in the next email.

-Craig 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Date: Thu, 20 Jun 2002 14:30:15 +0200
From: Craig Kulesa <ckul...@as.arizona.edu>
Subject: VM benchmarks for 2.5 (mainline & rmap patches)
In-Reply-To: <Pine.LNX.4.33.0206191322480.2638-100000@penguin.transmeta.com>
Message-ID: <Pine.LNX.4.44.0206200511450.4448-100000@loke.as.arizona.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it (1.20)
NNTP-Posting-Host: a.82.anti-phl.bofh.it
Newsgroups: linux.kernel
Organization: linux.*_mail_to_news_unidirectional_gateway
Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!
news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!nntp.infostrada.it!
bofh.it!robomod
References: <Pine.LNX.4.33.0206191322480.2638-100000@penguin.transmeta.com>
X-Original-Cc: Ingo Molnar <mi...@elte.hu>,
	Rik van Riel <r...@conectiva.com.br>, Dave Jones <da...@suse.de>,
	Daniel Phillips <phill...@bonn-fries.net>,
	<linux-ker...@vger.kernel.org>, <linux...@kvack.org>,
	<rwh...@earthlink.net>
X-Original-Date: Thu, 20 Jun 2002 05:25:41 -0700 (MST)
X-Original-Sender: linux-kernel-ow...@vger.kernel.org
X-Original-To: Linus Torvalds <torva...@transmeta.com>
Lines: 190



Following are a short sample of simple benchmarks that I used to test 
2.5.23 and the two rmap-based variants.  The tests are being run on a 
uniprocessor PII/333 IBM Thinkpad 390E with 192 MB of ram and using 
ext3 in data=writeback journalling mode.  Randy Hron can do a much 
better job of this on "real hardware", but this is a start.  ;)


Here are the kernels:

2.5.1-pre1:	totally vanilla, from the beginning of the 2.5 tree
2.5.23: 	"almost vanilla", modified only to make it compile
2.5.23-rmap:  	very simple rmap patch atop the 2.5.23 classzone VM logic
2.5.23-rmap13b:	Rik's rmap patch using his multiqueue page-aging VM

Here we go...

-------------------------------------------------------------------

Test 1: (non-swap) 'time make -j2 bzImage' for 2.5.23 tree, config at
	the rmap patch site (bottom of this email).  This is mostly a 
	fastpath test.  Fork, exec, substantive memory allocation and 
	use, but no swap allocation.  Record 'time' output.

2.5.1-pre1:     1145.450u 74.290s 20:58.40 96.9%   0+0k 0+0io 1270393pf+0w
2.5.23:	        1153.490u 79.380s 20:58.79 97.9%   0+0k 0+0io 1270393pf+0w
2.5.23-rmap:    1149.840u 83.350s 21:01.37 97.7%   0+0k 0+0io 1270393pf+0w
2.5.23-rmap13b: 1145.930u 83.640s 20:53.16 98.1%   0+0k 0+0io 1270393pf+0w

Comments: You can see the rmap overhead in the system times, but it
	  doesn't really pan out in the wall clock time, at least for
	  rmap13b.  Maybe for minimal rmap.

	  Note that system times increased from 2.5.1 to 2.5.23, but
	  that's not evident on the wall clock.

	  These tests are with ext3 in writeback mode, so we're doing
	  direct-to-BIO for a lot of stuff.  It's presumably not the
	  BIO/bh duplication of effort, at least as much as it has been...

---------------------------------------------------------------------

Test 2: 'time make -j32 bzImage' for 2.5.23, only building fs/ mm/ ipc/ 
	init/ and kernel/.  Same as above, but push the kernel into swap.  
	Record time and vmstat output.

2.5.23:          193.260u 17.540s 3:49.86 93.5%  0+0k 0+0io 223130pf+0w
		 Total kernel swapouts during test = 143992 kB
		 Total kernel swapins during test  = 188244 kB

2.5.23-rmap:     190.390u 17.310s 4:03.16 85.4%  0+0k 0+0io 220703pf+0w
		 Total kernel swapouts during test = 141700 kB
		 Total kernel swapins during test  = 162784 kB

2.5.23-rmap13b:  189.120u 16.670s 3:36.68 94.7%  0+0k 0+0io 219363pf+0w
		 Total kernel swapouts during test =  87736 kB
		 Total kernel swapins during test  =  18576 kB

Comments:  rmap13b is the real winner here.  Swap access is enormously
	   lower than with mainline or the minimal rmap patch.  The
	   minimal rmap patch is a bit less than mainline, but is 
	   definitely wasting its time somewhere...

	   Wall clock times are not as variable as swap access
	   between the kernels, but the trends do hold.
	   
	   It is valuable to note that this is a laptop hard drive
	   with the usual awful seek times.  If swap reads are
	   fragmented all-to-hell with rmap, with lots of disk seeks
	   necessary, we're still coming out ahead when we minimize
	   swap reads! 

---------------------------------------------------------------------

Test 3: (non-swap) dbench 1,2,4,8 ... just because everyone else does...

2.5.1:
Throughput 31.8967 MB/sec (NB=39.8709 MB/sec  318.967 MBit/sec)  1 procs
1.610u 2.120s 0:05.14 72.5%     0+0k 0+0io 129pf+0w
Throughput 33.0695 MB/sec (NB=41.3369 MB/sec  330.695 MBit/sec)  2 procs
3.490u 4.000s 0:08.99 83.3%     0+0k 0+0io 152pf+0w
Throughput 31.4901 MB/sec (NB=39.3626 MB/sec  314.901 MBit/sec)  4 procs
6.900u 8.290s 0:17.78 85.4%     0+0k 0+0io 198pf+0w
Throughput 15.4436 MB/sec (NB=19.3045 MB/sec  154.436 MBit/sec)  8 procs
13.780u 16.750s 1:09.38 44.0%   0+0k 0+0io 290pf+0w

2.5.23:
Throughput 35.1563 MB/sec (NB=43.9454 MB/sec  351.563 MBit/sec)  1 procs
1.710u 1.990s 0:04.76 77.7%     0+0k 0+0io 130pf+0w
Throughput 33.237 MB/sec (NB=41.5463 MB/sec  332.37 MBit/sec)  2 procs
3.430u 4.050s 0:08.95 83.5%     0+0k 0+0io 153pf+0w
Throughput 28.9504 MB/sec (NB=36.188 MB/sec  289.504 MBit/sec)  4 procs
6.780u 8.090s 0:19.24 77.2%     0+0k 0+0io 199pf+0w
Throughput 17.1113 MB/sec (NB=21.3891 MB/sec  171.113 MBit/sec)  8 procs
13.810u 21.870s 1:02.73 56.8%   0+0k 0+0io 291pf+0w

2.5.23-rmap:
Throughput 34.9151 MB/sec (NB=43.6439 MB/sec  349.151 MBit/sec)  1 procs
1.770u 1.940s 0:04.78 77.6%     0+0k 0+0io 133pf+0w
Throughput 33.875 MB/sec (NB=42.3437 MB/sec  338.75 MBit/sec)  2 procs
3.450u 4.000s 0:08.80 84.6%     0+0k 0+0io 156pf+0w
Throughput 29.6639 MB/sec (NB=37.0798 MB/sec  296.639 MBit/sec)  4 procs
6.640u 8.270s 0:18.81 79.2%     0+0k 0+0io 202pf+0w
Throughput 15.7686 MB/sec (NB=19.7107 MB/sec  157.686 MBit/sec
14.060u 21.850s 1:07.97 52.8%   0+0k 0+0io 294pf+0w

2.5.23-rmap13b:
Throughput 35.1443 MB/sec (NB=43.9304 MB/sec  351.443 MBit/sec)  1 procs
1.800u 1.930s 0:04.76 78.3%     0+0k 0+0io 132pf+0w
Throughput 33.9223 MB/sec (NB=42.4028 MB/sec  339.223 MBit/sec)  2 procs
3.280u 4.100s 0:08.79 83.9%     0+0k 0+0io 155pf+0w
Throughput 25.0807 MB/sec (NB=31.3509 MB/sec  250.807 MBit/sec)  4 procs
6.990u 7.910s 0:22.09 67.4%     0+0k 0+0io 202pf+0w
Throughput 14.1789 MB/sec (NB=17.7236 MB/sec  141.789 MBit/sec)  8 procs
13.780u 17.830s 1:15.52 41.8%   0+0k 0+0io 293pf+0w


Comments:  Stock 2.5 has gotten faster since the tree began.  That's
	   good.  Rmap patches don't affect this for small numbers of
	   processes, but symptomatically show a small slowdown by the
	   time we reach 'dbench 8'.  

---------------------------------------------------------------------

Test 4: (non-swap) cached (first) value from 'hdparm -Tt /dev/hda'

2.5.1-pre1: 	76.89 MB/sec
2.5.23:		75.99 MB/sec
2.5.23-rmap:	77.85 MB/sec
2.5.23-rmap13b:	76.58 MB/sec

Comments:  Within the statistical noise, no rmap slowdown in cached hdparm 
	   scores.  Otherwise not much to see here.

---------------------------------------------------------------------

Test 5: (non-swap) forkbomb test.  Fork() and malloc() lots of times.
	This is supposed to be one of rmap's achilles' heels.  
	The first line results from forking 10000 times with
	10000*sizeof(int) allocations.  The second is from 1 million
	forks with 1000*sizeof(int) allocations.  Average a large
	number of tests for the final results.

2.5.1-pre1:	0.000u 0.120s 0:12.66 0.9%      0+0k 0+0io 71pf+0w
		0.010u 0.100s 0:01.24 8.8%      0+0k 0+0io 70pf+0w

2.5.23:		0.000u 0.260s 0:12.96 2.0%      0+0k 0+0io 71pf+0w
		0.010u 0.220s 0:01.31 17.5%     0+0k 0+0io 71pf+0w

2.5.23-rmap:	0.000u 0.400s 0:13.19 3.0%      0+0k 0+0io 71pf+0w
		0.000u 0.250s 0:01.43 17.4%     0+0k 0+0io 71pf+0w

2.5.23-rmap13b:	0.000u 0.360s 0:13.36 2.4%      0+0k 0+0io 71pf+0w
		0.000u 0.250s 0:01.46 17.1%     0+0k 0+0io 71pf+0w


Comments:  The rmap overhead shows up here at the 2-3% level in the
	   first test, and 9-11% in the second, versus 2.5.23.  
	   This makes sense, as fork() activity is higher in the
	   second test. 

	   Strangely, mainline 2.5 also shows an increase (??) in
	   overhead, at about the same level, from 2.5.1 to present.

	   This silly little program is available with the rmap
	   patches at:  
		http://loke.as.arizona.edu/~ckulesa/kernel/rmap-vm/
	
---------------------------------------------------------------------


Hope this provides some useful food for thought.  

I'm sure it reassures Rik that my simple hack of rmap onto the
classzone VM isn't nearly as balanced as the first benchmark suggested
it was. ;) But it might make a good base to start from, and that's 
actually the point of the exercise. :) 


That's all. <yawns>  Bedtime. :)

Craig Kulesa
Steward Observatory, Univ. of Arizona

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Date: Thu, 20 Jun 2002 14:50:10 +0200
From: Craig Kulesa <ckul...@as.arizona.edu>
Subject: Re: [PATCH] Updated rmap VM for 2.5.23 (SMP, preempt fixes)
In-Reply-To: <Pine.LNX.4.44.0206200451590.4448-100000@loke.as.arizona.edu>
Message-ID: <Pine.LNX.4.44.0206200544200.4448-100000@loke.as.arizona.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it (1.20)
NNTP-Posting-Host: a.539.anti-phl.bofh.it
Newsgroups: linux.kernel
Organization: linux.*_mail_to_news_unidirectional_gateway
Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!
news.tele.dk!small.news.tele.dk!212.177.105.133!news.mailgate.org!
bofh.it!robomod
References: <Pine.LNX.4.44.0206200451590.4448-100000@loke.as.arizona.edu>
X-Original-Cc: linux...@kvack.org
X-Original-Date: Thu, 20 Jun 2002 05:45:15 -0700 (MST)
X-Original-Sender: linux-kernel-ow...@vger.kernel.org
X-Original-To: linux-ker...@vger.kernel.org
Lines: 12


Uhhhh... forgot to mention where to get them:
	http://loke.as.arizona.edu/~ckulesa/kernel/rmap-vm/

<sighs>
-Craig

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Date: Mon, 24 Jun 2002 17:40:05 +0200
From: Rik van Riel <r...@conectiva.com.br>
X-X-Sender: r...@imladris.surriel.com
Subject: Re: [PATCH] (1/2) reverse mapping VM for 2.5.23 (rmap-13b)
In-Reply-To: <Pine.LNX.4.44.0206192151390.20865-100000@e2>
Message-ID: <Pine.LNX.4.44L.0206241200310.3937-100000@imladris.surriel.com>
X-Spambait: aardv...@kernelnewbies.org
X-Spammeplease: aardv...@nl.linux.org
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it (1.20)
NNTP-Posting-Host: a.812.anti-phl.bofh.it
Newsgroups: linux.kernel
Organization: linux.*_mail_to_news_unidirectional_gateway
Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!
news.tele.dk!small.news.tele.dk!212.177.105.133!news.mailgate.org!
bofh.it!robomod
References: <Pine.LNX.4.44.0206192151390.20865-100000@e2>
X-Original-Cc: Dave Jones <da...@suse.de>,
	Daniel Phillips <phill...@bonn-fries.net>,
	Craig Kulesa <ckul...@as.arizona.edu>,
	<linux-ker...@vger.kernel.org>, <linux...@kvack.org>,
	Linus Torvalds <torva...@transmeta.com>, <rwh...@earthlink.net>
X-Original-Date: Mon, 24 Jun 2002 12:02:22 -0300 (BRT)
X-Original-Sender: linux-kernel-ow...@vger.kernel.org
X-Original-To: Ingo Molnar <mi...@elte.hu>
Lines: 35

On Wed, 19 Jun 2002, Ingo Molnar wrote:
> On Wed, 19 Jun 2002, Rik van Riel wrote:
>
> > I am encouraged by Craig's test results, which show that
> > rmap did a LOT less swapin IO and rmap with page aging even
> > less. The fact that it did too much swapout IO means one
> > part of the system needs tuning but doesn't say much about
> > the thing as a whole.
>
> btw., isnt there a fair chance that by 'fixing' the aging+rmap code to
> swap out less, you'll ultimately swap in more? [because the extra swappout
> likely ended up freeing up RAM as well, which in turn decreases the amount
> of trashing.]

Possibly, but I expect the 'extra' swapouts to be caused
by page_launder writing out too many pages at once and not
just the ones it wants to free.

Cleaning pages and freeing them are separate operations,
what is missing is a mechanism to clean enoughh pages but
not all inactive pages at once ;)

regards,

Rik
-- 
Bravely reimplemented by the knights who say "NIH".

http://www.surriel.com/		http://distro.conectiva.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Date: Mon, 24 Jun 2002 23:40:12 +0200
From: "Martin J. Bligh" <Martin.Bl...@us.ibm.com>
Subject: Re: [PATCH] (1/2) reverse mapping VM for 2.5.23 (rmap-13b)
Message-ID: <6660000.1024954471@flay>
In-Reply-To: <Pine.LNX.4.33.0206191322480.2638-100000@penguin.transmeta.com>
References: <Pine.LNX.4.33.0206191322480.2638-100000@penguin.transmeta.com>
X-Mailer: Mulberry/2.1.2 (Linux/x86)
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it (1.20)
NNTP-Posting-Host: a.719.anti-phl.bofh.it
Newsgroups: linux.kernel
Organization: linux.*_mail_to_news_unidirectional_gateway
Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!
news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!nntp.infostrada.it!
bofh.it!robomod
X-Original-Cc: Ingo Molnar <mi...@elte.hu>,
	Rik van Riel <r...@conectiva.com.br>, Dave Jones <da...@suse.de>,
	Daniel Phillips <phill...@bonn-fries.net>,
	linux-ker...@vger.kernel.org, linux...@kvack.org,
	rwh...@earthlink.net
X-Original-Date: Mon, 24 Jun 2002 14:34:31 -0700
X-Original-Sender: linux-kernel-ow...@vger.kernel.org
X-Original-To: Linus Torvalds <torva...@transmeta.com>,
	Craig Kulesa <ckul...@as.arizona.edu>
Lines: 31

>> I'll try a more varied set of tests tonight, with cpu usage tabulated.
> 
> Please do a few non-swap tests too. 
> 
> Swapping is the thing that rmap is supposed to _help_, so improvements in
> that area are good (and had better happen!), but if you're only looking at
> the swap performance, you're ignoring the known problems with rmap, ie the
> cases where non-rmap kernels do really well.
> 
> Comparing one but not the other doesn't give a very balanced picture..

It would also be interesting to see memory consumption figures for a benchmark 
with many large processes. With this type of load, memory consumption 
through PTEs is already a problem - as far as I can see, rmap triples the 
memory requirement of PTEs through the PTE chain's doubly linked list 
(an additional 8 bytes per entry) ... perhaps my calculations are wrong?  
This is particular problem for databases that tend to have thousands of
processes attatched to a large shared memory area.

A quick rough calculation indicates that the Oracle test I was helping out 
with was consuming almost 10Gb of PTEs without rmap - 30Gb for overhead 
doesn't sound like fun to me ;-(

M.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Date: Mon, 24 Jun 2002 23:50:04 +0200
From: Rik van Riel <r...@conectiva.com.br>
X-X-Sender: r...@imladris.surriel.com
Subject: Re: [PATCH] (1/2) reverse mapping VM for 2.5.23 (rmap-13b)
In-Reply-To: <6660000.1024954471@flay>
Message-ID: <Pine.LNX.4.44L.0206241837400.18418-100000@imladris.surriel.com>
X-Spambait: aardv...@kernelnewbies.org
X-Spammeplease: aardv...@nl.linux.org
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it (1.20)
NNTP-Posting-Host: a.523.anti-phl.bofh.it
Newsgroups: linux.kernel
Organization: linux.*_mail_to_news_unidirectional_gateway
Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!
news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!nntp.infostrada.it!
bofh.it!robomod
References: <6660000.1024954471@flay>
X-Original-Cc: Linus Torvalds <torva...@transmeta.com>,
	Craig Kulesa <ckul...@as.arizona.edu>, Ingo Molnar <mi...@elte.hu>,
	Dave Jones <da...@suse.de>,
	Daniel Phillips <phill...@bonn-fries.net>,
	<linux-ker...@vger.kernel.org>, <linux...@kvack.org>,
	<rwh...@earthlink.net>
X-Original-Date: Mon, 24 Jun 2002 18:39:51 -0300 (BRT)
X-Original-Sender: linux-kernel-ow...@vger.kernel.org
X-Original-To: "Martin J. Bligh" <Martin.Bl...@us.ibm.com>
Lines: 29

On Mon, 24 Jun 2002, Martin J. Bligh wrote:

> A quick rough calculation indicates that the Oracle test I was helping
> out with was consuming almost 10Gb of PTEs without rmap - 30Gb for
> overhead doesn't sound like fun to me ;-(

10 GB is already bad enough that rmap isn't so much causing
a problem but increasing an already untolerable problem.

For the large SHM segment you'd probably want to either use
large pages or shared page tables ... in each of these cases
the rmap overhead will disappear together with the page table
overhead.

Now we just need volunteers for the implementation ;)

kind regards,

Rik
-- 
Bravely reimplemented by the knights who say "NIH".

http://www.surriel.com/		http://distro.conectiva.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Date: Tue, 25 Jun 2002 00:00:10 +0200
From: "Martin J. Bligh" <Martin.Bl...@us.ibm.com>
Reply-To: "Martin J. Bligh" <Martin.Bl...@us.ibm.com>
Subject: Re: [PATCH] (1/2) reverse mapping VM for 2.5.23 (rmap-13b)
Message-ID: <752101.1024930574@mbligh.des.sequent.com>
In-Reply-To: <Pine.LNX.4.44L.0206241837400.18418-100000@imladris.surriel.com>
References: <Pine.LNX.4.44L.0206241837400.18418-100000@imladris.surriel.com>
X-Mailer: Mulberry/2.1.2 (Win32)
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it (1.20)
NNTP-Posting-Host: a.402.anti-phl.bofh.it
Newsgroups: linux.kernel
Organization: linux.*_mail_to_news_unidirectional_gateway
Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!
news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!newsfeed.icl.net!
news.mailgate.org!bofh.it!robomod
X-Original-Cc: Linus Torvalds <torva...@transmeta.com>,
	Craig Kulesa <ckul...@as.arizona.edu>, Ingo Molnar <mi...@elte.hu>,
	Dave Jones <da...@suse.de>,
	Daniel Phillips <phill...@bonn-fries.net>,
	linux-ker...@vger.kernel.org, linux...@kvack.org,
	rwh...@earthlink.net
X-Original-Date: Mon, 24 Jun 2002 14:56:14 -0700
X-Original-Sender: linux-kernel-ow...@vger.kernel.org
X-Original-To: Rik van Riel <r...@conectiva.com.br>
Lines: 22

>> A quick rough calculation indicates that the Oracle test I was helping
>> out with was consuming almost 10Gb of PTEs without rmap - 30Gb for
>> overhead doesn't sound like fun to me ;-(
> 
> 10 GB is already bad enough that rmap isn't so much causing
> a problem but increasing an already untolerable problem.

Yup, I'm not denying there's an large existing problem there, but 
at least we can fit it into memory right now. Just something to bear
in mind when you're benchmarking.

> Now we just need volunteers for the implementation ;)

We have some people looking at it already, but it's not the world's
most trivial problem to solve ;-)

M.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/