Path: archiver1.google.com!newsfeed.google.com!newsfeed.stanford.edu!
news.tele.dk!small.news.tele.dk!129.240.148.23!uio.no!nntp.uio.no!ifi.uio.no!
internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-ow...@vger.kernel.org>
X-Authentication-Warning: flashdance.cx: iocc owned process doing -bs
Original-Date: 	Sun, 16 Sep 2001 00:43:35 +0200 (CEST)
From: Peter Magnusson <i...@flashdance.nothanksok.cx>
X-X-Sender:  <iocc@flashdance>
To: <linux-ker...@vger.kernel.org>
Subject: broken VM in 2.4.10-pre9
Original-Message-ID: <Pine.LNX.4.33L2.0109160031500.7740-100000@flashdance>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: 	linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Sat, 15 Sep 2001 22:47:15 GMT
Message-ID: <fa.hqeo6fv.95ejog@ifi.uio.no>
Lines: 27

2.4.7: good VM
2.4.8: not good
2.4.9: not good!!!++
2.4.10-pre4: quite ok VM, but put little more on the swap than 2.4.7
2.4.10-pre8: not good
2.4.10-pre9: not good ... Linux didnt had used any swap at all, then i
             unrared two very large files at the same time. And now 104
             Mbyte swap is used! :-( 2.4.7 didnt do like this.
             Best is to use the swap as little as possible.

My cfg:

Real mem: 512684K (512 Mbyte)
Swap    : 257032K
compiled with: gcc version 2.96 20000731 (Linux-Mandrake 8.0 2.96-0.48mdk)


!! remove "nothanksok." from my email if you want to reply to me !!




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: archiver1.google.com!newsfeed.google.com!newsfeed.stanford.edu!
news.tele.dk!small.news.tele.dk!129.240.148.23!uio.no!nntp.uio.no!
ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-ow...@vger.kernel.org>
X-Authentication-Warning: palladium.transmeta.com: 
mail set sender to n...@transmeta.com using -f
To: linux-ker...@vger.kernel.org
From: torva...@transmeta.com (Linus Torvalds)
Subject: Re: broken VM in 2.4.10-pre9
Original-Date: 	Sun, 16 Sep 2001 05:31:11 +0000 (UTC)
Original-Message-ID: <9o1dev$23l$1@penguin.transmeta.com>
Original-References: <Pine.LNX.4.33L2.0109160031500.7740-100000@flashdance>
X-Complaints-To: news@transmeta.com
Original-NNTP-Posting-Date: 16 Sep 2001 05:31:55 GMT
Cache-Post-Path: palladium.transmeta.com!unkn...@penguin.transmeta.com
X-Cache: nntpcache 2.4.0b5 (see http://www.nntpcache.org/)
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: 	linux-kernel@vger.kernel.org
Organization: Transmeta Corporation
Date: Sun, 16 Sep 2001 05:37:20 GMT
Message-ID: <fa.idm5jpv.92iiqf@ifi.uio.no>
References: <fa.hqeo6fv.95ejog@ifi.uio.no>
Lines: 40

In article <Pine.LNX.4.33L2.0109160031500.7740-100000@flashdance>,
Peter Magnusson  <i...@flashdance.nothanksok.cx> wrote:
>
>2.4.10-pre4: quite ok VM, but put little more on the swap than 2.4.7
>2.4.10-pre8: not good

Ehh..

There are _no_ VM changes that I can see between pre4 and pre8.

>2.4.10-pre9: not good ... Linux didnt had used any swap at all, then i
>             unrared two very large files at the same time. And now 104
>             Mbyte swap is used! :-( 2.4.7 didnt do like this.
>             Best is to use the swap as little as possible.

.. and there are none between pre8 and pre9.

Basically, it sounds lik eyou have tested different loads on different
kernels, and some loads are nice and others are not.

Also note that the amount of "swap used" is totally meaningless in
2.4.x. The 2.4.x kernel will _allocate_ the swap backing store much
earlier than 2.2.x, but that doesn't actuall ymean that it does any of
the IO. Indeed, allocating the swap backing store just means that the
swap pages are then kept track of, so that they can be aged along with
other stores.

So whether Linux uses swap or not is a 100% meaningless indicator of
"goodness".  The only thing that matters is how well the job gets done,
ie was it reasonably responsive, and did the big untars finish quickly.. 

Don't look at how many pages of swap were used. That's a statistic,
nothing more.

		Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: archiver1.google.com!newsfeed.google.com!sn-xit-02!supernews.com!
newsfeed.direct.ca!look.ca!newsfeed.icl.net!opentransit.net!news.tele.dk!
small.news.tele.dk!129.240.148.23!uio.no!nntp.uio.no!ifi.uio.no!
internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-ow...@vger.kernel.org>
Original-Date: 	Sun, 16 Sep 2001 17:19:43 +0200 (MET)
From: Ricardo Galli <gal...@m3d.uib.es>
To: <linux-ker...@vger.kernel.org>
Subject: Re: broken VM in 2.4.10-pre9
Original-Message-ID: <Pine.LNX.4.33.0109161711430.30482-100000@m3d.uib.es>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: 	linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Sun, 16 Sep 2001 15:23:52 GMT
Message-ID: <fa.gaj3i4v.1k56g3f@ifi.uio.no>
Lines: 23

> So whether Linux uses swap or not is a 100% meaningless indicator of
> "goodness". The only thing that matters is how well the job gets done,
> ie was it reasonably responsive, and did the big untars finish quickly..

I am running 2.4.9 on a PII with 448MB RAM. After listening a couple of
hours MP3 from the /dev/cdrom and KDE started, more than 70MB went to
swap, about 300 MB in cache and KDE takes about 15-20 seconds just for
logging out and showing the greeting console.

Obviously, all apps went to disk to leave space for caching mp3 files that
are read only once. Altough logging out is not a very often process...

Regards,


--ricardo


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: archiver1.google.com!newsfeed.google.com!newsfeed.stanford.edu!
news.tele.dk!small.news.tele.dk!129.240.148.23!uio.no!nntp.uio.no!
ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-ow...@vger.kernel.org>
Subject: Re: broken VM in 2.4.10-pre9
From: Michael Rothwell <rothw...@holly-springs.nc.us>
To: Ricardo Galli <gal...@m3d.uib.es>
Cc: linux-ker...@vger.kernel.org
In-Reply-To: <Pine.LNX.4.33.0109161711430.30482-100000@m3d.uib.es>
Original-References: <Pine.LNX.4.33.0109161711430.30482-100...@m3d.uib.es>
Content-Type: text/plain
Content-Transfer-Encoding: 7bit
X-Mailer: Evolution/0.12.99 (Preview Release)
Original-Date: 	16 Sep 2001 11:23:54 -0400
Original-Message-Id: <1000653836.2440.0.camel@gromit.house>
Mime-Version: 1.0
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: 	linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Sun, 16 Sep 2001 15:26:24 GMT
Message-ID: <fa.giiogov.1tn0887@ifi.uio.no>
References: <fa.gaj3i4v.1k56g3f@ifi.uio.no>
Lines: 36

Is there a way to tell the VM to prune its cache? Or a way to limit the
amount of cache it uses?



On 16 Sep 2001 17:19:43 +0200, Ricardo Galli wrote:
> > So whether Linux uses swap or not is a 100% meaningless indicator of
> > "goodness". The only thing that matters is how well the job gets done,
> > ie was it reasonably responsive, and did the big untars finish quickly..
> 
> I am running 2.4.9 on a PII with 448MB RAM. After listening a couple of
> hours MP3 from the /dev/cdrom and KDE started, more than 70MB went to
> swap, about 300 MB in cache and KDE takes about 15-20 seconds just for
> logging out and showing the greeting console.
> 
> Obviously, all apps went to disk to leave space for caching mp3 files that
> are read only once. Altough logging out is not a very often process...
> 
> Regards,
> 
> 
> --ricardo
> 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: archiver1.google.com!newsfeed.google.com!newsfeed.stanford.edu!
news.tele.dk!small.news.tele.dk!129.240.148.23!uio.no!nntp.uio.no!
ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-ow...@vger.kernel.org>
Original-Date: 	Sun, 16 Sep 2001 13:33:36 -0300 (BRST)
From: Rik van Riel <r...@conectiva.com.br>
X-X-Sender:  <r...@imladris.rielhome.conectiva>
To: Michael Rothwell <rothw...@holly-springs.nc.us>
Cc: Ricardo Galli <gal...@m3d.uib.es>, <linux-ker...@vger.kernel.org>
Subject: Re: broken VM in 2.4.10-pre9
In-Reply-To: <1000653836.2440.0.camel@gromit.house>
Original-Message-ID: 
<Pine.LNX.4.33L.0109161330000.9536-100000@imladris.rielhome.conectiva>
X-spambait: aardv...@kernelnewbies.org
X-spammeplease: 	aardv...@nl.linux.org
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: 	linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Sun, 16 Sep 2001 16:35:54 GMT
Message-ID: <fa.p871ccv.1t2k5of@ifi.uio.no>
References: <fa.giiogov.1tn0887@ifi.uio.no>
Lines: 30

On 16 Sep 2001, Michael Rothwell wrote:

> Is there a way to tell the VM to prune its cache? Or a way to limit
> the amount of cache it uses?

Not yet, I'll make a quick hack for this when I get back next
week. It's pretty obvious now that the 2.4 kernel cannot get
enough information to select the right pages to evict from
memory.

For 2.5 I'm making a VM subsystem with reverse mappings, the
first iterations are giving very sweet performance so I will
continue with this project regardless of what other kernel
hackers might say ;)

cheers,

Rik
-- 
IA64: a worthy successor to i860.

http://www.surriel.com/		http://distro.conectiva.com/

Send all your spam to aardv...@nl.linux.org (spam digging piggy)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: archiver1.google.com!newsfeed.google.com!newsfeed.stanford.edu!
news.tele.dk!small.news.tele.dk!129.240.148.23!uio.no!nntp.uio.no!
ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-ow...@vger.kernel.org>
X-Authentication-Warning: palladium.transmeta.com: 
mail set sender to n...@transmeta.com using -f
To: linux-ker...@vger.kernel.org
From: torva...@transmeta.com (Linus Torvalds)
Subject: Re: broken VM in 2.4.10-pre9
Original-Date: 	Sun, 16 Sep 2001 19:43:25 +0000 (UTC)
Original-Message-ID: <9o2vct$889$1@penguin.transmeta.com>
Original-References: <1000653836.2440.0.ca...@gromit.house> 
<Pine.LNX.4.33L.0109161330000.9536-100...@imladris.rielhome.conectiva>
X-Complaints-To: news@transmeta.com
Original-NNTP-Posting-Date: 16 Sep 2001 19:44:13 GMT
Cache-Post-Path: palladium.transmeta.com!unkn...@penguin.transmeta.com
X-Cache: nntpcache 2.4.0b5 (see http://www.nntpcache.org/)
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: 	linux-kernel@vger.kernel.org
Organization: Transmeta Corporation
Date: Sun, 16 Sep 2001 19:45:37 GMT
Message-ID: <fa.if2rk2v.fnsi2n@ifi.uio.no>
References: <fa.p871ccv.1t2k5of@ifi.uio.no>
Lines: 37

In article <Pine.LNX.4.33L.0109161330000.9536-100...@imladris.rielhome.conectiva>,
Rik van Riel  <r...@conectiva.com.br> wrote:
>On 16 Sep 2001, Michael Rothwell wrote:
>
>> Is there a way to tell the VM to prune its cache? Or a way to limit
>> the amount of cache it uses?
>
>Not yet, I'll make a quick hack for this when I get back next
>week. It's pretty obvious now that the 2.4 kernel cannot get
>enough information to select the right pages to evict from
>memory.

Don't be stupid.

The desribed behaviour has nothing to do with limiting the cache or
anything else "cannot get enough information", except for the fact that
the kernel obviously cannot know what will happen in the future.

The kernel _correctly_ swapped out tons of pages that weren't touched in
a long long time. That's what you want to happen - the fact that they
then all became active on logout is sad.

The fact that the "use-once" logic didn't kick in is the problem. It's
hard to tell _why_ it didn't kick in, possibly because the MP3 player
read small chunks of the pages (touching them multiple times). 

THAT is worth looking into. But blathering about "reverse mappings will
help this" is just incredibly stupid. You seem to think that they are a
panacea for all problems, ranging from MP3 playback to world peace and
re-building the WTC.

		Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: archiver1.google.com!newsfeed.google.com!sn-xit-02!supernews.com!
news.tele.dk!small.news.tele.dk!129.240.148.23!uio.no!nntp.uio.no!
ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-ow...@vger.kernel.org>
Original-Date: 	Sun, 16 Sep 2001 16:57:17 -0300 (BRST)
From: Rik van Riel <r...@conectiva.com.br>
X-X-Sender:  <r...@imladris.rielhome.conectiva>
To: Linus Torvalds <torva...@transmeta.com>
Cc: <linux-ker...@vger.kernel.org>
Subject: Re: broken VM in 2.4.10-pre9
In-Reply-To: <9o2vct$889$1@penguin.transmeta.com>
Original-Message-ID: 
<Pine.LNX.4.33L.0109161655250.21279-100000@imladris.rielhome.conectiva>
X-spambait: aardv...@kernelnewbies.org
X-spammeplease: 	aardv...@nl.linux.org
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: 	linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Sun, 16 Sep 2001 19:58:59 GMT
Message-ID: <fa.q7ainlv.1s56l00@ifi.uio.no>
References: <fa.if2rk2v.fnsi2n@ifi.uio.no>
Lines: 32

On Sun, 16 Sep 2001, Linus Torvalds wrote:

> The desribed behaviour has nothing to do with limiting the cache or
> anything else "cannot get enough information", except for the fact that
> the kernel obviously cannot know what will happen in the future.
>
> The kernel _correctly_ swapped out tons of pages that weren't touched in
> a long long time. That's what you want to happen - the fact that they
> then all became active on logout is sad.

The problem is that a too large cache reliably makes the
system unsuitable for interactive use. In that case its
probably worth it to make evicting pages from that cache
more likely than evicting pages from user processes,
while still giving the truly busy cache pages a chance to
stay resident.

regards,

Rik
-- 
IA64: a worthy successor to i860.

http://www.surriel.com/		http://distro.conectiva.com/

Send all your spam to aardv...@nl.linux.org (spam digging piggy)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: archiver1.google.com!newsfeed.google.com!newsfeed.stanford.edu!
news.tele.dk!small.news.tele.dk!129.240.148.23!uio.no!nntp.uio.no!
ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-ow...@vger.kernel.org>
Original-Date: 	Sun, 16 Sep 2001 17:17:37 -0300 (BRST)
From: Rik van Riel <r...@conectiva.com.br>
X-X-Sender:  <r...@imladris.rielhome.conectiva>
To: Linus Torvalds <torva...@transmeta.com>
Cc: <linux-ker...@vger.kernel.org>
Subject: Re: broken VM in 2.4.10-pre9
In-Reply-To: <9o2vct$889$1@penguin.transmeta.com>
Original-Message-ID: 
<Pine.LNX.4.33L.0109161707280.21279-100000@imladris.rielhome.conectiva>
X-spambait: aardv...@kernelnewbies.org
X-spammeplease: 	aardv...@nl.linux.org
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: 	linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Sun, 16 Sep 2001 20:20:10 GMT
Message-ID: <fa.q8aiolv.1t56m05@ifi.uio.no>
References: <fa.if2rk2v.fnsi2n@ifi.uio.no>
Lines: 55

On Sun, 16 Sep 2001, Linus Torvalds wrote:

> The fact that the "use-once" logic didn't kick in is the problem. It's
> hard to tell _why_ it didn't kick in, possibly because the MP3 player
> read small chunks of the pages (touching them multiple times).

It's probably because it used mmap(), all mp3 players seem
to do that. If they also use MADV_SEQUENTIAL, I guess it'd
be easy to also do drop_behind on them, though...

> THAT is worth looking into. But blathering about "reverse mappings
> will help this" is just incredibly stupid. You seem to think that they
> are a panacea for all problems,

Absoluteley not, all reverse mappings allow us is an easier
framework to get the other decisions right. By implementing
_just_ the reverse mappings and leaving the other stuff the
same I've already found my desktop to be more usable. This
seems to be the effect of the fact that reverse mappings
allow us to get page aging right because we see all referenced
bits on a page. If you think we can do this without reverse
mappings I only have to point at linux 1.2, 2.0, 2.2 and 2.4
as a suggestion to the contrary. If it was possible, surely we
would have succeeded in 7 years of trying ?

Add to that the fact that reverse mappings makes it trivial to
do things like defragmenting memory a bit to make sure fork()
succeeds or sparc64 users can allocate page tables or being
able to keep the page tables mapped until the page is cleaned
in page_launder() (reducing soft page faults) or doing a physical
page scan to deal with gross imbalance between memory zones and
we'll have something which, IMHO, is worth experimenting with.

Sure, reverse mappings also have disadvantages, like one pointer
extra in the page_struct or as much as 2 pointers per mapping
for shared pages and a slight complication of the locking, but
I'm not convinced that these disadvantages are so severe we should
continue the VM the same way we failed to make it work right the
last 7 years.

regards,

Rik
-- 
IA64: a worthy successor to i860.

http://www.surriel.com/		http://distro.conectiva.com/

Send all your spam to aardv...@nl.linux.org (spam digging piggy)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: archiver1.google.com!newsfeed.google.com!sn-xit-02!supernews.com!
news.tele.dk!small.news.tele.dk!129.240.148.23!uio.no!nntp.uio.no!
ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-ow...@vger.kernel.org>
Original-Message-ID: <XFMail.20010916222959.ast@domdv.de>
X-Mailer: XFMail 1.4.6-3 on Linux
X-Priority: 3 (Normal)
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 8bit
MIME-Version: 1.0
In-Reply-To: <9o2vct$889$1@penguin.transmeta.com>
Original-Date: 	Sun, 16 Sep 2001 22:29:59 +0200 (CEST)
From: Andreas Steinmetz <a...@domdv.de>
To: <torva...@transmeta.com (Linus Torvalds)>
Subject: Re: broken VM in 2.4.10-pre9
Cc: linux-ker...@vger.kernel.org
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: 	linux-kernel@vger.kernel.org
Organization: D.O.M. Datenverarbeitung GmbH
Date: Sun, 16 Sep 2001 20:33:22 GMT
Message-ID: <fa.huf7k5v.17i8q8p@ifi.uio.no>
References: <fa.if2rk2v.fnsi2n@ifi.uio.no>
Lines: 29

> The fact that the "use-once" logic didn't kick in is the problem. It's
> hard to tell _why_ it didn't kick in, possibly because the MP3 player
> read small chunks of the pages (touching them multiple times). 
> 
Then you should have an eye on mmap(). aide uses it. And it causes a real
problem. The basic logic is here:

open(file,O_RDONLY);
mmap(whole-file,PROT_READ,MAP_SHARED);
<do md5sum of mapped file>
munmap();
close();

No matter how you call the thing above (not my code, anyway): I strongly feel
that the use once logic isn't a great idea. What if lots of pages get accessed
twice? Where to set the limit?
How about adding a swapout cost factor? This would prevent swapping until
pressure is really high without any fixed limits. Calculate clean page reuse in
microseconds whereas swapout followed by swapin is going to be milliseconds.
That's a factor of at least 1000 which needs to be applied in page selection.


Andreas Steinmetz
D.O.M. Datenverarbeitung GmbH
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: archiver1.google.com!newsfeed.google.com!newsfeed.stanford.edu!
news.tele.dk!small.news.tele.dk!129.240.148.23!uio.no!nntp.uio.no!
ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-ow...@vger.kernel.org>
X-Authentication-Warning: penguin.transmeta.com: torvalds owned process doing -bs
Original-Date: 	Sun, 16 Sep 2001 14:28:48 -0700 (PDT)
From: Linus Torvalds <torva...@transmeta.com>
To: Andreas Steinmetz <a...@domdv.de>
cc: <linux-ker...@vger.kernel.org>
Subject: Re: broken VM in 2.4.10-pre9
In-Reply-To: <XFMail.20010916222959.ast@domdv.de>
Original-Message-ID: <Pine.LNX.4.33.0109161415340.22182-100000@penguin.transmeta.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: 	linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Sun, 16 Sep 2001 21:32:03 GMT
Message-ID: <fa.n95l5ev.e7ah02@ifi.uio.no>
References: <fa.huf7k5v.17i8q8p@ifi.uio.no>
Lines: 108


On Sun, 16 Sep 2001, Andreas Steinmetz wrote:
>
> > The fact that the "use-once" logic didn't kick in is the problem. It's
> > hard to tell _why_ it didn't kick in, possibly because the MP3 player
> > read small chunks of the pages (touching them multiple times).
> >
> Then you should have an eye on mmap(). aide uses it. And it causes a real
> problem. The basic logic is here:
>
> open(file,O_RDONLY);
> mmap(whole-file,PROT_READ,MAP_SHARED);
> <do md5sum of mapped file>
> munmap();
> close();

Okey-dokey.

I actually started looking at the current Linux page referenced logic, and
it just looks incredibly broken. There's no logic to it, and it's obvious
how some of it has grown over time without people really understanding or
caring about the referenced bit.

It looks very much like part of the VM was done with only "page->age", and
another part was done with the reference bit. So some users will totally
ignore the information that other users use and update. It's not pretty.

> No matter how you call the thing above (not my code, anyway): I strongly feel
> that the use once logic isn't a great idea. What if lots of pages get accessed
> twice? Where to set the limit?

Actually, the once-used approach _should_ work fine for mmap'ed pages too,
but the fact is that the code didn't even try, partly because the mmap
code was the code that used page->age and didn't care about the referenced
bit at all (except it _did_ care about the referenced bit in the page
tables: just not the bit in "struct page". And it's the latter bit that
actually ends up being the best once-used logic).

> How about adding a swapout cost factor? This would prevent swapping until
> pressure is really high without any fixed limits. Calculate clean page reuse in
> microseconds whereas swapout followed by swapin is going to be milliseconds.
> That's a factor of at least 1000 which needs to be applied in page selection.

Well, the thing is, swap-out is often cheaper than read-in, and just
dropping a page is often the cheapest of all. And all of these things are
a bit intertwined.

I actually have a _sane_ generic "used-once" approach that works with
mmap'ed memory and with other kinds too, and right now it doesn't bother
with "page->age" _at_all_. Instead, the aging is done by moving things
from one list to another, which actually seems to be better, but who
knows.

And that automatically gets used-once right - any pages are always added
to the inactive lists, and get bumped up to active only after they are
physically referenced the second time. This is actually incredibly trivial
to do without any aging at all:

	void mark_page_accessed(struct page *page)
	{
	        if (!PageActive(page) && PageReferenced(page)) {
	                activate_page(page);
	                ClearPageReferenced(page);
	                return;
	        }

	        /* Mark the page referenced, AFTER checking for previous usage..  */
	        SetPageReferenced(page);
	}

and the other importan tpart that we got (completely) wrong wrt the
use-once logic is the fact that when we scan the inactive lists and find a
page that is marked "referenced", we should NOT move it to the active list
(that defeats the whole point of use-once), but we should instead just
clear the reference bit and move it to the head of the right inactive
list.

So it actually looks like the use-once logic only worked under some very
specific circumstances, not in general.

Anybody willing to test the simple used-once cleanups? No guarantees, but
at least they make sense (some of the old interactions certainly do not).

(The new code is a simple state machine:

 - touch non-referenced page: set the reference bit

 - touch already referenced page: move it to next list "upwards" (ie the
   active list)

 - age a non-referenced page on a list: move to "next" list downwards (ie
   free if already inactive, move to inactive if currently active)

 - age a referenced page on a list: clear ref bit and move to beginning of
   same list.

which works fine for mmap pages too. I left the age updates, because the
page age may well make sense within the active list).

I'll make a 2.4.10pre10.

		Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: archiver1.google.com!newsfeed.google.com!newsfeed.stanford.edu!
news.tele.dk!small.news.tele.dk!129.240.148.23!uio.no!nntp.uio.no!
ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-ow...@vger.kernel.org>
Content-Type: 	text/plain; charset=US-ASCII
From: Daniel Phillips <phill...@bonn-fries.net>
To: torva...@transmeta.com (Linus Torvalds), linux-ker...@vger.kernel.org
Subject: Re: broken VM in 2.4.10-pre9
Original-Date: 	Mon, 17 Sep 2001 02:37:53 +0200
X-Mailer: KMail [version 1.3.1]
Original-References: <1000653836.2440.0.ca...@gromit.house> 
<Pine.LNX.4.33L.0109161330000.9536-100...@imladris.rielhome.conectiva> 
<9o2vct$88...@penguin.transmeta.com>
In-Reply-To: <9o2vct$889$1@penguin.transmeta.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 7BIT
Original-Message-Id: <20010917003422Z16197-2757+375@humbolt.nl.linux.org>
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: 	linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Mon, 17 Sep 2001 00:37:32 GMT
Message-ID: <fa.kmepq8v.k0sf0d@ifi.uio.no>
References: <fa.if2rk2v.fnsi2n@ifi.uio.no>
Lines: 37

On September 16, 2001 09:43 pm, Linus Torvalds wrote:
> The fact that the "use-once" logic didn't kick in is the problem. It's
> hard to tell _why_ it didn't kick in, possibly because the MP3 player
> read small chunks of the pages (touching them multiple times). 

Can we confirm that the mp3 player is making subpage accesses? (strace)

The 'partially read/written' state isn't handled properly now.  The 
transition to the 'used-once' state should only occur if the transfer ends at 
the exact end of the page.  Right now it always takes place after the *first* 
transfer on the page which is correct only for full-page transfers.

It's still best to start all pages unreferenced, because otherwise we don't 
have a means of distinguishing between the first and subsequent page cache 
lookups.  The check_used_once logic should set the page referenced if the IO 
transfer ends in the interior of the page or unreferenced if it ends at the 
end of the page.

This straightforward to fix, I'll have a tested patch by Tuesday if nobody 
beats me to it.  I don't think this is the whole problem though, it's just 
exposing a balancing problem.  Even if I did go and randomly access a huge 
file so that all cache pages have high age (the effect we're simulating by 
accident here) I still shouldn't be able to drive all my swap pages out of 
memory.

--
Daniel
  

  

  
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: archiver1.google.com!newsfeed.google.com!newsfeed.stanford.edu!
news.tele.dk!small.news.tele.dk!129.240.148.23!uio.no!nntp.uio.no!
ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-ow...@vger.kernel.org>
X-Authentication-Warning: penguin.transmeta.com: torvalds owned process doing -bs
Original-Date: 	Sun, 16 Sep 2001 18:07:34 -0700 (PDT)
From: Linus Torvalds <torva...@transmeta.com>
To: Daniel Phillips <phill...@bonn-fries.net>
cc: <linux-ker...@vger.kernel.org>
Subject: Re: broken VM in 2.4.10-pre9
In-Reply-To: <20010917003422Z16197-2757+375@humbolt.nl.linux.org>
Original-Message-ID: <Pine.LNX.4.33.0109161738110.1054-100000@penguin.transmeta.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: 	linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Mon, 17 Sep 2001 01:11:43 GMT
Message-ID: <fa.o9a1e7v.hkm4b6@ifi.uio.no>
References: <fa.kmepq8v.k0sf0d@ifi.uio.no>
Lines: 105


On Mon, 17 Sep 2001, Daniel Phillips wrote:
>
> Can we confirm that the mp3 player is making subpage accesses? (strace)

People claim that they do mmap's, which the old code definitely didn't
handle correctly at all.

I'm not 100% sure that the 2.4.10-pre10 aging is right for anonymous pages
either, and the page-referenced handling at COW time looks suspiciously
broken, for example. It's not something we have ever gotten right, I think
- if the old pre-C-O-W page was marked accessed, we should mark that page
referenced before we break the COW. Otherwise we'll move over to a new
page without crediting the source.

> The 'partially read/written' state isn't handled properly now.  The
> transition to the 'used-once' state should only occur if the transfer ends at
> the exact end of the page.  Right now it always takes place after the *first*
> transfer on the page which is correct only for full-page transfers.

No, it's not as easy as you make it sound.

The problem is that partial accesses are real, and they should be counted
as such - except when they are _linear_ partial accesses, in which case
they should not be counted at all except for the first one.

Having some "if transfer ends at end of page" logic would minimally get
the enf-of-file case wrong, for example, never mind the case of a reader
that is seeking around in the file. The EOF case could be worked around
with yet another hack, but I suspect that the real fix is to try to fix
applications that do bad things.

> It's still best to start all pages unreferenced, because otherwise we don't
> have a means of distinguishing between the first and subsequent page cache
> lookups.  The check_used_once logic should set the page referenced if the IO
> transfer ends in the interior of the page or unreferenced if it ends at the
> end of the page.

See how 2.4.10-pre10 doesn't have any use_once hackery at all, but instead
has a clear path on references:

 prefetching: non-referenced page on inactive list
 after 1st reference: refrenced page on inactive list
 after 2nd reference: non-referenced page on active list
 after 3rd and subsequent accesses: referenced page on active list

while the "age down" logic is the exact reverse of the above. Logical and
easy to implement, and gives four distinct "stages" for all pages (along
with the LRU ordering within each list, of course).

Now, the above _is_ different from what we used to do. For one thing, it's
logical. But it might be different enough that the heuristics we have for
aging may need some tuning again. "Logical" is not enough..

There's also a few issues that I don't like right now wrt reference
handling, notably:

 - COW issue mentioned above. Probably trivially fixed by something like

	diff -u --recursive --new-file pre10/linux/mm/memory.c linux/mm/memory.c
	--- pre10/linux/mm/memory.c     Sun Sep 16 18:01:48 2001
	+++ linux/mm/memory.c   Sun Sep 16 18:00:59 2001
	@@ -955,6 +955,8 @@
	        if (pte_same(*page_table, pte)) {
	                if (PageReserved(old_page))
	                        ++mm->rss;
	+               if (pte_young(pte))
	+                       mark_page_accessed(old_page);
	                break_cow(vma, new_page, address, page_table);

	                /* Free the old page.. */

   which looks right (it basically saves off the referenced bit for the
   old page table entry in the physical page reference count).

 - truly anonymous pages (ie before they've been added to the swap cache)
   are not necessarily going to behave as nicely as other pages. They
   magically appear after VM scanning as a "1st reference", and I have a
   reasonably good argument that says that they'll have been aged up and
   down roughly the same number of times, which makes this more-or-less
   correct. But it's still a theoretical argument, nothing more.

   This could reasonably easily be fixed by adding these anonymous pages
   to the LRU lists anyway (with a bogus "page->mapping" which causes them
   to be re-mapped as _real_ swap cache pages when they need writeout),
   but that's a bit too subtle for my taste. If anybody wants to look into
   this, I'd love to know if it makes a difference in behaviour, though..

 - I don't like the lack of aging in 'reclaim_page()'. It will walk the
   whole LRU list if required, which kind of defeats the purpose of having
   reference bits and LRU on that list. The code _claims_ that it almost
   always succeeds with the first page, but I don't see why it would. I
   think that comment assumed that the inactive_clean list cannot have any
   referenced pages, but that's never been true.

There are probably other issues too, these are the ones I was wondering
about when I walked over the use of the PG_referenced bit..

		Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: archiver1.google.com!newsfeed.google.com!newsfeed.stanford.edu!
news.tele.dk!small.news.tele.dk!129.240.148.23!uio.no!nntp.uio.no!
ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-ow...@vger.kernel.org>
Original-Date: 	Mon, 17 Sep 2001 01:11:58 -0400
To: Linus Torvalds <torva...@transmeta.com>
Cc: Daniel Phillips <phill...@bonn-fries.net>, linux-ker...@vger.kernel.org
Subject: Re: broken VM in 2.4.10-pre9
Original-Message-ID: <20010917011157.A22989@cs.cmu.edu>
Mail-Followup-To: Linus Torvalds <torva...@transmeta.com>,
	Daniel Phillips <phill...@bonn-fries.net>,
	linux-ker...@vger.kernel.org
Original-References: <20010917003422Z16197-2757+...@humbolt.nl.linux.org> 
<Pine.LNX.4.33.0109161738110.1054-100...@penguin.transmeta.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <Pine.LNX.4.33.0109161738110.1054-100000@penguin.transmeta.com>
User-Agent: Mutt/1.3.20i
From: Jan Harkes <jahar...@cs.cmu.edu>
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: 	linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Mon, 17 Sep 2001 05:14:59 GMT
Message-ID: <fa.c47jjvv.15j8j3f@ifi.uio.no>
References: <fa.o9a1e7v.hkm4b6@ifi.uio.no>
Lines: 82

On Sun, Sep 16, 2001 at 06:07:34PM -0700, Linus Torvalds wrote:
> See how 2.4.10-pre10 doesn't have any use_once hackery at all, but instead
> has a clear path on references:
> 
>  prefetching: non-referenced page on inactive list
>  after 1st reference: refrenced page on inactive list
>  after 2nd reference: non-referenced page on active list
>  after 3rd and subsequent accesses: referenced page on active list

So it ends up using a 'used_thrice' hack. Yeah, that does solve some of
the used once problems ;)

>  - COW issue mentioned above. Probably trivially fixed by something like

The COW is triggered by a pagefault, so the page will be accessed and
the hardware bits (both accessed and dirty) should get set automatically.

>  - truly anonymous pages (ie before they've been added to the swap cache)
>    are not necessarily going to behave as nicely as other pages. They

I just found a simple example that none of the 2.4.x kernels really like
that much. Create a program that malloc's the available free memory
minus 5-10MB, memset's it to page the memory in as anonymous pages and
then goes to sleep. Then run something like a kernel compile. If there
is enough memory left to catch the allocation spikes to avoid swapping,
the system will be heavily paging with the small amount of "aged memory"
that is left over to work with.

>    but that's a bit too subtle for my taste. If anybody wants to look into
>    this, I'd love to know if it makes a difference in behaviour, though..

pre10 right after booting,
    MemTotal:       127104 kB
    MemFree:         41844 kB
    Active:          11632 kB
    Inact_dirty:     19148 kB
    Inact_clean:         0 kB
    Inact_target:     1004 kB

pre9 with Rik's reverse mapping & delayed swap allocation and my local hacks,
    MemTotal:       126976 kB
    MemFree:         41244 kB
    Active:          80032 kB
    Inact_dirty:         0 kB
    Inact_clean:         0 kB
    Inact_target:      984 kB

Inactive target is interesting, because it is directly related to the
amount of memory pressure we've seen (memory_pressure >> 6). Also as
we're still far from running low on free memory, nothing was pushed into
the inactive lists (yes, there is no used_once, or used_thrice stuff at
all). While pre10 has about 50 MB that is 'lost' to anonymous pages
which don't get aged until we start swapping things out.

Differences are definitely noticeable, but I'm almost sure that is
mostly related to the fact that we have all potentially pageable or
swappable memory on the lists.

>  - I don't like the lack of aging in 'reclaim_page()'. It will walk the
>    whole LRU list if required, which kind of defeats the purpose of having
>    reference bits and LRU on that list. The code _claims_ that it almost
>    always succeeds with the first page, but I don't see why it would. I
>    think that comment assumed that the inactive_clean list cannot have any
>    referenced pages, but that's never been true.

As far as I can understand the _original_ design on which the current VM
is based, aging only occurs to pages on the active 'ring', the inactive
lists are basically LRU-ordered victim caches. Pages are unmapped before
they go to the inactive_dirty list and buffers are flushed before they
can go to inactive_clean.

Ofcourse both the used_once changes and -pre10 sort of flushed these
designs down the toilet by putting mapped pages on the inactive_dirty
list and turning the active list into an LRU.

Jan

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: archiver1.google.com!newsfeed.google.com!newsfeed.stanford.edu!
news.tele.dk!small.news.tele.dk!129.240.148.23!uio.no!nntp.uio.no!
ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-ow...@vger.kernel.org>
To: Rik van Riel <r...@conectiva.com.br>
Cc: <linux-ker...@vger.kernel.org>, <linux...@kvack.org>
Subject: Re: broken VM in 2.4.10-pre9
Original-References: 
<Pine.LNX.4.33L.0109161330000.9536-100...@imladris.rielhome.conectiva>
From: ebied...@xmission.com (Eric W. Biederman)
Original-Date: 	17 Sep 2001 02:06:46 -0600
In-Reply-To: <Pine.LNX.4.33L.0109161330000.9536-100000@imladris.rielhome.conectiva>
Original-Message-ID: <m1elp6s0kp.fsf@frodo.biederman.org>
Lines: 	35
User-Agent: Gnus/5.0808 (Gnus v5.8.8) Emacs/20.5
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: 	linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Mon, 17 Sep 2001 08:16:51 GMT
Message-ID: <fa.kqs0bkv.ni4rgr@ifi.uio.no>
References: <fa.p871ccv.1t2k5of@ifi.uio.no>

Rik van Riel <r...@conectiva.com.br> writes:

> On 16 Sep 2001, Michael Rothwell wrote:
> 
> > Is there a way to tell the VM to prune its cache? Or a way to limit
> > the amount of cache it uses?
> 
> Not yet, I'll make a quick hack for this when I get back next
> week. It's pretty obvious now that the 2.4 kernel cannot get
> enough information to select the right pages to evict from
> memory.

Hmm.  Perhaps or perhaps it is using the information poorly.
There is an alternative approach to have better aging information.

An address_space can be allocated per mm_struct.    And all of the
anonymous pages can be allocated to that address_space.  The
address_space can then have an array or better a tree of extents that
list which indexes correspond to which swap pages.  With some
pages not being backed.

Getting the allocation of indices correct so that merging will work
is a little trickier then now, as is the case of a private writeable
mapping of a file.  But in a lot of other ways the logic becomes
simpler.
 
> For 2.5 I'm making a VM subsystem with reverse mappings, the
> first iterations are giving very sweet performance so I will
> continue with this project regardless of what other kernel
> hackers might say ;)

Do you have any arguments for the reverse mappings or just for some of
the other side effects that go along with them?

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: archiver1.google.com!newsfeed.google.com!newsfeed.stanford.edu!
news.tele.dk!small.news.tele.dk!129.240.148.23!uio.no!nntp.uio.no!
ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-ow...@vger.kernel.org>
Original-Date: 	Mon, 17 Sep 2001 09:12:43 -0300 (BRST)
From: Rik van Riel <r...@conectiva.com.br>
X-X-Sender:  <r...@imladris.rielhome.conectiva>
To: "Eric W. Biederman" <ebied...@xmission.com>
Cc: <linux-ker...@vger.kernel.org>, <linux...@kvack.org>
Subject: Re: broken VM in 2.4.10-pre9
In-Reply-To: <m1elp6s0kp.fsf@frodo.biederman.org>
Original-Message-ID: 
<Pine.LNX.4.33L.0109170909270.2990-100000@imladris.rielhome.conectiva>
X-spambait: aardv...@kernelnewbies.org
X-spammeplease: 	aardv...@nl.linux.org
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: 	linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Mon, 17 Sep 2001 12:14:22 GMT
Message-ID: <fa.pa6lgkv.1q245g6@ifi.uio.no>
References: <fa.kqs0bkv.ni4rgr@ifi.uio.no>
Lines: 37

On 17 Sep 2001, Eric W. Biederman wrote:

> There is an alternative approach to have better aging information.

[snip incomplete description of data structure]

What you didn't explain is how your idea is related to
aging.

> > For 2.5 I'm making a VM subsystem with reverse mappings, the
> > first iterations are giving very sweet performance so I will
> > continue with this project regardless of what other kernel
> > hackers might say ;)
>
> Do you have any arguments for the reverse mappings or just for some of
> the other side effects that go along with them?

Mainly for the side effects, but until somebody comes
up with another idea to achieve all the side effects I'm
not giving up on reverse mappings. If you can achieve
all the good stuff in another way, show it.

regards,

Rik
-- 
IA64: a worthy successor to i860.

http://www.surriel.com/		http://distro.conectiva.com/

Send all your spam to aardv...@nl.linux.org (spam digging piggy)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: archiver1.google.com!news2.google.com!newsfeed.google.com!
sn-xit-02!supernews.com!news.tele.dk!small.news.tele.dk!129.240.148.23!
uio.no!nntp.uio.no!ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-ow...@vger.kernel.org>
Content-Type: 	text/plain; charset=US-ASCII
From: Daniel Phillips <phill...@bonn-fries.net>
To: Jan Harkes <jahar...@cs.cmu.edu>, Linus Torvalds <torva...@transmeta.com>
Subject: Re: broken VM in 2.4.10-pre9
Original-Date: 	Mon, 17 Sep 2001 14:33:12 +0200
X-Mailer: KMail [version 1.3.1]
Cc: linux-ker...@vger.kernel.org
Original-References: <20010917003422Z16197-2757+...@humbolt.nl.linux.org> 
<Pine.LNX.4.33.0109161738110.1054-100...@penguin.transmeta.com> 
<20010917011157.A22...@cs.cmu.edu>
In-Reply-To: <20010917011157.A22989@cs.cmu.edu>
MIME-Version: 1.0
Content-Transfer-Encoding: 7BIT
Original-Message-Id: <20010917122559Z16382-2758+129@humbolt.nl.linux.org>
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: 	linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Mon, 17 Sep 2001 12:27:51 GMT
Message-ID: <fa.kqvfp7v.mgafgc@ifi.uio.no>
References: <fa.c47jjvv.15j8j3f@ifi.uio.no>
Lines: 49

On September 17, 2001 07:11 am, Jan Harkes wrote:
> As far as I can understand the _original_ design on which the current VM
> is based, aging only occurs to pages on the active 'ring', the inactive
> lists are basically LRU-ordered victim caches. Pages are unmapped before
> they go to the inactive_dirty list and buffers are flushed before they
> can go to inactive_clean.
>
> Ofcourse both the used_once changes and -pre10 sort of flushed these
> designs down the toilet by putting mapped pages on the inactive_dirty
> list and turning the active list into an LRU.

The active list is *supposed* to approximate an LRU.  The inactive lists
are not LRUs but queues, and have always been.

The inactive queues have always had both mapped and unmapped pages on
them. The reason for unmapping a swap cache page page when putting it
on the inactive queue is to give it some time to be rescued, since we
otherwise have no information about its short-term activity because
we have no way of accessing the hardware dirty bit given the physical
page on the lru.  A second reason for unmapping it is, we don't have
any choice.  The point where we place it on the inactive queue is the
last point where we're able to find its userspace page table entry.

<paid advertisement>
We'd be able to avoid unmapping swap cache pages with Rik's rmap
patch because we can easily check the hardware referenced bit before
finally evicting the page.  Plus, and I hope I'm interpreting this
correctly, we can allocate the swap slot and perform swap clustering
at that time, greatly simplifying the swapout code.
</paid advertisment> ;-)

Drifting a little further offtopic.  As far as I can tell, there's no 
fundamental reason why we cannot make the current strategy work as 
well as Rik's rmaps probably will, with some more blood, sweat and
code study.  On the other hand, Matt Dillon, the reigning champion of
virtual memory managment, was quite firm in stating that we should
drop the current virtually scanning strategy in favor of 100%
physical scanning as BSD uses, relying on reverse mapping.

   http://mail.nl.linux.org/linux-mm/2000-05/msg00419.html
   (Matt Dillon holds forth on the design of BSD's memory manager)

--
Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: archiver1.google.com!newsfeed.google.com!newsfeed.stanford.edu!
news.tele.dk!small.news.tele.dk!129.240.148.23!uio.no!nntp.uio.no!
ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-ow...@vger.kernel.org>
Original-Date: 	Mon, 17 Sep 2001 09:41:17 -0300 (BRST)
From: Rik van Riel <r...@conectiva.com.br>
X-X-Sender:  <r...@imladris.rielhome.conectiva>
To: Daniel Phillips <phill...@bonn-fries.net>
Cc: Jan Harkes <jahar...@cs.cmu.edu>, Linus Torvalds <torva...@transmeta.com>,
        <linux-ker...@vger.kernel.org>
Subject: Re: broken VM in 2.4.10-pre9
In-Reply-To: <20010917122559Z16382-2758+129@humbolt.nl.linux.org>
Original-Message-ID: <Pine.LNX.4.33L.0109170936010.2990-100000@imladris.rielhome.conectiva>
X-spambait: aardv...@kernelnewbies.org
X-spammeplease: 	aardv...@nl.linux.org
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: 	linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Mon, 17 Sep 2001 12:42:44 GMT
Message-ID: <fa.p8mhf4v.1ti0405@ifi.uio.no>
References: <fa.kqvfp7v.mgafgc@ifi.uio.no>
Lines: 40

On Mon, 17 Sep 2001, Daniel Phillips wrote:

> Drifting a little further offtopic.  As far as I can tell, there's no
> fundamental reason why we cannot make the current strategy work as
> well as Rik's rmaps probably will, with some more blood, sweat and
> code study.

I don't see any possibility to get that to work without
reverse mapping. Of course, that could be me overlooking
some possibility, but I'm not holding by breath waiting
for somebody to invent this other possibility.

> On the other hand, Matt Dillon, the reigning champion of
> virtual memory managment, was quite firm in stating that we should
> drop the current virtually scanning strategy in favor of 100%
> physical scanning as BSD uses, relying on reverse mapping.
>
>    http://mail.nl.linux.org/linux-mm/2000-05/msg00419.html
>    (Matt Dillon holds forth on the design of BSD's memory manager)

His claims are backed up by FreeBSD's VM performance,
so I'm inclined to believe them. If you think you can
come up with something better, I'll believe you when
you show it...

regards,

Rik
-- 
IA64: a worthy successor to i860.

http://www.surriel.com/		http://distro.conectiva.com/

Send all your spam to aardv...@nl.linux.org (spam digging piggy)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: archiver1.google.com!newsfeed.google.com!newsfeed.stanford.edu!
news.tele.dk!small.news.tele.dk!129.240.148.23!uio.no!nntp.uio.no!
ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-ow...@vger.kernel.org>
X-MimeOLE: Produced By Microsoft Exchange V6.0.4712.0
content-class: urn:content-classes:message
Subject: RE: broken VM in 2.4.10-pre9
MIME-Version: 1.0
Content-Type: 	text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7BIT
Original-Date: 	Mon, 17 Sep 2001 10:40:45 -0500
Original-Message-ID: <878A2048A35CD141AD5FC92C6B776E4907BB98@xchgind02.nsisw.com>
Thread-Topic: broken VM in 2.4.10-pre9
Thread-Index: AcE/ci/7B9XzedG4TP+9gJs5f6jcQgAHEoVA
From: "Rob Fuller" <rful...@nsisoftware.com>
To: "Rik van Riel" <r...@conectiva.com.br>,
        "Eric W. Biederman" <ebied...@xmission.com>
Cc: <linux-ker...@vger.kernel.org>, <linux...@kvack.org>
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: 	linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Mon, 17 Sep 2001 15:42:41 GMT
Message-ID: <fa.n6t9nbv.t5khh4@ifi.uio.no>
Lines: 34

One argument for reverse mappings is distributed shared memory or
distributed file systems and their interaction with memory mapped files.
For example, a distributed file system may need to invalidate a specific
page of a file that may be mapped multiple times on a node.

This may be a naive argument given my limited knowledge of Linux memory
management internals.  If so, I will refrain from posting this sort of
thing in the future.  Let me know.

> -----Original Message-----
> From: Rik van Riel [mailto:r...@conectiva.com.br]
> Sent: Monday, September 17, 2001 7:13 AM
> To: Eric W. Biederman
> Cc: linux-ker...@vger.kernel.org; linux...@kvack.org
> Subject: Re: broken VM in 2.4.10-pre9
> 
> 
> On 17 Sep 2001, Eric W. Biederman wrote:

<snip>

> > Do you have any arguments for the reverse mappings or just 
> for some of
> > the other side effects that go along with them?
> 
> Mainly for the side effects, but until somebody comes
> up with another idea to achieve all the side effects I'm
> not giving up on reverse mappings. If you can achieve
> all the good stuff in another way, show it.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: archiver1.google.com!newsfeed.google.com!newsfeed.stanford.edu!
news.tele.dk!small.news.tele.dk!129.240.148.23!uio.no!nntp.uio.no!
ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-ow...@vger.kernel.org>
To: "Rob Fuller" <rful...@nsisoftware.com>
Cc: <linux-ker...@vger.kernel.org>, <linux...@kvack.org>
Subject: Re: broken VM in 2.4.10-pre9
Original-References: <878A2048A35CD141AD5FC92C6B776E4907B...@xchgind02.nsisw.com>
From: ebied...@xmission.com (Eric W. Biederman)
Original-Date: 	17 Sep 2001 10:03:06 -0600
In-Reply-To: <878A2048A35CD141AD5FC92C6B776E4907BB98@xchgind02.nsisw.com>
Original-Message-ID: <m166ahst39.fsf@frodo.biederman.org>
Lines: 	25
User-Agent: Gnus/5.0808 (Gnus v5.8.8) Emacs/20.5
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: 	linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Mon, 17 Sep 2001 16:14:24 GMT
Message-ID: <fa.jnblvsv.1j3af85@ifi.uio.no>
References: <fa.n6t9nbv.t5khh4@ifi.uio.no>

"Rob Fuller" <rful...@nsisoftware.com> writes:

> One argument for reverse mappings is distributed shared memory or
> distributed file systems and their interaction with memory mapped
> files.  For example, a distributed file system may need to invalidate a specific
> page of a file that may be mapped multiple times on a node.

To reduce the time for an invalidate is indeed a good argument for
reverse maps.  However this is generally the uncommon case, and it is
fine to leave this kinds of things on the slow path.  From struct page 
we currently go to struct address_space to lists of struct vm_area
which works but is just a little slower (but generally cheaper) than
having a reverse map.

Since Rik was not seeing the invalidate or the unmap case as the
bottleneck this reverse mappings are not needed simply something
with a similiar effect on the VM.  

In linux we have avoided reverse maps (unlike the BSD's) which tends
to make the common case fast at the expense of making it more
difficult to handle times when the VM system is under extreme load and
we are swapping etc.

Eric

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: archiver1.google.com!newsfeed.google.com!sn-xit-02!supernews.com!
newsfeed.direct.ca!look.ca!newshub2.rdc1.sfba.home.com!news.home.com!
newshub1-work.home.com!gehenna.pell.portland.or.us!nntp-server.caltech.edu!
nntp-server.caltech.edu!mail2news96
Newsgroups: mlist.linux.kernel
Content-Type: 	text/plain; charset=US-ASCII
From: Daniel Phillips <phill...@bonn-fries.net>
X-To: ebied...@xmission.com (Eric W. Biederman),
        "Rob Fuller" <rful...@nsisoftware.com>
Subject: Re: broken VM in 2.4.10-pre9
Date: 	Wed, 19 Sep 2001 11:45:44 +0200
X-Cc: <linux-ker...@vger.kernel.org>, <linux...@kvack.org>
MIME-Version: 1.0
Message-ID: <linux.kernel.20010919093828Z17304-2759+92@humbolt.nl.linux.org>
Approved: n...@nntp-server.caltech.edu
Lines: 16

On September 17, 2001 06:03 pm, Eric W. Biederman wrote:
> In linux we have avoided reverse maps (unlike the BSD's) which tends
> to make the common case fast at the expense of making it more
> difficult to handle times when the VM system is under extreme load and
> we are swapping etc.

What do you suppose is the cost of the reverse map?  I get the impression you 
think it's more expensive than it is.

--
Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: archiver1.google.com!newsfeed.google.com!newsfeed.stanford.edu!
news.tele.dk!small.news.tele.dk!129.240.148.23!uio.no!nntp.uio.no!
ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-ow...@vger.kernel.org>
Subject: Re: broken VM in 2.4.10-pre9
To: phill...@bonn-fries.net (Daniel Phillips)
Original-Date: 	Wed, 19 Sep 2001 20:45:55 +0100 (BST)
Cc: ebied...@xmission.com (Eric W. Biederman),
        rful...@nsisoftware.com (Rob Fuller), linux-ker...@vger.kernel.org,
        linux...@kvack.org
In-Reply-To: <20010919093828Z17304-2759+92@humbolt.nl.linux.org> 
from "Daniel Phillips" at Sep 19, 2001 11:45:44 AM
X-Mailer: ELM [version 2.5 PL6]
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Original-Message-Id: <E15jnIB-0003gh-00@the-village.bc.nu>
From: Alan Cox <a...@lxorguk.ukuu.org.uk>
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: 	linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Wed, 19 Sep 2001 19:43:08 GMT
Message-ID: <fa.h6ikrgv.10m0j06@ifi.uio.no>
References: <fa.jqc4a0v.74mr83@ifi.uio.no>
Lines: 17

> On September 17, 2001 06:03 pm, Eric W. Biederman wrote:
> > In linux we have avoided reverse maps (unlike the BSD's) which tends
> > to make the common case fast at the expense of making it more
> > difficult to handle times when the VM system is under extreme load and
> > we are swapping etc.
> 
> What do you suppose is the cost of the reverse map?  I get the impression you 
> think it's more expensive than it is.

We can keep the typical page table cost lower than now (including reverse
maps) just by doing some common sense small cleanups to get the page struct
down to 48 bytes on x86
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: archiver1.google.com!newsfeed.google.com!newsfeed.stanford.edu!
news.tele.dk!small.news.tele.dk!129.240.148.23!uio.no!nntp.uio.no!
ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-ow...@vger.kernel.org>
To: Alan Cox <a...@lxorguk.ukuu.org.uk>
Cc: phill...@bonn-fries.net (Daniel Phillips),
        rful...@nsisoftware.com (Rob Fuller), linux-ker...@vger.kernel.org,
        linux...@kvack.org
Subject: Re: broken VM in 2.4.10-pre9
Original-References: <E15jnIB-0003gh...@the-village.bc.nu>
From: ebied...@xmission.com (Eric W. Biederman)
Original-Date: 	19 Sep 2001 15:03:21 -0600
In-Reply-To: <E15jnIB-0003gh-00@the-village.bc.nu>
Original-Message-ID: <m1iteegag6.fsf@frodo.biederman.org>
Lines: 	69
User-Agent: Gnus/5.0808 (Gnus v5.8.8) Emacs/20.5
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: 	linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Wed, 19 Sep 2001 21:14:10 GMT
Message-ID: <fa.kjba9lv.r3etii@ifi.uio.no>
References: <fa.h6ikrgv.10m0j06@ifi.uio.no>

Alan Cox <a...@lxorguk.ukuu.org.uk> writes:

> > On September 17, 2001 06:03 pm, Eric W. Biederman wrote:
> > > In linux we have avoided reverse maps (unlike the BSD's) which tends
> > > to make the common case fast at the expense of making it more
> > > difficult to handle times when the VM system is under extreme load and
> > > we are swapping etc.
> > 
> > What do you suppose is the cost of the reverse map?  I get the impression you
> 
> > think it's more expensive than it is.
> 
> We can keep the typical page table cost lower than now (including reverse
> maps) just by doing some common sense small cleanups to get the page struct
> down to 48 bytes on x86

I have to admit the first time I looked at reverse maps our struct page
was much lighter weight, then now (64 bytes x86 UP).  And our cost per
page was noticeably fewer bytes than the BSDs. average_mem_per_page =
sizeof(struct page) + sizeof(pte_t) + sizeof(reverse_pte_t)*average_user_per_page.
But struct page has grown pretty significantly since then, and could
use a cleanup.  

So I figure it is worth going through and computing the costs of
reverse page tables and not, dismissing them out of hand.  But the
fact that the linux VM could get good performance in most
circumstances without reverse page tables has always enchanted me.  

That added to the fact that last time someone ran the numbers linux
was considerably faster than the BSD for mm type operations when not
swapping.  And this is the common case.

I admit reverse page tables make it easier under a high load to get
good paging performance, as the algorithms are more straigh forward.
But I have not seen the argument that not having reverse maps make it
undoable.  In fact previous versions of linux seem to put the proof
that you can get at least reasonable swapping under load without
reverse page tables.

There is also the cache thrashing case.  While scaning page table
entries it is probably impossible to prevent cache thrashing, but
reverse page tables look like they make it worse.

With respect to the current VM the primary complaint I have heard is
that anonymous pages are not in the page cache so cannot be aged.  At
least that was the complaint that started this thread.  For adding
pages to the page cache we currently have conflicting tensions.  Do we
want it in the page cache to age better or do we not want to allocate
the swap space yet?

So my suggestion was to look at getting anonymous pages backed by what
amounts to a shared memory segment.  In that vein.  By using an extent
based data structure we can get the cost down under the current 8 bits
per page that we have for the swap counts, and make allocating swap
pages faster.  And we want to cluster related swap pages anyway so
an extent based system is a natural fit.

If we loose the requirement that swapped out pages need to be in the
page tables.  It becomes a trivial issue to drop page tables with all of
their pages swapped out.  Plus there are a million other special cases
we can remove from the current VM.

So right now I can see a bigger benefit from anonymouse pages with a
``backing store'' then I can from reverse maps.

Eric



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: archiver1.google.com!newsfeed.google.com!newsfeed.stanford.edu!
news.tele.dk!small.news.tele.dk!129.240.148.23!uio.no!nntp.uio.no!
ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-ow...@vger.kernel.org>
Subject: Re: broken VM in 2.4.10-pre9
To: ebied...@xmission.com (Eric W. Biederman)
Original-Date: 	Wed, 19 Sep 2001 23:04:10 +0100 (BST)
Cc: a...@lxorguk.ukuu.org.uk (Alan Cox),
        phill...@bonn-fries.net (Daniel Phillips),
        rful...@nsisoftware.com (Rob Fuller), linux-ker...@vger.kernel.org,
        linux...@kvack.org
In-Reply-To: <m1iteegag6.fsf@frodo.biederman.org> 
from "Eric W. Biederman" at Sep 19, 2001 03:03:21 PM
X-Mailer: ELM [version 2.5 PL6]
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Original-Message-Id: <E15jpRy-0003yt-00@the-village.bc.nu>
From: Alan Cox <a...@lxorguk.ukuu.org.uk>
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: 	linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Wed, 19 Sep 2001 22:01:24 GMT
Message-ID: <fa.hgjv98v.10meto6@ifi.uio.no>
References: <fa.kjba9lv.r3etii@ifi.uio.no>
Lines: 38

> That added to the fact that last time someone ran the numbers linux
> was considerably faster than the BSD for mm type operations when not
> swapping.  And this is the common case.

"Linux VM works wonderfully when nobody is using it" 

Which is rather like the scheduler works well for one task then by three is
making bad decisions.

> But I have not seen the argument that not having reverse maps make it
> undoable.  In fact previous versions of linux seem to put the proof
> that you can get at least reasonable swapping under load without
> reverse page tables.

The last decent Linx VM behaviour was about 2.1.100 or so - which was
without reverse maps. It's been downhill since then. So yes you may be
right.

> So my suggestion was to look at getting anonymous pages backed by what
> amounts to a shared memory segment.  In that vein.  By using an extent
> based data structure we can get the cost down under the current 8 bits
> per page that we have for the swap counts, and make allocating swap
> pages faster.  And we want to cluster related swap pages anyway so
> an extent based system is a natural fit.

Much of this goes away if you get rid of both the swap and anonymous page
special cases. Back anonymous pages with the "whoops everything I write here
vanishes mysteriously" file system and swap with a swapfs

Reverse mappings make linear aging easier to do but are not critical (we
can walk all physical pages via the page map array). 

Alan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: archiver1.google.com!newsfeed.google.com!newsfeed.stanford.edu!
news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!newsfeed.media.kyoto-u.ac.jp!
uio.no!nntp.uio.no!ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-ow...@vger.kernel.org>
Subject: Re: broken VM in 2.4.10-pre9
To: rful...@nsisoftware.com (Rob Fuller)
Original-Date: 	Wed, 19 Sep 2001 23:30:41 +0100 (BST)
Cc: da...@redhat.com (David S. Miller), ebied...@xmission.com,
        a...@lxorguk.ukuu.org.uk, phill...@bonn-fries.net,
        linux-ker...@vger.kernel.org, linux...@kvack.org
In-Reply-To: <878A2048A35CD141AD5FC92C6B776E4907B7A5@xchgind02.nsisw.com> 
from "Rob Fuller" at Sep 19, 2001 05:15:21 PM
X-Mailer: ELM [version 2.5 PL6]
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Original-Message-Id: <E15jprd-00042O-00@the-village.bc.nu>
From: Alan Cox <a...@lxorguk.ukuu.org.uk>
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: 	linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Wed, 19 Sep 2001 22:28:44 GMT
Message-ID: <fa.gd3l40v.57oqg1@ifi.uio.no>
References: <fa.nat3nbv.115ehih@ifi.uio.no>
Lines: 15

> "One argument for reverse mappings is distributed shared memory or
> distributed file systems and their interaction with memory mapped files.
> For example, a distributed file system may need to invalidate a specific
> page of a file that may be mapped multiple times on a node."

Wouldn't it be better for the file system itself to be doing that work. Also
do real world file systems that actually perform usably do this or just zap
the cached mappings like OpenGFS does.

Alan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: archiver1.google.com!newsfeed.google.com!newsfeed.stanford.edu!
news.tele.dk!small.news.tele.dk!129.240.148.23!uio.no!nntp.uio.no!
ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-ow...@vger.kernel.org>
Original-Date: 	Wed, 19 Sep 2001 20:05:45 -0300 (BRST)
From: Rik van Riel <r...@conectiva.com.br>
X-X-Sender:  <r...@imladris.rielhome.conectiva>
To: Alan Cox <a...@lxorguk.ukuu.org.uk>
Cc: "Eric W. Biederman" <ebied...@xmission.com>,
        Daniel Phillips <phill...@bonn-fries.net>,
        Rob Fuller <rful...@nsisoftware.com>, <linux-ker...@vger.kernel.org>,
        <linux...@kvack.org>
Subject: Re: broken VM in 2.4.10-pre9
In-Reply-To: <E15jpRy-0003yt-00@the-village.bc.nu>
Original-Message-ID: 
<Pine.LNX.4.33L.0109192005170.19147-100000@imladris.rielhome.conectiva>
X-spambait: aardv...@kernelnewbies.org
X-spammeplease: 	aardv...@nl.linux.org
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: 	linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Wed, 19 Sep 2001 23:07:46 GMT
Message-ID: <fa.q7acolv.1q5sm06@ifi.uio.no>
References: <fa.hgjv98v.10meto6@ifi.uio.no>
Lines: 22

On Wed, 19 Sep 2001, Alan Cox wrote:

> "Linux VM works wonderfully when nobody is using it"

"This OS is optimised for lmbench"


cheers,

Rik
-- 
IA64: a worthy successor to i860.

http://www.surriel.com/		http://distro.conectiva.com/

Send all your spam to aardv...@nl.linux.org (spam digging piggy)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/