[patch] VM stable again?

Hi Linus,

the patch below makes sure processes won't "eat" the pages
another process is freeing and seems to avoid the nasty
out of memory situations that people have seen.

With this patch performance isn't quite what it should be,
but I have some ideas on making performance fly (without
impacting stability, of course).

With this patch kswapd uses extremely little cpu, compared
to other kernel versions. This is probably a sign that the
apps will be able to manage VM by themselves without help
from kswapd ... except for performance of course ;)

The patch works in a very simple way:
- keep track of whether some process is critically low on
  memory and needs to call try_to_free_pages()
- if another allocation starts while the other app is in
  try_to_free_pages(), free some memory ourselves
- (skip point 2 if there is enough free memory, but that's
  just a minor performance optimisation)

This way we won't "eat" the free memory 

I'd appreciate it if some people could try it and see if
it fixes all the OOM situations.

regards,

Rik
--
The Internet is not a network of computers. It is a network
of people. That is its real strength.

Wanna talk about the kernel?  irc.openprojects.net / #kernelnewbies
http://www.conectiva.com/		http://www.surriel.com/


Patch

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

Re: [patch] VM stable again?

On Mon, 15 May 2000, Rik van Riel wrote:

> - keep track of whether some process is critically low on
>   memory and needs to call try_to_free_pages()
> - if another allocation starts while the other app is in
>   try_to_free_pages(), free some memory ourselves
> - (skip point 2 if there is enough free memory, but that's
>   just a minor performance optimisation)

yep, this should work. A minor comment:

> +		if (atomic_read(&free_before_allocate))

i believe this needs to be per-zone and should preferably be read within
the zone spinlock - not atomic operations. Updating a global counter is a
big time problem on SMP.

	Ingo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

Re: [patch] VM stable again?

On Mon, 15 May 2000, Ingo Molnar wrote:
> 
> yep, this should work. A minor comment:
> 
> > +		if (atomic_read(&free_before_allocate))
> 
> i believe this needs to be per-zone and should preferably be read within
> the zone spinlock - not atomic operations. Updating a global counter is a
> big time problem on SMP.

Nope.

It can't be per zone, because there is no "zone". There is only a generic
balance between different zones.

And the critical path actually only reads the counter, which is fine on
SMP: most of the time the counter should be quiescent, with every CPU just
having a shared copy in their caches. 

However, I do think that it might make sense to make this per-zonelist, so
that if a DMA request (or a request on another node in a NUMA environment)
causes another zone-list to be low-on-memory, that should not affect the
other zone-lists.

(The per-zonelist version should have pretty much the same behaviour as a
global one in the normal cases, it's just that it doesn't have the bad
behaviour in the uncommon cases).

Rik, mind cleaning that up, and fixing the leak? After that it looks
fine..

		Linus

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

Re: [patch] VM stable again?

On Mon, 15 May 2000, Rik van Riel wrote:
> 
> the patch below makes sure processes won't "eat" the pages
> another process is freeing and seems to avoid the nasty
> out of memory situations that people have seen.

Hmm.. The patch has an obvious leak: if the allocation ever fails, every
single allocator ever afterwards will be forced to try to free stuff,
simply because "free_before_allocate" wasn't decremented correctly. Which
is certainly not the right behaviour.

Also, this seems to assume that the regular reason for "out of memory" is
that the free lists emptied up while we were paging stuff out, which I do
not think is necessarily the full truth. I've definitely seen the simpler
case: just an overly eager failure from "try_to_free_pages()", which this
patch does not address.

That said, I think this patch is definitely conceptually the right thing:
it just says that while somebody else (not kswapd) is trying to free up
memory, nobody else should starve him out. I like the concept. 

		Linus

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

[patch2] pre8 VM stable

Hi Linus,

here is the second version of the patch, with the
leak fixed and per-zonelist status.

regards,

Rik
--
The Internet is not a network of computers. It is a network
of people. That is its real strength.

Wanna talk about the kernel?  irc.openprojects.net / #kernelnewbies
http://www.conectiva.com/		http://www.surriel.com/


Patch

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/