Too little, too slow





An introduction to memory management
and
Linux memory management today and tomorrow



Rik van Riel
Conectiva S.A. http://www.conectiva.com/
riel@conectiva.com.br


Generated from the Magicpoint slides 27/nov/2000
(page 1)


Too little, too slow






(page 2)


Memory hierarchy



hierarchy.gif


(page 3)


Memory hierarchy: basics








(page 4)


Memory hierarchy: who does what?





(page 5)


Memory hierarchy: numbers



Below is a table with some "randomly" chosen CPU types and their L1 and L2 cache characteristics. The exact numbers are not very important, but it is good to realise that even the L2 cache is not fast enough to let the CPU core run at 4% of its maximum speed...


Cache size & speed vs. CPU model Celeron A PIII Athlon K6-III
L1 size 32kB 32kB 128kB 64kB
L2 size 128kB 512kB 512kB 256kB
L1 lat 3 3 3 2
L2 lat 8 24 21 11
L2 total 11 27 24 13
L1 ass. 4 4 2 2
L2 ass. 8 4 2 4


As you can see, RAM and hard disk speeds are so rediculously low that you must avoid RAM and disk access at all cost if you want performance.

RAM latency: 50 - 400 cycles
Disk latency: > 3.000.000 cycles


(page 6)


Memory hierarchy: working set




Of course this breaks in applications that don't exhibit good locality \
of reference. Luckily it works well for most programs.


(page 7)


Cache






(page 8)


Cache: how it works





(page 9)


Cache: dirty memory





The fastest way of getting a cache line replaced is by not writing \
to it, so we CAN just forget about it when we need to cache something \
else.


(page 10)


Cache: coherency



We must make sure that we won't write stale data to disk or work on \
stale data from another CPU.



(page 11)


Cache: program optimisation



This example is from the Linux kernel source code, from \
kernel/sched.c::reschedule().

When all running processes have used their time slice:

recalculate:
	for_each_task(p)
		p->counter = (p->counter >> 1) + p->priority;


(page 12)


Cache: program optimisation



The magic trick:

recalculate:
	for_each_task(p)
		p->counter = (p->counter >> 1) + p->priority;

recalculate:
	for_each_task(p) {
		if (p->counter != (p->counter >> 1) + p->priority)
			p->counter = (p->counter >> 1) + p->priority;
	}


(page 13)


RAM





(page 14)


RAM: virtual memory





(page 15)


RAM: virtual memory



assom.gif



(page 16)


RAM: virt -> phys mapping





(page 17)


RAM: page faults




Because a page fault is very Very VERY slow, we really want to \
minimise the amount of page faults.


(page 18)


RAM: page replacement





(page 19)


RAM: LRU approximations





(page 20)


RAM: other replacement tricks





(page 21)


Disk





(page 22)


Disk: performance characteristics





(page 23)


Disk: performance optimisations



How to get disk performance good?


(page 24)


Disk: RAID





(page 25)


HSM: beyond disk






(page 26)


Too little, too slow



Conclusions of the introduction


(page 27)


Linux Memory Management





(page 28)


Linux VM: physical zones



actions like swapin and swapout

On x86 the zones happen to be "inclusive", but this is not the case for \
some other architectures (eg. with NUMA-like setups).


(page 29)


Linux VM: virtual page use





(page 30)


Linux VM 2.2: try_to_free_pages






You can read the functions in mm/vmscan.c and mm/filemap.c


(page 31)


Linux 2.2 VM: strong / weak points




we haven't scanned memory in a long time)


(page 32)


Linux VM: changes in 2.4




Not yet in 2.4 as of september 2000


(page 33)


Linux VM: 2.4 / 2.5 page aging





(page 34)


Linux VM: multiqueue VM





(page 35)


Linux VM: balancing aging and flushing





(page 36)


Linux 2.4 VM: try_to_free_pages




(page 37)


Linux 2.4 VM: page_launder







(page 38)


Linux VM: flush callback & pinned pages






(page 39)


Linux VM: plans for 2.5





(page 40)


Linux VM: plans for 2.5







(page 41)


Linux VM: more plans for 2.5







(page 42)


Linux VM: OOM / QoS







(page 43)


Acknowledgements



Thanks go out to

Links
(page 44)