[patches] kernel timer races

From: Andrew Morton (andrewm@uow.edu.au)
Date: Thu May 25 2000 - 04:57:52 EST

In 2.2 and 2.3 we have a bit of a problem with kernel timers:
 

     A timer handler may still be running after the timer has been
     deleted.
 

This is why del_timer_sync() was introduced in 2.3. But
del_timer_sync() has problems...
 

The kernel has 18 references to del_timer_sync and 668 references to
del_timer. I expect that a great number of the del_timer() calls are
racy, resulting in occasional failures such as:
 

 

 

     mainline() handler()
     ========== =========
                                      enter xxx_timer()
     del_timer()
     kfree(some_resource)
                                      access(some_resource)
 

The handler is accessing a resource after it has been released.
 

 

 

 

     mainline() handler()
     ========== =========
                                      enter xxx_timer()
     del_timer()
     disable(some_device)
                                      access(some_device)
 

The handler is touching some device after it has been shut down.
 

 

 

 

     mainline() handler()
     ========== =========
                                      enter xxx_timer()
     del_timer()
     foo = 1; foo = 2;
 

What is the value of 'foo'?
 

 

 

 

     mainline() handler()
     ========== =========
                                      enter xxx_timer()
     del_timer() run...
     MOD_DEC_USE_COUNT run...
     sys_delete_module() run...
                                      run...
 

The handler's text has been unloaded.
 

 

 

     mainline() handler()
     ========== =========
                                      enter xxx_timer()
     del_timer()
     kfree(timer)
                                      mod_timer(timer)
 

 

The latter is the worst, and unfortunately it is one of the most likely
scenarios. The timer is still ticking over and will continue to do so,
but its controlling timer_list is now in kfree'ed memory. The kernel
is a dead man walking. Many of the net drivers can do this.
 

 

 

I believe this needs fixing. Most of the timer handling in the kernel
assumes that once del_timer() returns, the timer is deleted and the
handler has terminated. I wonder how many developers even know that
this is not true?
 

 

The fix is to convert del_timer() to have synchronous semantics. That
is:
 

     Upon return from del_timer(), the timer is not registered and
     the handler is not running.
 

This implies that del_timer() must spin until the handler completes its
run.
 

This is in fact quite hard to do, because of the current very
permissive timer API:
 

     - The handler may call del_timer()
     - The handler may call add_timer()/mod_timer().
     - The handler may kfree() the timer_list storage.
 

The last one is the killer because it means that run_timer_list()
cannot hold state within the timer_list structure. Note that
run_timer_list() is at present careful to not touch the timer_list
after the handler returns.
 

Alexey tells us that handlers may kfree the timer_list. We would be
very interested in finding out which code does this.
 

 

 

The current del_timer_sync() attempts to give us the desired
synchronous semantics. But it has a few problems:
 

     - The handler must call timer_exit() to tell
       del_timer_sync() that it (the handler) has finished. Very, very
       few handlers currently call timer_exit().
 

     - Once timer_exit() has been called, the mainline code can
       do a sys_delete_module() and unload the timer handler's text.
       Unlikely, but possible.
 

     - The current del_timer_sync() spins on timer->running in
       timer_synchronize(). After the handler returns, the timer_list
       could have been kfree'ed, but del_timer_sync reads and writes
       its members.
 

 

I have put together several kernel modules:
 

     http://www.uow.edu.au/~andrewm/linux/timertest.tar.gz
 

The 'showit.o' module demonstrates the existence of the problem. It
consists of a kernel thread and a periodic timer. The kernel thread
spins until the handler starts, then deletes the timer and then checks
to see if the handler is still running. It shows that under 2.2 and
2.3 it is still running.
 

The 'timertest' module is a bunch of test cases for the proposed patch.
 

The 'kfree' module is a test case for some timer-used-after-kfree
detection code (uses slab poisoning, not included in these patches).
 

The 'deadlock' module forces a del_timer() deadlock (see below) to test
the new deadlock detection/avoidance code.
 

 

This patch uses timerlist_lock to serialise del_timer() and the
handlers. I did considered serialising them with global_bh_lock. It's
much of a muchness, and I didn't use global_bh_lock for several
reasons:
 

     - I expect global_bh_lock will disappear one day, when
       timer handlers are no longer globally serialised.
 

     - 2.2 and 2.3 are quite different in their handling of BH
       locking. But they are quite similar in their top-level handling
       of timers. So patching run_timer_list() and del_timer() is more
       portable to 2.2.
 

     - No point in locking against _all_ BH handlers when we're
       only concerned with timer handlers.
 

     - Cleaner.
 

The patch against 2.3.99-pre10-3 is at
 

     http://www.uow.edu.au/~andrewm/linux/timer-race-6.patch
 

It does the following:
 

     - Maintains an pointer to the currently-running handler's
       timer_list.
 

     - In run_timer_list(), note the currently running timer in
       that pointer.
 

     - In del_timer(), check (under timerlist_lock) to see if
       this timer is running. If so, spin until it stops.
 

     - In internal_add_timer(), check to see if the newly added
       timer's handler is currently running. If it is, record that
       fact for del_timer().
 

     - In del_timer(), after the handler has stopped running: if
       the timer was re-added (by either the handler or by another
       CPU), delete the darn thing again.
 

     - del_timer_sync() disappears. It's macroed onto del_timer().
 

     - timer_exit() becomes a no-op.
 

 

 

There's also a bit of temp diagnostic/debuggy stuff in this patch.
 

     - Detect the occurence of CPU0 doing a del_timer() during
       CPU1's execution of the handler. If this does occur, direct
       people to
 

          http://www.uow.edu.au/~andrewm/linux/del_timer.html
 

       Reminder: this scenario is the whole reason for this
       patch (and this long email). We need to gather these statistics
       so we can decide whether or not to patch 2.2. If I don't get
       any emails this whole thing was a waste of time. (But I've
       already seen it happen three times, and I haven't even tried...)
 

     - I was doing some work using slab poisoning to detect
       invalid timer_list usage. To support this I added
       some initialisation macros in timer.h:
 

       timer_list t = INIT_TIMER_FND(fn, d);
 

           For initialising a timer, given the handler and
           the 'data' initialiser.
 

       timer_list t = INIT_TIMER_FN(fn);
 

           Initialise the timer with the handler only.
 

       timer_list t = INIT_TIMER;
 

           Initialise to all zeroes.
 

       In list.h:
 

          #define LIST_HEAD_INIT_NULL { 0, 0 }
 

       For cleaner list_head initialisers.
 

       These changes allow us to centralise all timer_list
       initalisation within init_timer() and INIT_TIMER.
 

     - There are many files in the kernel tree which use
       statically initialised timers but fail to initialise them with
       init_timer().
 

       I have a separate patch which changes all these so that
       stuff like:
 

         static timer_list t = { NULL, NULL, 0, 0, some_func };
 

       becomes
 

         static timer_list t = INIT_TIMER_FN(some_func);
 

       This change was designed to allow the slab poisoning to
       be effective, but it is a worthwhile tidy-up in any case.
 

       This patch against 56 files is at
 

         http://www.uow.edu.au/~andrewm/linux/timer-tidy-6.patch
 

     - Deadlock detection:
 

 

This del_timer synchronisation introduces a potential deadlock if your
mainline and your timer handler both acquire the same lock:
 

     mainline() handler()
     ========== =========
 

     spin_lock_irqsave(some_lock)
                                      spin_lock_irqsave(some_lock)
     del_timer()
                                      spin_unlock_irqrestore(some_lock)
     spin_unlock_irqrestore(some_lock)
 

With the current async del_timer() semantics this would not deadlock.
But with the proposed synchronous semantics, del_timer() will spin for
ever. So in this patch the spin is limited to 5e8 loops, after which
it complains, does a stack dump (x86 only), directs the user to
http://www.uow.edu.au/~andrewm/linux/deadlock.html and proceeds.
 

There is still a little race in this code, identified in del_timer().
This can be fixed by some wild cross-communication between del_timer()
and run_timer_list(), but that fix will only obscure things more.
 

This patch will get confused (and simply fall back to the current
behaviour) if the handler re-adds the timer, then deletes it and then
kfree's the timer's memory. Dumb thing to do anyway. Poisoning will
pick this up.
 

I don't expect that anything currently depends upon del_timer()'s
asynchronous behaviour (how could it?). If this is required it would
be simple to add a new function, del_timer_async().
 

 

----

This fix would (I think) be cleaner and more efficient if we could store state within the timer_list structure, and examine that state within del_timer() and run_timer_list():

del_timer() { detach_timer() while (timer->running) ; if (timer_pending(timer)) detach_timer(); }

run_timer_list() { detach_timer(); timer->running = 1; timer->function(); timer->running = 0; }

toss in the appropriate locking and this is fine. But if the handler can kfree the timer_list, neither of these functions are allowed to touch the timer_list structure after the handler returns (or, in the case of del_timer(), while it is running).

So to do the above, we must redefine the timer API. Namely, the handler _must_ return with the timer_list structure still validly accessible by run_timer_list() and del_timer(). That is, the handler is not allowed to kfree() or otherwise maliciously diddle with the timer_list. This is why we would be very interested in finding out which code does this...

[ But are there other ways in which the handler could trash the timer_list apart from kfree()? ]

 

-- -akpm-

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/


Re: [patches] kernel timer races

From: Alan Cox (alan@lxorguk.ukuu.org.uk)
Date: Thu May 25 2000 - 07:40:45 EST

There is a really dumb fix perhaps ?
 

        wait_on_timers();
 

That would map to a bh wait on 2.2 and the saner timer wait on 2.3.x
 

 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/


Re: [patches] kernel timer races

From: Andrew Morton (andrewm@uow.edu.au)
Date: Thu May 25 2000 - 09:07:42 EST

Alan Cox wrote:
>
> There is a really dumb fix perhaps ?
>
> wait_on_timers();
>
> That would map to a bh wait on 2.2 and the saner timer wait on 2.3.x
 

Not quite sure what you mean here. Your emails are shorter than mine :)
 

Are you suggesting a wait_on_bh() within 2.2's del_timer()? If so we'd
have to take a peek at which CPU currently owns the BH to avoid deadlock
if a BH or ISR called del_timer().
 

hmm... wait_on_bh() isn't exported, and synchronize_bh() doesn't work
from BH's or IRQ's. This means that each arch would have to export
something new. I'll confess that the current BH locking makes my head
spin. I suspect this would all end up with similar complexity to the
current patch, it it's indeed what you meant.
 

More hmm... synchronize_bh() doesn't do anything to stop a new BH from
starting as soon as it has returned. So it's safe within __global_cli()
and it can be used after del_timer() to give del_timer_sync() semantics,
but apart from that, what use is it??
 

This patch, BTW, very much assumes that only one CPU can run a timer
handler at a time. It will BUG() if this isn't true. It's pretty easy
(although a bit expensive) to generalise it when handlers can run
simultaneously. (That's what version 4 did...)
 

I don't know how often these nasties can happen - you can do sums
assuming that a handler takes 100nS and del_timer() is called every
100mS and everything's stochastic: once every 300 hours. I've observed
it about once every ~5 hours, but this is just with a simple
IDE/NIC/nothing else box. It's guesswork, which is why I put in the
please-send-me-mail printk() when it happens.
 

I wish Alexey were around to comment, but we've seen neither hideski nor
hairski for a couple of weeks.
 

BTW2: is there any technical reason why the timer_struct clients haven't
been migrated over to use timer_list yet?
 

 

-- 
-akpm-

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/


Re: [patches] kernel timer races

From: Alan Cox (alan@lxorguk.ukuu.org.uk)
Date: Thu May 25 2000 - 09:18:09 EST

> More hmm... synchronize_bh() doesn't do anything to stop a new BH from
> starting as soon as it has returned. So it's safe within __global_cli()
> and it can be used after del_timer() to give del_timer_sync() semantics,
> but apart from that, what use is it??
 

Im thinking
 

        del_timer();
        del_timer();
        blah
        wait_timers();
 

Which happens to avoid the overhead and cost in the normal timer deletion
path.
 

> BTW2: is there any technical reason why the timer_struct clients haven't
> been migrated over to use timer_list yet?
 

Not that I know of
 

 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/


Re: [patches] kernel timer races

From: Andrew Morton (andrewm@uow.edu.au)
Date: Thu May 25 2000 - 10:02:37 EST

Alan Cox wrote:
>
> > More hmm... synchronize_bh() doesn't do anything to stop a new BH from
> > starting as soon as it has returned. So it's safe within __global_cli()
> > and it can be used after del_timer() to give del_timer_sync() semantics,
> > but apart from that, what use is it??
>
> Im thinking
>
> del_timer();
> del_timer();
> blah
> wait_timers();
 

Yes, that will work in process context but can't be globally macro'ed or
inlined onto the original del_timer because extra checking is needed in
BH/IRQ context.
 

IMO, all ~700 uses of del_timer() should use the synchronous version
until they have been audited for del_timer_async()-safety.
 

> Which happens to avoid the overhead and cost in the normal timer deletion
> path.
 

The cost isn't too high, actually. A memory read in internal_add_timer,
a read and a write in del_timer and four writes in run_timer_list().
Much more cost if the special-case code is triggered, of course. Very
rare.
 

> > BTW2: is there any technical reason why the timer_struct clients haven't
> > been migrated over to use timer_list yet?
>
> Not that I know of
 

Tempting.
 

 

-- 
-akpm-

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/


Re: [patches] kernel timer races

From: Alan Cox (alan@lxorguk.ukuu.org.uk)
Date: Thu May 25 2000 - 10:01:22 EST

> IMO, all ~700 uses of del_timer() should use the synchronous version
> until they have been audited for del_timer_async()-safety.
 

And what about auditing them for del_timer sync safety. One reason del_timer
is async is that the original del_timer for 2.1.x that waited did cause
deadlocks
 

 

 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/


Re: [patches] kernel timer races

From: Andrew Morton (andrewm@uow.edu.au)
Date: Thu May 25 2000 - 10:34:38 EST

Alan Cox wrote:
>
> > IMO, all ~700 uses of del_timer() should use the synchronous version
> > until they have been audited for del_timer_async()-safety.
>
> And what about auditing them for del_timer sync safety. One reason del_timer
> is async is that the original del_timer for 2.1.x that waited did cause
> deadlocks
 

That's why the patch detects the deadlock situation, breaks it and spits
out diagnostics - so we can hunt down the deadlocks and either fix them
or use del_timer_sync() in those special cases.
 

 

-- 
-akpm-

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/


Re: kernel timer races

From: Andrey Savochkin (saw@saw.sw.com.sg)
Date: Fri May 26 2000 - 07:33:50 EST

Hello,
 

On Thu, May 25, 2000 at 01:40:45PM +0100, Alan Cox wrote:
> There is a really dumb fix perhaps ?
>
> wait_on_timers();
>
> That would map to a bh wait on 2.2 and the saner timer wait on 2.3.x
 

Andrew, there are a lot of possible timer races, but most of them can be very
easy solved by the timer user. For example, setting some flag before
del_timer call so that the handler will not reinsert the timer solves several
kinds of the races. BTW, this flag seems to be necessary with del_timer_sync
anyway.
 

I like the Alan's proposal.
Actually, with wait_on_timers() and reinsert protection flag we do not need
del_timer_sync for most drivers. wait_on_timers() has larger overhead
because it waits for all timers, but, I think, it's acceptable for device
close call.
 

Best regards
                Andrey
 

--- ./kernel/timer.c.timer Fri May 26 17:55:54 2000
+++ ./kernel/timer.c Fri May 26 20:20:40 2000
@@ -270,9 +270,12 @@
         tv->index = (tv->index + 1) & TVN_MASK;
 }
 
+static volatile int walking_timer_list;
+
 static inline void run_timer_list(void)
 {
         spin_lock_irq(&timerlist_lock);
+ walking_timer_list = 1;
         while ((long)(jiffies - timer_jiffies) >= 0) {
                 struct list_head *head, *curr;
                 if (!tv1.index) {
@@ -304,7 +307,20 @@
                 ++timer_jiffies;
                 tv1.index = (tv1.index + 1) & TVR_MASK;
         }
+ walking_timer_list = 0;
         spin_unlock_irq(&timerlist_lock);
+}
+
+/*
+ * This function ensures that timers running at the moment of the call on other
+ * CPUs (if any) have finished. It doesn't guarantee that upon the exit the
+ * next round of timers hasn't begun.
+ */
+void wait_on_timers(void)
+{
+ do {
+ rmb();
+ } while (walking_timer_list);
 }
 
 
 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/


Re: [patches] kernel timer races

From: kuznet@ms2.inr.ac.ru
Date: Fri May 26 2000 - 13:26:28 EST

Hello!
 

> I wish Alexey were around to comment,
 

Well, Andrew, you are not the first who is bothered with this problem.
This bug existed in the same form in 2.1, it exists in 2.2 as well.
 

Actually, in 2.1 for some time del_timer() contained synchronize_bh()
exactly to avoid fixing these 250 places. I even do not know, when
it disappeared (apparently that solution was inclined to deadlocks),
it was surprise for me. Well, it was wrong in any case.
 

 

In 2.2 moral equivalent of del_timer_sync() is:
 

start_bh_atomic();
del_timer();
end_bh_atomic();
 

rather than:
 

del_timer();
synchronize_bh();
 

And wait_on_timers() is meaningless exactly by the same reason:
periodic timer can be restarted after del_timer().
 

 

 

What's about solution with waiting inside del_timer()...
Synchronous del_timer() is inclined to deadlocks, so that
you will not avoid auditing all these 256 places. 8)
 

About your analysis: _ALL_ the del_timer()'s in
net/{core,ipv4,ipv6,packet,netlink,sched,unix} are "asynchronous"
on your language and should be replaced with del_timer_async().
I not quite understood the problems, which you found there.
Please, explain.
 

Alexey
 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/


Re: kernel timer races

From: Andrew Morton (andrewm@uow.edu.au)
Date: Fri May 26 2000 - 11:01:43 EST

Andrey Savochkin wrote:
>
> Hello,
 

Hi, Andrey.
 

> On Thu, May 25, 2000 at 01:40:45PM +0100, Alan Cox wrote:
> > There is a really dumb fix perhaps ?
> >
> > wait_on_timers();
> >
> > That would map to a bh wait on 2.2 and the saner timer wait on 2.3.x
>
> Andrew, there are a lot of possible timer races,
 

Over a hundred, many of them fatal. Cheery, aren't I?
 

321 files use timers. I've performed a randomish audit of 35. Eight of
these are correct. http://www.uow.edu.au/~andrewm/linux/timers.txt
 

[ eepro100 should be using del_timer_sync() in speedo_close(),
  and doesn't need to wrap del_timer/del_timer_sync in
  #ifdef CONFIG_SMP in speedo_tx_timeout(), BTW :) ]
 

> but most of them can be very
> easy solved by the timer user. For example, setting some flag before
> del_timer call so that the handler will not reinsert the timer solves several
> kinds of the races.
 

Yes, it solves the timer-running-continuously-after-module-unload
problem, but not the handler-still-running-once-after-close problem. A
number of files do some clever spinlocking and trylocking to get around
this.
 

> BTW, this flag seems to be necessary with del_timer_sync
> anyway.
 

The present del_timer_sync takes care of that - it will detatch the
timer _after_ its handler has completed.
 

But del_timer_sync has a few problems, as I mentioned. Plus: if the
handler kfree's the timer (I've found a couple of these, BTW) then
del_timer_sync() will hang if slab poisoning is turned on.
 

> I like the Alan's proposal.
> Actually, with wait_on_timers() and reinsert protection flag we do not need
> del_timer_sync for most drivers. wait_on_timers() has larger overhead
> because it waits for all timers, but, I think, it's acceptable for device
> close call.
 

Sure. But del_timer() + wait_on_timers() + dont-re-add-flag ==
del_timer_sync()?
 

It helps a bit; we also need to do something about the fact that the
handler can be executing after the del_timer, even if it's the last time
it'll run. Only a tiny percentage of the clients of the kernel timers
are designed to cope with this.
 

I have completed the audit which Alan suggested (deadlocks caused by
making del_timer synchronous). I reviewed the 92 files which contain
the strings "del_timer|add_timer|mod_timer" and "spin". I patched the
following:
 

include/net/tcp.h
net/ipv4/tcp_timer.c
net/ipv4/igmp.c
net/ipv4/ip_fragment.c
net/decnet/dn_route.c
net/ipv6/addrconf.c
net/ipv6/reassembly.c
net/sunrpc/sched.c
net/sched/sch_generic.c
net/sched/estimator.c
drivers/net/slip.c
drivers/net/wan/syncppp.c
drivers/net/via-rhine.c
drivers/char/rtc.c
drivers/char/synclink.c
drivers/scsi/pci2220i.c
drivers/scsi/scsi_error.c
drivers/sound/sonicvibes.c
drivers/sound/cmpci.c
drivers/sound/esssolo1.c
drivers/video/tdfxfb.c
drivers/video/matrox/matroxfb_DAC1064.c
drivers/usb/uhci.c
drivers/ide/ide.c
drivers/ide/ide-disk.c
 

Detailed analysis at http://www.uow.edu.au/~andrewm/linux/spintimers.txt
Patch at http://www.uow.edu.au/~andrewm/linux/timer-deadlocks-7.patch
 

The problem is, in many of the above cases I simply replaced del_timer
with del_timer_async, so the races which some of the above files have
will remain in place. But the potential problems are identified and I
can now bug people :).
 

I expect the reason why these races haven't caused general disaster is
that they only affect SMP, and a lot of them are in code which doesn't
tend to be used on SMP machines: sound, old NICs, oddball devices, ISDN,
framebuffer,
etc:
 

- IDE is solid
- IPv4 is solid (I think - need to revisit)
- SCSI is solid (except for pci2220i.c)
- The popular NIC drivers are solid (except for close())
- IPv6 has problems...
 

 

We have a choice of:
 

1: Give del_timer synchronous semantics and patch the
   above 25 files or
 

2: Fix del_timer_sync and then use it in ~250 files or
 

3: Throw up our hands and run away, as appeared to
   happen in 2.1.x
 

I think you're implying that it should be option 2. That's OK - it's a
lot more work but at least I get to read most of the kernel source, fix
other bugs as they pop up and get to meet lots of maintainers :)
 

But it's a lot more work and options 1 and 2 are equivalent. The only
difference is in the names of the functions!
 

 

 

-- 
-akpm-

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/


Re: [patches] kernel timer races

From: Andrew Morton (andrewm@uow.edu.au)
Date: Sat May 27 2000 - 00:07:03 EST

kuznet@ms2.inr.ac.ru wrote:
>
>
> In 2.2 moral equivalent of del_timer_sync() is:
>
> start_bh_atomic();
> del_timer();
> end_bh_atomic();
 

Yes, except start_bh_atomic() returns immediately if called from BH or
IRQ context.
 

One thing which we must avoid is spinning on completion of ALL BH's.
This is because we would end up spending much, much time waiting for
irrelevant things like netdevice->hard_start_xmit().
 

So to synchronise we need to spin on something which is specific to
timers. Ideally, something which is specific to the single timer in
which we are interested. That's what my proposed patch does.
 

> rather than:
>
> del_timer();
> synchronize_bh();
>
> And wait_on_timers() is meaningless exactly by the same reason:
> periodic timer can be restarted after del_timer().
>
> What's about solution with waiting inside del_timer()...
> Synchronous del_timer() is inclined to deadlocks, so that
> you will not avoid auditing all these 256 places. 8)
 

Correct. I've looked at all the files which contain timer functions as
well as the string "spin". There may be a couple I've missed, but the
deadlock-detection code will trap these the first time they happen to
anyone. The machine will freeze for half a second or so then it shouts
out a URL, stack backtrace, etc. We'll find 'em.
 

> About your analysis: _ALL_ the del_timer()'s in
> net/{core,ipv4,ipv6,packet,netlink,sched,unix} are "asynchronous"
> on your language and should be replaced with del_timer_async().
 

Yes. But not bridge, netrom, decnet, ...
 

> I not quite understood the problems, which you found there.
> Please, explain.
 

I assumed that most of this was correct because it has your name all
over it :) So a blanket s/del_timer/del_timer_async/ is in order.
 

So I didn't look closely at core, ipv4, etc. I'll do so, but it's not
urgent.
 

I spotted two potential-I-dont-understand-this-stuff problems in ipv6:
 

inet6_ifa_finish_destroy() does a del_timer() and then kfree's 'ifp'.
But if the handler is still running, it is accessing fields within *ifp
(I think). Not very important, because inet6_ifa_finish_destroy() looks
like a 'cannot happen' function.
 

In reassembly.c, both calls to del_timer() are followed by a
frag_kfree_s(fq), but the handler could still be running, and it
accesses *fq.
 

BTW: I prefer the name "del_timer_async" - it makes people THINK before
using it. I'd like to rename del_timer_sync to del_timer_can_deadlock
as well :)
 

 

 

-- 
-akpm-

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/


Re: [patches] kernel timer races

From: kuznet@ms2.inr.ac.ru
Date: Sat May 27 2000 - 12:06:44 EST

Hello!
 

> > In 2.2 moral equivalent of del_timer_sync() is:
> >
> > start_bh_atomic();
> > del_timer();
> > end_bh_atomic();
>
> Yes, except start_bh_atomic() returns immediately if called from BH or
 

In 2.2 it is exactly, that we need.
 

 

> IRQ context.
 

Even not discussed. 8)
 

 

> One thing which we must avoid is spinning on completion of ALL BH's.
 

In 2.3 it is simply impossible to wait on this.
In 2.2 it is normal.
 

 

> So to synchronise we need to spin on something which is specific to
> timers. Ideally, something which is specific to the single timer in
> which we are interested. That's what my proposed patch does.
 

To be honest I still did not find the patch, where new del_timer()
is defined. 8)
 

 

> Yes. But not bridge, netrom, decnet, ...
 

Brigde and decnet must be clean as well. If they are not,
this code has active maintainers, which can be kicked.
 

 

> inet6_ifa_finish_destroy() does a del_timer() and then kfree's 'ifp'.
 

All these things use reference counting. When destructor is entered,
it is the last user and no timer is running on the object.
 

del_timer() there is pure debugging check.
 

 

> In reassembly.c, both calls to del_timer() are followed by a
> frag_kfree_s(fq), but the handler could still be running, and it
> accesses *fq.
 

Yes... It was safe in 2.2 (BH only), now it is bug. And del_timer_sync()
is dangerous to be used here, reassembly function could be called
from an obscure context... Reference counting.
 

 

> BTW: I prefer the name "del_timer_async" - it makes people THINK before
> using it. I'd like to rename del_timer_sync to del_timer_can_deadlock
> as well :)
 

I agreed! The thing, which I still do not understand is how you managed
to implement correct del_timer_sync(). 8)
 

Alexey
 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/