From: "Joshua M. Thompson" <in...@optera.com>
Subject: Processes hanging on 2.0.20; not just on Alphas anymore :(
Date: 1996/09/19
Message-ID: <Pine.LNX.3.91.960919191449.25491A-100000@jade.optera.com>#1/1
X-Deja-AN: 184100534
sender: owner-linux-ker...@vger.rutgers.edu
content-type: TEXT/PLAIN; charset=US-ASCII
x-hdr-sender: in...@optera.com
mime-version: 1.0
x-env-sender: owner-linux-kernel-outgo...@vger.rutgers.edu
newsgroups: linux.dev.kernel


I just noticed one of our Pentium servers running 2.0.20 is experiencing 
the same "hung" process problem I first noticed on my Alphas running 
2.0.20: I have an anonymous FTP process that was started about 20 minutes 
ago that is stuck in the "R"unning state and simply will not die -- I've 
done a kill -9 on it at least five times and yet it's still just sitting 
there (and, I think, giggling at me :-) ).

I haven't had much time lately to keep up on every last bit of list 
traffic...is this a known 2.0.20 problem?

-- 
in...@optera.com             | We are Grey
http://www.optera.com/~invid | We stand between the Candle and the Star
                             | Between the Darkness and the Light

From: Jon Lewis <jle...@inorganic5.fdt.net>
Subject: Re: Processes hanging on 2.0.20; not just on Alphas anymore :(
Date: 1996/09/19
Message-ID: <Pine.LNX.3.95.960919222708.14881F-100000@inorganic5.fdt.net>#1/1
X-Deja-AN: 184098179
sender: owner-linux-ker...@vger.rutgers.edu
references: <Pine.LNX.3.91.960919191449.25491A-100000@jade.optera.com>
content-type: TEXT/PLAIN; charset=US-ASCII
x-hdr-sender: jle...@inorganic5.fdt.net
mime-version: 1.0
x-env-sender: owner-linux-kernel-outgo...@vger.rutgers.edu
newsgroups: linux.dev.kernel


On Thu, 19 Sep 1996, Joshua M. Thompson wrote:

> I just noticed one of our Pentium servers running 2.0.20 is experiencing 
> the same "hung" process problem I first noticed on my Alphas running 
> 2.0.20: I have an anonymous FTP process that was started about 20 minutes 
> ago that is stuck in the "R"unning state and simply will not die -- I've 
> done a kill -9 on it at least five times and yet it's still just sitting 

Odd.  I posted about Apache 1.1.1 doing something similar last night.  It
went into state R, started chewing large % of CPU, and refused to die.  I
ended up going into /proc/3261/fd and finding and rm'ing one of the
files that was open (presumably being served by the process before it went
crazy).  Immediately after the rm, the process died.  This was under
2.0.18.  

If this is a kernel problem, I only hope it and the apparent PPP related
bug posted about today can be resolved before the developers plunge into
2.1.x.  I realize new features are much more fun than debugging, but
somebody has to do it, or we end up with Linux supporting toasters, but
the system crashes before the toast is done.


------------------------------------------------------------------
 Jon Lewis <jle...@fdt.net>  |  Unsolicited commercial e-mail will
 Network Administrator       |  be proof-read for $199/hr.
________Finger jle...@inorganic5.fdt.net for PGP public key_______

From: Alan Cox <a...@cymru.net>
Subject: Re: Processes hanging on 2.0.20; not just on Alphas anymore :(
Date: 1996/09/20
Message-ID: <199609201040.LAA02144@snowcrash.cymru.net>#1/1
X-Deja-AN: 184458545
sender: owner-linux-ker...@vger.rutgers.edu
references: <Pine.LNX.3.95.960919222708.14881F-100000@inorganic5.fdt.net>
content-type: text/plain; charset=US-ASCII
x-hdr-sender: a...@cymru.net
mime-version: 1.0
x-env-sender: owner-linux-kernel-outgo...@vger.rutgers.edu
newsgroups: linux.dev.kernel


> Odd.  I posted about Apache 1.1.1 doing something similar last night.  It
> went into state R, started chewing large % of CPU, and refused to die.  I
> ended up going into /proc/3261/fd and finding and rm'ing one of the
> files that was open (presumably being served by the process before it went
> crazy).  Immediately after the rm, the process died.  This was under
> 2.0.18.  

Yep. I think this is another manifestation of the file locking stuff that
locks SMP in 2.0.20, but in a differently annoying guise. Linus posted some
changes to avoid this problem so hopefully .21 will fix it

Alan

From: Linus Torvalds < torva...@cs.helsinki.fi>
Subject: Re: Processes hanging on 2.0.20; not just on Alphas anymore :(
Date: 1996/09/20
Message-ID: < Pine.LNX.3.91.960920165249.516B-100000@linux.cs.Helsinki.FI>#1/1
X-Deja-AN: 184199324
sender: owner-linux-ker...@vger.rutgers.edu
references: <199609201040.LAA02144@snowcrash.cymru.net>
content-type: TEXT/PLAIN; charset=US-ASCII
x-hdr-sender: torva...@cs.helsinki.fi
mime-version: 1.0
x-env-sender: owner-linux-kernel-outgo...@vger.rutgers.edu
newsgroups: linux.dev.kernel




On Fri, 20 Sep 1996, Alan Cox wrote:
> 
> Yep. I think this is another manifestation of the file locking stuff that
> locks SMP in 2.0.20, but in a differently annoying guise. Linus posted some
> changes to avoid this problem so hopefully .21 will fix it

Yep, 2.0.21 is out, and now SMP should be stable again thanks to the file
locking fixes (thanks to everybody who was testing things out). This also
does the "make depend" stage in C (everybodys fabourite gripe), and it's a
lot faster now. Whee. 

2.0.21 also fixes a deadlock with the LOOP devices thanks to Ray Van Tassle
who has been working on this for some time. 

The 3c509 driver also has a longer timeout that seems to be needed on some
setups, so if you had sporadic problems with detection of the 3c509 they
should hopefully be gone now. The wd7000 scsi driver is also updated, please
check it out.

		Linus "on to 2.1, I think" Torvalds

From: Linus Torvalds <torva...@cs.helsinki.fi>
Subject: test version of 2.1.0 available
Date: 1996/09/23
Message-ID: <Pine.LNX.3.91.960923164329.1132A-100000@linux.cs.Helsinki.FI>#1/1
X-Deja-AN: 185091781
sender: owner-linux-ker...@vger.rutgers.edu
references: <m2u3ssmxks.fsf@deanna.miranova.com>
content-type: TEXT/PLAIN; charset=US-ASCII
x-hdr-sender: torva...@cs.helsinki.fi
mime-version: 1.0
x-env-sender: owner-linux-kernel-outgo...@vger.rutgers.edu
newsgroups: linux.dev.kernel



Could people who are doing device drivers and other people that just want 
to live dangerously check out the new pre-2.1.0.gz file that I just put 
out for ftp on

  ftp://ftp.cs.helsinki.fi/pub/Software/Linux/Kernel/testing/pre-patch-2.1.0.gz

(It's _currently_ has a md5sum of 92eb12e2275dd75fb14f6538d1a4e03d and is
28265 bytes long, but I might be updating it). 

The new thing with the 2.1.x tree is that the kernel no longer uses the x86
segmentation to access user mode, but instead the kernel segment covers the
whole 4GB virtual address space, and thus accesses to user memory can be made
faster. 

NOTE! When you access user memory you still have to use the "get_user()"  etc
helper functions, exactly as before, because those can still do strange
things on other architectures. The new thing is just a x86 _implementation_
issue, not a change in the basic philosophy (you'll notice that most of the
changes are to x86-specific files). 

The thing about the new memory management setup is that it gets rid of a lot
of extra fluff that isn't needed, and makes things potentially faster. 
However, there is one downside to all this, and that is the fact that the
kernel address space is no longer identical to the hardware address space. 

For example, to access hardware address 0xA0000, you can no longer just do
something like this: 

	value = (unsigned char *) 0xA0000;

but instead you have to use something like this:

	value = readb(0xA0000);

Note that I called this a "downside", but in actual fact it's a good thing.
The reason that it's a good thing is that the drivers should have been using
readb() already - that's the portable interface, and that's how you have to
access PCI shared memory on alpha and PPC machines. In fact, most drivers
have already been converted to use this interface, but there are certainly
drivers out there that will break horribly by my pre-patch. That's why I'd
like driver authors (and technical people who want to test) to try out the
pre-patch. 

Drivers that should work already (because they work on alpha's - the ones
marked with (*) have actually been tested by me on my PC with the pre-patch): 

 - VGA console (*)
 - IDE harddisk and CD-ROM (*)
 - normal serial lines (*)
 - keyboard (*)
 - 3c509 network card (*)
 - floppy driver (*)
 - PS/2 mouse
 - de4x5 and ne2000 network cards
 - original ncr scsi drivers 
 - BSD ncr SCSI driver (at least with IO-mapped accesses)
 - Aha1740, BusLogic and QLogic ISP SCSI drivers

Other drivers are in the "may well work" category, but I cannot give any
guarantees. If some driver doesn't work, you're likely to find out sooner
rather than later (they generally break totally - it's not going to be
subtle). 

I'd love to get reports on these patches, and would be even happier if you
can also send a patch that actually fixes any broken behaviour. Things 
that can break are:
 - accessing device memory though a pointer
   FIX: use readb/writeb/memcpy_fromio/memcpy_toio instead of trying to 
        dereference the pointer directly.
 - using virtual addresses for controller-initiated DMA
   FIX: use the "virt_to_bus()" function to change a virtual address into 
        a "bus" address to give to controllers that do DMA into memory.
 - memory re-mapping may be broken right now, so anything that needs
   "vremap()" probably doesn't work. If you have a driver that breaks due 
   to this, please test changing the line (line 187 in mm/vmalloc.c)

		set_pte(pte, mk_pte(offset, PAGE_KERNEL));

   to do a

		set_pte(pte, mk_pte(offset+PAGE_OFFSET, PAGE_KERNEL));

   and see if that helps?

Comments? Success stories? Hate mail? Money?

		Linus

From: Linus Torvalds <torva...@cs.helsinki.fi>
Subject: Re: test version of 2.1.0 available
Date: 1996/09/24
Message-ID: <Pine.LNX.3.91.960924151341.966B-100000@linux.cs.Helsinki.FI>#1/1
X-Deja-AN: 185172373
sender: owner-linux-ker...@vger.rutgers.edu
references: <Pine.LNX.3.91.960924133256.2438A-100000@elserv.ffm.fgan.de>
content-type: TEXT/PLAIN; charset=US-ASCII
x-hdr-sender: torva...@cs.helsinki.fi
mime-version: 1.0
x-env-sender: owner-linux-kernel-outgo...@vger.rutgers.edu
newsgroups: linux.dev.kernel



Could people that had problems with the first pre-patch-2.1.0 try out the
second one (called "pre-patch-2.1.0-#2", and available in the same place as
the original one)? 

This one fixes a problem with the memory management setup on non-Pentium
machines that didn't have the 4MB page tables (I only tested it on a
Pentium). I think it should work correctly now. 

This also fixes the vremap() call, so hopefully the BSD ncr driver works.

Oh, and in case you didn't notice before: the 2.1.x kernels will only compile
with an ELF compiler: the a.out setup is completely disabled now. So you do
have to upgrade your compiler if you haven't already. 

(And people: I was _joking_ about the money. Just send me reports..)

		Linus

From: Alan Cox <a...@cymru.net>
Subject: Re: 2.0.22 will be the last version
Date: 1996/09/27
Message-ID: <199609270921.KAA27286@snowcrash.cymru.net>#1/1
X-Deja-AN: 185730660
sender: owner-linux-ker...@vger.rutgers.edu
references: <324A25BE.8F9@student.anu.edu.au>
content-type: text/plain; charset=US-ASCII
x-hdr-sender: a...@cymru.net
mime-version: 1.0
x-env-sender: owner-linux-kernel-outgo...@vger.rutgers.edu
newsgroups: linux.dev.kernel


> Everything on 2.0.21 seems stable so far.

Hohum.. Im still accumulating 2.0.x patches, and I expect some stuff like
the SYN flood protection things will change over time as people get a better
and better grip on the maths behind the problem.

Alan

From: Lindsay Haisley < fmo...@fmp.com>
Subject: Where are we on 2.0.x???
Date: 1996/09/26
Message-ID: < 2.2.32.19960926162942.009502f4@linux.fmp.com>#1/1
X-Deja-AN: 185625188
sender: owner-linux-ker...@vger.rutgers.edu
x-sender: fmo...@linux.fmp.com
content-type: text/plain; charset="us-ascii"
x-hdr-sender: fmo...@fmp.com
mime-version: 1.0
x-env-sender: owner-linux-kernel-outgo...@vger.rutgers.edu
newsgroups: linux.dev.kernel


I run a small commercial online system and would like to stabilize my kernel
upgrades with 2.0.21 (or whatever 2.0.x kernel the kernel dieties say is the
finished product) but I've seen reports on this list of unresolved problems
with regard to:

*  MTU Discovery

*  PPP compression

*  Virtual apache servers

*  Recompilation of the kernel source

What's the status of these and other potential rough spots on the 2.0.21
kernel, and has 2.0.x improvement "officially" stopped with this kernel
version?  Might I be better off using a kernel from early in the 2.0.x
series, or possibly one of the latter 1.3.x kernels which I've heard are
highly reliable and very stable?


Lindsay Haisley                   (______)
FMP Computer Services               (oo)        "The bull 
fmo...@fmp.com                /------\/            stops here!"
Austin, Texas, USA           / |    ||  
512-259-1190                *  ||---||             * * * * * *
                               ~~   ~~         http://www.fmp.com

From: Alan Cox <a...@cymru.net>
Subject: Re: Where are we on 2.0.x???
Date: 1996/09/27
Message-ID: <199609270932.KAA27679@snowcrash.cymru.net>#1/1
X-Deja-AN: 185713432
sender: owner-linux-ker...@vger.rutgers.edu
references: <2.2.32.19960926162942.009502f4@linux.fmp.com>
content-type: text/plain; charset=US-ASCII
x-hdr-sender: a...@cymru.net
mime-version: 1.0
x-env-sender: owner-linux-kernel-outgo...@vger.rutgers.edu
newsgroups: linux.dev.kernel


> upgrades with 2.0.21 (or whatever 2.0.x kernel the kernel dieties say is the
> finished product) but I've seen reports on this list of unresolved problems
> with regard to:
> 
> *  MTU Discovery
I have no outstanding bug reports of MTU discovery problems that were not
faulty kit or configuration errors

> *  PPP compression

I've seen a few of these reports too. Its on my unresolved collection

> *  Virtual apache servers
News to me.

> *  Recompilation of the kernel source
Only cases I know of are stupid ones like saying yes to netware file
support and no to IPX, or yes to sound and no drivers.

BTW: networking layer changes are tracked on
http://www.uk.linux.org/NetNews.html