Path: sparky!uunet!ferkel.ucsb.edu!taco!rock!concert!rutgers!
cs.utexas.edu!zaphod.mps.ohio-state.edu!wupost!tulane!rouge!rouge!dmb
From: d...@srl02.cacs.usl.edu (David M. Brumley)
Newsgroups: alt.hackers
Subject: "Unforking" an SLC.
Message-ID: <DMB.92Oct12215634@srl02.cacs.usl.edu>
Date: 13 Oct 92 03:56:34 GMT
Sender: a...@usl.edu (Anonymous NNTP Posting)
Organization: University of Louisiana
Lines: 38
Approved: First post.

On Sun SLCs, under SunOS 4.1.2, if the kernel is configured to limit
the number of simultaneous processes per user, the program:

	main() {for(;;fork());}

will quickly reach the per user process limit, after which all
attempts to execute commands from csh, except built-ins, will fail
with a "Vfork error."  Apparently, using the csh "kill" built-in to
kill the forking processes does not work.

Here's one way we've found to kill off the forking processes.  It
assumes that you are able to rsh to the forked machine, and that
your login shell is csh (blech).  

First, we want to take the original fork program (called "rabbit") and
make a copy (called "dog").  But csh on the forked machine can no
longer vfork() commands, so we use the csh "exec" built-in:

$ rsh <host> exec /bin/cp rabbit dog

Then, we run the dog program:

$ rsh <host> exec dog

Now, with the single dog process running in the remote shell (on a
leash, so to speak), we use another rsh to find the process group ID
of the rabbit processes:

$ rsh <host> exec /bin/ps -jx

Knowing the process group ID, we start killing rabbits:

$ rsh <host> exec /bin/kill -9 -<PGID>

After running the kill a few times, all of the original rabbit
processes die off and are replaced by dog processes.  Finally, we send
a quit signal <CTRL/\> to the original dog rsh, all of the dogs die,
and everything is back to normal.

Path: sparky!uunet!olivea!spool.mu.edu!sol.ctr.columbia.edu!destroyer!
cs.ubc.ca!newsserver.sfu.ca!howesb
From: how...@monashee.sfu.ca (Charles Howes)
Newsgroups: alt.hackers
Subject: Re: "Unforking" an SLC.
Message-ID: <1992Nov1.075320.1418@sfu.ca>
Date: 1 Nov 92 07:53:20 GMT
References: <DMB.92Oct12215634@srl02.cacs.usl.edu>
Sender: n...@sfu.ca
Organization: Simon Fraser University, Burnaby, B.C., Canada
Lines: 24
Approved: God

In article <DMB.92Oct12215...@srl02.cacs.usl.edu> d...@srl02.cacs.usl.edu
(David
M. Brumley) writes:

>main() {for(;;fork());}
>Here's one way we've found to kill off the forking processes.  It
>assumes that you are able to rsh to the forked machine, and that
>your login shell is csh (blech).

[method deleted]

The method I've found to be effective is 'kill -15 -1' or, failing that,
'kill -9 -1'.  It kills off all processes owned by you.

If the 'rabbit' program is a cron job, however, you have to act quickly.

NB: The process number -1 behaves differently on different machines.

Obhack: Using 'ping' to get a list of all hosts on the same subnet.

	ping -c2 224.0.0.1

Unfortunately, only SGI machines have implemented this.  The number appears
in the latest RFC specifying well-known numbers.

Path: sparky!uunet!ferkel.ucsb.edu!taco!hsdndev!yale!yale.edu!spool.mu.edu!
agate!doc.ic.ac.uk!uknet!pavo.csi.cam.ac.uk!pc123
From: pc...@cl.cam.ac.uk (Pete Chown)
Newsgroups: alt.hackers
Subject: Re: "Unforking" an SLC.
Summary: Listing hosts on a subnet
Message-ID: <1992Nov1.191132.4138@infodev.cam.ac.uk>
Date: 1 Nov 92 19:11:32 GMT
References: <DMB.92Oct12215634@srl02.cacs.usl.edu> <1992Nov1.075320.1418@sfu.ca>
Sender: n...@infodev.cam.ac.uk (USENET news)
Reply-To: PC...@phx.cam.ac.uk (Pete Chown)
Organization: U of Cambridge Comp Lab, UK
Lines: 31
Approved: foobar
Nntp-Posting-Host: barton.cl.cam.ac.uk

In article <1992Nov1.075320.1...@sfu.ca> how...@monashee.sfu.ca (Charles Howes) 
writes:

>Obhack: Using 'ping' to get a list of all hosts on the same subnet.
>
>	ping -c2 224.0.0.1
>
>Unfortunately, only SGI machines have implemented this.  The number appears
>in the latest RFC specifying well-known numbers.

Err... try:

$ host -l cl.cam.ac.uk

(you get rather a lot of output)

Obhack: Linux always used to give you individual characters even
though the terminal was in ICANON mode (supposed to be line buffering)
provided you were doing non-blocking I/O.  Correcting this was quite a
big hack - I knew nothing about the Linux tty driver, so just chopped
out the code that ICANON mode normally used to see if there was a
complete line.  I jammed it into the I/O routine, returning EAGAIN if
it wasn't complete.  It didn't work, so I removed a bit of it and then
it did.  I still don't know what any of it does!

The amazing thing was that I then submitted this as a patch to the
kernel, and it was accepted as though I knew exactly what I was
doing...
--
---------------------------------------------+ "A tight hat can be stretched.
Pete Chown, pc...@phx.cam.ac.uk (Internet)   |  First damp the head with steam
            pc...@uk.ac.cam.phx (Janet :-)  -+  from a boiling kettle."

Path: sparky!uunet!mcsun!news.funet.fi!hydra!klaava!torvalds
From: torva...@klaava.Helsinki.FI (Linus Torvalds)
Newsgroups: alt.hackers
Subject: Re: "Unforking" an SLC.
Message-ID: <1992Nov2.133732.24009@klaava.Helsinki.FI>
Date: 2 Nov 92 13:37:32 GMT
References: <DMB.92Oct12215634@srl02.cacs.usl.edu> <1992Nov1.075320.1418@sfu.ca> 
<1992Nov1.191132.4138@infodev.cam.ac.uk>
Organization: University of Helsinki
Lines: 49
Approved: cat-lovers everywhere

In article <1992Nov1.191132.4...@infodev.cam.ac.uk> PC...@phx.cam.ac.uk 
(Pete Chown) writes:
>
>Obhack: Linux always used to give you individual characters even
>though the terminal was in ICANON mode (supposed to be line buffering)
>provided you were doing non-blocking I/O.  Correcting this was quite a
>big hack - I knew nothing about the Linux tty driver, so just chopped
>out the code that ICANON mode normally used to see if there was a
>complete line.  I jammed it into the I/O routine, returning EAGAIN if
>it wasn't complete.  It didn't work, so I removed a bit of it and then
>it did.  I still don't know what any of it does!
>
>The amazing thing was that I then submitted this as a patch to the
>kernel, and it was accepted as though I knew exactly what I was
>doing...

Hmm..  Now he tells me. 

Actually, I cleaned it up a bit afterwards, not so much because there
was anything wrong with it, but simply because I also re-arranged the
sleeping code in "wait_for_canon()", and made it all a bit more modular
(no more duplicated code snippets).  Kernel programming isn't that
difficult: it just takes some extra care.  But it's a better obhack than
the one I have..

ObHack: the very same tty driver had this unfortunate bug that resulted
in a bad wait-queue when either the input or output area in user level
was swapped out (I don't think this bug exists in any official kernel,
but I had been fooling around with the sleeping code in general (see
above), and some of my changes were "less than perfect").  So I got this
kernel debugging info ("unable to handle kernel paging request..") while
running X and generally swapping a bit too much (recompiling the
kernel).  Argghh. 

Anyway, the kernel didn't crash - it just killed the uemacs session I
was working in, but it wasn't exactly encouraging.  Never fear: the new
kernels have a "/dev/cmem" device (by Eric Youngdale? - anyway it was a
fun and easy addition) which I hadn't really used very much before, so I
decided I might as well test it now. It fools gdb into reading kernel
memory as if it was a core dump, so I had to recompile the kernel with
debugging info and see what went wrong.

As it turned out, /dev/cmem was fun, but I could have gotten the same
info just by looking at the compiled binary.  Also, gdb-4.5 or gcc-2.2.2
seems to have some problems with the debugging info, as the disassembly
showed the wrong function names.  But I did find the bug, and linux (and
X11) lived through it all, so I was able to recompile and reboot within
minutes. 

		Linus