Signal handler compatibility?

Subject: Signal handler compatibility?
Date: Tue, 10 Dec 91 02:47:13 -0500
From: tytso@ATHENA.MIT.EDU (Theodore Ts'o)
To: Linux-activists@joker.cs.hut.fi
Reply-To: tytso@athena.mit.edu

Well, it looks like I have job control pretty much functioning with a
version of bash that understands job control, and it pretty much works.
One or two more days while I do more testing and packaging of diffs, and
I'll be sending the kernel changes to Linus, and making them available
on TSX-11 so that more adventurous souls who job control sooner rather
than later can grab it and try it out.

One of the problems which I've run into during my testing is that when a
read gets interrupted with a ^Z, when you restart it, the system call
that was in progress returns a -EINTR, and the program is supposed to
understand the EINTR and restart the system call.  Well, the problem is
thatmost programs *don't* bother to understand EINTR, and so it is
highly desirable to have restartable system calls ala BSD 4.3.

Unfortunately, in order to do that, it will be necessary to change the
signal trampoline code so the signal handler exits, a sigreturn system
call is executed.  This in fact will result in cleaner and clearer code
in the kernel, but since the signal trampoline code is part of libc.a,
it would mean that code which used signal handlers would need to be
recompiled.  In other words, a kernel which used the new way of doing
signal handling would not be binary compatible with binaries that used
signal handlers.

Now, I can think of ways that would probably work in making the
resulting kernel be backwards compatible, but they all involve really
gross hacks that would be really painful, and I'm not sure that it would
be good thing to be introducing ugly hacks to maintain backwards
compatibility when Linux is still in beta test.

How do people feel about this one?  How much code which has been ported
actually use signal handlers?  Would people be willing to recompile them
with a new libc.a?  Linus, would you be willing to take a change in the
signal handling code that might not be backwards compatible?

							- Ted

Subject: Looking for alpha testers for job control changes
Date: Thu, 12 Dec 91 13:30:42 -0500
From: tytso@ATHENA.MIT.EDU (Theodore Ts'o)
To: linux-activists@joker.cs.hut.fi
Reply-To: tytso@athena.mit.edu

I'm looking for a few people who are willing to take a look at my
changes to add job control to Linux and test them out.  The ideal alpha
tester is someone who is willing to try out these changes and see if 1)
they can find some way of using job control which breaks my
implementation, 2) if they conflict with changes they are planning to make
to the kernel (mostly changes in exit.c and signal.c, with additional
changes in tty_io.c, tty_ioctl.c, sys.c, and open.c), and/or 3) if they
reasonably correspond with other implementations of job control,
particularily BSD 4.3 and Sys V.

I make no guarantees about the correctness of these patches, and any
future patches which I send out will be relative to Linux 0.11, NOT
to these alpha patches.  The main reason why I want people to try them
out is that I've implemented job control almost entirely from the POSIX
spec, and I would other people to check to see if my interpretation of
the POSIX spec is compatible with the rest of the world.

If you feel up to trying them out, please let me know.  If you want to
look at the changes before deciding, they can be found in
TSX-11:~ftp/ALPHA/jobcontrol.  One generic bug which will hit when you
try this is out is that apparently there is a bug with how gcc handles
signals.  If you send gcc a SIGCONT (you don't even need to stop it
first), it will die with an IOT trap.  I suspect gcc needs to be
recompiled with a recent libc.a to fix this problem.  

						- Ted

Subject: Final version of the job control patches....
Date: Thu, 2 Jan 92 19:00:19 -0500
From: tytso@ATHENA.MIT.EDU (Theodore Ts'o)
To: Linux-activists@joker.cs.hut.fi
Reply-To: tytso@athena.mit.edu

The final copy of my job control patches to Linux 0.11 have been sent to
Linus; if you want an early peek at them, look in
tsx-11.mit.edu:/ALPHA/jobcontrol.  The following is the NOTES file from
that directory:

New Features added by these patches

	* Job control (setpgrp(), tcsetpgrp(), signals, wait(), etc.)
	* gethostname(), sethostname()
		- try using the included hostname program!
	* getrusage()
		- not completely implemented, but the skeleton
			is in place.
	* getrlimit(), setrlimit()  
		- try using the ulimit command in bash!
		- nothing looks at the limits yet, though
	* gettimeofday(), settimeofday()
		- for that microsecond accuracy some people demand :-)
			(well, 0.01 second accuracy, actually....)

Notes about patches:

1) I integrated in John Kohl's patches to tty_io.c, since I had also
	made changes, and at least one of his patches depended on
	my patch being made first.  I have included the other patches
	which he sent to me as separate patches in separate files,
	in case you did not get a copy from him.

2)  I changed the name of system_call.s to sys_call.s, since system_call.s
	is too long for RCS and !@#$! 14 character filenames.  You
	should either also make this change (and modify the Makefile
	appropriately) or edit the patch file to change the filename
	back to system_call.s

3)  Currently, MAXHOSTNAMELEN is set at 8 characters.  I think it
	should be changed to 64 characters (change in sys/param.h),
	to make it like BSD.  Unfortunately, doing this will require 
	recompiling libc.a and the shell, since it uses the uname() 
	system call.  Not a big deal, but it means that when you boot
	with the new kernel, you need to make sure you have the right
	version of /bin/sh installed.

4)  When we add time zone support into the library, we will need to
	do something about how Linux sets the time from the CMOS
	registers.   The problem is that the CMOS clock ticks localtime,
	and POSIX specifies that we should be ticking GMT time. 
	Currently, we're not, so it's not a problem; but once we
	start including TZ routines in the libc, we will need to deal
	with it.  My attempt to solve the problem can be found in
	sys.c: adjust_time().  I don't like the solution all that much,
	but it's the best I could think of at the time.

Things to think about:

A)  Perhaps the task_struct should go in malloc()'ed memory.  This
	will free close to 1k for the kernel stack.  We may not need
	to do it now, but as more things get added to the task_struct
	(like BSD-style group lists, etc.), and as the kernel gets more
	complicated (for example, when networking code is added),
	I suspect we will need to do it sooner or later.


B)  Speaking of which, I haven't looked at the new VM stuff that does
	paging.  Does the malloc() routine still work, or will it 
	need to be modified to coexist with paging?

C)  Another good idea would be to unify the buffer management and
	the free pool memory management, ala SunOS and the latest
	Sys VR4 design.  This allows free memory pages to be used as 
	buffers and vice versa.

D)  Currently, gid_t is a unsigned char.  I would strongly suggest that
	we change to be a unsigned short, and represent it internally
	in the kernel as an unsigned int.  This will make life much
	easier if (ha, ha) we ever decide add the Andrew Filesystem 
	(AFS) to Linux.  Since this effects the stat structure, this
	will probably require recompiling large numbers of programs.

E)  If a Linux machine stays up for greater than 248 days, the number of
 	clock ticks will become > 2**31, and jiffies will overflow.  I 
	suspect something really messy will happen then.  We can make
	things better by making jiffies (and all places that refer to
	offsets off jiffies) unsigned longs instead of signed longs,
	but even so, it is still conceivable for a machine to be
	up for greater than 500 days.

	This isn't a problem now, since Linux is in beta test, but
	saying that Unix systems are too unstable to worry about what
	happens when they've been up for more than 500 days seems
	a bit wrong to me.

						- Ted