Path: gmdzi!unido!mcsun!sunic!uupsi!rpi!dali.cs.montana.edu!|
uakari.primate.wisc.edu!sdd.hp.com!usc!cs.utexas.edu!
news-server.csri.toronto.edu!utgpu!utzoo!censor!geac!alias!dino!chk
From: chk%al...@csri.toronto.edu (C. Harald Koch)
Newsgroups: comp.sys.sgi,comp.mail.elm
Subject: bug in vfork semantics under IRIX 3.3.1
Message-ID: <1990Nov29.035827.1302@alias.uucp>
Date: 29 Nov 90 03:52:29 GMT
Sender: n...@alias.uucp (USENET News)
Reply-To: chk%al...@csri.toronto.edu (C. Harald Koch)
Organization: Alias Research, Inc. Toronto ON Canada
Lines: 41
Posted: Thu Nov 29 04:52:29 1990

I was just applying the latest patches to ELM, after upgrading to 3.3.1.
Suddenly elm was no longer able to read my mailbox! After long and detailed
debugging, I eventually found the problem:

ELM runs set group-id mail so that it can create lock files. This is a
potential security hole, so ELM uses subprocesses to verify certain file
access permissions using your real gid rather than your effective gid. This
is to prevent users from getting access to files that are readable by the
mail group (i.e. other users mailboxes).

Under 3.3.1, ELM configuration detects the existence of vfork() and uses it
instead of fork(). Then, in the child, ELM calls setgid() to set the
group-id to your real group-id, performs the test, and exits with a status.
The parent reads this status back.

On most systems with vfork(), the two processes inherit the same address
space, BUT DIFFERENT KERNEL U-AREAS. This means that the setgid() call
doesn't affect the parent.

Under IRIX, the vfork() call is actually implemented using sproc(), which is
a more primitive way to get multiple processes. It DOES NOT give you a
separate u-area. So the setgid() call affects the parent!

As a result, the parent process is no longer set group-id mail, and so it
cannot generate lock files in the mail directory!

I discovered this quite accidentally; I was using DBX to attempt some
debugging and found that vfork() confused DBX, so I recompiled elm to use
fork() instead. Suddenly, everything worked fine! So I wrote a simple test
program which runs set group-id, vforks, and does a setgid(getgid()) in the
child. Sure enough, the group-id in the parent changes!

vfork() also causes problems with Perl. I strongly suggest not using it at
all, unless you *really* need the performance improvement that it gives.

	Whee!

--
C. Harald Koch  VE3TLA                Alias Research, Inc., Toronto ON Canada
chk%al...@csri.utoronto.ca      c...@gpu.utcs.toronto.edu      c...@chk.mef.org
"Open the Zamboni! We're coming out!" - Kathrin Garland and Anson James, 2299

Path: gmdzi!unido!mcsun!sunic!uupsi!rpi!zaphod.mps.ohio-state.edu!usc!
ucla-cs!twinsun!eggert
From: egg...@twinsun.com (Paul Eggert)
Newsgroups: comp.sys.sgi
Subject: Re: bug in vfork semantics under IRIX 3.3.1
Message-ID: <1990Nov30.041307.15489@twinsun.com>
Date: 30 Nov 90 04:13:07 GMT
References: <1990Nov29.035827.1302@alias.uucp>
Sender: n...@twinsun.com
Organization: Twin Sun, Inc
Lines: 16
Posted: Fri Nov 30 05:13:07 1990
Nntp-Posting-Host: ata

chk%al...@csri.toronto.edu (C. Harald Koch) writes:

	Under IRIX, the vfork() call is actually implemented using sproc(),
	which is a more primitive way to get multiple processes.  It DOES NOT
	give you a separate u-area.  So the setgid() call affects the parent!
	...  I strongly suggest not using it at all....

I second this suggestion.  Under IRIX 3.3, vfork() also botches file
descriptors: e.g. if a vfork() child process closes a file, IRIX mistakenly
closes the corresponding file descriptor in the parent.  I ran into this
problem porting RCS 5.4 to SGI.

I couldn't find any SGI documentation for vfork(),
so I suspect it's both undocumented and unsupported.
Even so, surely it is unwise for SGI to supply such a nonstandard vfork(),
because too many people will run into similar problems.

Path: utzoo!censor!geac!torsqnt!news-server.csri.toronto.edu!
cs.utexas.edu!wuarchive!rex!ames!sgi!shinobu!odin!patton.wpd.sgi.com!jmb
From: j...@patton.wpd.sgi.com (Doctor Software)
Newsgroups: comp.sys.sgi
Subject: Re: bug in vfork semantics under IRIX 3.3.1
Message-ID: <1990Dec2.195600.25725@odin.corp.sgi.com>
Date: 2 Dec 90 19:56:00 GMT
References: <1990Nov29.035827.1302@alias.uucp> <1990Nov30.041307.15489@twinsun.com>
Sender: n...@odin.corp.sgi.com (Net News)
Reply-To: j...@patton.wpd.sgi.com (Doctor Software)
Organization: Silicon Graphics Inc.
Lines: 58

In article <1990Nov30.041307.15...@twinsun.com>, egg...@twinsun.com
(Paul Eggert) writes:
> 
> chk%al...@csri.toronto.edu (C. Harald Koch) writes:
> 
> 	Under IRIX, the vfork() call is actually implemented using sproc(),
> 	which is a more primitive way to get multiple processes.  It DOES NOT
> 	give you a separate u-area.  So the setgid() call affects the parent!
> 	...  I strongly suggest not using it at all....
> 

I guess I missed the original posting on this one, but the assumption
here is wrong. sproc(2) creates a new process which shares certain
resources with the parent process. The caller of sproc() is in control
of which are shared. The list includes VM, file descriptors, user and
group IDs, and others. Thus, there >is< a separate u-area for both.
sproc() and vfork() aren't even in the same league - sproc() creates
multiple threads of execution, while vfork() was implemented solely to
speed up the fork/exec sequence.

> I second this suggestion.  Under IRIX 3.3, vfork() also botches file
> descriptors: e.g. if a vfork() child process closes a file, IRIX mistakenly
> closes the corresponding file descriptor in the parent.  I ran into this
> problem porting RCS 5.4 to SGI.
> 
> I couldn't find any SGI documentation for vfork(),
> so I suspect it's both undocumented and unsupported.

Wherever you found a vfork() routine, the underlying sproc() undoubtedly
shares file descriptors with the parent, and thus your problems. As to
support, indeed, vfork() is unsupported by SGI. Please note that vfork()
is supposed to be semantically equivalent to fork(), unless the child
messes up the parent's address space (which the 4.3 manual page warns about).

Vfork() is actually itself a kludge, because it is a response to the
poor performance of fork() on Bezerkley based machines. Because BSD VM
copies the entire address space on fork, fork is expensive. Modern VM
systems (including IRIX) implement copy-on-write as well as text
sharing, so the cost of a fork() call is very small. The proper thing to
do is to fix fork()'s poor performance, not kludge in a system call to
get around it.

Thus, the easiest way to implement vfork() is to add this to the program:

# define vfork	fork

and leave it at that.

> Even so, surely it is unwise for SGI to supply such a nonstandard vfork(),
> because too many people will run into similar problems.

Surely you jest. The entire computer industry would grind to a halt if
every company tried to make sure it only shipped "documented" entry
points to it's libraries.

-- Jim Barton
   Silicon Graphics Computer Systems
   j...@sgi.com

Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!uunet!
mcsun!hp4nl!charon!guido
From: gu...@cwi.nl (Guido van Rossum)
Newsgroups: comp.sys.sgi
Subject: Re: bug in vfork semantics under IRIX 3.3.1
Message-ID: <2638@charon.cwi.nl>
Date: 3 Dec 90 10:29:34 GMT
References: <1990Nov29.035827.1302@alias.uucp> 
<1990Nov30.041307.15489@twinsun.com> <1990Dec2.195600.25725@odin.corp.sgi.com>
Sender: n...@cwi.nl
Lines: 21

j...@patton.wpd.sgi.com (Doctor Software) writes:

>[Reasonable article omitted]
>
>Vfork() is actually itself a kludge, because it is a response to the
>poor performance of fork() on Bezerkley based machines.
                               ^^^^^^^^^
(As an SGI employee you should know better than using such pejoratives.)

From the rest of your story it is clear that SGI is aware that some
programs need vfork().  You also claim that SGI's fork() has adequate
performance to be used instead of vfork().  Then why did someone at SGI
bother to whip up an inadequate vfork() substitute using sproc() while
it could be implemented just as well using fork() trivially, with better
preservation of the semantics?  (Note that the shared memory semantics
of vfork() are explicitly undefined, whereas the non-shared u-area is
essential.)  I'm sure this can be fixed in the next release.

--
Guido van Rossum, CWI, Amsterdam <gu...@cwi.nl>
"A thing of beauty is a joy till sunrise"

Path: gmdzi!unido!mcsun!uunet!samsung!usc!ucla-cs!twinsun!eggert
From: egg...@twinsun.com (Paul Eggert)
Newsgroups: comp.sys.sgi
Subject: Re: bug in vfork semantics under IRIX 3.3.1
Message-ID: <1990Dec3.024237.23749@twinsun.com>
Date: 3 Dec 90 02:42:37 GMT
References: <1990Nov29.035827.1302@alias.uucp> 
<1990Nov30.041307.15489@twinsun.com> <1990Dec2.195600.25725@odin.corp.sgi.com>
Sender: n...@twinsun.com
Organization: Twin Sun, Inc
Lines: 20
Posted: Mon Dec  3 03:42:37 1990
Nntp-Posting-Host: ata

j...@patton.wpd.sgi.com (Doctor Software) writes:

	Thus, the easiest way to implement vfork() is ...
		# define vfork	fork

Several programs (e.g. perl, rn) have autoconfiguration scripts that determine
whether the host has a vfork(), and use fork() otherwise.
This strategy fails under IRIX 3.3, which has a vfork() that doesn't work.
A programmer who knows about the IRIX vfork bug can work around this by hand,
but it's unrealistic to expect this of every perl and rn maintainer.
If SGI wants to make it easy to port software to their machines,
IRIX should have either a working vfork(), or no vfork() at all.


	The entire computer industry would grind to a halt if every company
	tried to make sure it only shipped "documented" entry points to it's
	libraries.

If IRIX's vfork() is indeed just an undocumented library entry point that has
nothing to do with BSD's vfork(), then it should be given a different name.

Path: gmdzi!unido!mcsun!sunic!uupsi!rpi!zaphod.mps.ohio-state.edu!ncar!
ames!sgi!shinobu!odin!patton.wpd.sgi.com!jmb
From: j...@patton.wpd.sgi.com (Doctor Software)
Newsgroups: comp.sys.sgi
Subject: Re: bug in vfork semantics under IRIX 3.3.1
Message-ID: <1990Dec3.160645.3347@odin.corp.sgi.com>
Date: 3 Dec 90 16:06:45 GMT
References: <1990Nov29.035827.1302@alias.uucp> 
<1990Nov30.041307.15489@twinsun.com> <1990Dec2.195600.25725@odin.corp.sgi.com> 
<1990Dec3.024237.23749@twinsun.com>
Sender: n...@odin.corp.sgi.com (Net News)
Reply-To: j...@patton.wpd.sgi.com (Doctor Software)
Organization: Silicon Graphics Inc.
Lines: 20
Posted: Mon Dec  3 17:06:45 1990

In article <1990Dec3.024237.23...@twinsun.com>, egg...@twinsun.com (Paul
Eggert) writes:
> ...
> If SGI wants to make it easy to port software to their machines,
> IRIX should have either a working vfork(), or no vfork() at all.
> 
> 
> 	The entire computer industry would grind to a halt if every company
> 	tried to make sure it only shipped "documented" entry points to it's
> 	libraries.
> 
> If IRIX's vfork() is indeed just an undocumented library entry point that has
> nothing to do with BSD's vfork(), then it should be given a different name.

You're right of course on both counts. The solution has already started
to work it's way through the mill out here ...

-- Jim Barton
   Silicon Graphics Computer Systems
   j...@sgi.com