UNIX Methods and Concepts: Putting the Genie Back in the Bottle, by Salus & Toomey

June 16 2006

UNIX Methods and Concepts: Putting the Genie Back in the Bottle

~ by Dr. Peter H. Salus and Warren K. Toomey, The UNIX Heritage Society

Recently, The SCO Group has asserted that IBM negligently leaked the methods and concepts in UNIX. What The SCO Group fails to realize is that, from day one, the methods and concepts in UNIX were out in the open. And, as AT&T found out when UNIX was commercialized, staunching the leakage of UNIX methods and concepts was like putting the proverbial genie back into the bottle.

Throughout the 1970-1980 decade, AT&T made no secret of the UNIX source code. After the system became "public," following the SOSP paper in October 1973 (and its publication in the July 1974 Communications of the ACM), there were many requests from outside AT&T for the new OS. The SOSP paper was itself very revealing of the UNIX methods and concepts, and AT&T at the time (following a consent decree) was obliged to confine its business to "telegraphy and telephony" and not to sell software products, but to license UNIX on a very open basis.

What the requesters received, from Ken Thompson and (later) from Irma Biren, was a 10" tape or a disk pack with the bits of 3rd Edition UNIX, or 4th, or 5th, or 6th. All the bits, not what we'd call a binary version, but the source. And many recipients just printed it out. The disk pack frequently came with a handwritten note: "Here's your RK05 , Love Ken"; on the tape that Lou Katz received, the note read: "Here's the tape, if it craps out, I'll cut another."

Further, from the beginning, Thompson would talk about the system and its code: for example, at the "UNIX Users' Meeting" at Columbia University on May 15, 1974. There were no barriers, no bars, no hurdles. And every site had full source.

It is important to note that UNIX was never a static system, and the userbase found it immensely useful to have the source code, so that the system could be fixed, and enhanced to suit the users at each individual site. Examples include the AUSAM system, developed at UNSW in Australia, and the early BSDs. The changes made by the users often found their way back into the main UNIX development tree. Significant portions of AUSAM code were still visible in System V in the late 1980s.

With a malleable OS in the hands of the users, including the source code, the users found the urge to exchange home-grown bug fixes and improvements to the system. Beginning in mid-1976, the UNIX Users' Group mailed out tapes of the "software exchange." The first was announced in the May-June "UNIX NEWS," the second followed in November, the third in May-June 1977. AT&T's attitude forced the users to exchange knowledge with one another. Mel Ferentz (then publisher of "UNIX NEWS") was driven to initiate the exchange, and Mike O'Brien (then a graduate student) to implement it.

Another avenue for the exchange of home-grown methods and concepts was at the annual UNIX User Group (later USENIX) conferences, where everybody bought two tapes, one full of new programs, device drivers and system patches, and the other tape empty (over 150 attendees at the May 1977 meeting in Urbana, IL).

After the creation of USENET in late 1979, the net.v6bugs and net.v7bugs newsgroups were formed so that users could exchange bug fixes on-line in the form of patches. These newsgroups were quite active, thus "leaking" many lines of original UNIX code.

Even the UNIX developers aided and abetted the free exchange of methods and concepts: Ken Thompson took a sabbatical at University of California, Berkeley where he introduced UNIX and it methods and concepts to the staff and students.

Then there is the case of the "50 bugs" tape. By the late 1970s, AT&T had started to impose more restrictive conditions in its UNIX licenses, stifling the exchange of UNIX code between licensees. The licenses also did not include the ability to obtain bug fixes from AT&T. The researchers at Bell Labs had found and fixed a significant number of bugs in UNIX, and Ken Thompson had tried to get the patches out, but the lawyers kept stalling him. Eventually, a tape with the patches was "found" by Lou Katz and Reidar Bornholdt on Mountain Avenue (the road leading to the Labs). Ken also "inadvertently" left an image of the tape at the University of Illinois, when visiting on his way to Berkeley, and another at Berkeley.

While the ability to exchange code was being limited, the same was not true for the distribution of methods and concepts. The distribution of John Lions' commentary on 6th Edition UNIX was stopped as it contained source code. It was followed by Maurice J. Bach's book on System V, which used pseudo-code to explain the system's internals. Many other books followed, including those by McKusick et al, Goodheart & Cox, and Vahalia. All of these outline the methods and concepts in UNIX in exquisite detail.

All of this begs the question, was there ever anything in UNIX worth protecting, and how should it have been protected? UNIX, of course, is one of the most influential and useful operating systems in computing history. But, was it the source code that was critical, or the algorithms used in the system, or its methods and concepts, or something else?

As AT&T began to productize UNIX in the late 1970s, it became critical to protect the source code. However, AT&T dithered on how best to do this. At this time, the ability to copyright software was still uncertain, and the company intially chose a license plus trade secrets approach, and did not revisit the copyright approach until the 1980s. The result of this was the lack of copyright notices in 32V.

But was the actual source code really important? Certainly, it gave the users the ability to tailor their systems, to fix bugs, and to extend UNIX in directions that the original designers had not chosen to go. But the early UNIX source code didn't contain any significantly important algorithms. The preface to the Lions' commentary indicates that the early systems used simple algorithms (linear search etc).

What about the essential "methods and concepts" in UNIX: a hierachical filesystem, i-nodes, multitasking, protected process address spaces, a command-line shell etc.? None of these were new in the realm of operating systems.

In fact, the useful methods and concepts in UNIX were at a much higher level, that of the UNIX toolbox mindset: lots of well-designed tools which perform individual actions, combined with a framework which allows them to be connected together. And the toolbox notions were there (according to McIlroy, Thompson and Kernighan) by 1972. But even at this level, AT&T not only failed to protect this, but encouraged the adoption of this mindset, e.g. with the Software Tools book by Kernighan and Plauger (1976) and the first UNIX issue of the Bell Systems Technical Journal in 1978.

In 1983, we had Bourne's book "The UNIX System" (with the copyright held by "Bell Laboratories"!) -- not a good way to protect things. Even after the advent of System V, neither AT&T nor USL attempted to veil methods and concepts. Goodheart and Cox in The Magic Garden Explained (1994) give full details where SVR4 is concerned, calling it "an open systems design."

Finally, in 1995, Mike Gancarz of DEC, gave the world "The UNIX Philosophy", showing us how "the UNIX philosophy is an approach to developing operating systems and software that constantly looks to the future." There is very little there that isn't in the first three articles in the BSTJ in 1978.

In summary, what made UNIX so good, and was it protectable? The jewel in the UNIX crown is not the source code, not the algorithms, not the low-level methods and concepts. It is the basic design of UNIX, its inherent philosophy and mindset, and the ability for users to modify the system and swap changes with other users. While the latter could be limited to some extent via copyrights and licenses, the overall design and the inherent mindset was public from the very beginning, and could never be protected.

To end, an addendum on the ELF magic number issue. There is an amazingly large number of executable formats that use the magic numbers from PDP-11 a.out files: 0407, 0410, 0413. This goes to show that a) a magic number that has no inherent meaning is unprotectable and b) how well the PDP-11 a.out magic numbers infected the binaries of other platforms. Remember, they are all PDP-11 branch instructions. The concept has been with us for over 30 years.

05:21 PM EDT

Copyright 2006 http://www.groklaw.net/ - http://creativecommons.org/licenses/by-nc-nd/3.0/