Path: gmdzi!unido!mcsun!uunet!samsung!think.com!hsdndev!cmcl2! kramden.acf.nyu.edu!brnstnd From: brns...@kramden.acf.nyu.edu (Dan Bernstein) Newsgroups: gnu.misc.discuss Subject: Anyone want to gamma-test a compressor? Message-ID: <812:Feb1720:37:5291@kramden.acf.nyu.edu> Date: 17 Feb 91 20:37:52 GMT Organization: IR Lines: 106 I now have version 0.95 of a compression package implementing AP and Y coding, two non-Huffman LZ78-style dictionary coders. Both compressors produce somewhat smaller output than compress. AP runs at basically the same speed as compress. Y is quite a bit slower but appears to be new (and hence unpatented). Enclosed are BLURB and PATENTS from the package. I plan to send the entire package to comp.sources.misc by the end of the month. In the meantime, I'm looking for gamma testers. Gamma test means the package has survived beta test and I think it's reasonably complete, but it still needs more portability testing, and it's not too late to add features if I've missed any. If you're interested in the package and you're willing to put in some time testing it and telling me what you think, send a note to brns...@nyu.edu. ---Dan This package includes yabba and unyabba, portable implementations of Y coding. Y coding is unpatented---unlike the LZW coding used by compress and the LZ78 variants used by most other dictionary compressors. Y also gives somewhat better compression than LZW. As an extra bonus, this package also includes whap and unwhap, portable implementations of AP coding. AP coding is patented, but like Y it is more effective than LZW. whap runs about as fast as compress and, in its default configuration, uses less memory; unwhap runs as fast as uncompress and uses less than half as much memory. The entire package is public-domain. You can copy my Y code into any application without fear of copyrights or patents. Be aware, however, that the current implementation is probably not too close to optimal, and I am actively searching for a better method. If AP were not patented I would recommend that you use it instead. For more information see PATENTS in the package. PATENTS: After someone patents a product or method that he invented, he can sue anyone who makes, uses, or sells the invention. You've probably heard that Welch at Unisys patented LZW compression some years back, and so Unisys can sue anyone who uses UNIX compress or any similar program. You may not have heard that Miller and Wegman at IBM independently discovered several compression methods, including LZW---and sent their patent application to the Patent & Trademark Office before Welch did. So it appears that IBM controls LZW. How can Unisys patent something that IBM already patented? And how can either of them patent what are obviously mathematical algorithms, when mathematical algorithms aren't supposed to be patentable? The answer is that the Patent & Trademark Office doesn't let mathematicians and computer scientists become patent examiners. So none of their examiners can recognize mathematics without a court order, and even then they have trouble. They also can't tell that two patents cover exactly the same algorithm when the wording is slightly different. Practically every good compression method in the last ten years has been patented. Modem and disk manufacturers can't ignore the possibility of government-approved monopolies on compressors: there's too much money at stake. So they apply for one patent after another, twisting the wording as much as possible so that the patent examiners don't realize they're patenting mathematics. By and large, they succeed. On December 26, 1990, I discovered an apparently new compression method, Y coding. I haven't found a way to make Y run as fast as compress (yet), but it's not too slow for general use, and it produces better results overall than any other non-Huffman dictionary compressor I've seen. I am sick and tired of having to tiptoe through this minefield of patents; I want a method that produces good results and can be used freely by anyone. Well, now I have it. Everything in Y coding is public domain. No copyrights, no patents, no protection at all, on the code or on the algorithms or on the file format or on anything else. I can't be totally sure, of course, that I'm the first to discover Y, but everyone I've talked to thinks it's new. Use it as you will. Several months before I found Y coding, I found what I now know to be Storer's AP coding. At first I thought it was new, and I started putting together an implementation. It's quite a bit faster than my Y implementation---even faster than compress on some machines---and uses less memory. But it has turned out to be patented. I've included the code in my Y package because I don't want my work to go to waste, but it can't be used freely. You can use AP for what's sometimes called ``experimental use''; the precise wording in most court cases is ``for the sole purpose of satisfying philosophical taste or curiosity, or for instruction and amusement.'' Patents never apply to that sort of use. If you like AP's efficiency and want to start using it for more than satisfying your philosophical curiosity, you might be tempted to ask Storer for a license. Don't give in! First, his patent---any compression patent---is on very shaky ground, and as it hasn't been acknowledged in the industry, it's even weaker. Second, you can challenge the patent by sending the Patent & Trademark Office about $2000 and an explanation of why AP is a mathematical algorithm. Now you may not have the money or lawyers or time or energy to do this successfully, but there are a lot of people who feel just as restricted. You should check up on the League for Programming Freedom; if you don't subscribe to their political goals but still want to get rid of compression patents, send me a note. In any case this will be cheaper on everyone than paying Storer for something he shouldn't even control. Third, Y isn't that much slower than AP, and it compresses just as effectively. Chances are good that I or someone else will find a faster implementation soon. Finally, we may be able to convince Storer to donate his patent to the public good. If you want to appeal to his morals, send a note to sto...@cs.brandeis.edu, copy to brns...@nyu.edu. Disclaimer: I'm not a lawyer. I'm just an outraged citizen. ---Dan Bernstein, brns...@nyu.edu
Path: gmdzi!unido!mcsun!uunet!bonnie.concordia.ca!news-server.csri.toronto.edu! helios.physics.utoronto.ca!ists!yunexus!oz From: o...@yunexus.yorku.ca (Ozan Yigit) Newsgroups: gnu.misc.discuss Subject: data compression notes [Re: Anyone want ...] Message-ID: <21698@yunexus.YorkU.CA> Date: 19 Feb 91 16:26:08 GMT References: <812:Feb1720:37:5291@kramden.acf.nyu.edu> Sender: n...@yunexus.YorkU.CA Organization: York U. Communications Research & Development Lines: 36 In article <See Ref> Dan Bernstein (brns...@nyu.edu) writes: >Practically every good compression method in the last ten years has been >patented. Ross Williams' SAKDC [1] (Swiss Army Knife Data Compression) algorithm is not patented. I verified this with the author, he does not believe in patenting algorithms. >On December 26, 1990, I discovered an apparently new compression method, >Y coding. That is nice. I think you should publish the algorithm (and a through analysis of it) in some respectable journal. That would provide you with the appropriate feedback, allow wider dissamination of your algorithm in general, and highlight one potential solution to the problem of patented algorithms: invent a better one. >... I am >sick and tired of having to tiptoe through this minefield of patents; I >want a method that produces good results and can be used freely by >anyone. Coincidentally, this month's DDJ contains some stuff on this topic, including a compressor by Mark R. Nelson (model-2) that is 25-30% better than compress. There is also a Data Compression Contest hosted by DDJ: who knows what will turn up. oz --- [1] Williams, Ross N., Adaptive Data Compression, Kluwer Publishers, Boston, 1991. --- Yellow pages is evil. All traces of | Internet: o...@nexus.yorku.ca it must be vanguished from the face | Uucp: utzoo/utai!yunexus!oz of the earth. -- Eriks Rugelis | Phone: 1+ (416) 736 5257 ..
Path: gmdzi!unido!fauern!ira.uka.de!sol.ctr.columbia.edu! zaphod.mps.ohio-state.edu!rpi!uupsi!cmcl2!kramden.acf.nyu.edu!brnstnd From: brns...@kramden.acf.nyu.edu (Dan Bernstein) Newsgroups: gnu.misc.discuss Subject: Re: data compression notes [Re: Anyone want ...] Message-ID: <2022:Feb2012:34:0191@kramden.acf.nyu.edu> Date: 20 Feb 91 12:34:01 GMT References: <812:Feb1720:37:5291@kramden.acf.nyu.edu> <21698@yunexus.YorkU.CA> Organization: IR Lines: 18 In article <21...@yunexus.YorkU.CA> o...@yunexus.yorku.ca (Ozan Yigit) writes: > That is nice. I think you should publish the algorithm (and a through > analysis of it) in some respectable journal. I'm working on a paper. > Coincidentally, this month's DDJ contains some stuff on this topic, > including a compressor by Mark R. Nelson (model-2) that is 25-30% better > than compress. But how fast is it? How much memory does it use? Take, for example, the important case of people transferring news between IBM PCs with size-64K memory segments. compress works with 12 bits, or about a 6K input block size on average. yabba works with a 12K input block size. whap works with a 20K input block size. This alone gives a big improvement. Can Nelson's modeller be used in limited memory? ---Dan
Path: gmdzi!unido!mcsun!ukc!dcl-cs!aber-cs!athene!pcg From: p...@cs.aber.ac.uk (Piercarlo Grandi) Newsgroups: gnu.misc.discuss Subject: Re: data compression notes [Re: Anyone want ...] Message-ID: <PCG.91Feb25215626@odin.cs.aber.ac.uk> Date: 25 Feb 91 21:56:26 GMT References: <812:Feb1720:37:5291@kramden.acf.nyu.edu> <21698@yunexus.YorkU.CA> <2022:Feb2012:34:0191@kramden.acf.nyu.edu> Sender: p...@aber-cs.UUCP Organization: Coleg Prifysgol Cymru Lines: 48 Nntp-Posting-Host: odin In-reply-to: brnstnd@kramden.acf.nyu.edu's message of 20 Feb 91 12:34:01 GMT On 20 Feb 91 12:34:01 GMT, brns...@kramden.acf.nyu.edu (Dan Bernstein) said: brnstnd> In article <21...@yunexus.YorkU.CA> o...@yunexus.yorku.ca (Ozan brnstnd> Yigit) writes: oz> That is nice. I think you should publish the algorithm (and a oz> through analysis of it) in some respectable journal. brnstnd> I'm working on a paper. Why not patent it? :-) Seriously, as the FSF might also use patents in the same way as it uses copyrights, to ensure that certain designs can circulate freely instead of being hoarded. Suppose that you do not patent your algorithm; somebody else later can rediscover and patent it, and then *you* have to sue the somebody and prove that you had prior art. If you preempt this by getting a patent and then licensing it for free (to everybody that promises to use it only in copyleft sources, or under similar conditions), this becomes much more difficult, because at the very least your work will turn up in a patent search and subsequent hoarders will have their claim rejected by the Patent Office (hopefully). Naturally there is a snag: while copylefting a source costs nothing, freepatenting an algorithm or a device costs thousands of dollars. Maybe the FSF could decide to do something about this for some possibly important ideas, like establishing a fund and a panel to decide which of the technologies submitted to the FSF should be patented drawing fropm that fund's money. The idea appeals to me a lot, because then the FSF could accumulate a portfolio of freepatents, and if it became suitably attractive, patent portfolio swaps could be arranged with large corporations, thus freeing even more patents. I for one would surely contribute some patentable technology to the FSF; I have posted in the past an algorithm for compiling multiple assignments, and the design of a reverse map MMU, in an attempt to prevent others to patent them eventually. It would have been much better I think to let the FSF patent them and then let everybody use them. I think that Dan Bernstein and others could be contributors too. -- Piercarlo Grandi | ARPA: pcg%uk.ac.aber...@nsfnet-relay.ac.uk Dept of CS, UCW Aberystwyth | UUCP: ...!mcsun!ukc!aber-cs!pcg Penglais, Aberystwyth SY23 3BZ, UK | INET: p...@cs.aber.ac.uk
Path: gmdzi!unido!mcsun!uunet!wuarchive!zaphod.mps.ohio-state.edu!rpi!uupsi! cmcl2!kramden.acf.nyu.edu!brnstnd From: brns...@kramden.acf.nyu.edu (Dan Bernstein) Newsgroups: gnu.misc.discuss Subject: Re: data compression notes [Re: Anyone want ...] Message-ID: <23019:Feb2609:37:3191@kramden.acf.nyu.edu> Date: 26 Feb 91 09:37:31 GMT References: <21698@yunexus.YorkU.CA> <2022:Feb2012:34:0191@kramden.acf.nyu.edu> <PCG.91Feb25215626@odin.cs.aber.ac.uk> Organization: IR Lines: 112 In article <PCG.91Feb25215...@odin.cs.aber.ac.uk> p...@cs.aber.ac.uk (Piercarlo Grandi) writes: > Suppose that you do not patent your algorithm; somebody else later can > rediscover and patent it, and then *you* have to sue the somebody and > prove that you had prior art. No, there's no point. The courts have said [riffle riffle riffle] ``A patent could not be infringed by the practice of any prior art.'' (Peerless Equipment Co. [now there's a name!] v. W.H. Miner, Inc., C.C.A. Ill.1938, 93 F.2d 98, certiorari denied 58 S.Ct. 611, 303 U.S. 641, 82 L.Ed. 1113.) Even better: ``It is neither wrong nor immoral for a manufacturer deliberately to create an article on basis of prior art in order to avoid infringement.'' (Reiner v. I. Leon Co., C.A.N.Y. 1963, 324 F.2d 648.) All you have to do is find a previous publication of the invention and you're home free. An interesting twist is that the courts usually won't even consider whether the patent in question is valid. They'll just watch a dozen experts testify that you're working within the prior art, and boom, the case is over. Normally they would first consider the validity of the patent, and only then consider whether you've infringed it. (This is still backwards in some circuits.) I am told, by the way, that electronic publication does constitute publication for the purposes of patent law (though not necessarily copyright law). However, evidence from a computer generally isn't admissible in court, so anyone who saved a copy of my first announcement of Y coding should print out a copy, sign and date it, and preferably notarize it. Here's another copy just in case. (Can never be too careful... [grin]) ---Dan Path: kramden.acf.nyu.edu!brnstnd From: brns...@kramden.acf.nyu.edu (Dan Bernstein) Newsgroups: gnu.misc.discuss Subject: Re: compress algorithm patent Message-ID: <6617:Jan2723:16:0...@kramden.acf.nyu.edu> Date: Sun Jan 27 23:16:09 GMT 1991 References: <CARLTON.91Jan26141...@husc8.harvard.edu> Organization: IR Lines: 68 In article <CARLTON.91Jan26141...@husc8.harvard.edu> carl...@husc8.harvard.edu (david carlton) writes: > does anybody know what > happened with that? Well, yes. The question is who wants to spend the hours telling you. :-) The best non-Huffman variant of LZ/LZW/MW is Storer's AP coding, which I independently discovered and implemented last year. After I found out that Storer had discovered the method first (and patented it), I tried to negotiate with him, but he doesn't seem interested in helping people if he doesn't make some money off the deal. I now have a compression program that produces typically 10-20% better results than compress, can be compiled to use under half as much memory as compress for decompression and the same amount for compression, runs at basically the same speed, is easier to configure, compiles and runs on 16- and 32-bit machines under practically any sane C implementation, and can prep its output for encryption. If James Storer agrees to let everyone use the program without restriction, I will place it into the public domain and distribute it through USENET; since RMS was interested in my squeeze program and since the new program is better in almost all respects, I imagine he'll pick it up for GNU. Send a polite note to sto...@cs.brandeis.edu if you would like to see this happen. In the meantime, I'm not sure that even a few hundred messages would convince Storer, so here's a method that appears to be unpatented. For the record, I found Y coding December 26. It gives slightly better results than LZW and codes somewhat faster than AP. On the other hand, decompression is about as slow as compression and uses more memory. Y coding: Start with a dictionary of all single characters. At each input position, add a new string to the dictionary. The new string is the longest match between the current position and the dictionary, plus one further character in the current position. Constructing the dictionary and printing output are entirely independent. To do the latter, simply output a code for the longest match against the dictionary constructed so far, advance past that match, and repeat. (``So far'' means taking into account only strings constructed without having to look at the start of the current match.) For example, take the input string yabbadabbadabbadoo!!!!. These strings are added to the dictionary, one starting at each input position: ya, ab, bb, ba, ad, da, abb, bba, bad, ada, dab, abba, bbad, bado, ado, do, oo, o!, !!, !!!. One way to look at this generation is that bbad is added to the dictionary as soon as bb.*bba.*bbad appears in the input. Now you read off the coding as follows: Read y. y matches the dictionary constructed up to the previous character, so keep going. Read a. ya doesn't match the dictionary constructed up to the previous character, so output the code for y. Read b. Output a. etc. The final outputs are y, a, b, b, a, d, ab, ba, da, bba, d, o, o, !, !, !!. Another example: pickypickypickypickypicky! codes as p, i, c, k, y, pi, ck, yp, ick, ypi, ckyp, icky, !. You can see that strings of any length can be added to the dictionary rather quickly, so although this has a similar flavor to LZJ it is certainly not equivalent. Anyway, I place Y coding into the public domain; unless somebody else has snuck a patent in somewhere, you're free to use the method in any program. I'm working on rewriting my AP program for Y coding, though I think AP is somewhat better, and I don't particularly want to optimize yet another compressor if Storer's willing to allow use of the current one. ---Dan