Path: utzoo!utgpu!water!watmath!clyde!bellcore!faline!thumper!ulysses!andante! alice!andrew From: and...@alice.UUCP Newsgroups: comp.unix.wizards,comp.unix.questions Subject: grep replacement Summary: proposal for a replacement for grep/egrep/fgrep Message-ID: <7882@alice.UUCP> Date: 23 May 88 15:22:02 GMT Organization: AT&T Bell Laboratories, Murray Hill NJ Lines: 23 Posted: Mon May 23 11:22:02 1988 Al Aho and I are designing a replacement for grep, egrep and fgrep. The question is what flags should it support and what kind of patterns should it handle? (Assume the existence of flags to make it compatible with grep, egrep and fgrep.) The proposed flags are the V9 flags: -f file pattern is (`cat file`) -v print nonmatching -i ignore aphabetic case -n print line number -x the pattern used is ^pattern$ -c print count only -l print filenames only -b print block numbers -h do not print filenames in front of matching lines -H always print filenames in front of matching lines -s no output; just status -e expr use expr as the pattern The patterns are as for egrep, supplemented by back-referencing as in \{pattern\}\1. please send your comments about flags or patterns to research!andrew
Path: utzoo!attcan!uunet!husc6!mailrus!ames!elroy!cogswell!alan From: a...@cogswell.Jpl.Nasa.Gov (Alan S. Mazer) Newsgroups: comp.unix.wizards,comp.unix.questions Subject: Re: grep replacement Summary: Context please Message-ID: <6866@elroy.Jpl.Nasa.Gov> Date: 26 May 88 00:04:47 GMT References: <7882@alice.UUCP> <5630@umn-cs.cs.umn.edu> Sender: n...@elroy.Jpl.Nasa.Gov Lines: 7 One thing I would _love_ is to be able to find the context of what I've found, for example, to find the two (n?) surrounding lines. I have wanted to do this many times and there is no good way. -- Alan ..!cit-vax!elroy!alan * "But seriously, what elroy!a...@csvax.caltech.edu could go wrong?"
Path: utzoo!attcan!uunet!mcvax!unido!rubmez!frei From: f...@rubmez.UUCP (Matthias Frei ) Newsgroups: comp.unix.wizards Subject: Re: grep replacement Message-ID: <136@rubmez.UUCP> Date: 30 May 88 11:04:42 GMT Organization: MEZ, RUB, Bochum, FRG Lines: 28 Posted: Mon May 30 12:04:42 1988 In-Reply-To: your article <7882@alice.UUCP> > Al Aho and I are designing a replacement for grep, egrep and fgrep. > The question is what flags should it support and what kind of patterns > should it handle? (Assume the existence of flags to make it compatible > with grep, egrep and fgrep.) Hi, some applications need to divert a file in two parts. One should contain all lines matching any patterns, the other one all lines not matching any of the patterns. So I want following flags: - d divert the file "matches" to stdout "nomatches" to stderr -r exchange stdout and stderr, if -d is given Will you post Your new grep to the net ? (I hope so) Thanks in Advance for a nice new tool Matthias Frei -------------------------------------------------------------------- Snail-mail: | E-Mail address: Microelectronics Center | UUCP f...@rubmez.uucp University of Bochum | (...uunet!unido!rubmez!frei) 4630 Bochum 1, P.O.-Box 102143 | West Germany |
Path: utzoo!utgpu!water!watmath!clyde!bellcore!rutgers!mit-eddie!uw-beaver! uw-june!uw-entropy!dataio!pilchuck!ssc!happym!kent From: k...@happym.UUCP (Kent Forschmiedt) Newsgroups: comp.unix.wizards Subject: Re: grep replacement Message-ID: <449@happym.UUCP> Date: 2 Jun 88 02:35:46 GMT References: <136@rubmez.UUCP> Reply-To: k...@happym.UUCP (Kent Forschmiedt) Organization: Happy Man Corp. Lines: 24 In article <1...@rubmez.UUCP> f...@rubmez.UUCP (Matthias Frei ) writes: >I want following flags: > > - d divert the file > "matches" to stdout > "nomatches" to stderr > -r exchange stdout and stderr, if -d is given I second the vote - just today I did one of these: grep $PATTERN file > afile grep -v $PATTERN file > anotherfile Note, however, that -v will serve for the suggested -r. >Will you post Your new grep to the net ? (I hope so) From alice.UUCP?? Ha ha! That's Bell Labs! It will be in V10 Unix, and none of us humans will see it until sysVr6, and only then if we are lucky!! -- -- Kent Forschmiedt -- k...@happym.UUCP, tikal!camco!happym!kent Happy Man Corporation 206-282-9598
Path: utzoo!dciem!nrcaer!scs!spl1!laidbak!att!osu-cis!killer!tness7!bellcore! faline!thumper!ulysses!andante!alice!andrew From: and...@alice.UUCP Newsgroups: comp.unix.wizards,comp.unix.questions Subject: Re: grep replacement Message-ID: <7944@alice.UUCP> Date: 3 Jun 88 16:58:39 GMT Article-I.D.: alice.7944 References: <136@rubmez.UUCP> <449@happym.UUCP> Organization: AT&T Bell Laboratories, Murray Hill NJ Lines: 20 Summary: the right way to do context and where's the source? In article <4...@happym.UUCP>, k...@happym.UUCP writes: > From alice.UUCP?? Ha ha! That's Bell Labs! It will be in V10 > Unix, and none of us humans will see it until sysVr6, and only then > if we are lucky!! Context: the right thing to do is to write a context program that takes input looking like "filename:linenumber:goo" and prints whatever context you like. we can then take this crap out of grep and diff and make it generally available for use with programs like the C compiler and eqn and so on. It can also do the right thing with folding together nearby lines. At least one good first cut has been put on the net but a C program sounds easy enough to do. Source: the software i write is publicly available because it matters to me. it was a hassle but mk and fio are available to everybody for reasonable cost (< $125 commercial, nearly free educational). i am trying hard to do the same for the new grep. it will be in V10, it will be in plan9, and should be in SVR4 (the joint sun-at&t release).
Path: utzoo!utgpu!water!watmath!clyde!bellcore!faline!thumper!ulysses!andante! mit-eddie!bbn!uwmcsd1!ig!agate!pasteur!ames!mailrus!nrl-cmf!cmcl2!brl-adm!brl-smoke!gwyn From: g...@brl-smoke.UUCP Newsgroups: comp.unix.wizards,comp.unix.questions Subject: Re: grep replacement Message-ID: <8012@brl-smoke.ARPA> Date: 4 Jun 88 21:28:19 GMT References: <136@rubmez.UUCP> <449@happym.UUCP> <7944@alice.UUCP> Reply-To: g...@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) Organization: Ballistic Research Lab (BRL), APG, MD. Lines: 6 In article <7...@alice.UUCP> and...@alice.UUCP writes: > the right thing to do is to write a context program that takes >input looking like "filename:linenumber:goo" and prints whatever context ... Heavens -- a tool user. I thought that only Neanderthals were still alive. I guess Bell Labs escaped the plague.
Path: utzoo!utgpu!water!watmath!clyde!bellcore!rutgers!ukma!husc6!bu-cs!bzs From: b...@bu-cs.BU.EDU (Barry Shein) Newsgroups: comp.unix.wizards,comp.unix.questions Subject: Re: grep replacement Message-ID: <23133@bu-cs.BU.EDU> Date: 5 Jun 88 01:37:09 GMT References: <136@rubmez.UUCP> <449@happym.UUCP> <7944@alice.UUCP> <8012@brl-smoke.ARPA> Organization: Boston U. Comp. Sci. Lines: 17 In-reply-to: gwyn@brl-smoke.ARPA's message of 4 Jun 88 21:28:19 GMT From: g...@brl-smoke.ARPA (Doug Gwyn ) >In article <7...@alice.UUCP> and...@alice.UUCP writes: >> the right thing to do is to write a context program that takes >>input looking like "filename:linenumber:goo" and prints whatever context ... > >Heavens -- a tool user. I thought that only Neanderthals were still alive. >I guess Bell Labs escaped the plague. Almost, unless the original input was produced by a pipeline, in which case this (putative) post-processor can't help unless you tee the mess to a temp file, yup, mess is the right word. Or maybe only us Neanderthals are interested in tools which work on pipes? Have they gone out of style? -Barry "Ulak of Org" Shein, Boston University
Path: utzoo!utgpu!water!watmath!clyde!bellcore!rutgers!mit-eddie!ll-xn! ames!nrl-cmf!cmcl2!brl-adm!brl-smoke!gwyn From: g...@brl-smoke.ARPA (Doug Gwyn ) Newsgroups: comp.unix.wizards,comp.unix.questions Subject: Re: grep replacement Message-ID: <8022@brl-smoke.ARPA> Date: 5 Jun 88 03:30:46 GMT References: <136@rubmez.UUCP> <449@happym.UUCP> <7944@alice.UUCP> <8012@brl-smoke.ARPA> <23133@bu-cs.BU.EDU> Reply-To: g...@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) Organization: Ballistic Research Lab (BRL), APG, MD. Lines: 22 In article <23...@bu-cs.BU.EDU> b...@bu-cs.BU.EDU (Barry Shein) writes: >Almost, unless the original input was produced by a pipeline, in which >case this (putative) post-processor can't help unless you tee the mess >to a temp file, yup, mess is the right word. The proposed tool would be very handy on ordinary text files, but it is hard to see a use for it on pipes. Or, getting back to context-grep, what good would it do to show context from a pipe? To do anything with the information (other than stare at it), you'd need to produce it again. There might be some use for context-{grep,diff,...} on a stream, but if a separate context tool will satisfy 99% of the need, as I think it would, as well as provide this capability for other commands "for free", it would be a better approach than hacking context into other commands. By the way, I hope the new grep when asked to always produce the filename will use "-" for stdin's name, and the context tool would also follow the same convention. Even though the Research systems have /dev/stdin, other sites may not, and anyway (as we've just seen) stdin isn't really a definite object.
Path: utzoo!utgpu!water!watmath!clyde!bellcore!rutgers!njin!princeton!udel! gatech!ncar!boulder!sunybcs!bingvaxu!leah!itsgw!sun.soe.clarkson.edu!nelson From: nel...@sun.soe.clarkson.edu (Russ Nelson) Newsgroups: comp.unix.wizards,comp.unix.questions Subject: Re: grep replacement Message-ID: <1030@sun.soe.clarkson.edu> Date: 5 Jun 88 03:38:55 GMT References: <136@rubmez.UUCP> <449@happym.UUCP> <7944@alice.UUCP> <8012@brl-smoke.ARPA> <23133@bu-cs.BU.EDU> Reply-To: nel...@sun.soe.clarkson.edu (Russ Nelson) Followup-To: comp.unix.wizards Organization: Clarkson University, Potsdam, NY Lines: 19 In article <23...@bu-cs.BU.EDU> b...@bu-cs.BU.EDU (Barry Shein) writes: >In article <7...@alice.UUCP> and...@alice.UUCP writes: >> the right thing to do is to write a context program that takes >>input looking like "filename:linenumber:goo" and prints whatever context ... > >Almost, unless the original input was produced by a pipeline, in which >case this (putative) post-processor can't help unless you tee the mess >to a temp file, yup, mess is the right word. How about: alias with_context tee >/tmp/$$ | $* | context -f/tmp/$$ or something like that? Does that offend tool-users sensibilities? *Do* Neanderthals have any sensibilities? -- signed char *reply-to-russ(int network) { /* Why can't BITNET go */ if(network == BITNET) return "NELSON@CLUTX"; /* domainish? */ else return "nel...@clutx.clarkson.edu"; }
Path: utzoo!dciem!nrcaer!scs!spl1!laidbak!att!mtunx!pacbell!lll-tis! helios.ee.lbl.gov!pasteur!ucbvax!decwrl!purdue!bu-cs!bzs From: b...@bu-cs.BU.EDU (Barry Shein) Newsgroups: comp.unix.wizards,comp.unix.questions Subject: Re: grep replacement Message-ID: <23142@bu-cs.BU.EDU> Date: 5 Jun 88 15:24:23 GMT Article-I.D.: bu-cs.23142 References: <136@rubmez.UUCP> <449@happym.UUCP> <7944@alice.UUCP> <8012@brl-smoke.ARPA> <23133@bu-cs.BU.EDU> <8022@brl-smoke.ARPA> Organization: Boston U. Comp. Sci. Lines: 82 In-reply-to: gwyn@brl-smoke.ARPA's message of 5 Jun 88 03:30:46 GMT From: g...@brl-smoke.ARPA (Doug Gwyn ) >In article <23...@bu-cs.BU.EDU> b...@bu-cs.BU.EDU (Barry Shein) writes: >>Almost, unless the original input was produced by a pipeline, in which >>case this (putative) post-processor can't help unless you tee the mess >>to a temp file, yup, mess is the right word. > >The proposed tool would be very handy on ordinary text files, >but it is hard to see a use for it on pipes. Or, getting back >to context-grep, what good would it do to show context from a >pipe? To do anything with the information (other than stare >at it), you'd need to produce it again. What else are context displays for except to stare at (or save in a file for later staring)? Are the resultant contexts often the input to other programs? (I know that 'patch' can take a context input but that's irrelevant, it hardly needs nor prefers a context diff to my knowledge, it's just being accomodating so humans can look at the context diff if something botches.) Actually, I can answer that in the context of the original suggestion. The motivation for a context comes in two major flavors: A) To stare at (the surrounding context gives a human some hint of the context in which the text appeared) B) Because the context really represents a multi-line (eg) record, such as pulling out every termcap or terminfo entry which contains some property but desiring the result to contain the entire multiline entry so it could be re-used to create a new file. In either case it's independent of whether the data is coming from a pipe (as it should be.) Its pipeness may be caused by something as simple as the data being grabbed across the network (rsh HOST cat foo | ...). Anyhow, I think it's bad in general to demand the reasoning of why a selection operator should work in a pipe, it just should (although I have presented a reasonable argument.) That's what tools are all about. >There might be some >use for context-{grep,diff,...} on a stream, but if a separate >context tool will satisfy 99% of the need, as I think it would, >as well as provide this capability for other commands "for free", >it would be a better approach than hacking context into other >commands. I think claiming that 99% of the use won't need pipes is unsound, it should just work with a pipe and any tool which requires passing the file name and then re-positioning the file just won't, it's violating a fundamental design concept by doing this (not that in rare cases this might not be necessary, but I don't see where this is one of them unless you use the circular argument of it "must be a separate program".) The reasoning for adding it to grep would be: a) Grep already has its finger on the context, it's right there (or could be), why re-process the entire stream/file just to get it printed? Grep found the context, why find it again? b) The context suggestions are merely logical generalizations of the what grep already does, print the context of a match (it just happens to now limit that to exactly one line.) Nothing new conceptually is being added, only generalized. In fact, if I were to write this context-display tool my first thought would be to just use grep and try to emit unique patterns (a la TAGS files) which grep can then re-scan. But grep doesn't quite cut it w/o this little generalization. I think we're going in circles and this post-processor is nothing more than a special case of grep or perhaps cat or sed the way it was proposed (why not just generate sed commands to list the lines if that's all you want?) Anyhow, at least we're back to the technical issues and away from calling anyone who disagrees Neanderthals... -Barry Shein, Boston University
Path: utzoo!dciem!nrcaer!scs!spl1!laidbak!att!mtunx!pacbell!lll-tis! helios.ee.lbl.gov!pasteur!ucbvax!decwrl!purdue!bu-cs!bzs From: b...@bu-cs.BU.EDU (Barry Shein) Newsgroups: comp.unix.wizards,comp.unix.questions Subject: Re: grep replacement Message-ID: <23143@bu-cs.BU.EDU> Date: 5 Jun 88 15:28:40 GMT Article-I.D.: bu-cs.23143 References: <136@rubmez.UUCP> <449@happym.UUCP> <7944@alice.UUCP> <8012@brl-smoke.ARPA> <23133@bu-cs.BU.EDU> <1030@sun.soe.clarkson.edu> Organization: Boston U. Comp. Sci. Lines: 24 In-reply-to: nelson@sun.soe.clarkson.edu's message of 5 Jun 88 03:38:55 GMT From: nel...@sun.soe.clarkson.edu (Russ Nelson) [responding to me] >>Almost, unless the original input was produced by a pipeline, in which >>case this (putative) post-processor can't help unless you tee the mess >>to a temp file, yup, mess is the right word. > >How about: > >alias with_context tee >/tmp/$$ | $* | context -f/tmp/$$ > >or something like that? Does that offend tool-users sensibilities? >*Do* Neanderthals have any sensibilities? I don't understand, the way to avoid having to tee it into temp files is to tee it into temp files? Given that sort of solution we can eliminate pipes entirely from unix, was that your point? That pipes are fundamentally useless and can always be eliminated via use of intermediate temp files? It begs the question, burying it in a little syntactic sugar with an alias command doesn't solve the problem. -Barry Shein, Boston University
Path: utzoo!utgpu!water!watmath!clyde!bellcore!rutgers!gatech!ncar! oddjob!mimsy!chris From: ch...@mimsy.UUCP (Chris Torek) Newsgroups: comp.unix.wizards,comp.unix.questions Subject: Re: grep replacement and /dev/stdin Message-ID: <11821@mimsy.UUCP> Date: 5 Jun 88 21:41:14 GMT References: <136@rubmez.UUCP> <449@happym.UUCP> <7944@alice.UUCP> <8022@brl-smoke.ARPA> Organization: U of Maryland, Dept. of Computer Science, Coll. Pk., MD 20742 Lines: 22 In article <8...@brl-smoke.ARPA> g...@brl-smoke.ARPA (Doug Gwyn ) writes: >By the way, I hope the new grep when asked to always produce >the filename will use "-" for stdin's name, and the context >tool would also follow the same convention. Even though the >Research systems have /dev/stdin, other sites may not, Why not? We (ch...@mimsy.umd.edu and f...@mimsy.umd.edu) have posted an implementation at least twice. (Still could not get Berkeley to include it in 4.3-tahoe, alas; maybe 4.4....) The implmentation was easy in 4.1BSD, and not hard in 4.2 and 4.3BSD, so it should be easy in any pre-networking Unix, and not hard in the networking Unices. (It only got harder because Fred wanted to open, not dup, the appropriate descriptor, and that is not possible for sockets or [presumably] streams. I believe the V8 /dev/stdin dups fd 0.) >and anyway (as we've just seen) stdin isn't really a definite >object. Neither is `-'. -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: ch...@mimsy.umd.edu Path: uunet!mimsy!chris
Path: utzoo!utgpu!water!watmath!clyde!bellcore!rutgers!gatech!ncar! oddjob!mimsy!chris From: ch...@mimsy.UUCP (Chris Torek) Newsgroups: comp.unix.wizards,comp.unix.questions Subject: aside on patch and context diffs Message-ID: <11822@mimsy.UUCP> Date: 5 Jun 88 21:47:19 GMT References: <136@rubmez.UUCP> <449@happym.UUCP> <7944@alice.UUCP> <23142@bu-cs.BU.EDU> Organization: U of Maryland, Dept. of Computer Science, Coll. Pk., MD 20742 Lines: 13 In article <23...@bu-cs.BU.EDU> b...@bu-cs.BU.EDU (Barry Shein) writes: >... 'patch' can take a context input but that's irrelevant, it hardly >needs nor prefers a context diff to my knowledge, it's just being >accomodating so humans can look at the context diff if something >botches. There is another very good reason to use context diffs with patch, and that is that a one-line change (e.g., fixing a comment) can break a non-context diff too easily. (Also, I like to scan the diffs myself before applying them; it catches a number of bugs handily.) -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: ch...@mimsy.umd.edu Path: uunet!mimsy!chris
Path: utzoo!attcan!uunet!lll-winken!lll-tis!ames!ll-xn!mit-eddie!uw-beaver! cornell!batcomputer!sun.soe.clarkson.edu!nelson From: nel...@sun.soe.clarkson.edu (Russ Nelson) Newsgroups: comp.unix.wizards,comp.unix.questions Subject: Re: grep replacement Message-ID: <1037@sun.soe.clarkson.edu> Date: 6 Jun 88 15:18:29 GMT References: <136@rubmez.UUCP> <449@happym.UUCP> <7944@alice.UUCP> <8012@brl-smoke.ARPA> <23133@bu-cs.BU.EDU> <1030@sun.soe.clarkson.edu> <23143@bu-cs.BU.EDU> Reply-To: nel...@sun.soe.clarkson.edu (Russ Nelson) Organization: Clarkson University, Potsdam, NY Lines: 20 In article <23...@bu-cs.BU.EDU> b...@bu-cs.BU.EDU (Barry Shein) writes: >From: nel...@sun.soe.clarkson.edu (Russ Nelson) [responding to me] >>alias with_context tee >/tmp/$$ | $* | context -f/tmp/$$ >I don't understand, the way to avoid having to tee it into temp >files is to tee it into temp files? No. There is no way to avoid teeing it into a temp file. Such is life with pipes. If you want context then you need to save it. My alias is perfectly consistent with the tool-using philosophy. Yes, it's a kludge, but that's the only way to save context in a single-stream pipe philosophy. I remember reading a paper in which multiple streams going hither and yon were proposed, but the syntax was gothic at best. I like being able to say this: bsd: sort | with_context grep rfoo | more sysv: sort | with_context grep foo | more Because sysv doesn't have the r* utilities, of course :-) -- signed char *reply-to-russ(int network) { /* Why can't BITNET go */ if(network == BITNET) return "NELSON@CLUTX"; /* domainish? */ else return "nel...@clutx.clarkson.edu"; }
Path: utzoo!attcan!uunet!seismo!rick From: r...@seismo.CSS.GOV (Rick Adams) Newsgroups: comp.unix.questions Subject: Re: grep replacement Summary: -h Message-ID: <44366@beno.seismo.CSS.GOV> Date: 6 Jun 88 20:40:05 GMT References: <7882@alice.UUCP> <2450011@hpsal2.HP.COM> <54818@sun.uucp> <10264@ncc.Nexus.CA> Organization: Center for Seismic Studies, Arlington, VA Lines: 10 7th Edition grep had a -h flag to not print the filenames on a grep. 4BSD still has a -h flag. System 5 doesn't have a -h flag. (Another example of how System 5 is superior to BSD... and V7...) ---rick
Path: utzoo!attcan!uunet!husc6!bu-cs!tower From: to...@bu-cs.BU.EDU (Leonard H. Tower Jr.) Newsgroups: comp.unix.wizards,comp.unix.questions Subject: Re: grep replacement Summary: try GNU Emacs' M-x grep RET Message-ID: <23158@bu-cs.BU.EDU> Date: 6 Jun 88 21:44:34 GMT References: <7882@alice.UUCP> <5630@umn-cs.cs.umn.edu> <6866@elroy.Jpl.Nasa.Gov> Reply-To: to...@bu-it.bu.edu (Leonard H. Tower Jr.) Followup-To: comp.unix.wizards Organization: Distributed Systems Group, Boston University, 111 Cummington Street, Boston, MA 02215, USA +1 (617) 353-2780 Lines: 24 X-Home: 36 Porter Street, Somerville, MA 02143, USA +1 (617) 623-7739 Path: utzoo!attcan!uunet!husc6!bu-cs!tower From: to...@bu-cs.BU.EDU (Leonard H. Tower Jr.) Newsgroups: comp.unix.wizards,comp.unix.questions Subject: Re: grep replacement Summary: try GNU Emacs' M-x grep RET Message-ID: <23158@bu-cs.BU.EDU> Date: 6 Jun 88 21:44:34 GMT References: <7882@alice.UUCP> <5630@umn-cs.cs.umn.edu> <6866@elroy.Jpl.Nasa.Gov> Reply-To: to...@bu-it.bu.edu (Leonard H. Tower Jr.) Followup-To: comp.unix.wizards Organization: Distributed Systems Group, Boston University, 111 Cummington Street, Boston, MA 02215, USA +1 (617) 353-2780 Lines: 24 X-UUCP-Path: ..!harvard!bu-cs!tower In article <6...@elroy.Jpl.Nasa.Gov> a...@cogswell.Jpl.Nasa.Gov (Alan S. Mazer) writes: | |One thing I would _love_ is to be able to find the context of what I've |found, for example, to find the two (n?) surrounding lines. I have wanted |to do this many times and there is no good way. GNU Emacs has a command that will walk you through each match of a grep run and show you the context around it: grep: Run grep, with user-specified args, and collect output in a buffer. While grep runs asynchronously, you can use the C-x ` command to find the text that grep hits refer to. M-x grep RET to invoke it. I suspect other Unix Emacs have a similar feature. Information on how to obtain GNU Emacs, other GNU software, or the GNU project itself is available from: g...@prep.ai.mit.edu enjoy -len
Path: utzoo!utgpu!water!watmath!clyde!att!mtunx!rutgers!gatech!ncar! oddjob!mimsy!eneevax!umd5!brl-adm!brl-smoke!gwyn From: g...@brl-smoke.ARPA (Doug Gwyn ) Newsgroups: comp.unix.questions Subject: Re: grep replacement Message-ID: <8032@brl-smoke.ARPA> Date: 7 Jun 88 08:59:56 GMT References: <7882@alice.UUCP> <2450011@hpsal2.HP.COM> <54818@sun.uucp> <10264@ncc.Nexus.CA> <44366@beno.seismo.CSS.GOV> Reply-To: g...@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) Organization: Ballistic Research Lab (BRL), APG, MD. Lines: 10 In article <44...@beno.seismo.CSS.GOV> r...@seismo.CSS.GOV (Rick Adams) writes: >7th Edition grep had a -h flag to not print the filenames on a grep. >4BSD still has a -h flag. >System 5 doesn't have a -h flag. >(Another example of how System 5 is superior to BSD... and V7...) Maybe the AT&T folks figured that their customers were smart enough to type "cat files ... | grep". I've never had the need for a -h flag, but I sure would like for the -H (ALWAYS print filename) option to be the default instead of the current variable algorithm.
Path: utzoo!attcan!uunet!lll-winken!lll-tis!ames!pasteur!ucbvax!decwrl! labrea!rutgers!bellcore!faline!thumper!ulysses!andante!alice!andrew From: and...@alice.UUCP Newsgroups: comp.unix.wizards,comp.unix.questions Subject: Re: grep replacement Summary: responses to the request for comments Message-ID: <7962@alice.UUCP> Date: 10 Jun 88 18:34:00 GMT Organization: AT&T Bell Laboratories, Murray Hill NJ Lines: 143 The following is a summary of the somewhat plausible ideas suggested for the new grep. I thank leo de witt particularly and others for clearing up misconceptions and pointing out (correctly) that existing tools like sed already do (or at least nearly do) what some people asked for. The following points are in no particular order and no slight is intended by my presentation. After that, I summarise the current flags. 1) named character classes, e.g. \alpha, \digit. i think this is a hokey idea and dismissed it as unnecessary crud but then found out it is part of the proposed regular expression stuff for posix. it may creep in but i hope not. 2) matching multi-line patterns (\n as part of pattern) this actually requires a lot of infrastructure support and thought. i prefer to leave that to other more powerful programs such as sam. 3) print lines with context. the second most requested feature but i'm not doing it. this is just the job for sed. to be consistent, we just took the context crap out of diff too. this is actually reasonable; showing context is the job for a separate tool (pipeline difficulties apart). 4) print one(first matching) line and go onto the next file. most of the justification for this seemed to be scanning mail and/or netnews articles for the subject line; neither of which gets any sympathy from me. but it is easy to do and doesn't add an option; we add a new option (say -1) and remove -s. -1 is just like -s except it prints the matching line. then the old grep -s pattern is now grep -1 pattern > /dev/null and within epsilon of being as efficent. 5) divert matching lines onto one fd, nonmatching onto another. sorry, run grep twice. 6) print the Nth occurence of the pattern (N is number or list). it may be possible to think of a real reason for this (i couldn't) but the answer is no. 7) -w (pattern matches only words) the most requested feature. well, it turns out that -x (exact) is there because doug mcilroy wanted to match words against a dictionary. it seems to have no other use. Therefore, -x is being dropped (after all, it only costs a quick edit to do it yourself) and is replaced by -w == (^|[^_a-zA-Z0-9])pattern($|[^_a-zA-Z0-9]). 8) grep should work on binary files and kanji. that it should work on kanji or any character set is a given (at least, any character set supported by the system V international character set stuff). binary files will work too modulo the following restraint: lines (between \n's) have to fit in a buffer (current size 64K). violations are an error (exit 2). 9) -b has bogus units. agreed. -b now is in bytes. 10) -B (add an ^ to the front of the given pattern, analogous to -x and -w) -x (and -w) is enough. sorry. 11) recursively descend through argument lists no. find | xargs is going to have to do. 12) read filenames on standard input no. xargs will have to do. 13) should be as fast as bm. no worries. in fact, our egrep is 3xfaster than bm. i intend to be competitive with woods' egrep. it should also be as fast as fgrep for multiple keywords. the new grep incorporates boyer-moore as a degenerate case of Commentz-Walter, a faster replacement for the fgrep algorithm. 14) -lv (files that don't have any matching lines) -lv means print names of files that have any nonmatching lines (useful, say, for checking input syntax). -L will mean print names of files without selected lines. 15) print the part of the line that matched. no. that is available at the subroutine level. 16) compatability with old grep/fgrep/egrep. the current name for the new command is gre (aho chose it). after a while, it will become our grep. there will be a -G flag to take patterns a la old grep and a -F to take patterns a la fgrep (that is, no metacharacters except \n == |). gre is close enough to egrep to not matter. 17) fewer limits. so far, gre will have only one limit, a line length of 64K. (NO, i am not supporting arbitrary length lines (yet)!) we forsee no need for any other limit. for example, the current gre acts like fgrep. it is 4 times faster than fgrep and has no limits; we can gre -f /usr/dict/words (72K words, 600KB). 18) recognise file types (ignore binaries, unpack packed files etc). get real. go back to your macintosh or pyramid. gre will just grep files, not understand them. 19) handle patterns occurring multiple times per line this is illdefined (how many time does aaaa occur in a line of 20 'a's? in order of decreasing correctness, the answers are >=1, 17, 5). For the cases people mentioned (words), pipe it thru tr to put the words one per line. 20) why use \{\} instead of \(\)? this is not yet resolved (mcilroy&ritchie vs aho&pike&me). grouping is an orthogonal issue to subexpressions so why use the same parentheses? the latest suggestion (by ritchie) is to allow both \(\) and \{\} as grouping operators but the \3 would only count one type (say \(\)). this would be much better for complicated patterns with much grouping. 21) subroutine versions of the pattern matching stuff. in a deep sense, the new grep will have no pattern matching code in it. all the pattern matching code will be in libc with a uniform interface. the boyer-moore and commentz-walter routines have been done. the other two are egrep and back-referencing egrep. lastly, regexp will be reimplemented. 22) support a filename of - to mean standard input. a unix without /dev/stdin is largely bogus but as a sop to the poor barstards having to work on BSD, gre will support - as stdin (at least for a while). Thus, the current proposal is the following flags. it would take a GOOD argument to change my mind on this list (unless it is to get rid of a flag). -f file pattern is (`cat file`) -v nonmatching lines are 'selected' -i ignore aphabetic case -n print line number -c print count of selected lines only -l print filenames which have a selected line -L print filenames who do not have a selected line -b print byte offset of line begin -h do not print filenames in front of matching lines -H always print filenames in front of matching lines -w pattern is (^|[^_a-zA-Z0-9])pattern($|[^_a-zA-Z0-9]) -1 print only first selected line per file -e expr use expr as the pattern Andrew Hume research!andrew
Path: utzoo!attcan!uunet!seismo!keith From: ke...@seismo.CSS.GOV (Keith Bostic) Newsgroups: comp.unix.wizards,comp.unix.questions Subject: Re: grep replacement Message-ID: <44370@beno.seismo.CSS.GOV> Date: 13 Jun 88 19:12:20 GMT References: <7962@alice.UUCP> Organization: Center for Seismic Studies, Arlington, VA Lines: 31 In article <7...@alice.UUCP>, and...@alice.UUCP writes: > 22) support a filename of - to mean standard input. > a unix without /dev/stdin is largely bogus but as a sop to the poor > barstards having to work on BSD, gre will support - > as stdin (at least for a while). > > Andrew Hume > research!andrew A few comments: -- As far I'm aware, V9 is the only system that has "/dev/stdin" at the moment. For those who haven't heard of it, V9 is a research version of UN*X developed and in use at the Computing Science Research Center, a part of AT&T Bell Laboratories, and available to a small number of universities. It was preceded by V8, which, interestingly enough, was built on top of 4.1BSD. -- System V does not suppport "/dev/stdin". -- The next full release of BSD will contain "/dev/stdin" and friends. It is not part of the 4.3-tahoe release because it requires changes to stdio. I do not expect, however, commands that currently support the "-" syntax to change, for compatibility reasons. V9 itself continues to support such commands. To sum up, let's try and keep this, if not actually constructive, at least bearing some distant relationship to the facts. Keith Bostic
Path: utzoo!attcan!uunet!husc6!uwvax!oddjob!mimsy!chris From: ch...@mimsy.UUCP (Chris Torek) Newsgroups: comp.unix.wizards,comp.unix.questions Subject: Re: grep replacement Message-ID: <11957@mimsy.UUCP> Date: 14 Jun 88 03:54:41 GMT References: <7962@alice.UUCP> <44370@beno.seismo.CSS.GOV> Organization: U of Maryland, Dept. of Computer Science, Coll. Pk., MD 20742 Lines: 38 In article <44...@beno.seismo.CSS.GOV> ke...@seismo.CSS.GOV [at seismo?!?] (Keith Bostic) writes: > -- The next full release of BSD will contain "/dev/stdin" and friends. > It is not part of the 4.3-tahoe release because it requires changes > to stdio. Well, only because freopen("/dev/stdin", "r", stdin) unexpectedly fails: it closes fd 0 before attempting to open /dev/stdin, which means that stdin is gone before it can grab it again. When I `fixed' this here it broke /usr/ucb/head and I had to fix the fix! The sequence needed is messy: old = fileno(fp); new = open(...); if (new < 0) { close(old); /* maybe it was EMFILE */ new = open(...);/* (could test errno too) */ if (new < 0) return error; } if (new != old) { if (dup2(new, old) >= 0) /* move it back */ close(new); else { close(old); fileno(fp) = new; } } Not using dup2 means that freopen(stderr) might make fileno(stderr) something other than 2, which breaks at least perror(). -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: ch...@mimsy.umd.edu Path: uunet!mimsy!chris