Newsgroups: comp.os.linux.development Path: bga.com!news.sprintlink.net!hookup!swrinde!gatech! newsxfer.itd.umich.edu!isclient.merit.edu!msuinfo! harbinger.cc.monash.edu.au!yarrina.connect.com.au! warrane.connect.com.au!kralizec.zeta.org.au!socs.uts.edu.au!metro! mama.research.canon.oz.au!luke From: lu...@research.canon.oz.au (Luke Kendall) Subject: Linux seems to perform terribly for large directories Message-ID: <Cs76BL.2F2@research.canon.oz.au> Lines: 50 Sender: ne...@research.canon.oz.au Nntp-Posting-Host: tosh Organization: Canon Information Systems Research Australia Date: Thu, 30 Jun 1994 06:35:45 GMT I have strong suspicion that Linux has a problem with large directories. An early pointer to this, was that doing an `ls' on a directory with (say) 5000 files, took several minutes to begin producing output. This is _far_ slower than on other versions of Unix. The 2nd indicator was what happened when I used cpio to read a large number of files from a floppy containing files from the Linux newsgroups: (in particular, the voluminous comp.os.linux.help). (My pattern of use was to dump a whole lot of files to my home machine, and every now and then read them and delete uninteresting articles. I started loading from around news item 20000; I'm now up in the early 40,000's. So about 20,000 files have been added and removed.) There was 40Mb free at the time; (the hard disc had recently been filled to within 500kb of Full). Then I lots of junk. So, this time, reading 761 files from the floppy (2295 blocks, i.e. 1.15Mb), the elapsed time was something like 7 minutes! Normally reading a floppy like this takes between 1 and 2 minutes. I timed a 2nd, similar floppy of files. Elapsed time just over 4 minutes; 20secs user, 117secs system, 49% of CPU. I believe that the process was swift until it read a fixed amount from floppy into internal memory, and then slowed down dramatically when writing the files out to the hard disc (judging by the screen output & drive access light). Just listing the files on the floppy was as fast as normal. Reading a floppy into /tmp was normal speed. Moving the files into the right directory took only seconds. Reading a floppy into a directory that had contained far fewer files also took only a reasonable amount of time. Processes running were an xview X session with a performance monitor and clock in the background (as normal). A ps of the cpio process about 2/3 or 3/4 of the way through the process showed it had used 1m30s CPU time. Processor is a 486DX33, with a VLB controller and a 340Mb Western Digital IDE drive, 8Mb of memory, running Linux 0.99.13. The floppies are 1.44Mb. So: what gives? Have others noticed this problem? luke -- Luke Kendall, Senior Software Engineer. | Net: lu...@research.canon.oz.au Canon Information Systems Research Australia | Phone: +61 2 805 2982 P.O. Box 313 North Ryde, NSW, Australia 2113 | Fax: +61 2 805 2929
Newsgroups: comp.os.linux.development Path: bga.com!news.sprintlink.net!hookup!yeshua.marcam.com! MathWorks.Com!europa.eng.gtefsd.com!emory!swrinde!pipex!uknet!festival! dcs.ed.ac.uk!sct From: s...@dcs.ed.ac.uk (Stephen Tweedie) Subject: [ANSWER] Linux seems to perform terribly for large directories In-Reply-To: kjb@cs.vu.nl's message of Tue, 5 Jul 1994 09:31:29 GMT Message-ID: <SCT.94Jul8143930@ascrib.dcs.ed.ac.uk> Sender: cn...@dcs.ed.ac.uk (UseNet News Admin) Organization: Department of Computer Science, University of Edinburgh References: <Cs76BL.2F2@research.canon.oz.au> <Cs94z0.s9@pe1chl.ampr.org> <1994Jul4.140054.10696@uk.ac.swan.pyr> <2vb2kk$hvs@wombat.cssc-syd.tansu.com.au> <CsGnsH.KC8@cs.vu.nl> Date: Fri, 8 Jul 1994 13:39:29 GMT Lines: 32 Hi, I just thought I'd mention that this is very much a work-in-progress topic. At the Heidelburg conference, Ted Ts'o and I discussed this at some length, and Ted even found a glaring bug in the existing readdir() system call (one which affects all long-filename filesystems, by the way). So, I've currently got tentative bug-fix patches and performance improvements for the directory handling code. The bug-fix should also mean that the ext2fs directory cache may now be re-enabled. There are two major performance enhancements: first, readdir() now returns more than one directory entry if requested (up to a whole block-full, in fact). This requires library support too, by the way, but will not require any applications to be recompiled: *all* applications use the readdir(3) from the library, not readdir(2). Secondly, there is the directory cache. This is a major win for things like "ls -l", where an application does a repeated readdir()/stat(). Up to 128 directory names resolved by readdir() and lookup() will be cached, and these names may then be referenced without scanning the entire directory tree again. Once they are tested, these patches should be in a kernel soon. Watch this space... :-) Cheers, Stephen. --- Stephen Tweedie <s...@dcs.ed.ac.uk> (JANET: sct@uk.ac.ed.dcs) Department of Computer Science, Edinburgh University, Scotland.