Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!
news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!news.tele.dk!
small.news.tele.dk!129.240.148.23!uio.no!nntp.uio.no!ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-ow...@vger.kernel.org>
X-Authentication-Warning: penguin.transmeta.com: torvalds owned process doing -bs
Original-Date: 	Sat, 15 Dec 2001 16:13:12 -0800 (PST)
From: Linus Torvalds <torva...@transmeta.com>
To: Davide Libenzi <davi...@xmailserver.org>
cc: Kernel Mailing List <linux-ker...@vger.kernel.org>
Subject: Re: Just a second ...
In-Reply-To: <Pine.LNX.4.40.0112151552070.1560-100000@blue1.dev.mcafeelabs.com>
Original-Message-ID: <Pine.LNX.4.33.0112151603180.4493-100000@penguin.transmeta.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: 	linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Sun, 16 Dec 2001 00:15:56 GMT
Message-ID: <fa.oba7eev.jko7j9@ifi.uio.no>
References: <fa.luasppv.unoa8r@ifi.uio.no>
Lines: 46


On Sat, 15 Dec 2001, Davide Libenzi wrote:
>
> when you find 10 secs free in your spare time i really would like to know
> the reason ( if any ) of your abstention from any schdeuler discussion.
> No hurry, just a few lines out of lkml.

I just don't find it very interesting. The scheduler is about 100 lines
out of however-many-million (3.8 at least count), and doesn't even impact
most normal performace very much.

We'll clearly do per-CPU runqueues or something some day. And that worries
me not one whit, compared to thigns like VM and block device layer ;)

I know a lot of people think schedulers are important, and the operating
system theory about them is overflowing - it's one of those things that
people can argue about forever, yet is conceptually simple enough that
people aren't afraid of it. I just personally never found it to be a major
issue.

Let's face it - the current scheduler has the same old basic structure
that it did almost 10 years ago, and yes, it's not optimal, but there
really aren't that many real-world loads where people really care. I'm
sorry, but it's true.

And you have to realize that there are not very many things that have
aged as well as the scheduler. Which is just another proof that scheduling
is easy.

We've rewritten the VM several times in the last ten years, and I expect
it will be changed several more times in the next few years. Withing five
years we'll almost certainly have to make the current three-level page
tables be four levels etc.

In comparison to those kinds of issues, I suspect that making the
scheduler use per-CPU queues together with some inter-CPU load balancing
logic is probably _trivial_. Patches already exist, and I don't feel that
people can screw up the few hundred lines too badly.

		Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!
news.tele.dk!small.news.tele.dk!129.240.148.23!uio.no!nntp.uio.no!
ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-ow...@vger.kernel.org>
Original-Date: 	Mon, 17 Dec 2001 14:48:43 -0800 (PST)
From: Davide Libenzi <davi...@xmailserver.org>
X-X-Sender: dav...@blue1.dev.mcafeelabs.com
To: Linus Torvalds <torva...@transmeta.com>
cc: Kernel Mailing List <linux-ker...@vger.kernel.org>
Subject: Scheduler ( was: Just a second ) ...
In-Reply-To: <Pine.LNX.4.33.0112151603180.4493-100000@penguin.transmeta.com>
Original-Message-ID: <Pine.LNX.4.40.0112151934410.1014-100000@blue1.dev.mcafeelabs.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: 	linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Mon, 17 Dec 2001 22:49:20 GMT
Message-ID: <fa.m0aeo2v.sn6a0p@ifi.uio.no>
References: <fa.oba7eev.jko7j9@ifi.uio.no>
Lines: 117

On Sat, 15 Dec 2001, Linus Torvalds wrote:

> I just don't find it very interesting. The scheduler is about 100 lines
> out of however-many-million (3.8 at least count), and doesn't even impact
> most normal performace very much.

Linus, sharing queue and lock between CPUs for a "thing" highly frequency
( schedule()s + wakeup()s ) accessed like the scheduler it's quite ugly
and it's not that much funny. And it's not only performance wise, it's
more design wise.


> We'll clearly do per-CPU runqueues or something some day. And that worries
> me not one whit, compared to thigns like VM and block device layer ;)

Why not 2.5.x ?


> I know a lot of people think schedulers are important, and the operating
> system theory about them is overflowing - ...

It's no more important of anything else, it's just one of the remaining
scalability/design issues. No, it's not more important than VM but
there're enough people working on VM. And the hope is to get the scheduler
right with an ETA of less than 10 years.


> it's one of those things that people can argue about forever, ...

Yes, i suppose that if something is not addressed, it'll come up again and
again.


> yet is conceptually simple enough that people aren't afraid of it.
         ^^^^^^^^^^^^^^^^^^^

1, ...


> Let's face it - the current scheduler has the same old basic structure
> that it did almost 10 years ago, and yes, it's not optimal, but there
> really aren't that many real-world loads where people really care. I'm
> sorry, but it's true.

Moving to 4, 8, 16 CPUs the run queue load, that would be thought insane
for UP systems, starts to matter. Just to leave out cache line effects.
Just to leave out the way the current scheduler moves tasks around CPUs.
Linus, it's not only about performance benchmarks with 2451 processes
jumping on the run queue, that i could not care less about, it's just a
sum of sucky "things" that make an issue. You can look at it like a
cosmetic/design patch more than a strict performance patch if you like.


> And you have to realize that there are not very many things that have
> aged as well as the scheduler. Which is just another proof that
> scheduling is easy.
  ^^^^^^^^^^^^^^^^^^

..., 2, ...


> We've rewritten the VM several times in the last ten years, and I expect
> it will be changed several more times in the next few years. Withing five
> years we'll almost certainly have to make the current three-level page
> tables be four levels etc.
>
> In comparison to those kinds of issues, I suspect that making the
> scheduler use per-CPU queues together with some inter-CPU load balancing
> logic is probably _trivial_.
                    ^^^^^^^^^

... 3, there should be a subliminal message inside but i'm not able to
get it ;)
I would not call selecting the right task to run in an SMP system trivial.
The difference between selecting the right task to run and selecting the
right page to swap is that if you screw up with the task the system
impact is lower. But, if you screw up, your design will suck in both cases.
Anyway, given that 1) real men do VM ( i thought they didn't eat quiche )
and easy-coders do scheduling 2) the schdeuler is easy/trivial and you do
not seem interested in working on it 3) whoever is doing the scheduler
cannot screw up things, why don't you give the responsibility for example
to Alan or Ingo so that a discussion ( obviously easy ) about the future
of the schdeuler can be started w/out hurting real men doing VM ?
I'm talking about, you know, that kind of discussions where people bring
solutions, code and numbers, they talk about the good and bad of certain
approaches and they finally come up ( after some sane fight ) with a much
or less widely approved solution. The scheduler, besides the real men
crap, is one of the basic components of an OS, and having a public
debate, i'm not saying every month and neither every year, but at least
once every four years ( this is the last i remember ) could be a nice thing.
And no, if you do not give to someone that you trust the "power" to
redesign the scheduler, no schdeuler discussions will start simply
because people don't like the result of a debate to be dumped to /dev/null.


> Patches already exist, and I don't feel that people can screw up the few
> hundred lines too badly.

Can you point me to a Linux patch that implements _real_independent_
( queue and locking ) CPU schedulers with global balancing policy ?
I searched very badly but i did not find anything.




Your faithfully,
Jimmy Scheduler





-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!
news.tele.dk!small.news.tele.dk!129.240.148.23!uio.no!nntp.uio.no!
ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-ow...@vger.kernel.org>
X-Authentication-Warning: penguin.transmeta.com: torvalds owned process doing -bs
Original-Date: 	Mon, 17 Dec 2001 14:53:40 -0800 (PST)
From: Linus Torvalds <torva...@transmeta.com>
To: Davide Libenzi <davi...@xmailserver.org>
cc: Kernel Mailing List <linux-ker...@vger.kernel.org>
Subject: Re: Scheduler ( was: Just a second ) ...
In-Reply-To: <Pine.LNX.4.40.0112151934410.1014-100000@blue1.dev.mcafeelabs.com>
Original-Message-ID: <Pine.LNX.4.33.0112171449520.1854-100000@penguin.transmeta.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: 	linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Mon, 17 Dec 2001 22:57:09 GMT
Message-ID: <fa.obpvefv.j4g5j6@ifi.uio.no>
References: <fa.m0aeo2v.sn6a0p@ifi.uio.no>
Lines: 44


On Mon, 17 Dec 2001, Davide Libenzi wrote:

> On Sat, 15 Dec 2001, Linus Torvalds wrote:
>
> > I just don't find it very interesting. The scheduler is about 100 lines
> > out of however-many-million (3.8 at least count), and doesn't even impact
> > most normal performace very much.
>
> Linus, sharing queue and lock between CPUs for a "thing" highly frequency
> ( schedule()s + wakeup()s ) accessed like the scheduler it's quite ugly
> and it's not that much funny. And it's not only performance wise, it's
> more design wise.

"Design wise" is highly overrated.

Simplicity is _much_ more important, if something commonly is only done a
few hundred times a second. Locking overhead is basically zero for that
case.

> > We'll clearly do per-CPU runqueues or something some day. And that worries
> > me not one whit, compared to thigns like VM and block device layer ;)
>
> Why not 2.5.x ?

Maybe. But read the rest of the sentence. There are issues that are about
a million times more important.

> Moving to 4, 8, 16 CPUs the run queue load, that would be thought insane
> for UP systems, starts to matter.

4 cpu's are "high end" today. We can probably point to tens of thousands
of UP machines for each 4-way out there. The ratio gets even worse for 8,
and 16 CPU's is basically a rounding error.

You have to prioritize. Scheduling overhead is way down the list.

		Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!
newsfeeds.belnet.be!news.belnet.be!skynet.be!skynet.be!newsfeed1.uni2.dk!
news.net.uni-c.dk!uninett.no!uio.no!nntp.uio.no!ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-ow...@vger.kernel.org>
Original-Date: 	Mon, 17 Dec 2001 15:15:15 -0800 (PST)
From: Davide Libenzi <davi...@xmailserver.org>
X-X-Sender: dav...@blue1.dev.mcafeelabs.com
To: Linus Torvalds <torva...@transmeta.com>
cc: Kernel Mailing List <linux-ker...@vger.kernel.org>
Subject: Re: Scheduler ( was: Just a second ) ...
In-Reply-To: <Pine.LNX.4.33.0112171449520.1854-100000@penguin.transmeta.com>
Original-Message-ID: <Pine.LNX.4.40.0112171508330.1577-100000@blue1.dev.mcafeelabs.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: 	linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Mon, 17 Dec 2001 23:17:10 GMT
Message-ID: <fa.lvqkoqv.v7gb8k@ifi.uio.no>
References: <fa.obpvefv.j4g5j6@ifi.uio.no>
Lines: 47

On Mon, 17 Dec 2001, Linus Torvalds wrote:

>
> On Mon, 17 Dec 2001, Davide Libenzi wrote:
>
> > On Sat, 15 Dec 2001, Linus Torvalds wrote:
> >
> > > I just don't find it very interesting. The scheduler is about 100 lines
> > > out of however-many-million (3.8 at least count), and doesn't even impact
> > > most normal performace very much.
> >
> > Linus, sharing queue and lock between CPUs for a "thing" highly frequency
> > ( schedule()s + wakeup()s ) accessed like the scheduler it's quite ugly
> > and it's not that much funny. And it's not only performance wise, it's
> > more design wise.
>
> "Design wise" is highly overrated.
>
> Simplicity is _much_ more important, if something commonly is only done a
> few hundred times a second. Locking overhead is basically zero for that
> case.

Few hundred is a nice definition because you can basically range from 0 to
infinite. Anyway i agree that we can spend days debating about what this
"few hundred" translate to, and i do not really want to.


> 4 cpu's are "high end" today. We can probably point to tens of thousands
> of UP machines for each 4-way out there. The ratio gets even worse for 8,
> and 16 CPU's is basically a rounding error.
>
> You have to prioritize. Scheduling overhead is way down the list.

You don't really have to serialize/prioritize, old Latins used to say
"Divide Et Impera" ;)




- Davide


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: archiver1.google.com!news2.google.com!news1.google.com!
newsfeed.stanford.edu!news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!
newsfeed.media.kyoto-u.ac.jp!uio.no!nntp.uio.no!ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-ow...@vger.kernel.org>
X-Authentication-Warning: penguin.transmeta.com: torvalds owned process doing -bs
Original-Date: 	Mon, 17 Dec 2001 15:18:14 -0800 (PST)
From: Linus Torvalds <torva...@transmeta.com>
To: Davide Libenzi <davi...@xmailserver.org>
cc: Kernel Mailing List <linux-ker...@vger.kernel.org>
Subject: Re: Scheduler ( was: Just a second ) ...
In-Reply-To: <Pine.LNX.4.40.0112171508330.1577-100000@blue1.dev.mcafeelabs.com>
Original-Message-ID: <Pine.LNX.4.33.0112171516090.1891-100000@penguin.transmeta.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: 	linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Mon, 17 Dec 2001 23:21:03 GMT
Message-ID: <fa.o9afenv.hk47rc@ifi.uio.no>
References: <fa.lvqkoqv.v7gb8k@ifi.uio.no>
Lines: 29


On Mon, 17 Dec 2001, Davide Libenzi wrote:
> >
> > You have to prioritize. Scheduling overhead is way down the list.
>
> You don't really have to serialize/prioritize, old Latins used to say
> "Divide Et Impera" ;)

Well, you explicitly _asked_ me why I had been silent on the issue. I told
you.

I also told you that I thought it wasn't that big of a deal, and that
patches already exist.

So I'm letting the patches fight it out among the people who _do_ care.

Then, eventually, I'll do something about it, when we have a winner.

If that isn't "Divide et Impera", I don't know _what_ is. Remember: the
romans didn't much care for their subjects. They just wanted the glory,
and the taxes.

		Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!
news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!newsfeed.media.kyoto-u.ac.jp!
uio.no!nntp.uio.no!ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-ow...@vger.kernel.org>
Original-Date: 	Mon, 17 Dec 2001 15:39:36 -0800 (PST)
From: Davide Libenzi <davi...@xmailserver.org>
X-X-Sender: dav...@blue1.dev.mcafeelabs.com
To: Linus Torvalds <torva...@transmeta.com>
cc: Kernel Mailing List <linux-ker...@vger.kernel.org>
Subject: Re: Scheduler ( was: Just a second ) ...
In-Reply-To: <Pine.LNX.4.33.0112171516090.1891-100000@penguin.transmeta.com>
Original-Message-ID: <Pine.LNX.4.40.0112171532581.1577-100000@blue1.dev.mcafeelabs.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: 	linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Mon, 17 Dec 2001 23:38:53 GMT
Message-ID: <fa.m0qupqv.s76bou@ifi.uio.no>
References: <fa.o9afenv.hk47rc@ifi.uio.no>
Lines: 23

On Mon, 17 Dec 2001, Linus Torvalds wrote:

> So I'm letting the patches fight it out among the people who _do_ care.
>
> Then, eventually, I'll do something about it, when we have a winner.
>
> If that isn't "Divide et Impera", I don't know _what_ is. Remember: the
> romans didn't much care for their subjects. They just wanted the glory,
> and the taxes.

Just like today, everyone I talk to wants glory, and everyone I talk to
wants to _not_ pay taxes.



- Davide


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: archiver1.google.com!news1.google.com!sn-xit-02!supernews.com!
newsfeed.direct.ca!look.ca!logbridge.uoregon.edu!news.net.uni-c.dk!
uninett.no!uio.no!nntp.uio.no!ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-ow...@vger.kernel.org>
Original-Date: 	Mon, 17 Dec 2001 18:52:44 -0500
From: Benjamin LaHaise <b...@redhat.com>
To: Linus Torvalds <torva...@transmeta.com>
Cc: Davide Libenzi <davi...@xmailserver.org>,
        Kernel Mailing List <linux-ker...@vger.kernel.org>
Subject: Re: Scheduler ( was: Just a second ) ...
Original-Message-ID: <20011217185244.D9581@redhat.com>
Original-References: 
<Pine.LNX.4.40.0112171508330.1577-100...@blue1.dev.mcafeelabs.com> 
<Pine.LNX.4.33.0112171516090.1891-100...@penguin.transmeta.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.5i
In-Reply-To: <Pine.LNX.4.33.0112171516090.1891-100000@penguin.transmeta.com>; 
from torvalds@transmeta.com on Mon, Dec 17, 2001 at 03:18:14PM -0800
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: 	linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Mon, 17 Dec 2001 23:54:42 GMT
Message-ID: <fa.dtefe8v.tkc32c@ifi.uio.no>
References: <fa.o9afenv.hk47rc@ifi.uio.no>
Lines: 13

On Mon, Dec 17, 2001 at 03:18:14PM -0800, Linus Torvalds wrote:
> Well, you explicitly _asked_ me why I had been silent on the issue. I told
> you.

Well, what about those of us who need syscall numbers assigned for which 
you are the only official assigned number registry?

		-ben
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!
news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!news.tele.dk!
small.news.tele.dk!129.240.148.23!uio.no!nntp.uio.no!ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-ow...@vger.kernel.org>
X-Authentication-Warning: penguin.transmeta.com: torvalds owned process doing -bs
Original-Date: 	Mon, 17 Dec 2001 17:11:09 -0800 (PST)
From: Linus Torvalds <torva...@transmeta.com>
To: Benjamin LaHaise <b...@redhat.com>
cc: Davide Libenzi <davi...@xmailserver.org>,
        Kernel Mailing List <linux-ker...@vger.kernel.org>
Subject: Re: Scheduler ( was: Just a second ) ...
In-Reply-To: <20011217185244.D9581@redhat.com>
Original-Message-ID: <Pine.LNX.4.33.0112171710160.2035-100000@penguin.transmeta.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: 	linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Tue, 18 Dec 2001 01:13:36 GMT
Message-ID: <fa.oa9td7v.gke5be@ifi.uio.no>
References: <fa.dtefe8v.tkc32c@ifi.uio.no>
Lines: 20


On Mon, 17 Dec 2001, Benjamin LaHaise wrote:
> On Mon, Dec 17, 2001 at 03:18:14PM -0800, Linus Torvalds wrote:
> > Well, you explicitly _asked_ me why I had been silent on the issue. I told
> > you.
>
> Well, what about those of us who need syscall numbers assigned for which
> you are the only official assigned number registry?

I've told you a number of times that I'd like to see the preliminary
implementation publicly discussed and some uses outside of private
companies that I have no insight into..

		Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: archiver1.google.com!news1.google.com!sn-xit-02!supernews.com!
news.tele.dk!small.news.tele.dk!193.213.112.26!newsfeed1.ulv.nextra.no!
nextra.com!uninett.no!uio.no!nntp.uio.no!ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-ow...@vger.kernel.org>
Original-Date: 	Mon, 17 Dec 2001 23:54:26 -0200 (BRST)
From: Rik van Riel <r...@conectiva.com.br>
X-X-Sender:  <r...@imladris.surriel.com>
To: Linus Torvalds <torva...@transmeta.com>
Cc: Davide Libenzi <davi...@xmailserver.org>,
        Kernel Mailing List <linux-ker...@vger.kernel.org>
Subject: Re: Scheduler ( was: Just a second ) ...
In-Reply-To: <Pine.LNX.4.33.0112171449520.1854-100000@penguin.transmeta.com>
Original-Message-ID: <Pine.LNX.4.33L.0112172353420.15741-100000@imladris.surriel.com>
X-spambait: aardv...@kernelnewbies.org
X-spammeplease: 	aardv...@nl.linux.org
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: 	linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Tue, 18 Dec 2001 01:57:57 GMT
Message-ID: <fa.nvjcm6v.rii0ol@ifi.uio.no>
References: <fa.obpvefv.j4g5j6@ifi.uio.no>
Lines: 23

On Mon, 17 Dec 2001, Linus Torvalds wrote:

> You have to prioritize. Scheduling overhead is way down the list.

That's not what the profiling on my UP machine indicates,
let alone on SMP machines.

Try readprofile some day, chances are schedule() is pretty
near the top of the list.

regards,

Rik
-- 
Shortwave goes a long way:  irc.starchat.net  #swl

http://www.surriel.com/		http://distro.conectiva.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!
news.tele.dk!small.news.tele.dk!129.240.148.23!uio.no!nntp.uio.no!
ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-ow...@vger.kernel.org>
X-Authentication-Warning: penguin.transmeta.com: torvalds owned process doing -bs
Original-Date: 	Mon, 17 Dec 2001 18:35:54 -0800 (PST)
From: Linus Torvalds <torva...@transmeta.com>
To: Rik van Riel <r...@conectiva.com.br>
cc: Davide Libenzi <davi...@xmailserver.org>,
        Kernel Mailing List <linux-ker...@vger.kernel.org>
Subject: Re: Scheduler ( was: Just a second ) ...
In-Reply-To: <Pine.LNX.4.33L.0112172353420.15741-100000@imladris.surriel.com>
Original-Message-ID: <Pine.LNX.4.33.0112171825460.2108-100000@penguin.transmeta.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: 	linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Tue, 18 Dec 2001 02:38:21 GMT
Message-ID: <fa.obq1cnv.i4i5b6@ifi.uio.no>
References: <fa.nvjcm6v.rii0ol@ifi.uio.no>
Lines: 57


On Mon, 17 Dec 2001, Rik van Riel wrote:
>
> Try readprofile some day, chances are schedule() is pretty
> near the top of the list.

Ehh.. Of course I do readprofile.

But did you ever compare readprofile output to _total_ cycles spent?

The fact is, it's not even noticeable under any normal loads, and
_definitely_ not on UP except with totally made up benchmarks that just
pass tokens around or yield all the time.

Because we spend 95-99% in user space or idle. Which is as it should be.
There are _very_ few loads that are kernel-intensive, and in fact the best
way to get high system times is to do either lots of fork/exec/wait with
everything cached, or do lots of open/read/write/close with everything
cached.

Of the remaining 1-5% of time, schedule() shows up as one fairly high
thing, but on most profiles I've seen of real work it shows up long after
things like "clear_page()" and "copy_page()".

And look closely at the profile, and you'll notice that it tends to be a
_loong_ tail of stuff.

Quite frankly, I'd be a _lot_ more interested in making the scheduling
slices _shorter_ during 2.5.x, and go to a 1kHz clock on x86 instead of a
100Hz one, _despite_ the fact that it will increase scheduling load even
more. Because it improves interactive feel, and sometimes even performance
(ie being able to sleep for shorter sequences of time allows some things
that want "almost realtime" behaviour to avoid busy-looping for those
short waits - improving performace exactly _because_ they put more load on
the scheduler).

The benchmark that is just about _the_ worst on the scheduler is actually
something like "lmbench", and if you look at profiles for that you'll
notice that system call entry and exit together with the read/write path
ends up being more of a performance issue.

And you know what? From a user standpoint, improving disk latency is again
a _lot_ more noticeable than scheduler overhead.

And even more important than performance is being able to read and write
to CD-RW disks without having to know about things like "ide-scsi" etc,
and do it sanely over different bus architectures etc.

The scheduler simply isn't that important.

			Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: archiver1.google.com!news1.google.com!sn-xit-02!supernews.com!
news.tele.dk!small.news.tele.dk!129.240.148.23!uio.no!nntp.uio.no!
ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-ow...@vger.kernel.org>
Original-Date: 	Mon, 17 Dec 2001 19:08:25 -0800 (PST)
From: Davide Libenzi <davi...@xmailserver.org>
X-X-Sender: dav...@blue1.dev.mcafeelabs.com
To: Linus Torvalds <torva...@transmeta.com>
cc: Rik van Riel <r...@conectiva.com.br>,
        Kernel Mailing List <linux-ker...@vger.kernel.org>
Subject: Re: Scheduler ( was: Just a second ) ...
In-Reply-To: <Pine.LNX.4.33.0112171825460.2108-100000@penguin.transmeta.com>
Original-Message-ID: <Pine.LNX.4.40.0112171849490.1577-100000@blue1.dev.mcafeelabs.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: 	linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Tue, 18 Dec 2001 03:07:31 GMT
Message-ID: <fa.m0b6pqv.snua8l@ifi.uio.no>
References: <fa.obq1cnv.i4i5b6@ifi.uio.no>
Lines: 50

On Mon, 17 Dec 2001, Linus Torvalds wrote:

> Quite frankly, I'd be a _lot_ more interested in making the scheduling
> slices _shorter_ during 2.5.x, and go to a 1kHz clock on x86 instead of a
> 100Hz one, _despite_ the fact that it will increase scheduling load even
> more. Because it improves interactive feel, and sometimes even performance
> (ie being able to sleep for shorter sequences of time allows some things
> that want "almost realtime" behaviour to avoid busy-looping for those
> short waits - improving performace exactly _because_ they put more load on
> the scheduler).

I'm ok with increasing HZ but not so ok with decreasing time slices.
When you switch a task you've a fixed cost ( tlb, cache image,... ) that,
if you decrease the time slice, you're going to weigh with a lower run time
highering its percent impact.
The more interactive feel can be achieved by using a real BVT
implementation :

-            p->counter = (p->counter >> 1) + NICE_TO_TICKS(p->nice);
+            p->counter += NICE_TO_TICKS(p->nice);

The only problem with this is that, with certain task run patterns,
processes can run a long time ( having an high dynamic priority ) before
they get scheduled.
What i was thinking was something like, in timer.c :

        if (p->counter > decay_ticks)
            --p->counter;
        else if (++p->timer_ticks >= MAX_RUN_TIME) {
            p->counter -= p->timer_ticks;
            p->timer_ticks = 0;
            p->need_resched = 1;
        }

Having MAX_RUN_TIME ~= NICE_TO_TICKS(0)
In this way I/O bound tasks can run with high priority giving a better
interactive feel, w/out running too much freezing the system when exiting
from a quite long I/O wait.




- Davide


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: archiver1.google.com!news1.google.com!sn-xit-02!supernews.com!
news.tele.dk!small.news.tele.dk!129.240.148.23!uio.no!nntp.uio.no!
ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-ow...@vger.kernel.org>
Original-Date: 	Mon, 17 Dec 2001 19:19:54 -0800 (PST)
From: Davide Libenzi <davi...@xmailserver.org>
X-X-Sender: dav...@blue1.dev.mcafeelabs.com
To: Davide Libenzi <davi...@xmailserver.org>
cc: Linus Torvalds <torva...@transmeta.com>,
        Rik van Riel <r...@conectiva.com.br>,
        Kernel Mailing List <linux-ker...@vger.kernel.org>
Subject: Re: Scheduler ( was: Just a second ) ...
In-Reply-To: <Pine.LNX.4.40.0112171849490.1577-100000@blue1.dev.mcafeelabs.com>
Original-Message-ID: <Pine.LNX.4.40.0112171913450.1577-100000@blue1.dev.mcafeelabs.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: 	linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Tue, 18 Dec 2001 03:19:29 GMT
Message-ID: <fa.m0b0p2v.sn4b0v@ifi.uio.no>
References: <fa.m0b6pqv.snua8l@ifi.uio.no>
Lines: 25

On Mon, 17 Dec 2001, Davide Libenzi wrote:

> What i was thinking was something like, in timer.c :
>
>         if (p->counter > decay_ticks)
>             --p->counter;
>         else if (++p->timer_ticks >= MAX_RUN_TIME) {
>             p->counter -= p->timer_ticks;
>             p->timer_ticks = 0;
>             p->need_resched = 1;
>         }

Obviously that code doesn't work :) but the idea is to not permit the task
to run more than a maximum time consecutively.



- Davide


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: archiver1.google.com!news2.google.com!news1.google.com!
newsfeed.stanford.edu!logbridge.uoregon.edu!news.net.uni-c.dk!
uninett.no!uio.no!nntp.uio.no!ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-ow...@vger.kernel.org>
X-Authentication-Warning: penguin.transmeta.com: torvalds owned process doing -bs
Original-Date: 	Mon, 17 Dec 2001 20:27:18 -0800 (PST)
From: Linus Torvalds <torva...@transmeta.com>
To: William Lee Irwin III <w...@holomorphy.com>
cc: Kernel Mailing List <linux-ker...@vger.kernel.org>
Subject: Re: Scheduler ( was: Just a second ) ...
In-Reply-To: <20011217200946.D753@holomorphy.com>
Original-Message-ID: <Pine.LNX.4.33.0112172014530.2281-100000@penguin.transmeta.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: 	linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Tue, 18 Dec 2001 04:31:24 GMT
Message-ID: <fa.ocpdefv.j4e7je@ifi.uio.no>
References: <fa.h7jgu6v.onemo8@ifi.uio.no>
Lines: 62


[ cc'd back to Linux kernel, in case somebody wants to take a look whether
  there is something wrong in the sound drivers, for example ]

On Mon, 17 Dec 2001, William Lee Irwin III wrote:
>
> This is no benchmark. This is my home machine it's taking a bite out of.
> I'm trying to websurf and play mp3's and read email here. No forkbombs.
> No databases. No made-up benchmarks. I don't know what it's doing (or
> trying to do) in there but I'd like the CPU cycles back.
>
> From a recent /proc/profile dump on 2.4.17-pre1 (no patches), my top 5
> (excluding default_idle) are:
> --------------------------------------------------------
>  22420 total                                      0.0168
>   4624 default_idle                              96.3333
>   1280 schedule                                   0.6202
>   1130 handle_IRQ_event                          11.7708
>    929 file_read_actor                            9.6771
>    843 fast_clear_page                            7.5268

The most likely cause is simply waking up after each sound interrupt: you
also have a _lot_ of time handling interrupts. Quite frankly, web surfing
and mp3 playing simply shouldn't use any noticeable amounts of CPU.

The point being that I really doubt it's the scheduler proper, it's
probably how it is _used_. And I'd suspect your sound driver (or user)
conspires to keep scheduling stuff.

For example (and this is _purely_ an example, I don't know if this is
your particular case), this sounds like a classic case of "bad buffering".
What bad buffering would do is:
 - you have a sound buffer that the mp3 player tries to keep full
 - your sound buffer is, let's pick a random number, 64 entries of 1024
   bytes each.
 - the sound card gives an interrupt every time it has emptied a buffer.
 - the mp3 player is waiting on "free space"
 - we wake up the mp3 player for _every_ sound fragment filled.

Do you see what this leads to? We schedule the mp3 task (which gets a high
priority because it tends to run for a really short time, filling just 1
small buffer each time) _every_ time a single buffer empties. Even though
we have 63 other full buffers.

The classic fix for these kinds of things is _not_ to make the scheduler
faster. Sure, that would help, but that's not really the problem. The
_real_ fix is to use water-marks, and make the sound driver wake up the
writing process only when (say) half the buffers have emptied.

Now the mp3 player can fill 32 of the buffers at a time, and gets
scheduled an order of magnitude less. It doesn't end up waking up every
time.

Which sound driver are you using, just in case this _is_ the reason?

		Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!
news.tele.dk!small.news.tele.dk!129.240.148.23!uio.no!nntp.uio.no!
ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-ow...@vger.kernel.org>
Original-Date: 	Mon, 17 Dec 2001 20:55:47 -0800
From: William Lee Irwin III <w...@holomorphy.com>
To: Kernel Mailing List <linux-ker...@vger.kernel.org>
Cc: torva...@transmeta.com
Subject: Re: Scheduler ( was: Just a second ) ...
Original-Message-ID: <20011217205547.C821@holomorphy.com>
Mail-Followup-To: William Lee Irwin III <w...@holomorphy.com>,
	Kernel Mailing List <linux-ker...@vger.kernel.org>,
	torva...@transmeta.com
Original-References: <20011217200946.D...@holomorphy.com> 
<Pine.LNX.4.33.0112172014530.2281-100...@penguin.transmeta.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Description: brief message
Content-Disposition: inline
User-Agent: Mutt/1.3.17i
In-Reply-To: <Pine.LNX.4.33.0112172014530.2281-100000@penguin.transmeta.com>; 
from torvalds@transmeta.com on Mon, Dec 17, 2001 at 08:27:18PM -0800
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: 	linux-kernel@vger.kernel.org
Organization: The Domain of Holomorphy
Date: Tue, 18 Dec 2001 04:57:34 GMT
Message-ID: <fa.h83cuuv.v72n03@ifi.uio.no>
References: <fa.ocpdefv.j4e7je@ifi.uio.no>
Lines: 34

On Mon, Dec 17, 2001 at 08:27:18PM -0800, Linus Torvalds wrote:
> The most likely cause is simply waking up after each sound interrupt: you
> also have a _lot_ of time handling interrupts. Quite frankly, web surfing
> and mp3 playing simply shouldn't use any noticeable amounts of CPU.

I think we have a winner:
/proc/interrupts
------------------------------------------------
           CPU0       
  0:   17321824          XT-PIC  timer
  1:          4          XT-PIC  keyboard
  2:          0          XT-PIC  cascade
  5:   46490271          XT-PIC  soundblaster
  9:     400232          XT-PIC  usb-ohci, eth0, eth1
 11:     939150          XT-PIC  aic7xxx, aic7xxx
 14:         13          XT-PIC  ide0

Approximately 4 times more often than the timer interrupt.
That's not nice...

On Mon, Dec 17, 2001 at 08:27:18PM -0800, Linus Torvalds wrote:
> Which sound driver are you using, just in case this _is_ the reason?

SoundBlaster 16
A change of hardware should help verify this.


Cheers,
Bill
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!
news.tele.dk!small.news.tele.dk!129.240.148.23!uio.no!nntp.uio.no!
ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-ow...@vger.kernel.org>
From: Thierry Forveille <forve...@cfht.hawaii.edu>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Original-Message-ID: <15390.53230.827019.336771@hoku.cfht.hawaii.edu>
Original-Date: 	Mon, 17 Dec 2001 19:11:10 -1000 (HST)
To: linux-ker...@vger.kernel.org
Subject: Re: Scheduler ( was: Just a second ) ...
X-Mailer: VM 6.75 under Emacs 19.34.1
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: 	linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Tue, 18 Dec 2001 05:13:24 GMT
Message-ID: <fa.k50mnjv.18med3p@ifi.uio.no>
Lines: 18

Linus Torvalds (torva...@transmeta.com) writes
> On Mon, 17 Dec 2001, Rik van Riel wrote:
> >
> > Try readprofile some day, chances are schedule() is pretty
> > near the top of the list.
>
> Ehh.. Of course I do readprofile.
>  
> But did you ever compare readprofile output to _total_ cycles spent?
>
I have a feeling that this discussion got sidetracked: cpu cycles burnt 
in the scheduler indeed is non-issue, but big tasks being needlessly moved
around on SMPs is worth tackling.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!
news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!newsfeed.media.kyoto-u.ac.jp!
uio.no!nntp.uio.no!ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-ow...@vger.kernel.org>
X-Authentication-Warning: penguin.transmeta.com: torvalds owned process doing -bs
Original-Date: 	Mon, 17 Dec 2001 22:09:22 -0800 (PST)
From: Linus Torvalds <torva...@transmeta.com>
To: William Lee Irwin III <w...@holomorphy.com>
cc: Kernel Mailing List <linux-ker...@vger.kernel.org>,
        Jeff Garzik <jgar...@mandrakesoft.com>
Subject: Re: Scheduler ( was: Just a second ) ...
In-Reply-To: <20011217205547.C821@holomorphy.com>
Original-Message-ID: <Pine.LNX.4.33.0112172153410.2416-100000@penguin.transmeta.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: 	linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Tue, 18 Dec 2001 06:13:30 GMT
Message-ID: <fa.oc9fdnv.jk44re@ifi.uio.no>
References: <fa.h83cuuv.v72n03@ifi.uio.no>
Lines: 70


On Mon, 17 Dec 2001, William Lee Irwin III wrote:
>
>   5:   46490271          XT-PIC  soundblaster
>
> Approximately 4 times more often than the timer interrupt.
> That's not nice...

Yeah.

Well, looking at the issue, the problem is probably not just in the sb
driver: the soundblaster driver shares the output buffer code with a
number of other drivers (there's some horrible "dmabuf.c" code in common).

And yes, the dmabuf code will wake up the writer on every single DMA
complete interrupt. Considering that you seem to have them at least 400
times a second (and probably more, unless you've literally had sound going
since the machine was booted), I think we know why your setup spends time
in the scheduler.

> On Mon, Dec 17, 2001 at 08:27:18PM -0800, Linus Torvalds wrote:
> > Which sound driver are you using, just in case this _is_ the reason?
>
> SoundBlaster 16
> A change of hardware should help verify this.

A number of sound drivers will use the same logic.

You may be able to change this more easily some other way, by using a
larger fragment size for example. That's up to the sw that actually feeds
the sound stream, so it might be your decoder that selects a small
fragment size.

Quite frankly I don't know the sound infrastructure well enough to make
any more intelligent suggestions about other decoders or similar to try,
at this point I just start blathering.

But yes, I bet you'll also see much less impact of this if you were to
switch to more modern hardware.

grep grep grep.. Oh, before you do that, how about changing "min_fragment"
in sb_audio.c from 5 to something bigger like 9 or 10?

That

	audio_devs[devc->dev]->min_fragment = 5;

literally means that your minimum fragment size seems to be a rather
pathetic 32 bytes (which doesn't mean that your sound will be set to that,
but it _might_ be). That sounds totally ridiculous, but maybe I've
misunderstood the code.

Jeff, you've worked on the sb code at some point - does it really do
32-byte sound fragments? Why? That sounds truly insane if I really parsed
that code correctly. That's thousands of separate DMA transfers
and interrupts per second..

Raising that min_fragment thing from 5 to 10 would make the minimum DMA
buffer go from 32 bytes to 1kB, which is a _lot_ more reasonable (what,
at 2*2 bytes per sample and 44kHz would mean that a 1kB DMA buffer empties
in less than 1/100th of a second, but at least it should be < 200 irqs/sec
rather than >400).

		Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!
news.tele.dk!small.news.tele.dk!129.240.148.23!uio.no!nntp.uio.no!
ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-ow...@vger.kernel.org>
Original-Date: 	Tue, 18 Dec 2001 10:23:58 -0200 (BRST)
From: Rik van Riel <r...@conectiva.com.br>
X-X-Sender:  <r...@duckman.distro.conectiva>
To: Linus Torvalds <torva...@transmeta.com>
Cc: William Lee Irwin III <w...@holomorphy.com>,
        Kernel Mailing List <linux-ker...@vger.kernel.org>,
        Jeff Garzik <jgar...@mandrakesoft.com>
Subject: Re: Scheduler ( was: Just a second ) ...
In-Reply-To: <Pine.LNX.4.33.0112172153410.2416-100000@penguin.transmeta.com>
Original-Message-ID: <Pine.LNX.4.33L.0112181021520.10000-100000@duckman.distro.conectiva>
X-spambait: aardv...@kernelnewbies.org
X-spammeplease: 	aardv...@nl.linux.org
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: 	linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Tue, 18 Dec 2001 12:25:35 GMT
Message-ID: <fa.okkmg1v.3hmvj6@ifi.uio.no>
References: <fa.oc9fdnv.jk44re@ifi.uio.no>
Lines: 36

On Mon, 17 Dec 2001, Linus Torvalds wrote:
> On Mon, 17 Dec 2001, William Lee Irwin III wrote:
> >
> >   5:   46490271          XT-PIC  soundblaster
> >
> > Approximately 4 times more often than the timer interrupt.
> > That's not nice...

That's not nearly as much as your typical server system runs
in network packets and wakeups of the samba/database/http
daemons, though ...

> Well, looking at the issue, the problem is probably not just in the sb
> driver: the soundblaster driver shares the output buffer code with a
> number of other drivers (there's some horrible "dmabuf.c" code in common).

So you fixed it for the sound driver, nice.  We still have
the issue tha the scheduler can take up lots of time on busy
server systems, though.

(though I suspect on those systems it probably spends more
time recalculating than selecting processes)

regards,

Rik
-- 
DMCA, SSSCA, W3C?  Who cares?  http://thefreeworld.net/

http://www.surriel.com/		http://distro.conectiva.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: archiver1.google.com!news1.google.com!sn-xit-02!supernews.com!
news.tele.dk!small.news.tele.dk!129.240.148.23!uio.no!nntp.uio.no!
ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-ow...@vger.kernel.org>
Subject: Re: Scheduler ( was: Just a second ) ...
To: torva...@transmeta.com (Linus Torvalds)
Original-Date: 	Tue, 18 Dec 2001 14:09:16 +0000 (GMT)
Cc: r...@conectiva.com.br (Rik van Riel),
        davi...@xmailserver.org (Davide Libenzi),
        linux-ker...@vger.kernel.org (Kernel Mailing List)
In-Reply-To: <Pine.LNX.4.33.0112171825460.2108-100000@penguin.transmeta.com> 
from "Linus Torvalds" at Dec 17, 2001 06:35:54 PM
X-Mailer: ELM [version 2.5 PL6]
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Original-Message-Id: <E16GKvk-0007Sc-00@the-village.bc.nu>
From: Alan Cox <a...@lxorguk.ukuu.org.uk>
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: 	linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Tue, 18 Dec 2001 14:02:08 GMT
Message-ID: <fa.gb555vv.1858phf@ifi.uio.no>
References: <fa.obq1cnv.i4i5b6@ifi.uio.no>
Lines: 12

> to CD-RW disks without having to know about things like "ide-scsi" etc,
> and do it sanely over different bus architectures etc.
> 
> The scheduler simply isn't that important.

The scheduler is eating 40-60% of the machine on real world 8 cpu workloads.
That isn't going to go away by sticking heads in sand.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: archiver1.google.com!news1.google.com!sn-xit-02!supernews.com!
news.tele.dk!small.news.tele.dk!129.240.148.23!uio.no!nntp.uio.no!
ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-ow...@vger.kernel.org>
From: Alan Cox <a...@lxorguk.ukuu.org.uk>
Original-Message-Id: <200112181429.fBIETsf15577@pinkpanther.swansea.linux.org.uk>
Subject: Re: Scheduler ( was: Just a second ) ...
To: torva...@transmeta.com (Linus Torvalds)
Original-Date: 	Tue, 18 Dec 2001 14:29:54 +0000 (GMT)
Cc: w...@holomorphy.com (William Lee Irwin III),
        linux-ker...@vger.kernel.org (Kernel Mailing List),
        jgar...@mandrakesoft.com (Jeff Garzik)
In-Reply-To: <Pine.LNX.4.33.0112172153410.2416-100000@penguin.transmeta.com> 
from "Linus Torvalds" at Dec 17, 2001 10:09:22 
X-Mailer: ELM [version 2.5 PL6]
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: 	linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Tue, 18 Dec 2001 15:38:01 GMT
Message-ID: <fa.pr3ksmv.1c3sqrh@ifi.uio.no>
References: <fa.oc9fdnv.jk44re@ifi.uio.no>
Lines: 40

> Well, looking at the issue, the problem is probably not just in the sb
> driver: the soundblaster driver shares the output buffer code with a
> number of other drivers (there's some horrible "dmabuf.c" code in common).

The sb driver is fine

> A number of sound drivers will use the same logic.

Most hardware does

> Quite frankly I don't know the sound infrastructure well enough to make
> any more intelligent suggestions about other decoders or similar to try,
> at this point I just start blathering.

some of the sound stuff uses very short fragments to get accurate 
audio/video synchronization. Some apps also do it gratuitously when they
should be using other API's. Its also used sensibly for things like
gnome-meeting where its worth trading CPU for latency because 1K of
buffering starts giving you earth<->moon type conversations

> But yes, I bet you'll also see much less impact of this if you were to
> switch to more modern hardware.

Not really - the app asked for an event every 32 bytes. This is an app not
kernel problem.

> at 2*2 bytes per sample and 44kHz would mean that a 1kB DMA buffer empties
> in less than 1/100th of a second, but at least it should be < 200 irqs/sec
> rather than >400).

With a few exceptions the applications tend to use 4K or larger DMA chunks
anyway. Very few need tiny chunks.

Alan

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!
news.tele.dk!small.news.tele.dk!129.240.148.23!uio.no!nntp.uio.no!
ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-ow...@vger.kernel.org>
From: Alan Cox <a...@lxorguk.ukuu.org.uk>
Original-Message-Id: <200112181431.fBIEVW115600@pinkpanther.swansea.linux.org.uk>
Subject: Re: Scheduler ( was: Just a second ) ...
To: forve...@cfht.hawaii.edu (Thierry Forveille)
Original-Date: 	Tue, 18 Dec 2001 14:31:32 +0000 (GMT)
Cc: linux-ker...@vger.kernel.org
In-Reply-To: <15390.53230.827019.336771@hoku.cfht.hawaii.edu> 
from "Thierry Forveille" at Dec 17, 2001 07:11:10 
X-Mailer: ELM [version 2.5 PL6]
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: 	linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Tue, 18 Dec 2001 15:40:26 GMT
Message-ID: <fa.pnjom3v.1fjoj99@ifi.uio.no>
References: <fa.k50mnjv.18med3p@ifi.uio.no>
Lines: 14

> I have a feeling that this discussion got sidetracked: cpu cycles burnt 
> in the scheduler indeed is non-issue, but big tasks being needlessly moved
> around on SMPs is worth tackling.]

Its not a non issue - 40% of an 8 way box is a lot of lost CPU. Fixing the
CPU bounce around problem also matters a lot - Ingo's speedups seen just by 
improving that on the current scheduler show its worth the work


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!
news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!news.tele.dk!
small.news.tele.dk!129.240.148.23!uio.no!nntp.uio.no!ifi.uio.no!
internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-ow...@vger.kernel.org>
Original-Date: 	Tue, 18 Dec 2001 08:50:59 -0800
From: Mike Kravetz <krav...@us.ibm.com>
To: Alan Cox <a...@lxorguk.ukuu.org.uk>
Cc: Linus Torvalds <torva...@transmeta.com>,
        Rik van Riel <r...@conectiva.com.br>,
        Davide Libenzi <davi...@xmailserver.org>,
        Kernel Mailing List <linux-ker...@vger.kernel.org>
Subject: Re: Scheduler ( was: Just a second ) ...
Original-Message-ID: <20011218085059.A1176@w-mikek2.des.beaverton.ibm.com>
Original-References: 
<Pine.LNX.4.33.0112171825460.2108-100...@penguin.transmeta.com> 
<E16GKvk-0007Sc...@the-village.bc.nu>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.5i
In-Reply-To: <E16GKvk-0007Sc-00@the-village.bc.nu>; 
from alan@lxorguk.ukuu.org.uk on Tue, Dec 18, 2001 at 02:09:16PM +0000
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: 	linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Tue, 18 Dec 2001 16:55:13 GMT
Message-ID: <fa.lc9ou7v.7jmejt@ifi.uio.no>
References: <fa.gb555vv.1858phf@ifi.uio.no>
Lines: 15

On Tue, Dec 18, 2001 at 02:09:16PM +0000, Alan Cox wrote:
> The scheduler is eating 40-60% of the machine on real world 8 cpu workloads.
> That isn't going to go away by sticking heads in sand.

Can you be more specific as to the workload you are referring to?
As someone who has been playing with the scheduler for a while,
I am interested in all such workloads.

-- 
Mike
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!
news.tele.dk!small.news.tele.dk!129.240.148.23!uio.no!nntp.uio.no!
ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-ow...@vger.kernel.org>
X-Authentication-Warning: penguin.transmeta.com: torvalds owned process doing -bs
Original-Date: 	Tue, 18 Dec 2001 09:00:47 -0800 (PST)
From: Linus Torvalds <torva...@transmeta.com>
To: Alan Cox <a...@lxorguk.ukuu.org.uk>
cc: Rik van Riel <r...@conectiva.com.br>,
        Davide Libenzi <davi...@xmailserver.org>,
        Kernel Mailing List <linux-ker...@vger.kernel.org>
Subject: Re: Scheduler ( was: Just a second ) ...
In-Reply-To: <E16GKvk-0007Sc-00@the-village.bc.nu>
Original-Message-ID: <Pine.LNX.4.33.0112180854070.2867-100000@penguin.transmeta.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: 	linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Tue, 18 Dec 2001 17:04:30 GMT
Message-ID: <fa.o9ahevv.gk2537@ifi.uio.no>
References: <fa.gb555vv.1858phf@ifi.uio.no>
Lines: 37


On Tue, 18 Dec 2001, Alan Cox wrote:
>
> The scheduler is eating 40-60% of the machine on real world 8 cpu workloads.
> That isn't going to go away by sticking heads in sand.

Did you _read_ what I said?

We _have_ patches. You apparently have your own set.

Fight it out. Don't involve me, because I don't think it's even a
challenging thing. I wrote what is _still_ largely the algorithm in 1991,
and it's damn near the only piece of code from back then that even _has_
some similarity to the original code still. All the "recompute count when
everybody has gone down to zero" was there pretty much from day 1 (*).

Which makes me say: "oh, a quick hack from 1991 works on most machines in
2001, so how hard a problem can it be?"

Fight it out. People asked whether I was interested, and I said "no". Take
a clue: do benchmarks on all the competing patches, and try to create the
best one, and present it to me as a done deal.

		Linus

(*) The single biggest change from day 1 is that it used to iterate over a
global array of process slots, and for scalability reasons (not CPU
scalability, but "max nr of processes in the system" scalability) the
array was gotten rid of, giving the current doubly linked list. Everything
else that any scheduler person complains about was pretty much there
otherwise ;)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: archiver1.google.com!news1.google.com!sn-xit-02!supernews.com!
news.tele.dk!small.news.tele.dk!129.240.148.23!uio.no!nntp.uio.no!
ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-ow...@vger.kernel.org>
X-Authentication-Warning: penguin.transmeta.com: torvalds owned process doing -bs
Original-Date: 	Tue, 18 Dec 2001 09:22:37 -0800 (PST)
From: Linus Torvalds <torva...@transmeta.com>
To: Mike Kravetz <krav...@us.ibm.com>
cc: Alan Cox <a...@lxorguk.ukuu.org.uk>, Rik van Riel <r...@conectiva.com.br>,
        Davide Libenzi <davi...@xmailserver.org>,
        Kernel Mailing List <linux-ker...@vger.kernel.org>
Subject: Re: Scheduler ( was: Just a second ) ...
In-Reply-To: <20011218085059.A1176@w-mikek2.des.beaverton.ibm.com>
Original-Message-ID: <Pine.LNX.4.33.0112180918340.2867-100000@penguin.transmeta.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: 	linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Tue, 18 Dec 2001 17:25:43 GMT
Message-ID: <fa.oaqddvv.h4643b@ifi.uio.no>
References: <fa.lc9ou7v.7jmejt@ifi.uio.no>
Lines: 32


On Tue, 18 Dec 2001, Mike Kravetz wrote:
> On Tue, Dec 18, 2001 at 02:09:16PM +0000, Alan Cox wrote:
> > The scheduler is eating 40-60% of the machine on real world 8 cpu workloads.
> > That isn't going to go away by sticking heads in sand.
>
> Can you be more specific as to the workload you are referring to?
> As someone who has been playing with the scheduler for a while,
> I am interested in all such workloads.

Well, careful: depending on what "%" means, a 8-cpu machine has either
"100% max" or "800% max".

So are we talking about "we spend 40-60% of all CPU cycles in the
scheduler" or are we talking about "we spend 40-60% of the CPU power of
_one_ CPU out of 8 in the scheduler".

Yes, 40-60% sounds like a lot ("Wow! About half the time is spent in the
scheduler"), but I bet it's 40-60% of _one_ CPU, which really translates
to "The worst scheduler case I've ever seen under a real load spent 5-8%
of the machine CPU resources on scheduling".

And let's face it, 5-8% is bad, but we're not talking "half the CPU power"
here.

		Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!
news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!news.tele.dk!
small.news.tele.dk!129.240.148.23!uio.no!nntp.uio.no!ifi.uio.no!
internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-ow...@vger.kernel.org>
Original-Date: 	Tue, 18 Dec 2001 09:50:05 -0800 (PST)
From: Davide Libenzi <davi...@xmailserver.org>
X-X-Sender: dav...@blue1.dev.mcafeelabs.com
To: Linus Torvalds <torva...@transmeta.com>
cc: Mike Kravetz <krav...@us.ibm.com>, Alan Cox <a...@lxorguk.ukuu.org.uk>,
        Rik van Riel <r...@conectiva.com.br>,
        Kernel Mailing List <linux-ker...@vger.kernel.org>
Subject: Re: Scheduler ( was: Just a second ) ...
In-Reply-To: <Pine.LNX.4.33.0112180918340.2867-100000@penguin.transmeta.com>
Original-Message-ID: <Pine.LNX.4.40.0112180940400.1591-100000@blue1.dev.mcafeelabs.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: 	linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Tue, 18 Dec 2001 17:49:35 GMT
Message-ID: <fa.lvqmq9v.s7e9ol@ifi.uio.no>
References: <fa.oaqddvv.h4643b@ifi.uio.no>
Lines: 49

On Tue, 18 Dec 2001, Linus Torvalds wrote:

>
> On Tue, 18 Dec 2001, Mike Kravetz wrote:
> > On Tue, Dec 18, 2001 at 02:09:16PM +0000, Alan Cox wrote:
> > > The scheduler is eating 40-60% of the machine on real world 8 cpu workloads.
> > > That isn't going to go away by sticking heads in sand.
> >
> > Can you be more specific as to the workload you are referring to?
> > As someone who has been playing with the scheduler for a while,
> > I am interested in all such workloads.
>
> Well, careful: depending on what "%" means, a 8-cpu machine has either
> "100% max" or "800% max".
>
> So are we talking about "we spend 40-60% of all CPU cycles in the
> scheduler" or are we talking about "we spend 40-60% of the CPU power of
> _one_ CPU out of 8 in the scheduler".
>
> Yes, 40-60% sounds like a lot ("Wow! About half the time is spent in the
> scheduler"), but I bet it's 40-60% of _one_ CPU, which really translates
> to "The worst scheduler case I've ever seen under a real load spent 5-8%
> of the machine CPU resources on scheduling".
>
> And let's face it, 5-8% is bad, but we're not talking "half the CPU power"
> here.

Linus, you're plain right that we can spend days debating about the
scheduler load.
You've to agree that sharing a single lock/queue for multiple CPU is,
let's say, quite crappy.
You agreed that the scheduler is easy and the fix should not take that
much time.
You said that you're going to accept the solution that is coming out from
the mailing list.
Why don't we start talking about some solution and code ?
Starting from a basic architecture down to the implementation.
Alan and Rik are quite "unloaded" now, what do You think ?



- Davide


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!
news.tele.dk!small.news.tele.dk!129.240.148.23!uio.no!nntp.uio.no!
ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-ow...@vger.kernel.org>
Original-Date: 	Tue, 18 Dec 2001 16:34:57 +0100 (CET)
From: deg...@fhm.edu
Reply-To: deg...@fhm.edu
Subject: Re: Scheduler ( was: Just a second ) ...
To: a...@lxorguk.ukuu.org.uk
Cc: linux-ker...@vger.kernel.org
In-Reply-To: <E16GKvk-0007Sc-00@the-village.bc.nu>
MIME-Version: 1.0
Content-Type: TEXT/plain; charset=us-ascii
Original-Message-Id: <20011218164152.1E4835A3E@Nicole.fhm.edu>
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: 	linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Tue, 18 Dec 2001 18:05:59 GMT
Message-ID: <fa.gm6glbv.a5olir@ifi.uio.no>
References: <fa.gb555vv.1858phf@ifi.uio.no>
Lines: 24

On 18 Dec, Alan Cox wrote:

> The scheduler is eating 40-60% of the machine on real world 8 cpu
> workloads. That isn't going to go away by sticking heads in sand.

What about a CONFIG_8WAY which, if set, activates a scheduler that
performs better on such nontypical machines? I see and understand
boths sides arguments yet I fail to see where the real problem is
with having a scheduler that just kicks in _iff_ we're running the
kernel on a nontypical kind of machine.
This would keep the straigtforward scheduler Linus is defending
for the single processor machines while providing more performance
to heavy SMP machines by having a more complex scheduler better suited
for this task.

--
Servus,
       Daniel

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: archiver1.google.com!news1.google.com!sn-xit-02!supernews.com!
news.tele.dk!small.news.tele.dk!129.240.148.23!uio.no!nntp.uio.no!
ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-ow...@vger.kernel.org>
Original-Date: 	Tue, 18 Dec 2001 10:35:47 -0800
From: Mike Kravetz <krav...@us.ibm.com>
To: deg...@fhm.edu
Cc: a...@lxorguk.ukuu.org.uk, linux-ker...@vger.kernel.org
Subject: Re: Scheduler ( was: Just a second ) ...
Original-Message-ID: <20011218103547.B1176@w-mikek2.des.beaverton.ibm.com>
Original-References: <E16GKvk-0007Sc...@the-village.bc.nu> <20011218164152.1E4835...@Nicole.fhm.edu>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.5i
In-Reply-To: <20011218164152.1E4835A3E@Nicole.fhm.edu>; from degger@fhm.edu on Tue, Dec 18, 2001 at 04:34:57PM +0100
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: 	linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Tue, 18 Dec 2001 18:41:47 GMT
Message-ID: <fa.lc94tnv.7jqf3r@ifi.uio.no>
References: <fa.gm6glbv.a5olir@ifi.uio.no>
Lines: 24

On Tue, Dec 18, 2001 at 04:34:57PM +0100, deg...@fhm.edu wrote:
> What about a CONFIG_8WAY which, if set, activates a scheduler that
> performs better on such nontypical machines?

I'm pretty sure that we can create a scheduler that works well on
an 8-way, and works just as well as the current scheduler on a UP
machine.  There is already a CONFIG_SMP which is all that should
be necessary to distinguish between the two.

What may be of more concern is support for different architectures
such as HMT and NUMA.  What about better scheduler support for
people working in the RT embedded space?  Each of these seem to
have different scheduling requirements.  Do people working on these
'non-typical' machines need to create their own scheduler patches?
OR is there some 'clean' way to incorporate them into the source
tree?

-- 
Mike
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: archiver1.google.com!news1.google.com!sn-xit-02!supernews.com!
news.tele.dk!small.news.tele.dk!129.240.148.23!uio.no!nntp.uio.no!
ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-ow...@vger.kernel.org>
Original-Date: 	Tue, 18 Dec 2001 10:48:16 -0800 (PST)
From: Davide Libenzi <davi...@xmailserver.org>
X-X-Sender: dav...@blue1.dev.mcafeelabs.com
To: deg...@fhm.edu
cc: Alan Cox <a...@lxorguk.ukuu.org.uk>, lkml <linux-ker...@vger.kernel.org>
Subject: Re: Scheduler ( was: Just a second ) ...
In-Reply-To: <20011218164152.1E4835A3E@Nicole.fhm.edu>
Original-Message-ID: <Pine.LNX.4.40.0112181045040.1591-100000@blue1.dev.mcafeelabs.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: 	linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Tue, 18 Dec 2001 18:55:47 GMT
Message-ID: <fa.luacqav.unk9og@ifi.uio.no>
References: <fa.gm6glbv.a5olir@ifi.uio.no>
Lines: 35

On Tue, 18 Dec 2001 deg...@fhm.edu wrote:

> On 18 Dec, Alan Cox wrote:
>
> > The scheduler is eating 40-60% of the machine on real world 8 cpu
> > workloads. That isn't going to go away by sticking heads in sand.
>
> What about a CONFIG_8WAY which, if set, activates a scheduler that
> performs better on such nontypical machines? I see and understand
> boths sides arguments yet I fail to see where the real problem is
> with having a scheduler that just kicks in _iff_ we're running the
> kernel on a nontypical kind of machine.
> This would keep the straigtforward scheduler Linus is defending
> for the single processor machines while providing more performance
> to heavy SMP machines by having a more complex scheduler better suited
> for this task.

By using a multi queue scheduler with global balancing policy you can keep
the core scheduler as is and have the balancing code to take care of
distributing the load.
Obviously that code is under CONFIG_SMP, so it's not even compiled in UP.
In this way you've the same scheduler code running independently with a
lower load on the run queue and an high locality of locking.




- Davide


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!
news.tele.dk!small.news.tele.dk!129.240.148.23!uio.no!nntp.uio.no!
ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-ow...@vger.kernel.org>
Subject: Re: Scheduler ( was: Just a second ) ...
To: torva...@transmeta.com (Linus Torvalds)
Original-Date: 	Tue, 18 Dec 2001 19:17:58 +0000 (GMT)
Cc: a...@lxorguk.ukuu.org.uk (Alan Cox), r...@conectiva.com.br (Rik van Riel),
        davi...@xmailserver.org (Davide Libenzi),
        linux-ker...@vger.kernel.org (Kernel Mailing List)
In-Reply-To: <Pine.LNX.4.33.0112180854070.2867-100000@penguin.transmeta.com> 
from "Linus Torvalds" at Dec 18, 2001 09:00:47 AM
X-Mailer: ELM [version 2.5 PL6]
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Original-Message-Id: <E16GPkU-0008No-00@the-village.bc.nu>
From: Alan Cox <a...@lxorguk.ukuu.org.uk>
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: 	linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Tue, 18 Dec 2001 19:19:12 GMT
Message-ID: <fa.gb570fv.1b4am10@ifi.uio.no>
References: <fa.o9ahevv.gk2537@ifi.uio.no>
Lines: 18

> > The scheduler is eating 40-60% of the machine on real world 8 cpu workloads.
> > That isn't going to go away by sticking heads in sand.
> 
> Did you _read_ what I said?
> 
> We _have_ patches. You apparently have your own set.

I did read that mail - but somewhat later. Right now Im scanning l/k
every few days no more.

As to my stuff - everything I propose different to ibm/davide is about
cost/speed of ordering or minor optimisations. I don't plan to compete and
duplicate work
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: archiver1.google.com!news2.google.com!news1.google.com!
sn-xit-02!supernews.com!news.tele.dk!small.news.tele.dk!129.240.148.23!
uio.no!nntp.uio.no!ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-ow...@vger.kernel.org>
Original-Date: 	Thu, 20 Dec 2001 01:50:36 -0200 (BRST)
From: Rik van Riel <r...@conectiva.com.br>
X-X-Sender:  <r...@imladris.surriel.com>
To: Linus Torvalds <torva...@transmeta.com>
Cc: Benjamin LaHaise <b...@redhat.com>, Alan Cox <a...@lxorguk.ukuu.org.uk>,
        Davide Libenzi <davi...@xmailserver.org>,
        Kernel Mailing List <linux-ker...@vger.kernel.org>
Subject: Re: Scheduler ( was: Just a second ) ...
In-Reply-To: <Pine.LNX.4.33.0112181508001.3410-100000@penguin.transmeta.com>
Original-Message-ID: <Pine.LNX.4.33L.0112200149330.15741-100000@imladris.surriel.com>
X-spambait: aardv...@kernelnewbies.org
X-spammeplease: 	aardv...@nl.linux.org
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: 	linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Thu, 20 Dec 2001 03:52:10 GMT
Message-ID: <fa.nv36luv.t2o00n@ifi.uio.no>
References: <fa.oa9lcnv.gke5rc@ifi.uio.no>
Lines: 22

On Tue, 18 Dec 2001, Linus Torvalds wrote:

> The thing is, I'm personally very suspicious of the "features for that
> exclusive 0.1%" mentality.

Then why do we have sendfile(), or that idiotic sys_readahead() ?

(is there _any_ use for sys_readahead() ?  at all ?)

cheers,

Rik
-- 
Shortwave goes a long way:  irc.starchat.net  #swl

http://www.surriel.com/		http://distro.conectiva.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!
skynet.be!skynet.be!news.algonet.se!algonet!newsfeed1.uni2.dk!
news.net.uni-c.dk!uninett.no!uio.no!nntp.uio.no!ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-ow...@vger.kernel.org>
X-Authentication-Warning: penguin.transmeta.com: torvalds owned process doing -bs
Original-Date: 	Wed, 19 Dec 2001 21:52:41 -0800 (PST)
From: Linus Torvalds <torva...@transmeta.com>
To: Rik van Riel <r...@conectiva.com.br>
cc: Benjamin LaHaise <b...@redhat.com>, Alan Cox <a...@lxorguk.ukuu.org.uk>,
        Davide Libenzi <davi...@xmailserver.org>,
        Kernel Mailing List <linux-ker...@vger.kernel.org>
Subject: Re: Scheduler ( was: Just a second ) ...
In-Reply-To: <Pine.LNX.4.33L.0112200149330.15741-100000@imladris.surriel.com>
Original-Message-ID: <Pine.LNX.4.33.0112192149440.19321-100000@penguin.transmeta.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: 	linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Thu, 20 Dec 2001 05:55:43 GMT
Message-ID: <fa.n9lf6mv.cn0gob@ifi.uio.no>
References: <fa.nv36luv.t2o00n@ifi.uio.no>
Lines: 27


On Thu, 20 Dec 2001, Rik van Riel wrote:
> On Tue, 18 Dec 2001, Linus Torvalds wrote:
>
> > The thing is, I'm personally very suspicious of the "features for that
> > exclusive 0.1%" mentality.
>
> Then why do we have sendfile(), or that idiotic sys_readahead() ?

Hey, I expect others to do things in their tree, and I live by the same
rules: I do my stuff openly in my tree.

The Apache people actually seemed quite interested in sendfile. Of course,
that was before apache seemed to stop worrying about trying to beat
others at performance (rightly or wrongly - I think they are right
from a pragmatic viewpoint, and wrong from a PR one).

And hey, the same way I encourage others to experiment openly with their
trees, I experiment with mine.

			Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/