Latency is the amount
of time it takes a process to be given a cpu
once it is made runnable. The graph at right
represents, on average, the latency of all
processes during the last sample period.
A process may be made runnable but also be made to
wait for a free processor if
- All processors are currently busy, and
- Higher or equal priority tasks are already waiting
in front of it.
Some code paths are not optimized to search for idle
processors upon wakeup and so processes awakened by
those paths will need to wait for a rebalance to
happen (1ms or less if there is a processor idle.)
Ideally, there are always fewer processors than
processes
and every process always has a processor waiting for
it when it wants to run. In practice, this would
characterize an underutilized machine. The key
here is to keep
latencies as close to zero as possible. If
you are
always at zero, then perhaps your load is
mismatched to the machine.
The lower this number is, the better.
A runslice is the amount of time
a process takes
(on average) when it runs. This differs from a
timeslice, which is the amount of time it is
permitted to run before being forced off the
processor. The graph at right indicates the
average runslice time of all processes during
the sample period.
The numbers here are neither inherently good
nor inherently bad, but can be used to help
characterize the load on the system. Loads which
have very short runslices are not using their
full timeslice, and so modifying the length of
those timeslices (either through kernel twiddling
or nice(1)) is unlikely to have any effect
on the system. They are voluntarily giving up the
processor, probably because they are waiting for
some other event such as a signal, semaphore, or
I/O completion. Loads which have large runslices,
on the other hand, are more compute-intensive and
may be leaving the processor involuntarily. These
types of processes
may well see different behavior if their timeslices
are shortened or lengthened.
Runslice information and latency information may
be used to roughly predict the average queue
length. If processes are waiting 10ms, on average,
and running only 2ms, on average, then when a
process is queued it probably has 10/2 or 5 processes
in front of it. On average, remember!
Should you wish to, the easiest way to change a
timeslice is to use
nice(1).
For comparison, the normal timeslice for a process
with a nice of 0 is 100ms. It can range
from as little as 10ms to as much as 200ms. Changing
the minimum, maximum, or algorithm for calculating
it in between requires, however, an actual kernel
patch.
|
 |
 |
|
|
The top graph at right shows the number of times
load_balance() was called per second.
This will range from about 5 times per second per
processor, to as high as a thousand times per
second per processor when it is idle.
The number of times it is
called is not, in itself, of great interest, but
if it is being called frequently to do balancing
and failing to find processes to balance (see
find_busy_queue()), some heuristics
may need to be tuned.
load_balance() may be called from a limited
number of places. It may be called when a processor
is idle or busy. In addition, it may be called when
the scheduler
realizes a processor is about to go idle,
in an
attempt to bring over one or more processes to this
soon-to-be-idle processor.
This graph indicates the number of times
load_balance() was called while
the processor was already idle.
When the processor is idle, load_balance()
can be called every clock tick, or up to a thousand
times per second, per processor. (If the processor
is idle, why not go looking for jobs?) On
a 8-processor system that was completely idle, then,
you'd expect to see nearly 8000 calls per second.
If this number is lower, it indicates either
the system was not completely idle or the queue is
not being checked once per clock tick. Being
low or
high in any of these categories is not in
itself necessarily good or bad,
but it can give evidence of or confirmation
of other behavior in the system.
This graph indicates the number of times
load_balance() was called while
the processor was about to become idle.
The last graph at right indicates the number of times
load_balance() was called while
the processor was busy.
When the processor is busy, load_balance()
is called far less often than when idle; typically
about five times per second, per processor. On
a 8-processor system that was fully loaded, then,
you'd expect to see about 40 calls per second.
If this number is lower, it indicates the
processor was not that busy.
|
 |
 |
 |
 |
|
|
pull_task() is called to move exactly
one task from one runqueue to another. It is
presumed we are pulling from another
queue to our own, as this simplifies the
process considerably. The graph at right shows
the number of times pull_task() was called.
pull_task() can be called when
the processor is idle, newly idle, or busy.
This number is neither good nor bad
by itself, but since it represents the end result
of a complex decision-making process, knowing how
many tasks were pulled might be very helpful in
determining how successful other functions are.
The
graph at right shows the number of times
pull_task() was called while the
processor was idle.
This number is neither good nor bad
by itself, but since it represents the end result
of a complex decision-making process, knowing how
many tasks were pulled might be very helpful in
determining how successful other functions are.
pull_task() can be called when
the processor is idle, newly idle, or busy. The
graph at right shows the number of times
pull_task() was called while the
processor was busy.
This number is neither good nor bad
by itself, but since it represents the end result
of a complex decision-making process, knowing how
many tasks were pulled might be very helpful in
determining how successful other functions are.
pull_task() can be called when
the processor is idle, newly idle, or busy. The
graph at right shows the number of times
pull_task() was called while the
processor was newly idle.
This number is neither good nor bad
by itself, but since it represents the end result
of a complex decision-making process, knowing how
many tasks were pulled might be very helpful in
determining how successful other functions are.
|
 |
 |
 |
 |
|
|
The graph at right indicates the number of times
sched_migrate_task()
(or migrate_to_cpu(), in earlier versions)
was called. This
function can only be called when a process execs.
The theory is that when a process execs, it is
giving up its previous image and we can be
untroubled about such things as whether it is
likely that the memory it wants to use is already
in cache. Upon exec, it will require new text and
new data pages anyway.
Unless the system is madly creating processes,
this is unlikely to be called more than a few
times per second per processor, and in many
benchmarks it's not unusual to see it tail off
to zero after initialization completes and a few
core processes are started.
This number is neither good nor bad,
but it helps characterize the load and possibly
add more meaning to other data.
|
 |
|
|
load_balance() may be called from a limited
number of places. It may be called when a processor
is idle, or busy. In some kernels, it may also be
called from schedule() when the scheduler
realizes a processor is about to go idle, in an
attempt to bring over one or more processes to this
soon-to-be-idle processor.
This graph indicates the number of times
load_balance() was called while
the processor was idle.
When the processor is idle, load_balance()
is called every clock tick, or up to a thousand
times per second, per processor. (If the processor
is idle, why not go looking for jobs?) On
a 8-processor system that was completely idle, then,
you'd expect to see about 8000 calls per second.
If this number is lower, it indicates the
processor was more busy. Being low or
high in itself isn't necessarily good or bad,
but it can give evidence of or confirmation
of other behavior in the system.
|
 |
|
|
The graph at right shows the number
of tasks moved by active_load_balance(). The
purpose of active_load_balance() is
described in the graph for
calls to active_load_balance().
The lower this number is, the better.
|
 |
|
|
The graph at right shows the
number of calls to active_load_balance().
Normally tasks are pulled from other
processors to the processor doing the
balancing. This function is called by the
migration threads utilized when a overburdened
processor realizes that not
enough of the other processors are stealing its
tasks -- in essence doing a "push" rather than a
"pull". This is usually a complex procedure, and
is a stopgap measure to prevent imbalance from
existing too long on the system. Ideally, the
other balancing algorithms are doing a good job
and this gets called very infrequently.
The lower this number is, the better.
If it is high (more than a few times per
second), other balancing algorithms
may need to be tuned.
|
 |
|
|
When load_balance() is called, it will
call find_busiest_queue() to determine if
there is any queue busier than itself from which
it should pull tasks. If find_busiest_queue()
is successful, it will also include the imbalance
that it found.
In releases prior to 2.6.6,
if this queue has 5
processes waiting to run and another queue has 7,
find_busiest_queue() will indicate an
imbalance of 2. (Note that the
actual number of tasks that need to move to create
balance is half that, or 1.) In releases
subsequent to 2.6.6, and in -mm trees after 2.6.2,
the "imbalance" is actually the number of tasks
to move -- that is, it's already divided by two.
In the above example, it would return 1, not 2.
Since load_balance() is called so frequently
when the machine is idle, counting a failure of
find_busiest_queue() as "zero imbalance"
would quickly run the numbers uselessly close to
zero. So what is graphed here to the right is the
average imbalance when there was an
imbalance found.
As an exceptional case, if
load_balance() was not called during
the sample or was but never detected an imbalance,
a value of zero was entered on the graph rather than
create discontinuities.
The lower this number, the better.
|
 |
|
|
The graph at right shows the number of times
sys_sched_yield() was called. This
function instructs the scheduler to take the
caller off the processor. How long it should
be off the processor is implementation-dependent,
and programs using this function to create very short
delays usually needs to retuned after system
modifications, much to the maintainers' chagrin.
Because of its unpredictability as a substitute
for a quick delay and the subsequent need to be
constantly retune applications utilizing it,
using this function is to be discouraged.
Accordingly, lower numbers are better,
with zero being the best score possible.
Nevertheless, some applications and libraries
still use it (notably many Java implementations);
these applications and libraries may benefit from
a retuning from time to time as the operating
system changes.
|
 |
|
|
The graph at right shows the number of times
schedule() was called. This
function is the heart of the scheduler and is
called every time a scheduling decision needs to
be made or possible reevaluated --
that is, every time a sleeping process wakes up or
a running process goes to sleep. It's also called
at many other times when priorities may have changed
and the "currently running process" may need to
be changed. Systems with low
runslices may see a correspondingly higher
frequency of schedule() calls, as more
jobs are being switched in and out per second.
This number is neither good
nor bad, but does help characterize the
load when interpreting other results. Although
it's been written carefully, schedule()
is not a cheap function to call and any
modifications at either the kernel or user
level that result in fewer calls to schedule()
will probably improve performance.
|
 |
|
|
The graph at right shows the number of times
sched_balance_exec() was called. This
function is called each time a process does an
exec(). When possible, it will call
sched_migrate_task() to move the task
to a less busy cpu, since at exec time the
task has no resident text or data pages on this
(or any) processor. See the description on
sched_migrate_task()
for a more thorough explanation.
This number, like the count of
sched_migrate_task(), is neither good
nor bad, but does help characterize the
load when interpreting other results.
|
 |
|