Infinite work streams are the new reality of
most systems. Web servers and application servers
serve very large user populations where it is
realistic to expect infinite streams of new work.
The work never ends. Requests come in 24 hours a day
7 days a week. Work could easily saturate
servers at 100% CPU usage.
Traditionally we have considered 100% CPU usage a bad sign.
As compensation we create complicated infrastructures
to load balance work, replicate state, and cluster
machines.
CPUs don't get tired so you might think we would
try to use the CPU as much as possible.
In other fields we try to increase productivity by
using a resource to the greatest extent possible.
In the server world we try to guarantee a certain
level of responsiveness by forcing an artificially
low CPU usage. The idea is if we don't have CPU
availability then we can't respond to new work with a
reasonable latency or complete existing work.
Is there really a problem with the CPU being used
100% of the time? Isn't the real problem that we use CPU
availability and task priority as a simple cognitive
shorthand for architecting a system rather than having
to understand our system's low level work streams and using
that information to make specific scheduling decisions?
We simply don't have the tools to do anything other
than make clumbsy architecture decisions based on
load balancing servers and making guesses at the
number of threads to use and the priorities for
those threads.
We could use 100% of CPU time if we could:
0. Schedule work so that explicit locking is uncessary (though possible). This
will help prevent dead lock and priority inversion.
1. Control how much of the CPU work items can have.
2. Decide on the relative priority of work and schedule work by
that priority.
3. Have a fairness algorithm for giving a particular level of service
to each work priority.
4. Schedule work CPU allowance across tasks.

Recent comments
23 weeks 4 days ago