CPU scheduling (by alaric)
I came across the original paper on Lottery Scheduling.
What was all the fuss about? All it boils down to is using a weighting assigned to each process to divide time between CPU-bound processes at the same priority level. Back when I was a kid and wrote my first preemptive multitasking kernel, that's what I did, because being a lame MS-DOS programmer I hadn't thought of blocked processes being removed from the run queue, and I thought that 'priority' must be such a weighting.
Anyway, it got me thinking. Lottery scheduling is fine for long-running non-real-time processes at the same priority. A user running a few background computational tools such as distributed.net and SETI@Home would benefit from being able to choose the balance between the two at will, sure. But for event-driven tasks like a GUI frontend, a classic priority scheme would still win, if people bothered to set the priorities correctly. As it stands, unless they explicitly make a process 'nice', they usually just leave it to the OS to dynamically assign priority based upon the processes' I/O behaviour, using adaptive heuristics that take a while to respond to changes in the behaviour anyway.
And the inefficincies of this approach are often seen when some heavyweight background computation causes your GUI to become unresponsive... So, I wondered to myself, how can one combine the best of these two approaches for ARGON?
There are various different types of task a computer is called upon to perform, from the viewpoint of a schedule, each having several variables.
- Hard real time event handlers. These are pieces of code that should be invoked when some external event occurs; they must complete within a deadline. As such, they need to be written so that their maximum actual run time is known. The variables are the maximum event rate the system is expected to support, the deadline (the maximum amount of real time after an event occuring the system has to complete the event handler), and the maximum amount of CPU time the handler will require to complete. The difference between the last two tell the scheduler how long it can afford to delay executing the handler, while the maximum event rate can be used by the scheduler to reject the task creation if it cannot meet the requirement - a task that takes half a second of CPU time to complete, running ten times a second, on a single processor machine, would not be possible.
- Hard real time pollers. These are almost identical to event handlers - and can be considered handlers of timer interrupts - but the scheduler may be able to benefit from knowing that they are regular events, and reserve time slots for them in advance. The variables are the time the task should first run, then the time period between future runs, the deadline, and the CPU time the handler will require to complete.
- Soft real time versions of the above two. They are "soft" in that the requirement for the deadline to be met is loosened; the system may occasionally fail to meet deadlines. In practice, this means that the scheduler need not refuse tasks that it cannot guarantee it will be able to run within the deadline in future. This class is necessary, because deciding if a set of tasks with hard deadlines can be scheduled is NP complete; the scheduler will have to use a heuristic that errs on the safe side to decide whether to reject a task or not, and as such, if all tasks were considered to have hard real time constraints, then the system would reject a lot of tasks it could have made reasonable jobs of - and then spend a lot of time idle. Since the system may find itself in situations where several soft real time tasks all need the CPU, and it will not be possible for them all to meet their deadlines due to limited CPU time being available, they need an extra scheduler variable: priority. This enforces an ordering between tasks. In a situation where a number of tasks cannot all be serviced before their deadlines, the lowest priority tasks will be the ones that are delayed.
- Non real time event handlers. These are event handlers where the system merely has to make its best effort to complete the handler as soon as possible; there is no hard deadline. This removes the constraint on the handler being able to complete in a pre-determined maximum time interval. If no real time tasks need handling, then the system will perform non real time event handlers with its spare CPU time. These event handlers will fall into two classes; ones that complete rapidly, generally before another event occurs, and long-running handlers. Ones that complete rapidly ought to have priority over long-running handlers, to get them out of the way, thus freeing up system queue space and getting as many jobs completed as soon as possible, so the scheduler needs to be able to set a timer interrupt running whenever it starts executing one of these tasks; if the timer interrupt fires before the task completes, then the task becomes a member of the next class of task. The variables for this task type are priority, used to decide which of a number of event handlers that need executing should get to run, and weighting, which is not used when the task is running as an event handler, but will be used if it runs for its entire timeslice and gets demoted to the next category.
- Non real time bulk jobs. These are jobs that will run for some time, such as large database operations. They are characterised by the fact that they need to frequently yield the CPU to short-term important tasks higher up this list. Also, several of them may be running at once; and since slow progress towards completion of the task may well be better (and cannot be worse) than waiting until other tasks complete then running at full speed to complete the task in one go, it would seem that all such bulk jobs need to cooperate with each other. A normal priority scheme would result in the highest priority task running to completion, then lower priority tasks taking turns to run, down the priority list. Very low priority tasks, when faced with a deluge of high priority tasks, may be "starved" of CPU time and never get to run at all. Therefore, this is a type of task that, I think, does benefit from "lottery scheduling". As such, there are no priorities for bulk jobs; just a weighting variable. A bulk tasks with a weighting of 100, when the sum of the weightings of all the bulk tasks in existence is 1000, should get a tenth of the CPU time available for bulk jobs.
- Idle tasks. When the system really has nothing better to do, there are sometimes a few tasks that can be produced to make use of otherwise wasted CPU cycles. Of course, many architectures allow the CPU to be shut down until an event occurs, saving power. But in many systems, it is considered worth donating spare CPU time to good causes like SETI@Home. Idle tasks tend to be almost eternally runnable, rather than having a definite beginning and end; as such, rather than being spawned by event handler code, they ought to be started every time an ARGON node starts up, from a list of idle tasks associated with the node combined with a cluster-wide list. Since there may be several possible idle tasks, rather than choosing one based upon priority, again a lottery scheduler with a weighting variable is applicable.
- And if there aren't even any idle tasks available, or they're all blocked on uploading their results to the SETI@Home servers, then the system can still shut down the CPU until the next interrupt or, failing that, sit in a busy-wait loop.
So can one scheduler really handle all the above types of tasks?
Pages: 1 2