COROS II: Blood and Bone

In the last installment talking about COROS, I built up on various uses for coroutines, ending with a primitive scheduling tool. I then hinted at wrapping this up into an API that conceals the gorier elements of micromanaging coroutines.

This is that API.

EVQ

Where the CO (coroutine) component of COROS is the heart of the RTOS, EVQ is its circulatory system and skeleton. It is, at its simplest, an event queue mechanism, but one with some facilities that will permit (as is to be seen in later articles) some surprisingly sophisticated constructs.

The main job of EVQ, as with any event queue, is to decouple components in a system, having them communicate solely in terms of events. Events can be “wake up” calls (for example to flash a LED) or they can be sophisticated, state-carrying and -modifying messages. Similarly event handlers can be simple callback functions (familiar to anybody who has ever programmed Win16 or “modern” web frameworks), or they can be full-blown, independent coroutines using the CO component.

The mile-high view

Events are placed into an event queue. An application can have many event queues. Event queues are typically (but not necessarily) placed into a tree-structured event environment. Posting an event to an event queue makes it processed by any event queue higher in the chain; conversely, servicing an event queue processes all triggered events in all child queues.

This tree-structured approach allows standalone, plug-in components for ease of porting: components in a system only need to know of their own event queue and to have a parent queue passed in upon initialization. A form of priority mechanism can be implemented through the judicious use of child queues as well.

Event queues in EVQ are first class entities and can be passed around a system, even in messages, plausibly enabling Pi Calculus-like operations.

The EVQ API

The EVQ API is of necessity more involved than the CO API. That being said, it is not a particularly large one, and it is divided into five related groups:

Data types

Aside from the data types from CO (co_t and co_function), the following data types are used in EVQ:

typedef uint32_t delay_ms;
typedef uint32_t tick_ms; 
typedef struct _evq_queue_t *evq_queue_t;
typedef struct _evq_event_t *evq_event_t;
typedef void (*evq_callback)(void *);

delay_ms and tick_ms are simple counter values counting out milliseconds used primarily for making the intent of a variable clear. The first signals a duration for delays or repetitions. The second is a snapshot time.

evq_queue_t is the type of an event queue's opaque handle. It contains the details of an event queue including its parent, its children, the head of the events list, and other such information needed for running the queue.

evq_event_t contains the information about an event including its trigger time, its period for repeating events, the nature of its callback (coroutine or function), data that is associated with the event, its containing queue, its chaining information, and other data associated with moving the event smoothly through the system.

evq_callback is the type of the function used in simple function callbacks. (Coroutine callbacks are, naturally, of the co_t type.)

Event queue management

Event queue management is done with the following functions:

evq_queue_t evq_create_queue(void);
void evq_attach_queue(evq_queue_t parent, evq_queue_t child);
void evq_detach_queue(evq_queue_t child);

Event queues are created with evq_create_queue() and once created cannot be destroyed. Children are attached to a parent via evq_attach_queue(). A child can remove itself from its parent queue using evq_detach_queue() if needed, to either pause its own processing (say in preparation for a low-power phase of operation) or to move its relationship around.

Not all queues are necessarily attached at all times, or indeed ever. Values of evq_queue_t can be used as regular data, passed around to functions or through events, and can be directly processed at need, permitting a style of operation that is remarkably flexible even while it is carefully managed.

Persistent event management

Persistent events are those events which tend to be created statically, or at least long-term, and are then posted at need. Persistent events are most useful in the context of repeated events that cannot be periodically scheduled. An example of such a use case is an event to signal that a DMA buffer has been filled. Rather than adding the overhead of creating an event to the DMA's interrupt handler, a well-designed system would create an event and then post it at need manually.

Persistent events are managed with the following functions:

evq_event_t evq_create_fn_event(delay_ms delay, delay_ms period, evq_callback, void *);
evq_event_t evq_create_co_event(delay_ms delay, delay_ms period, co_t, void *);
void evq_post_event(evq_queue_t, evq_event_t, void *);
void evq_cancel_event(evq_event_t);
delay_ms evq_query_time_remaining(evq_event_t);
void evq_destroy_event(evq_event_t);

There are two functions to create events: one that creates an event intended to call a function on firing (evq_create_fn_event()), and one that creates an event intended to resume a coroutine on firing (evq_create_co_event()). The delivery mechanism aside, the two are otherwise functionally identical.

Events can be of three kinds: – immediate – deferred – periodic

An immediate event is ready to be fired the moment it is put in the event queue. This is signalled by providing a delay and period of 0.

A deferred event is delayed by a provided number of milliseconds. This is signalled by providing a delay of some non-zero value and a period of 0.

Both immediate and deferred events are automatically cancelled upon being fired and will not be in the event queue afterward.

A periodic event is a long-lived event that has a non-zero period value. It will first fire after its delay has passed in milliseconds (0 meaning, naturally, immediately), and will then re-fire ever period milliseconds thereafter until cancelled or destroyed.

Events are placed on the event queue for processing at their allotted time with the evq_post_event() function. Note that events can be created with a data pointer and that when posted a data pointer is also provided. If evq_post_event() supplies a non-NULL data, this overrides the data provided on creation. If it passes NULL, then the data provided on creation is used instead.

The amount of time an event has left before firing can be queried with evq_query_time_remaining(). An event in the queue can be removed from the queue with evq_cancel_event() and will be both removed and have the memory they hold be reclaimed through evq_destroy_event(). (Note that immediate and deferred events are automatically cancelled, but not destroyed.)

Ad-hoc event management

Ad-hoc events are dynamically-created events that are queued immediately upon creation. There are six functions for dealing with them, in two groups—one for coroutine events and one for function callback events:

/* callback events */
evq_event_t evq_queue_now(evq_queue_t, evq_callback, void *);
evq_event_t evq_queue_in(evq_queue_t, evq_callback, void *, delay_ms delay);
evq_event_t evq_queue_every(evq_queue_t, evq_callback, void *, delay_ms period);
/* coroutine events */
evq_event_t evq_resume_now(evq_queue_t, co_t, void *);
evq_event_t evq_resume_in(evq_queue_t, co_t, void *, delay_ms delay);
evq_event_t evq_resume_every(evq_queue_t, co_t, void *, delay_ms period);

The three functions in each group correspond to immediate, deferred, and periodic events with the proviso that the periodic event is deferred by its period, in effect the equivalent of making an event with evq_create_*_event() with a delay and period value that is the same.

In all cases, as soon as the function is called, behind the scenes the event is created from the provided data and inserted into the queue in a single operation. Unlike persistent events, however, ad-hoc events that are immediate or deferred will be destroyed upon firing, not merely cancelled. (Periodic ad-hoc events will, naturally, remain in being until manually destroyed.)

The main advantage to ad-hoc events is ease of use. The main disadvantage to them is the prospect of heap fragmentation as events and event data of various sizes is constantly allocated and cleared, leaving the possibility in the long term of a system failing because no available fragment is large enough to allocate an event. (There are mitigation strategies available and, indeed, COROS uses the one outlined.)

Event processing

The core of an application in an event-driven system like COROS is always the event pump. A typical application will initialize hardware, initialize any software components and then simply process events, letting control flow and activity be governed by events flowing through the system. To this end there are three functions reflecting three different strategies for managing events in applications:

void evq_process(evq_queue_t);
void evq_process_for(evq_queue_t, delay_ms);
void evq_process_one_tick(evq_queue_t);

The first of these should never exit. (If it does this is a gross failure and should lead to an application shutdown and/or restart!) evq_process() processes the provided queue and does nothing else in an endless loop.

evq_process_for() does the same as evq_process() but will return after a given duration. This permits the mainline code to periodically perform global maintenance duties like feeding a watchdog timer or the like, giving, in effect, an “idle task” for the application.

evq_process_one_tick() is even more extreme. It will process all currently-due events (if any) and then immediately return. This is provided for more involved or time-sensitive global duties and for hybrid systems which may not be able to push everything they need into the event queue (because of, say, a third-party library). This is also the likely function to be used when processing independent event queues or when children elevate themselves in priority by only processing their own events.

User-supplied functions

tick_ms evq_get_tick(void);

The way that various systems get timer ticks is to varied for EVQ to supply one of its own. Instead EVQ provides a weakly-linked implementation that asserts if called. To use EVQ a user must provide their own implementation of this function. This function must return a tick_ms number that increments as accurately as possible once every millisecond. In extreme cases where this granularity is not available (though this is not common on the targets EVQ is aimed at), the returned value may be incremented by however many milliseconds actually transpire between ticks. For example if the old MS-DOS timer tick of 50ms is used, each tick should increment the return value by 50.

Closing thoughts

EVQ is of necessity more involved than CO in COROS. It is not complicated, but it does have design decisions that could legitimately cause some scratching of heads.

Obvious design choices in the API are the split between callbacks and coroutines. Very simple event reactions (like toggling a LED or flipping a GPIO) are not well served by coroutines: these are rather heavyweight for such simple, nigh-instantaneous actions and callbacks are just better suited there. On the other hand doing any kind of complicated logic in only callbacks is the kind of nightmare that keeps people awake at night.

Less obvious a design choice, however, is the decision to provide a tree structure of event queues in addition to the ability to have independent event queues. The next article in this series will explore some of the reasoning behind this decision, (including a Pi Calculus-type approach to task management).