5-Part Series:

GitHub Repository: bahree/rust-microkernel — full source code and build scripts

Docker Image: amitbahree/rust-microkernel — prebuilt dev environment with Rust, QEMU, and source code


Recap from Part 1 : we have a bare-metal kernel that boots on AArch64, sets up a basic logger, and halts. It’s a single task, doing one thing. Now it’s time to add more tasks and let them talk to each other.

Right now, our kernel boots, prints a message, and halts - that’s it. One task, running alone, forever. But operating systems run hundreds of tasks at once, and those tasks need to communicate with one another. A scheduler needs to tell a display driver what to render. A network stack needs to hand received bytes to an application. How do those tasks communicate?

That’s what we’re building today. We’ll add message-passing IPC (inter-process communication) and a cooperative scheduler, turning our boot-only kernel into something that actually looks like a microkernel. Two tasks will run side by side, exchanging messages through a router we build from scratch.

TL;DR

We add two big things in this post. First, a message-passing IPC system: tasks communicate by sending small, fixed-size messages through a central router, rather than sharing memory directly. Second, a cooperative scheduler that polls tasks in round-robin order, giving each one a turn to do work before moving to the next.

A couple of terms we’ll use throughout.

  • Cooperative multitasking means that each task voluntarily relinquishes the CPU when it finishes its current unit of work. The OS trusts tasks to yield promptly, which is a nice assumption when it holds and a disaster when it doesn’t (more on that later).
  • Round-robin is the simplest scheduling strategy: we call each task in order, cycling through the list forever. Every task gets an equal turn. No priorities, no special treatment.

1. Why microkernel?

Most operating systems you use daily (Linux, Windows, macOS) use monolithic kernels. Everything runs in kernel space with full privileges: drivers, filesystems, networking, etc. This makes things fast because components can call each other directly, with no overhead. However, on the flip side, it’s fragile. A single bug in a GPU driver can corrupt kernel memory and take down the entire system, because that driver runs with the same unrestricted access as the scheduler and the memory manager.

Microkernels flip this around. Only the absolute minimum runs in kernel space: IPC, scheduling, and basic memory management. Everything else (drivers, filesystems, networking) runs in user space as separate, isolated processes. If a driver crashes, the kernel can restart it without rebooting. The tradeoff is speed. Every time a user-space driver needs to talk to the kernel or another driver, it must cross a privilege boundary, which costs cycles.

1.1 Famous microkernels

OSUsed inKey feature
seL4Aerospace, medical devicesFormally verified (mathematically provably correct)
Minix 3Intel ME firmwareExtreme isolation, automatic driver restart
QNXCars, industrial systemsHard real-time guarantees
L4 familyResearch, embeddedUltra-fast IPC (~100 cycles)

A microkernel forces us to think about interfaces - especially as we can’t just reach into another component’s data structures. We need to define a message format and a protocol. That constraint yields cleaner abstractions and makes each piece easier to reason about in isolation. When something goes wrong, we can trace the message flow rather than hunting through a tangle of shared state.

2. IPC: how tasks “talk”

In a microkernel, tasks don’t share memory. They communicate by sending messages through the kernel. This is called inter-process communication (IPC). It’s the lifeblood of a microkernel OS. The design of the IPC system has huge implications for performance, security, and ease of use. There are many ways to do IPC, but we’ll build a simple message-passing system with fixed-size messages and single-slot mailboxes. This is the model used by real microkernels like L4 and seL4, and it’s a great way to understand the core concepts without getting bogged down in complexity.

2.1 The problem with shared memory

The key design decision in microkernel IPC is to avoid shared memory. If tasks shared memory, they could read and write the same variables directly. This is simple and fast, but it leads to all sorts of problems, such as race conditions, data corruption, and security vulnerabilities. For example:

// Shared memory approach (NOT what we're building)
static mut SHARED_DATA: u32 = 0;

// Task A writes
unsafe { SHARED_DATA = 42; }

// Task B reads
let x = unsafe { SHARED_DATA };
Listing 1: Shared memory approach (not what we're doing)

On one hand, this looks simple, but there’s a problem. What happens if Task A is partway through writing a value when Task B reads it? Imagine A is changing a u32 counter from 255 (0x000000FF) to 256 (0x00000100). If B reads at just the wrong moment, it might see a half-updated value like 0x000001FF (511), which is neither the old value nor the new one. This is called a torn read.

Now, on our single-core cooperative system, this particular scenario can’t happen because tasks don’t run simultaneously. But the moment you add preemption (Part 3) or multiple cores, shared memory becomes a minefield. And even without race conditions, shared memory has other problems. There’s no access control (anyone can read or write), and bugs become “spooky action at a distance” where one task silently corrupts data that another task depends on.

2.2 Message-passing instead

The alternative is message-passing. Tasks don’t share any memory. Instead, they send discrete, self-contained messages through the kernel. Each message has a header (who it’s from, who it’s to, what type of message it is) and a payload (the actual data). The kernel routes messages to their destinations based on their headers. This way, communication becomes explicit and traceable. You can see every interaction by looking at the send and receive calls. The kernel controls delivery, so you get natural access control (a task can only receive messages addressed to it). And if the receiver isn’t keeping up, the router can push back on the sender rather than letting messages pile up in some unbounded buffer.

// Task A sends a message
router.send(Message {
    header: MsgHeader { dst: EndpointId::TaskB, .. },
    payload: [42, 0, 0, 0, 0, 0, 0, 0],
});

// Task B receives it
if let Some(msg) = router.recv(EndpointId::TaskB) {
    // process msg.payload
}
Listing 2: Message-passing approach (our design)

Communication becomes explicit, allowing us to trace every interaction by reading the send and receive calls. The kernel controls delivery, so you get natural access control (a task can only receive messages addressed to it). And if the receiver isn’t keeping up, the router can push back on the sender rather than letting messages pile up in some unbounded buffer. This is the model real microkernels use, and it’s what we’ll build.

3. Message structure

Let’s look at the actual code. Everything lives in crates/kernel/src/ipc.rs. This module defines the message format, the router, and some helper functions for working with messages. The design is simple but captures the essential features of a real IPC system.

3.1 Endpoints and message types

First, we need a way to identify who’s who. Each task gets an EndpointId, and each message has a MsgType to indicate its type. This is like addressing an envelope: the header says who it’s from, who it’s to, and what kind of message it is.

#[derive(Copy, Clone, Debug, Eq, PartialEq)]
#[repr(u8)]
pub enum EndpointId {
    Ping = 1,
    Pong = 2,
}

#[derive(Copy, Clone, Debug)]
#[repr(u8)]
pub enum MsgType {
    Ping = 1,
    Pong = 2,
}
Listing 3: Endpoint and message type definitions

An endpoint is like a mailing address. Each task is assigned a unique EndpointId, and messages are routed to their destination by looking up the recipient’s endpoint. Think of each task as having its own numbered post office box.

You’ll notice #[repr(u8)] on both enums. Here’s what that does. By default, Rust can choose whatever in-memory representation it likes for an enum, and that might be 4 bytes or more depending on alignment. #[repr(u8)] tells the compiler to store the enum’s discriminant as exactly one byte. On bare metal, we need precise control over the size of every data structure because these values end up in message buffers and will eventually cross privilege boundaries via syscalls. Without #[repr(u8)], our struct sizes would be unpredictable.

3.2 The message itself

pub const MAX_PAYLOAD: usize = 8;

#[derive(Copy, Clone, Debug)]
#[repr(C)]
pub struct MsgHeader {
    pub src: EndpointId,      // who sent this
    pub dst: EndpointId,      // who should receive it
    pub ty: MsgType,          // what kind of message
    pub len: u8,              // how many payload bytes are used
    pub seq: u32,             // sequence number for ordering
}

#[derive(Copy, Clone, Debug)]
#[repr(C)]
pub struct Message {
    pub header: MsgHeader,
    pub payload: [u8; MAX_PAYLOAD],
}
Listing 4: Message header and payload

A few design decisions to outline.

Why #[repr(C)]?

This tells Rust to lay out the struct in memory exactly like a C compiler would. The fields are stored in declaration order, with predictable padding and alignment. This is crucial for IPC because both the sender and receiver need to agree on where each field sits in memory. If Rust reordered fields for optimization, the sender might put the src field at byte 0 while the receiver expects it at byte 4, leading to chaos.

Why only 8 bytes of payload?

One might wonder why we limit the payload to just 8 bytes. This is a deliberate design choice to keep our IPC system simple and efficient. In a real microkernel, the IPC mechanism is optimized for small messages that fit in CPU registers, enabling very fast communication between tasks. By keeping the payload small, we can avoid the overhead of copying large amounts of data and instead pass capabilities (such as memory access rights) for larger transfers. This design also encourages a more message-oriented architecture, in which tasks exchange small commands or data rather than sharing large memory buffers.

Why sequence numbers?

Each message gets a seq field to help detect lost messages (if you receive seq 5 then seq 7, you know seq 6 went missing), duplicates (seq 5 received twice), and ordering issues. It’s a simple reliability mechanism that real IPC systems use too.

No heap allocations.

We’re working in a bare-metal environment with no heap allocator, so we can’t use Vec<u8> or Box<Message>. Instead, we use fixed-size arrays for the payload.

3.3 Payload helpers

We want to send structured data in the payload, but since it’s just a byte array, we need to serialize and deserialize it manually. For our Ping/Pong demo, we’ll just send a u32 sequence number in the payload. We use little-endian byte order for simplicity and consistency with ARM’s native endianness.

pub fn write_u32_le(dst: &mut [u8], v: u32) {
    dst[0] = (v & 0xFF) as u8;
    dst[1] = ((v >> 8) & 0xFF) as u8;
    dst[2] = ((v >> 16) & 0xFF) as u8;
    dst[3] = ((v >> 24) & 0xFF) as u8;
}

pub fn read_u32_le(src: &[u8]) -> u32 {
    (src[0] as u32)
        | ((src[1] as u32) << 8)
        | ((src[2] as u32) << 16)
        | ((src[3] as u32) << 24)
}
Listing 5: Payload serialization helpers

We manually shift bytes to serialize a u32 into the first four bytes of the payload. We can’t use u32::to_le_bytes() in a const context on all targets yet, and this works fine.

In case you are wondering what endianess is? It’s the order in which bytes are stored for multi-byte values. Little-endian means the least significant byte comes first. For example, the number 0x12345678 would be stored as 78 56 34 12 in little-endian. This is important to get right when serializing data into a byte array.

4. The router: mailbox-based IPC

Now that we have a message format, we need a way to deliver messages between tasks. In a microkernel, the kernel itself is responsible for routing messages. Each task has an endpoint, and the kernel maintains a mailbox for each endpoint. When a task sends a message, the kernel looks at the destination endpoint and drops the message into that endpoint’s mailbox. When a task wants to receive, the kernel checks if there’s anything in that task’s mailbox. This design decouples senders and receivers, allowing them to operate at their own pace. If the receiver is slow, the sender can still send messages (up to a point), and if the sender is fast, the receiver can process messages as they come in.

Here’s the flow:

sequenceDiagram
    participant PT as PingTask
    participant R as Router
    participant M as Pong Mailbox
    participant PO as PongTask

    PT->>R: send(msg to Pong)
    R->>M: Check if full
    alt Mailbox empty
        R->>M: Store message
        M-->>R: OK
        R-->>PT: Success
    else Mailbox full
        R-->>PT: MailboxFull error
    end

    Note over PO: Later, in poll()
    PO->>R: recv(Pong endpoint)
    R->>M: Check if message present
    alt Message available
        M->>R: Return message
        M->>M: Mark empty
        R->>PO: Some(message)
        PO->>PO: Process message
    else No message
        R->>PO: None
    end
Figure 1: IPC message flow between tasks

If the mailbox already has an unread message, send() fails. The sender has to try again later. This is called backpressure, and it’s the simplest possible version: zero buffering beyond one message. Without backpressure, a fast sender could flood a slow receiver, consuming unbounded memory. Our single-slot design prevents that by construction.

4.1 Implementation

Let’s look at the actual code for the Mailbox and Router. The Router owns one mailbox per endpoint and handles all the logic for sending and receiving messages. The code is straightforward and intentionally minimal, illustrating the core concepts without extra complexity.

#[derive(Copy, Clone, Debug)]
pub enum SendError {
    MailboxFull,
}

#[derive(Copy, Clone)]
struct Mailbox {
    full: bool,
    msg: Message,
}

impl Mailbox {
    const EMPTY: Message = Message {
        header: MsgHeader {
            src: EndpointId::Ping,
            dst: EndpointId::Ping,
            ty: MsgType::Ping,
            len: 0,
            seq: 0,
        },
        payload: [0; MAX_PAYLOAD],
    };

    const fn new() -> Self {
        Self { full: false, msg: Self::EMPTY }
    }

    fn put(&mut self, msg: Message) -> Result<(), SendError> {
        if self.full { return Err(SendError::MailboxFull); }
        self.msg = msg;
        self.full = true;
        Ok(())
    }

    fn take(&mut self) -> Option<Message> {
        if !self.full { return None; }
        self.full = false;
        Some(self.msg)
    }
}

pub struct Router {
    ping: Mailbox,
    pong: Mailbox,
}

impl Router {
    pub const fn new() -> Self {
        Self { ping: Mailbox::new(), pong: Mailbox::new() }
    }

    pub fn send(&mut self, msg: Message) -> Result<(), SendError> {
        match msg.header.dst {
            EndpointId::Ping => self.ping.put(msg),
            EndpointId::Pong => self.pong.put(msg),
        }
    }

    pub fn recv(&mut self, dst: EndpointId) -> Option<Message> {
        match dst {
            EndpointId::Ping => self.ping.take(),
            EndpointId::Pong => self.pong.take(),
        }
    }
}
Listing 6: Mailbox and Router implementation

The Mailbox struct has a full flag indicating whether it currently holds a message and a msg field to store the message itself. The put() method checks if the mailbox is already full; if it is, it returns an error. If not, it stores the message and marks the mailbox as full. The take() method checks if there’s a message to read; if there is, it returns the message and marks the mailbox as empty. If not, it returns None.

In the Router, we have one mailbox for each endpoint. The send() method examines the destination endpoint in the message header and attempts to place the message in the corresponding mailbox. The recv() method checks the specified endpoint’s mailbox for a message and returns it if available.

This design is simple and efficient for our demo. In a real microkernel, you might have more complex routing logic, support for multiple mailboxes per endpoint, or even a more sophisticated synchronization mechanism.ightly.

5. Static router placement

From the code we’ve seen so far, it’s clear that the Router needs to be accessible globally, since all tasks need to send and receive messages through it. How do we achieve that in Rust, especially in a no_std environment where we don’t have the luxury of a heap or dynamic initialization? That’s where things get interesting.

The problem is the Sync trait. Any static value must be Sync, meaning it’s safe to access from multiple threads simultaneously. Our Router has mutable state (those mailboxes change when messages are sent and received), so it’s not Sync by default. Rust is trying to protect us from data races, which is normally a good thing. But in our single-threaded bare-metal kernel, there are no other threads. We need to tell the compiler, “dude, trust us, this is fine.”

The solution involves three layers: UnsafeCell, a manual Sync implementation, and a linker section attribute to ensure that the router is placed in writable memory. Let’s break it down.

use core::cell::UnsafeCell;

#[repr(transparent)]
struct RouterCell(UnsafeCell<ipc::Router>);
unsafe impl Sync for RouterCell {}

#[link_section = ".data"]
static ROUTER: RouterCell = RouterCell(UnsafeCell::new(ipc::Router::new()));

pub fn kmain(logger: &dyn Logger) -> ! {
    let router: &mut ipc::Router = unsafe { &mut *ROUTER.0.get() };
    // ... use router for the rest of the kernel's life
}
Listing 7: Static router with writable section placement

Here’s how the three layers wrap the Router to make it safely accessible as a global static:

graph TD
    subgraph ".data section (Layer 3)"
        subgraph "RouterCell + unsafe impl Sync (Layer 2)"
            subgraph "UnsafeCell (Layer 1)"
                R["ipc::Router<br/>(mutable mailbox state)"]
            end
        end
    end

    L3["\`#[link_section = '.data']\`<br/>Forces writable memory placement"]
    L2["unsafe impl Sync<br/>Satisfies static requirement"]
    L1["UnsafeCell&lt;T&gt;<br/>Enables interior mutability"]

    L3 -.-> R
    L2 -.-> R
    L1 -.-> R

    style R fill:#f99,stroke:#333
    style L1 fill:#ff9,stroke:#333
    style L2 fill:#9f9,stroke:#333
    style L3 fill:#9ff,stroke:#333
Figure 2: Three layers wrapping the Router

And here’s the runtime dereference chain — how we actually get a usable &mut Router from the static:

flowchart LR
    A["ROUTER<br/>(static RouterCell)"] -->|".0"| B["UnsafeCell&lt;Router&gt;"]
    B -->|".get()"| C["*mut Router<br/>(raw pointer)"]
    C -->|"* (deref)"| D["Router<br/>(value)"]
    D -->|"&mut"| E["&mut Router<br/>(usable reference)"]

    style A fill:#9ff,stroke:#333
    style C fill:#ff9,stroke:#333
    style E fill:#9f9,stroke:#333
Figure 3: Runtime dereference chain

Layer 1: UnsafeCell

This is the key to interior mutability in Rust. UnsafeCell<T> is a special wrapper that tells the compiler, “I know this data will be mutated through shared references, but I promise to handle it safely.” Normally, Rust enforces that if you have a &T, you can’t mutate T. But UnsafeCell provides a .get() method that returns a raw mutable pointer (*mut T), allowing us to bypass the borrow checker. This is essential for our router because we need to mutate its state (the mailboxes) while it’s accessible globally.

Layer 2: Manual Sync implementation

One might wonder why we need to implement Sync manually. It is because UnsafeCell<T> deliberately doesn’t implement Sync to prevent data races. Since static variables require Sync, we wrap our Router in a RouterCell that contains an UnsafeCell. By writing unsafe impl Sync for RouterCell {}, we’re telling the compiler that we guarantee this will only be accessed from one context at a time. This is safe in our case because our kernel is single-threaded and our interrupt handlers don’t touch the router. In a more complex kernel with preemption or multiple cores, this would be a dangerous promise, but for our simple cooperative scheduler, it’s perfectly fine.

Layer 3: #[link_section = ".data"]

OK, this one bit is interesting. When the compiler produces a binary, it organizes data into sections. .rodata (read-only data) holds constants. .data holds initialized mutable data. On ARM, the MMU enforces these permissions: .rodata pages are mapped read-only, so writing to them triggers a data abort (the CPU equivalent of a segfault). The Rust compiler sees that ROUTER is a static initialized with a const fn and sometimes decides it belongs in .rodata. But we need to mutate it at runtime. #[link_section = ".data"] overrides the compiler’s choice and forces placement in the writable section. If we don’t do this, the kernel crashes on the first send() call. Now that seem a lot like my usual code. 😁

The unsafe dereference

In case you missed it, the expression unsafe { &mut *ROUTER.0.get() } is a bit of Rust wizardry. Let’s break it down:

  1. ROUTER is our static RouterCell.
  2. ROUTER.0 accesses the UnsafeCell<ipc::Router> inside RouterCell.
  3. .get() returns a *mut ipc::Router (raw mutable pointer).
  4. The * dereferences that raw pointer, giving us an ipc::Router.
  5. &mut borrows it as a mutable reference.
  6. The whole thing is wrapped in unsafe { ... } because the compiler can’t verify that no other code holds a reference to the same Router. We know it’s safe because we only call this once, at the start of kmain, and our kernel is single-threaded. This is a common pattern for global mutable state in no_std Rust, but it requires careful reasoning to ensure safety.
  7. The end result is that we get a &mut ipc::Router that we can use throughout the kernel to send and receive messages.

6. Task abstraction

Tasks are the fundamental units of work in our kernel. Each task has its own logic and state, and the scheduler manages which task runs at any given time. To make this work, we define a Task trait that all tasks must implement. This trait defines the contract for how tasks interact with the scheduler and the IPC system. Each task must provide an id() method to identify its endpoint and a poll() method that the scheduler calls every tick. The poll() method is where the task does its work: checking for messages, sending messages, updating state, etc. The cooperative multitasking model means that tasks must return from poll() quickly to allow other tasks to run. If a task needs to wait for something (like a message), it should return immediately and check again on the next tick.

6.1 What’s a task?

A task is an independent unit of work that the scheduler manages. Think of it as a lightweight thread - it has its own state and logic, but shares the CPU with other tasks. The scheduler decides which task runs at any given moment, switching between them to create the illusion that they all run simultaneously.

If you’ve used async/await in Rust (with tokio or async-std), the mental model is similar. Each async future does some work, then yields control back to the executor. Our tasks do the same thing. The scheduler calls each task’s poll() method; the task performs a small chunk of work (checking the mailbox, sending a message, logging something, etc.), then returns so the next task gets a turn. The one difference in our simple example is that our scheduler is the OS itself, not a userspace library.

6.2 The Task trait

pub trait Task {
    fn id(&self) -> EndpointId;
    fn poll(&mut self, logger: &dyn Logger, ipc: &mut ipc::Router, tick: u64);
}
Listing 8: Task trait definition

Two methods define the contract for a task:

  • id() - this returns the task’s endpoint identifier, so the scheduler (or the task itself) knows which mailbox to check.
  • poll() - this is where the work happens. It is called every scheduler iteration and must return quickly.

The logger parameter is &dyn Logger, a trait object. This means it’s a reference that can point to any type implementing the Logger trait, with the specific type resolved at runtime (dynamic dispatch). Different platforms have different logger implementations (COM1 serial, PL011 UART, mini-UART), but our kernel code works with all of them without knowing which concrete type it’s talking to. Same pattern as Box<dyn Error> in standard Rust, but we use a plain reference because we don’t have a heap.

The cooperative contract is simple - tasks must return from poll() quickly. No infinite loops, no blocking waits. If a task needs to wait for something (like a reply message), it returns immediately and checks again on the next tick.

7. PingTask and PongTask

Let’s see how this all comes together in practice. We have our IPC system, our scheduler, and the Task trait; now let’s see how real tasks use this system. We’ll build two tasks that play a simple game: PingTask sends a ping message every few ticks, PongTask receives it and replies with a pong. It’s the “hello world” of IPC. 😊

7.1 PingTask

The PingTask is a tiny state machine (with two states). It has a sequence number that increments with each ping, and a boolean flag to track whether it’s currently waiting for a pong reply. The logic is straightforward: if it’s not waiting, it sends a ping every 10 ticks. If it is waiting, it checks for a pong reply each tick. When it gets the pong, it flips back to idle and increments the sequence number.

pub struct PingTask {
    seq: u32,
    waiting: bool,
}

impl PingTask {
    pub const fn new() -> Self {
        Self { seq: 1, waiting: false }
    }
}

impl Task for PingTask {
    fn id(&self) -> EndpointId { EndpointId::Ping }

    fn poll(&mut self, logger: &dyn Logger, ipc: &mut ipc::Router, tick: u64) {
        if tick == 0 { logger.log("task/ping: poll\n"); }

        // Check for replies first
        if let Some(msg) = ipc.recv(self.id()) {
            if matches!(msg.header.ty, MsgType::Pong) {
                self.waiting = false;
                logger.log("task/ping: got pong\n");
            }
        }

        // Send a ping every 10 ticks, but only if we're not
        // already waiting for a reply
        if !self.waiting && (tick % 10) == 0 {
            let mut payload = [0u8; ipc::MAX_PAYLOAD];
            ipc::write_u32_le(&mut payload[0..4], self.seq);
            let msg = ipc::Message {
                header: ipc::MsgHeader {
                    src: EndpointId::Ping,
                    dst: EndpointId::Pong,
                    ty: MsgType::Ping,
                    len: 4,
                    seq: self.seq,
                },
                payload,
            };
            match ipc.send(msg) {
                Ok(()) => {
                    logger.log("task/ping: sent ping\n");
                    self.waiting = true;
                    self.seq = self.seq.wrapping_add(1);
                }
                Err(_) => {
                    logger.log("task/ping: send failed (queue full)\n");
                }
            }
        }
    }
}
Listing 9: PingTask implementation

When PingTask is idle (waiting == false), it checks every 10 ticks whether it’s time to send a ping. When it sends one, it changes to the waiting state. On the other hand, while waiting, it checks its mailbox each tick for a pong reply. When the pong arrives, it flips back to idle. The sequence number increments with each send (using wrapping_add so it rolls over instead of panicking at u32::MAX).

Notice how the task constructs the message manually: it fills a payload buffer, builds a header with source, destination, type, length, and sequence number, then hands the whole thing to the router. If the send fails because the destination mailbox is full, it logs the error and will try again on the next qualifying tick.

7.2 PongTask

The PongTask is even simpler. It just waits for pings and replies with pongs. It doesn’t need to track any state, so it’s a unit struct. Every tick, it checks its mailbox. If there’s a ping, it reads the sequence number from the payload, constructs a pong reply with the same sequence number, and sends it back to the Ping endpoint. It ignores the tick counter entirely since it doesn’t need to do anything on a timer.

pub struct PongTask;

impl PongTask {
    pub const fn new() -> Self { Self }
}

impl Task for PongTask {
    fn id(&self) -> EndpointId { EndpointId::Pong }

    fn poll(&mut self, logger: &dyn Logger, ipc: &mut ipc::Router, _tick: u64) {
        if let Some(msg) = ipc.recv(self.id()) {
            if matches!(msg.header.ty, MsgType::Ping) {
                let seq = ipc::read_u32_le(&msg.payload[0..4]);
                logger.log("task/pong: got ping\n");

                let mut payload = [0u8; ipc::MAX_PAYLOAD];
                ipc::write_u32_le(&mut payload[0..4], seq);
                let reply = ipc::Message {
                    header: ipc::MsgHeader {
                        src: EndpointId::Pong,
                        dst: EndpointId::Ping,
                        ty: MsgType::Pong,
                        len: 4,
                        seq,
                    },
                    payload,
                };
                let _ = ipc.send(reply);
            }
        }
    }
}
Listing 10: PongTask implementation

The reason PongTask is a unit struct (i.e., it has no fields) because it doesn’t need to track any state. Every tick, it checks its mailbox. If there’s a ping, it reads the sequence number from the payload, constructs a pong reply with the same sequence number, and sends it back to the Ping endpoint. Notice it ignores the tick counter entirely (the _tick prefix tells Rust we know it’s unused).

One subtle thing - PongTask uses let _ = ipc.send(reply) instead of a match on the result. It deliberately discards any send errors. For a simple echo-reply task, if the Ping mailbox is full, dropping the reply is acceptable. A more robust implementation might retry, but for our demo, this keeps things clean.

8. The scheduler

The scheduler is the heart of our kernel. It’s responsible for giving each task a turn to run and for managing the flow of time (ticks). In a real microkernel, the scheduler would be more complex, supporting priorities, preemption, and multiple cores. But for our demo, we keep it simple: it’s a cooperative round-robin scheduler that just iterates over a fixed list of tasks and calls poll() on each one every tick.

The scheduler itself is just a function that takes a list of tasks, a logger, and the IPC router. It runs an infinite loop where it calls poll() on each task, increments a tick counter, and halts the CPU until the next interrupt. The cooperative nature means that if any task takes too long in poll(), it can starve the others, but that’s a trade-off we’re making for simplicity.

pub fn run(
    tasks: &mut [&mut dyn Task],
    logger: &dyn Logger,
    ipc: &mut ipc::Router,
) -> ! {
    let mut tick: u64 = 0;
    logger.log("sched: starting\n");
    loop {
        for t in tasks.iter_mut() {
            t.poll(logger, ipc, tick);
        }
        tick = tick.wrapping_add(1);
        hal::arch::halt();
    }
}
Listing 11: Round-robin cooperative scheduler

That’s the whole thing. An infinite loop that iterates over every task, calls poll() on each one, bumps a tick counter, and halts the CPU until the next interrupt. The -> ! return type is the “never” type: this function never returns. On bare metal, there’s no OS to return to. The kernel runs forever.

A few things worth noting:

  • The tick counter uses wrapping_add(1) instead of += 1, which means when it hits u64::MAX it wraps back to zero instead of panicking.
  • At one tick per millisecond, a u64 would take about 584 million years to overflow, so this is really just defensive programming, but it’s a good habit in kernel code where panicking means the system dies.
  • The hal::arch::halt() call at the end of each iteration is important. On x86_64, it executes the HLT instruction, and on AArch64, it executes WFI (wait for interrupt). Both put the CPU into a low-power sleep state until the next hardware interrupt arrives. Without this, our loop would spin at full speed, burning power for no reason.

8.1 Wiring it all together

So we have our IPC system, our tasks, and our scheduler. Now we need to put it all together in the kernel’s entry point. This is where we initialize the router, create our tasks, and hand everything to the scheduler. After this point, the scheduler takes over, and we never return. kmain is the main function of our kernel, and it’s where we set up the system’s initial state.

pub fn kmain(logger: &dyn Logger) -> ! {
    logger.log("rustOS: kernel online\n");
    logger.log("rustOS: microkernel step 1 (IPC + cooperative scheduling)\n");

    let router: &mut ipc::Router = unsafe { &mut *ROUTER.0.get() };

    let mut ping = sched::PingTask::new();
    let mut pong = sched::PongTask::new();
    let mut tasks: [&mut dyn sched::Task; 2] = [&mut ping, &mut pong];

    sched::run(&mut tasks, logger, router)
}
Listing 12: Kernel entry point with scheduler invocation

We start by logging some messages to indicate the kernel is online and what we’re doing. Then we grab a mutable reference to the global router using the unsafe pattern we discussed earlier. We create our two tasks on the stack, put them into a fixed-size array of trait objects, and hand everything to the scheduler. The scheduler runs forever, polling both tasks on every iteration. Ping sends messages to Pong, Pong replies, and the cycle continues indefinitely.

9. Running the demo

How do we know it works? Let’s run it and see the logs. If everything is set up correctly, you should see a steady stream of ping-pong messages in the output, demonstrating that the tasks are communicating through the IPC system and that the scheduler is giving them turns to run.

Build and run the kernel as usual and watch the logs:

./scripts/build-aarch64-virt.sh && ./scripts/run-aarch64-virt.sh
Listing 13: Build and run the IPC demo
1
2
3
4
5
6
7
8
rustOS: aarch64 QEMU virt boot OK
rustOS: IPC + cooperative scheduling demo
rustOS: kernel online
rustOS: microkernel step 1 (IPC + cooperative scheduling)
sched: starting
task/ping: poll
task/ping: sent ping
task/pong: got ping
IPC demo output: tasks exchanging messages through the cooperative scheduler
Figure 4: IPC demo output showing tasks exchanging messages through the cooperative scheduler.

You should see the ping and pong messages alternating. The first tick (tick 0) triggers both the initial poll log and the first ping send (since 0 % 10 == 0). PongTask picks up the message and replies. On subsequent qualifying ticks, the pattern repeats.

10. Limitations of cooperative scheduling

OK, so we’ve got tasks talking to each other. That’s great. But there’s a fundamental problem with our scheduler. What happens if a task doesn’t cooperate? Imagine we have a BadTask that just spins forever in its poll() method, never returning control to the scheduler:

impl Task for BadTask {
    fn id(&self) -> EndpointId { EndpointId::Ping }
    fn poll(&mut self, _logger: &dyn Logger, _ipc: &mut ipc::Router, _tick: u64) {
        loop {
            // spin forever, the scheduler never regains control
        }
    }
}
Listing 14: A misbehaving task

In this scenario, the BadTask takes over the CPU and never yields. The scheduler is stuck waiting for poll() to return, but it never does. As a result, all other tasks are frozen. The system is effectively deadlocked because the scheduler has no way to interrupt BadTask and give someone else a turn. In other words, the entire system hangs.

This is the core weakness of cooperative scheduling - it relies on every task being well-behaved. If one task misbehaves, the whole system suffers. In a real OS running untrusted code, this is unacceptable. We need a way for the OS to forcibly take control back from a running task, regardless of what that task is doing.

The way to solve this is with preemptive scheduling. Instead of waiting for tasks to yield, the OS can use a hardware timer to interrupt the currently running task at regular intervals. When the timer fires, it triggers an interrupt, and the interrupt handler can switch to a different task. This way, even if a task misbehaves and never yields, the OS can still regain control and keep the system responsive.

The solution is preemptive scheduling: a hardware timer fires at regular intervals, triggers an interrupt, and the interrupt handler switches to a different task. The currently running task doesn’t get a choice. In Part 3 , we’ll set up timer interrupts on AArch64 and build a preemptive context switcher that saves and restores full CPU state. The cooperative scheduler we built here won’t go away (it’s still useful for understanding the basics), but we’ll layer preemption on top of it.

11. What we built

In this part, we took our boot-only kernel and turned it into a simple microkernel with IPC and cooperative multitasking. Starting from a kernel that could only boot and halt, we added three things. A message-passing IPC system with typed messages, endpoint-based routing, and single-slot mailboxes with backpressure. A trait-based task abstraction that lets us write independent units of work with a clean poll() interface. And a cooperative round-robin scheduler that gives each task a fair turn.

The Ping/Pong demo proves it works: two tasks communicate entirely through messages, with no shared mutable state, no unsafe data sharing, and explicit control flow that you can trace through the logs. This is the essence of microkernel design: tasks are isolated, communicate through well-defined channels, and the kernel provides minimal mechanisms to support that communication and scheduling.

However, cooperative scheduling has a fatal flaw. It trusts tasks to yield. If a task misbehaves and never yields, the whole system hangs. In Part 3 , we’ll fix that with timer interrupts and preemptive multitasking, allowing the OS to regain control even if a task goes rogue.

14. References


5-Part Series: