Making a headless AI assistant observable - without SSH

GitHub Repository: bahree/nanoclaw - full source code

NanoClaw is a headless AI assistant running on my personal server. It processes messages from WhatsApp, Telegram, and Slack, runs scheduled tasks, and manages conversations with Claude agents in isolated containers. It’s been incredibly useful, but it had one major pain point: no visibility into what it was doing or why. If something went wrong - a message didn’t get a reply, a task didn’t run - the only way to debug was to SSH into the server, tail logs, and piece together what happened. Not ideal when you’re on the go and your assistant just… stops responding.

I use NanoClaw as my personal AI assistant - it handles everything from answering questions, to tracking flights, to giving me a daily morning briefing with F1 standings and weather. It is my agent and allows for me to check on the kind of thing I want to check on from my phone, not have to SSH into a server for.

The problem with a headless service that runs 24/7 is that you can’t see what it’s doing. When a message doesn’t get a reply, or a scheduled task doesn’t fire, the debugging workflow is: SSH into the server, tail the pino logs, grep for timestamps, piece together what happened. Not great when you’re out and about and your assistant just… stops responding.

I wanted three things:

A way to ask the system “what are you doing right now?” from the same WhatsApp chat I use to talk to it.
A way to manage scheduled tasks without SSH.
And a way to ask “why did you do that?” after the fact, with full traceability from triggering event to outbound action.

This post covers all three and how we built them in NanoClaw - nine features (~1100 lines of new TypeScript, two new modules, and three new SQLite tables).

TL;DR

Given the existing architecture of NanoClaw, we added a suite of observability features that are all accessible from the main WhatsApp group. No SSH, no separate dashboards, no external logging services - just commands you can type in the chat to see what’s going on and manage the system.

I added three groups of features to NanoClaw:

Real-time visibility - /status shows uptime, memory, active containers, channels, groups, and task summaries. /status tasks shows the full task list with schedules, next run times, and IDs.
Operational control - /task pause|resume|delete <id> manages scheduled tasks directly from the chat. No SSH, no restarts.
Event tracing and debugging - three SQLite tables that trace every action back to its triggering event. Pipeline is instrumented at message ingress, agent output, scheduled tasks, and IPC. Query with /debug last 10, /debug why, /debug event <id>, and /debug report. Auto-prunes after 3 days.

All operated entirely from the messaging channel.

1. The architecture (quick context)

Before diving in, here’s how NanoClaw works at a high level. Understanding this makes the instrumentation decisions clearer. NanoClaw is a OSS fork of OpenClaw, so it shares the same core architecture:

flowchart LR
    WA[WhatsApp] --> ORC[Orchestrator]
    TG[Telegram] --> ORC
    SL[Slack] --> ORC
    ORC --> DB[(SQLite)]
    ORC --> Q[Group Queue]
    Q --> C1[Container 1]
    Q --> C2[Container 2]
    C1 --> IPC[IPC Watcher]
    IPC --> ORC
    SCH[Task Scheduler] --> Q

Figure 1: NanoClaw message flow

Messages arrive from channels, get stored in SQLite, and the orchestrator polls for new messages every 2 seconds. When a registered group has unprocessed messages, the group queue spawns a container running Claude’s Agent SDK. The agent’s output streams back through the orchestrator to the originating channel. Scheduled tasks follow the same path but are triggered by a cron/interval scheduler rather than by incoming messages.

The key insight: everything flows through the orchestrator. That’s where we intercept commands, instrument actions, and expose state. The group queue manages the container lifecycle. The database is already there for message storage. All the pieces are in place - we just need to wire them up.

2. What was built

Here’s the full inventory. Two new modules, six modified files, nine distinct features:

#	Feature	Type	Files
1	`/status` - system dashboard	Command	`status.ts`, `index.ts`
2	`/status tasks` - task detail view	Command	`status.ts`, `index.ts`
3	`/task pause\|resume\|delete` - task management	Command	`status.ts`, `index.ts`
4	`GroupQueue.getStatus()` - queue introspection	API	`group-queue.ts`
5	Three-table event log schema	Schema	`db.ts`
6	`logEvent` / `logAction` / `logToolCall`	Module	`event-log.ts` (new)
7	Pipeline instrumentation	Instrumentation	`index.ts`, `task-scheduler.ts`, `ipc.ts`
8	`/debug last\|why\|event\|report`	Command	`status.ts`, `event-log.ts`
9	Auto-pruning with configurable retention	Config	`event-log.ts`, `config.ts`

All commands are restricted to the main group only. Non-main groups are silently ignored, preventing random group members from querying system status or managing tasks.

3. /status - real-time system dashboard

The /status command assembles information from several subsystems into a single message. It queries the group queue for container states, the database for registered groups and tasks, and formats it all as a WhatsApp-friendly message.

The output is designed to give a quick overview of the system’s health and activity at a glance:

/status output showing uptime, an active container processing an F1 query, WhatsApp channel, and 2 active cron tasks — **Figure 2:** /status while the agent is answering an F1 question - containers, channels, groups, and tasks at a glance.

What it shows:

Uptime and memory - how long the process has been running, RSS in MB
Timezone - the configured timezone (important for cron schedules)
Containers - active/max concurrent, with per-group detail (idle, processing, running task, queued)
Channels - which messaging channels are connected (WhatsApp, Telegram, etc.)
Groups - all registered groups with the main group indicator
Tasks - count of active/paused tasks, next upcoming task with time-until
Remote control - whether a remote Claude Code session is active

The implementation pulls from existing subsystems - no new state tracking was needed:

export function buildStatus(queue: GroupQueue, channels: Channel[]): string {
  const uptime = formatDuration(Date.now() - startTime);
  const mem = Math.round(process.memoryUsage.rss() / 1024 / 1024);

  const qs = queue.getStatus();       // new method on GroupQueue
  const channelNames = channels.map((ch) => ch.name).join(', ');
  const groups = getAllRegisteredGroups();
  const tasks = getAllTasks();
  const activeTasks = tasks.filter((t) => t.status === 'active');
  const rc = getActiveSession();       // remote control state

  const lines: string[] = [
    `*${ASSISTANT_NAME} Status*`,
    `----------------`,
    `Uptime: ${uptime}`,
    `Memory: ${mem} MB`,
    `Timezone: ${TIMEZONE}`,
    ``,
    `*Containers:* ${qs.activeCount}/${qs.maxConcurrent} active`,
    `*Channels:* ${channelNames || 'none'}`,
  ];
  // ... active container details, groups, tasks, remote control
  return lines.join('\n');
}

Listing 1: Building the status response

3.1 Exposing queue internals

The GroupQueue class already tracked everything we needed internally: active containers, pending messages, pending tasks, and idle state. It just wasn’t exposed. A new getStatus() method surfaces this without leaking internal implementation:

getStatus(): {
  activeCount: number;
  maxConcurrent: number;
  waitingCount: number;
  groups: Array<{
    jid: string;
    active: boolean;
    idleWaiting: boolean;
    isTaskContainer: boolean;
    pendingMessages: boolean;
    pendingTaskCount: number;
  }>;
} {
  const groups = [];
  for (const [jid, state] of this.groups) {
    if (state.active || state.pendingMessages || state.pendingTasks.length > 0) {
      groups.push({
        jid,
        active: state.active,
        idleWaiting: state.idleWaiting,
        isTaskContainer: state.isTaskContainer,
        pendingMessages: state.pendingMessages,
        pendingTaskCount: state.pendingTasks.length,
      });
    }
  }
  return {
    activeCount: this.activeCount,
    maxConcurrent: MAX_CONCURRENT_CONTAINERS,
    waitingCount: this.waitingGroups.length,
    groups,
  };
}

Listing 2: GroupQueue.getStatus() - queue introspection

Only groups with activity (active, pending messages, or pending tasks) are included - no noise from idle groups. The status output translates these states into human-readable labels: “idle”, “processing”, “running task”, or “queued”.

4. /status tasks - task detail view

While /status includes a task summary (count + next upcoming), /status tasks gives the full picture. Each task shows its prompt (truncated to 50 chars), schedule type, next run time, last run time, and the task ID you need for management commands.

/status tasks showing daily morning briefing with F1 standings, AI news digest, and completed flight tracking tasks — **Figure 3:** /status tasks - two active daily crons (morning briefing with F1 + weather, AI news digest) and completed one-offs (flight tracking, reminders).

function formatTaskLine(task: ScheduledTask, index: number): string {
  const status = task.status === 'active' ? ''
    : task.status === 'paused' ? ' [paused]' : ' [done]';
  const schedule = task.schedule_type === 'cron'
    ? `cron: ${task.schedule_value}`
    : task.schedule_type === 'interval'
      ? `every ${task.schedule_value}`
      : `once`;
  const prompt = task.prompt.length > 50
    ? task.prompt.slice(0, 50) + '...' : task.prompt;
  const next = task.next_run ? formatTimeUntil(task.next_run) : 'n/a';
  const lastRun = task.last_run ? formatTimeAgo(task.last_run) : 'never';

  return [
    `*${index + 1}.* ${prompt}${status}`,
    `   Schedule: ${schedule}`,
    `   Next: ${next} | Last: ${lastRun}`,
    `   ID: ${task.id}`,
  ].join('\n');
}

Listing 3: Formatting a task line

Tasks are grouped by status - active first, then paused, then completed. The relative time formatting (in 2h 15m, 3m 42s ago) makes it easy to see at a glance what’s coming up and what ran recently.

5. /task - operational control

The /task command turns the chat into a control plane. No more SSH-ing in to pause a runaway task or clean up a completed one.

Command	What it does
`/task pause <id>`	Pause an active task (stops scheduling, preserves config)
`/task resume <id>`	Resume a paused task
`/task delete <id>`	Delete a task and its run history

/task pause and /task resume on the AI news digest task — **Figure 4:** Pausing and resuming the AI news digest task - no SSH, no restart.

export function handleTaskCommand(
  args: string,
): { ok: true; message: string } | { ok: false; error: string } {
  const parts = args.trim().split(/\s+/);
  const action = parts[0]?.toLowerCase();
  const taskId = parts.slice(1).join(' ');

  const task = getTaskById(taskId);
  if (!task) return { ok: false, error: `Task not found: ${taskId}` };

  switch (action) {
    case 'pause':
      if (task.status !== 'active')
        return { ok: false, error: `Task is already ${task.status}` };
      updateTask(taskId, { status: 'paused' });
      return { ok: true, message: `Paused: "${task.prompt.slice(0, 50)}"` };
    case 'resume':
      if (task.status !== 'paused')
        return { ok: false, error: `Task is ${task.status}, not paused` };
      updateTask(taskId, { status: 'active' });
      return { ok: true, message: `Resumed: "${task.prompt.slice(0, 50)}"` };
    case 'delete':
      deleteTask(taskId);
      return { ok: true, message: `Deleted: "${task.prompt.slice(0, 50)}"` };
    default:
      return { ok: false, error: `Unknown action: ${action}` };
  }
}

Listing 4: Task command handler

The result type ({ ok: true; message } | { ok: false; error }) is a pattern used throughout NanoClaw for commands; the caller doesn’t need to know the implementation details, just whether it succeeded and what to tell the user. State validation is done upfront (can’t pause an already-paused task, can’t resume an active one).

6. Command interception

All built-in commands (/status, /status tasks, /task, /debug) share the same interception pattern: they’re caught at the top of the onMessage callback, before the message is stored or processed.

const channelOpts = {
  onMessage: (chatJid: string, msg: NewMessage) => {
    const trimmed = msg.content.trim();

    // Built-in commands - intercept before storage
    if (trimmed === '/status' || trimmed === '/status tasks') {
      handleStatus(trimmed, chatJid, msg).catch(/* ... */);
      return;  // don't store, don't process
    }
    if (trimmed.startsWith('/task ')) {
      handleTaskCmd(chatJid, msg).catch(/* ... */);
      return;
    }
    if (trimmed.startsWith('/debug')) {
      handleDebugCmd(chatJid, msg).catch(/* ... */);
      return;
    }

    // ... sender allowlist filtering, event logging, message storage
    storeMessage(msg);
  },
};

Listing 5: Intercepting built-in commands before storage

The return before storeMessage() is the key design decision. These commands are ephemeral – they shouldn’t appear in the conversation history, shouldn’t trigger the agent, and shouldn’t affect message cursors. They’re handled entirely by the host process and respond instantly (no container spawn needed).

Each handler checks group?.isMain before proceeding. Non-main-group commands are silently dropped, with a warning in the server logs.

7. Event/action/tool logging

The commands above tell you what’s happening now. But when something went wrong an hour ago, you need a trail. That’s where event logging comes in.

7.1 The three-table schema

NanoClaw’s existing SQLite schema was focused on message storage and container state. We needed a new schema to capture the full traceability from inbound triggers to outbound actions to tool calls. The design is a simple three-table structure. Every inbound trigger is an event, every outbound action is an action linked to its triggering event, and every tool invocation is a tool call linked to its parent action.

CREATE TABLE IF NOT EXISTS event_log (
  id           TEXT PRIMARY KEY,    -- UUID
  timestamp    DATETIME DEFAULT CURRENT_TIMESTAMP,
  source       TEXT NOT NULL,       -- 'whatsapp', 'telegram', 'scheduled_task', 'ipc'
  source_id    TEXT,                -- message ID, task ID, etc.
  raw_content  TEXT,                -- full payload (JSON, truncated to 10KB)
  summary      TEXT                 -- human-readable one-liner
);

CREATE TABLE IF NOT EXISTS action_log (
  id            TEXT PRIMARY KEY,
  timestamp     DATETIME DEFAULT CURRENT_TIMESTAMP,
  triggered_by  TEXT REFERENCES event_log(id),
  action_type   TEXT NOT NULL,      -- 'message_sent', 'task_scheduled', 'tool_call'
  target        TEXT,               -- group JID, email address, task ID
  content       TEXT,               -- what was sent or done
  tool_calls    TEXT                -- JSON array of tool names
);

CREATE TABLE IF NOT EXISTS tool_call_log (
  id           TEXT PRIMARY KEY,
  action_id    TEXT REFERENCES action_log(id),
  timestamp    DATETIME DEFAULT CURRENT_TIMESTAMP,
  tool_name    TEXT NOT NULL,
  input        TEXT,                -- JSON (truncated to 10KB)
  output       TEXT,                -- JSON (truncated to 10KB)
  duration_ms  INTEGER,
  success      INTEGER DEFAULT 1
);

CREATE INDEX IF NOT EXISTS idx_event_log_timestamp ON event_log(timestamp);
CREATE INDEX IF NOT EXISTS idx_action_log_triggered_by ON action_log(triggered_by);
CREATE INDEX IF NOT EXISTS idx_tool_call_log_action_id ON tool_call_log(action_id);

Listing 6: Event logging schema

The foreign key chain is event_log <- action_log <- tool_call_log. Given any action, you can trace back to why it happened. Given any event, you can see everything it caused. The indexes support the /debug query patterns: filtering by timestamp (pruning), joining by triggered_by (event lookups), and grouping by action_id (tool call chains).

flowchart TD
    E[Event: WhatsApp message received] --> A1[Action: message_sent to group]
    E --> A2[Action: task_scheduled]
    A1 --> T1[Tool Call: runContainerAgent]
    A1 --> T2[Tool Call: channel.sendMessage]

Figure 5: Event -> Action -> Tool Call tracing chain

These tables live in the same messages.db file as everything else. No additional file handles, no additional backup concerns, no additional connection management. They’re created in the existing createSchema() function using CREATE TABLE IF NOT EXISTS, so they’re added transparently on first startup after the upgrade.

7.2 The logger module

Three functions, matching the three tables. All IDs are UUIDs via crypto.randomUUID(). All writes are fire-and-forget - wrapped in try/catch, errors logged at debug level, never blocking the pipeline.

export function logEvent(
  source: string,
  sourceId: string | null,
  rawContent: unknown,
  summary: string,
): string {
  const id = crypto.randomUUID();
  try {
    insertEventStmt().run(
      id, new Date().toISOString(), source, sourceId,
      truncate(rawContent), summary,
    );
  } catch (err) {
    logger.debug({ err, source, sourceId }, 'Failed to log event');
  }
  return id;  // always returns an ID, even if the write failed
}

export function logAction(
  triggeredBy: string | null,
  actionType: string,
  target: string | null,
  content: unknown,
  toolCalls?: string[],
): string {
  const id = crypto.randomUUID();
  try {
    insertActionStmt().run(
      id, new Date().toISOString(), triggeredBy, actionType,
      target, truncate(content),
      toolCalls ? JSON.stringify(toolCalls) : null,
    );
  } catch (err) {
    logger.debug({ err, actionType, target }, 'Failed to log action');
  }
  return id;
}

Listing 7: Core logging functions

Every call returns an ID, so callers can chain events -> actions -> tool calls, even if the write fails silently. This is intentional - the pipeline code doesn’t check whether logging succeeded, it just carries the ID forward.

The truncate() helper caps content at 10KB to prevent DB bloat:

const MAX_CONTENT_SIZE = 10 * 1024;

function truncate(value: unknown): string | null {
  if (value === undefined || value === null) return null;
  const str = typeof value === 'string' ? value : JSON.stringify(value);
  if (str.length > MAX_CONTENT_SIZE) return str.slice(0, MAX_CONTENT_SIZE);
  return str;
}

Listing 8: Content truncation

7.3 The tool call wrapper

logToolCall is different from the other two - it wraps an async operation and automatically records input, output, duration, and success/failure:

export async function logToolCall<T>(
  actionId: string,
  toolName: string,
  input: unknown,
  fn: () => Promise<T>,
): Promise<T> {
  const id = crypto.randomUUID();
  const start = Date.now();
  let success = true;
  let output: unknown = null;

  try {
    const result = await fn();
    output = result;
    return result;
  } catch (err) {
    success = false;
    output = err instanceof Error ? err.message : String(err);
    throw err;  // re-throw - logging doesn't swallow errors
  } finally {
    const durationMs = Date.now() - start;
    try {
      insertToolCallStmt().run(
        id, actionId, new Date().toISOString(), toolName,
        truncate(input), truncate(output), durationMs, success ? 1 : 0,
      );
    } catch (logErr) {
      logger.debug({ err: logErr, toolName }, 'Failed to log tool call');
    }
  }
}

Listing 9: Tool call wrapper with automatic instrumentation

The finally block ensures the log entry is written regardless of whether the wrapped operation succeeded or failed. The error is always re-thrown - logToolCall is transparent to the caller. It’s a decorator pattern: wrap any async operation and get free instrumentation.

Insert statements are lazily prepared and reused across calls, avoiding the overhead of re-preparing the same SQL on every log write.

7.4 Instrumenting the pipeline

With the logger module in place, instrumentation is surgical - a few lines at each key point in the message flow:

Inbound messages (in the onMessage callback):

// Detect channel from JID format
const evtChannel = chatJid.includes('@g.us') || chatJid.includes('@s.whatsapp.net')
  ? 'whatsapp'
  : chatJid.startsWith('tg:') ? 'telegram'
  : chatJid.startsWith('dc:') ? 'discord'
  : chatJid.startsWith('sl:') ? 'slack'
  : 'channel';

logEvent(
  evtChannel, msg.id,
  { sender: msg.sender_name, content: msg.content?.slice(0, 200) },
  `Message from ${msg.sender_name}: ${(msg.content || '').slice(0, 80)}`,
);

Listing 10: Logging inbound messages with channel detection

Agent processing (when the orchestrator starts handling a group’s messages):

// Log the processing event - this ID links to all downstream actions
const eventId = logEvent(
  'message_batch', chatJid,
  { messageCount: missedMessages.length, group: group.name },
  `Processing ${missedMessages.length} message(s) for ${group.name}`,
);

Listing 11: Logging the processing event and carrying the eventId

Outbound messages (in the streaming output callback):

if (text) {
  await channel.sendMessage(chatJid, text);
  logAction(eventId, 'message_sent', chatJid, text.slice(0, 500));
}

Listing 12: Logging outbound actions linked to the triggering event

The eventId from the processing event links the outbound action back to the batch that triggered it. This is the chain that /debug why follows.

Scheduled tasks - logged at the point the scheduler picks up a due task, and again when the result is sent to the user:

const eventId = logEvent(
  'scheduled_task', task.id,
  { prompt: task.prompt.slice(0, 200), schedule: task.schedule_type },
  `Scheduled task: ${task.prompt.slice(0, 80)}`,
);

// ... later, when the container produces output:
await deps.sendMessage(task.chat_jid, streamedOutput.result);
logAction(eventId, 'message_sent', task.chat_jid, result?.slice(0, 500) ?? null);

Listing 13: Scheduled task instrumentation

IPC - logged when the IPC watcher processes messages and task operations from containers:

const ipcEventId = logEvent(
  'ipc', null,
  { chatJid: data.chatJid, sourceGroup, text: data.text?.slice(0, 200) },
  `IPC message from ${sourceGroup}`,
);
await deps.sendMessage(data.chatJid, data.text);
logAction(ipcEventId, 'message_sent', data.chatJid, data.text?.slice(0, 500) ?? null);

Listing 14: IPC message instrumentation

8. The /debug commands

Four subcommands for querying the event log, all main-group only:

Command	What it shows
`/debug last <n>`	Last n actions with their triggering events
`/debug why`	Most recent action with full event context and tool call chain
`/debug event <id>`	Everything triggered by a specific event
`/debug report`	Summary dashboard: table sizes, events by source, actions by type, busiest hours, recent errors

8.1 /debug last - quick scan

The “what happened recently?” view. Joins action_log with event_log to show each action alongside what caused it:

/debug last 5 showing F1 query response, flight tracking updates, and Gmail cleanup actions — **Figure 6:** /debug last 5 - the F1 response, flight landing alert, and Gmail cleanup, each traced back to its trigger.

export function getLastActions(n: number): Array<{
  action: ActionLogRow;
  event: EventLogRow | null;
}> {
  const rows = getDb()
    .prepare(
      `SELECT a.*, e.id as e_id, e.timestamp as e_timestamp,
              e.source as e_source, e.source_id as e_source_id,
              e.summary as e_summary
       FROM action_log a
       LEFT JOIN event_log e ON a.triggered_by = e.id
       ORDER BY a.timestamp DESC
       LIMIT ?`,
    )
    .all(n);
  // ... map to structured result
}

Listing 15: Querying last N actions with LEFT JOIN

The LEFT JOIN is important - some actions might not have a triggering event (e.g., system-initiated actions), and we still want to see them.

8.2 /debug why - full trace

Answers “why did the last thing happen?” by pulling the most recent action, its triggering event, and all associated tool calls:

/debug why showing the most recent action with its triggering WhatsApp message and tool call chain — **Figure 7:** /debug why - tracing the most recent action back to the WhatsApp message that caused it.

export function getLastActionWithToolCalls(): {
  action: ActionLogRow;
  event: EventLogRow | null;
  toolCalls: ToolCallLogRow[];
} | null {
  const results = getLastActions(1);
  if (results.length === 0) return null;

  const { action, event } = results[0];
  const toolCalls = getDb()
    .prepare(
      `SELECT * FROM tool_call_log WHERE action_id = ? ORDER BY timestamp`,
    )
    .all(action.id) as ToolCallLogRow[];

  return { action, event, toolCalls };
}

Listing 16: Full trace for the most recent action

The output shows the full chain: triggering event (source, summary, timestamp, ID), the action taken (type, target, content), and each tool call with its duration and success/failure status. Copy the event ID from here and use /debug event <id> to see everything else that event triggered.

8.3 /debug report - summary dashboard

The “is everything healthy?” view. Aggregates across all three tables into a single report:

/debug report showing event counts by source, action breakdown, and busiest hours from a day of flight tracking and F1 queries — **Figure 8:** /debug report - a day's worth of WhatsApp messages, flight tracking, Gmail cleanup, and F1 queries summarized.

What it includes:

Retention period - configured days and current time window
Table sizes - row counts for events, actions, and tool calls
Events by source - breakdown by channel (whatsapp, telegram, scheduled_task, ipc)
Actions by type - breakdown by what was done (message_sent, task_scheduled, etc.)
Busiest hours - top 5 hours by event count, in local timezone
Recent failed tool calls - last 10 with tool name, duration, and error output
Recent errors - last 10 error-like actions with their triggering event

One thing worth noting: the busiest hours are computed in JavaScript using toLocaleString with the configured time zone, not in SQL. SQLite stores timestamps as UTC (as ISO strings), and performing timezone conversions in SQL would require loading an extension. Instead, we fetch the raw timestamps and bucket them in JS:

const allTimestamps = db.prepare(`SELECT timestamp FROM event_log`).all();
const hourCounts = new Map<string, number>();
for (const { timestamp } of allTimestamps) {
  const localHour = new Date(timestamp).toLocaleString('en-US', {
    timeZone: TIMEZONE,
    hour: 'numeric', hour12: true,
  });
  hourCounts.set(localHour, (hourCounts.get(localHour) || 0) + 1);
}

Listing 17: Bucketing event timestamps by local hour

Learned this the hard way when the report initially showed UTC hours, and I couldn’t figure out why 5 PM was my busiest time. 😄

9. Auto-pruning and retention

An unbounded observability system is a liability. Logs older than 3 days are automatically deleted. The retention period is configurable via the EVENT_LOG_RETENTION_DAYS environment variable (set to 0 to disable pruning).

// config.ts
export const EVENT_LOG_RETENTION_DAYS = Math.max(
  0,
  parseInt(process.env.EVENT_LOG_RETENTION_DAYS || '3', 10) || 3,
);
export const EVENT_LOG_PRUNE_INTERVAL = 60 * 60 * 1000; // hourly

Listing 18: Log retention configuration

Pruning runs at startup (clean up anything that expired while the service was down) and then every 60 minutes:

export function pruneOldLogs() {
  if (EVENT_LOG_RETENTION_DAYS === 0) return;

  const cutoff = new Date(
    Date.now() - EVENT_LOG_RETENTION_DAYS * 24 * 60 * 60 * 1000,
  ).toISOString();

  const db = getDb();
  // Delete in FK-safe order: children first
  db.prepare(`DELETE FROM tool_call_log WHERE action_id IN (
    SELECT id FROM action_log WHERE timestamp < ?
  )`).run(cutoff);
  db.prepare(`DELETE FROM action_log WHERE timestamp < ?`).run(cutoff);
  db.prepare(`DELETE FROM event_log WHERE timestamp < ?`).run(cutoff);
}

export function startLogPruning(): void {
  pruneOldLogs();
  const timer = setInterval(pruneOldLogs, EVENT_LOG_PRUNE_INTERVAL);
  timer.unref();   // don't keep the process alive for pruning
}

Listing 19: Pruning with FK-safe deletion order

The deletion order matters: tool_call_log rows reference action_log, which in turn references event_log. Deleting parents first would violate foreign key constraints. The timer.unref() call ensures the pruning interval doesn’t prevent graceful shutdown.

10. Design decisions

A few choices that are worth calling out:

Fire-and-forget, not await. Every logging call is synchronous (better-sqlite3) and wrapped in try/catch. If the write fails - disk full, DB locked, schema mismatch - the error is logged at debug level and the pipeline continues. The logging system is never on the critical path. An observability system that can take down the thing it’s observing is worse than useless.

Same database, no new dependencies. The logging tables live in messages.db alongside messages, tasks, sessions, and router state. No new files to back up, no new connections to manage, no new packages to install. CREATE TABLE IF NOT EXISTS means existing installations pick up the schema on restart.

Commands intercepted before storage. /status, /task, and /debug messages never reach the agent container. They don’t appear in conversation history, don’t trigger container spawns, and don’t affect message cursors. This is important - a /status check shouldn’t cost you a container slot or show up as context in the agent’s next conversation.

Prepared statements, lazily created. The insert statements are created on first use and reused across calls. For a system logging every message and action, re-preparing SQL on every call would add up.

UUIDs for everything. crypto.randomUUID() for all IDs. No auto-increment, no collision risk across restarts, and IDs are meaningful in isolation (you can paste one into /debug event <id> without context).

11. Try it yourself

If you’re running NanoClaw (or OpenClaw), these features are available out of the box. Here’s how to get started:

If you already have NanoClaw running:

Pull the latest code and rebuild:
1 2
git pull npm run build

Restart the service:

# Linux (systemd)
systemctl --user restart nanoclaw

# macOS (launchd)
launchctl kickstart -k gui/$(id -u)/com.nanoclaw

The new tables are created automatically on startup. No migration step needed.
Send /status in your main group to verify it’s working.

If you’re starting fresh:

Fork or clone bahree/nanoclaw (or the upstream OpenClaw )
Follow the setup instructions in the README
Once connected to a channel, all commands are available immediately

Command reference:

Command	What it does
`/status`	System overview: uptime, memory, containers, channels, groups, tasks
`/status tasks`	Full task list with schedules, next run, last run, IDs
`/task pause <id>`	Pause a scheduled task
`/task resume <id>`	Resume a paused task
`/task delete <id>`	Delete a task and its run history
`/debug last <n>`	Last n actions with their triggering events
`/debug why`	Most recent action with full trace
`/debug event <id>`	All actions triggered by a specific event
`/debug report`	Summary dashboard with stats and errors

Configuration:

Env variable	Default	What it does
`EVENT_LOG_RETENTION_DAYS`	`3`	Days to keep event logs (0 = keep forever)

All commands are main-group only. They respond instantly (no container needed) and don’t appear in the conversation history.

12. Summary

Three problems, one philosophy: make the system controllable and observable from the same interface you use to interact with it.

/status gives you real-time visibility - what’s running, what’s queued, what’s scheduled, which channels are connected. /task gives you operational control: pause a runaway task, resume one you paused, and clean up completed ones. Event logging gives you after-the-fact traceability - every action links back to its triggering event, every tool call links back to its parent action. /debug commands let you query the trail. Auto-pruning keeps it from growing unbounded.

About ~1100 lines of new TypeScript across 8 files. Two new modules (status.ts and event-log.ts), three new SQLite tables, a handful of indexes, and one new config variable. No new dependencies, no separate services. It just works on the next restart.

The source code for NanoClaw is available at bahree/nanoclaw .