raspberry-pi-agent is that layer.
The Problem with Ad-Hoc Scripts
The Pi started as a web host, became a monitoring node, then an AI command center, and now runs half a dozen services simultaneously. Each new tool I built on it — the Telegram bot, the backup scheduler, the Qwen CLI wrapper — wanted to do the same basic operations:
- Run a shell command and capture its output
- Check whether a systemd service is running
- Read the tail of a log file
- Report CPU temperature, memory usage, and disk space
- Restart a service if it's unhealthy
The first few tools each implemented their own version of these operations. That meant five different subprocess.run() wrappers, five slightly different approaches to error handling, five different ways to format the output. When a pattern needed changing — better timeout handling, consistent logging — I had to find and update every copy.
raspberry-pi-agent consolidates all of it into a single, importable Python module with a consistent interface.
Architecture: A Command Dispatcher
The agent is built around a central dispatcher that routes typed command objects to the appropriate handler. Commands are plain Python dataclasses — easy to serialize, easy to log, easy to test:
from dataclasses import dataclass
from typing import Optional
@dataclass
class ShellCommand:
command: str
timeout: int = 30
capture_stderr: bool = True
@dataclass
class ServiceCommand:
service_name: str
action: str # 'status', 'start', 'stop', 'restart'
@dataclass
class LogTailCommand:
log_path: str
lines: int = 50
@dataclass
class SystemStatsCommand:
include_temperature: bool = True
include_disk: bool = True
The dispatcher maps each command type to a handler function:
class PiAgent:
def __init__(self):
self._handlers = {
ShellCommand: self._handle_shell,
ServiceCommand: self._handle_service,
LogTailCommand: self._handle_log_tail,
SystemStatsCommand: self._handle_system_stats,
}
def execute(self, command) -> dict:
handler = self._handlers.get(type(command))
if not handler:
return {'success': False, 'error': f'Unknown command type: {type(command).__name__}'}
try:
return handler(command)
except Exception as e:
return {'success': False, 'error': str(e), 'command': repr(command)}
Every handler returns the same structure: a dict with a success boolean, a result on success, or an error string on failure. Callers never need to distinguish between a shell execution failure and a Python exception — they always get back the same shape of data.
Shell Execution
The shell handler wraps subprocess.run() with sane defaults — timeout enforcement, stderr capture, and output decoding — that were missing or inconsistent in the ad-hoc scripts:
def _handle_shell(self, cmd: ShellCommand) -> dict:
stderr_dest = subprocess.PIPE if cmd.capture_stderr else subprocess.DEVNULL
result = subprocess.run(
cmd.command,
shell=True,
capture_output=True,
text=True,
timeout=cmd.timeout
)
return {
'success': result.returncode == 0,
'stdout': result.stdout.strip(),
'stderr': result.stderr.strip() if cmd.capture_stderr else None,
'returncode': result.returncode
}
The shell=True choice is intentional: the Pi agent is for local, trusted commands — the same commands I'd type at the terminal myself. For external-input scenarios, parameterized commands would be the right approach, but that's not the use case here.
Service Management
Service checks and restarts are the most common operations. The service handler translates action strings into systemctl calls and parses the output into a structured response:
def _handle_service(self, cmd: ServiceCommand) -> dict:
if cmd.action == 'status':
result = subprocess.run(
['systemctl', 'is-active', cmd.service_name],
capture_output=True, text=True
)
active = result.stdout.strip() == 'active'
return {
'success': True,
'service': cmd.service_name,
'active': active,
'status': result.stdout.strip()
}
elif cmd.action in ('start', 'stop', 'restart'):
result = subprocess.run(
['sudo', 'systemctl', cmd.action, cmd.service_name],
capture_output=True, text=True, timeout=30
)
return {
'success': result.returncode == 0,
'service': cmd.service_name,
'action': cmd.action,
'error': result.stderr.strip() if result.returncode != 0 else None
}
else:
return {'success': False, 'error': f'Unknown action: {cmd.action}'}
System Stats
The system stats handler aggregates the Pi's health metrics into a single response — CPU temperature, memory usage, disk space, and uptime. Rather than shelling out for each metric, it uses a mix of /proc reads and the psutil library for cross-version compatibility:
import psutil
import subprocess
def _handle_system_stats(self, cmd: SystemStatsCommand) -> dict:
stats = {
'success': True,
'cpu_percent': psutil.cpu_percent(interval=1),
'memory': {
'total_mb': psutil.virtual_memory().total // (1024 * 1024),
'used_mb': psutil.virtual_memory().used // (1024 * 1024),
'percent': psutil.virtual_memory().percent
},
'uptime_seconds': int(psutil.boot_time())
}
if cmd.include_temperature:
try:
temp_result = subprocess.run(
['vcgencmd', 'measure_temp'],
capture_output=True, text=True, timeout=5
)
# Output: temp=52.3'C
temp_str = temp_result.stdout.strip()
temp_val = float(temp_str.replace("temp=", "").replace("'C", ""))
stats['cpu_temp_celsius'] = temp_val
except Exception:
stats['cpu_temp_celsius'] = None
if cmd.include_disk:
disk = psutil.disk_usage('/')
stats['disk'] = {
'total_gb': round(disk.total / (1024 ** 3), 1),
'used_gb': round(disk.used / (1024 ** 3), 1),
'percent': disk.percent
}
return stats
Log Tailing
Reading the tail of a log file is something every service integration needed — the Telegram bot to surface recent errors, Arpy Assist to answer "what did the backup script do last night?" The handler uses Python's file reading rather than shelling out to tail, which keeps the output consistent across log sizes:
def _handle_log_tail(self, cmd: LogTailCommand) -> dict:
try:
with open(cmd.log_path, 'r', errors='replace') as f:
lines = f.readlines()
tail = lines[-cmd.lines:] if len(lines) > cmd.lines else lines
return {
'success': True,
'log_path': cmd.log_path,
'lines_returned': len(tail),
'content': ''.join(tail)
}
except FileNotFoundError:
return {'success': False, 'error': f'Log not found: {cmd.log_path}'}
except PermissionError:
return {'success': False, 'error': f'Permission denied: {cmd.log_path}'}
How Other Tools Use It
The agent is imported directly by the other Pi services that need it. The Telegram monitoring bot uses it for health check commands:
# In the Telegram bot's command handler
from raspberry_pi_agent import PiAgent, SystemStatsCommand, ServiceCommand
agent = PiAgent()
async def handle_status_command(update, context):
stats = agent.execute(SystemStatsCommand())
if not stats['success']:
await update.message.reply_text("Couldn't retrieve stats.")
return
msg = (
f"CPU: {stats['cpu_percent']}% | "
f"Temp: {stats.get('cpu_temp_celsius', 'N/A')}°C\n"
f"Memory: {stats['memory']['used_mb']}MB / {stats['memory']['total_mb']}MB "
f"({stats['memory']['percent']}%)\n"
f"Disk: {stats['disk']['used_gb']}GB / {stats['disk']['total_gb']}GB "
f"({stats['disk']['percent']}%)"
)
await update.message.reply_text(msg)
Arpy Assist — the web interface for Qwen CLI — uses the agent to execute commands that the AI model suggests, routing Qwen's shell recommendations through the dispatcher rather than running them directly. This gives the web interface a consistent layer for logging and timeout handling around everything Qwen wants to run.
Adding New Command Types
Adding a new capability to the agent is a three-step pattern:
- Define a new dataclass for the command
- Write a handler method
- Register it in
self._handlers
No base classes to inherit from, no decorator magic to understand, no framework to fight. The Telegram bot integration prompted adding a NetworkCommand type for checking external connectivity and ping times; the Pi Backups integration added a BackupStatusCommand that reads from the backup log. Each addition took under 30 minutes and didn't touch any existing handler.
Testing Without a Pi
One practical problem with Pi automation scripts: they're hard to test on a development machine. vcgencmd doesn't exist on macOS. systemctl behaves differently on non-systemd systems. The agent's architecture makes this testable:
class MockPiAgent(PiAgent):
"""Drop-in replacement for unit testing without Pi hardware."""
def _handle_system_stats(self, cmd):
return {
'success': True,
'cpu_percent': 23.5,
'cpu_temp_celsius': 48.2,
'memory': {'total_mb': 4096, 'used_mb': 1820, 'percent': 44.4},
'disk': {'total_gb': 64.0, 'used_gb': 18.3, 'percent': 28.6}
}
def _handle_service(self, cmd):
return {'success': True, 'service': cmd.service_name, 'active': True, 'status': 'active'}
The calling code never knows it's talking to a mock — it just calls agent.execute(command) and gets back the same dict shape. This made it possible to develop and test the Telegram bot's response formatting and Arpy Assist's frontend entirely on a laptop.
What's Next
The agent currently operates synchronously — each command blocks until complete. For the Telegram bot this is fine; it's handling one command at a time per chat. For Arpy Assist's web interface, which can serve multiple concurrent users, an async version built around asyncio.create_subprocess_shell would remove the blocking constraint.
A command queue with persistent storage (SQLite) would let long-running operations — a full backup, a large file transfer — be submitted and polled for results rather than holding an HTTP connection open. That's a likely next step as the Arpy Assist web interface adds more ambitious command types.
Conclusion
The pattern here — a typed command + a dispatcher + consistent return shapes — isn't novel. It's the same principle behind every command bus or message queue. The value isn't in the novelty; it's in applying a pattern that prevents the Pi's automation stack from turning into a collection of incompatible, poorly-tested one-offs.
If you're building anything more than one or two automation scripts for a Pi, a central agent layer like this pays for itself in the second tool you add. The code is on GitHub as part of the broader Pi automation stack.