Introduction
Deploying bug fixes, new sensor adapters, or policy updates to a fleet of edge devices without physical access requires a robust over-the-air (OTA) update mechanism. This article designs a complete OTA pipeline for Pyvorin Edge: bundle signing with Ed25519, version manifests, atomic symlink swaps, automatic rollback on health check failure, and stable/beta/canary release channels. All signing primitives come from the real SDK source in /var/www/pyvorin/edge_sdk/pyvorin_edge/packaging/signer.py and /var/www/pyvorin/edge_sdk/pyvorin_edge/packaging/verifier.py.
Bundle Signing with Ed25519
The SDK provides BundleSigner, which generates Ed25519 key pairs, hashes every file in a bundle, and writes a signed manifest. The private key never leaves the build server. The public key is baked into each device image as a trust anchor.
Signing a Release Bundle
# On the build server
pyv-edge-sign \
--bundle-dir ./dist/edge-agent-v1.4.2 \
--private-key /secure/signing.key \
--output-manifest ./dist/edge-agent-v1.4.2/manifest.json
Under the hood, BundleSigner.sign_bundle() performs the following steps, exactly as implemented in signer.py:
from pathlib import Path
from pyvorin_edge.packaging.signer import BundleSigner
# Load your securely stored private key
private_key = Path("/secure/signing.key").read_bytes()
# Sign the bundle
BundleSigner.sign_bundle(
bundle_dir="./dist/edge-agent-v1.4.2",
private_key=private_key,
output_manifest="./dist/edge-agent-v1.4.2/manifest.json",
)
The manifest contains:
bundle_name— human-readable identifierversion— semantic version stringtimestamp— Unix epoch secondsfiles— mapping of relative paths to SHA-256 hex digestssignature— base64-encoded Ed25519 signature of the canonical manifest JSON
Version Manifest Format
A well-formed manifest is required by BundleVerifier.verify_at_runtime() on the device. Below is an example generated by sign_bundle():
{
"manifest": {
"bundle_name": "edge-agent-v1.4.2",
"version": "1.4.2",
"timestamp": 1716979200,
"files": {
"main.py": "a3f5c8...",
"config.toml": "e7b2d1...",
"adapters/mqtt.py": "9c4a11..."
}
},
"signature": "base64Ed25519Signature..."
}
Atomic Swap Strategy
Updates must not corrupt a running agent. The safest approach on a Unix filesystem is a staging directory plus a symlink swap.
Directory Layout on Device
/opt/pyvorin-edge/
├── current -> versions/v1.4.1/ # Symlink to active version
├── previous -> versions/v1.4.0/ # Symlink for rollback
├── versions/
│ ├── v1.4.0/
│ ├── v1.4.1/
│ └── v1.4.2/ # Staging directory
└── trust_anchor.json
The Swap Procedure
- Download the new bundle into
versions/v1.4.2/. - Verify the bundle signature and file hashes using
BundleVerifier.verify_bundle(). - Atomically update
previousto point to the current version. - Atomically update
currentto point toversions/v1.4.2/. - Restart the EdgeAgent systemd service.
Steps 3 and 4 use os.symlink() followed by os.replace(), which is atomic on Linux:
import os
from pathlib import Path
def atomic_swap(base_dir: Path, new_version: str) -> None:
versions_dir = base_dir / "versions"
current_link = base_dir / "current"
previous_link = base_dir / "previous"
new_target = versions_dir / new_version
if not new_target.is_dir():
raise RuntimeError(f"Staging directory missing: {new_target}")
# 1. Point 'previous' to whatever 'current' points to now
temp_previous = base_dir / ".previous.tmp"
if current_link.is_symlink():
temp_previous.symlink_to(os.readlink(current_link))
os.replace(temp_previous, previous_link)
# 2. Point 'current' to the new version
temp_current = base_dir / ".current.tmp"
temp_current.symlink_to(str(new_target))
os.replace(temp_current, current_link)
Rollback on Failure
After swapping, the device must confirm the new version is healthy before committing to it. If the health check fails, the device reverts the symlink and restarts the agent.
Health Check After Update
The EdgeAgent exposes GET /health, which returns a JSON payload built in main.py. A successful update must satisfy:
status == "healthy"metrics.cpu_percent < 95metrics.disk_percent < 95cloud.queue_depth < 10000(no immediate sync backlog explosion)agent.running == true
Rollback Procedure
import time
import requests
from pathlib import Path
def rollback(base_dir: Path) -> None:
current_link = base_dir / "current"
previous_link = base_dir / "previous"
if not previous_link.is_symlink():
raise RuntimeError("No previous version to roll back to")
temp_current = base_dir / ".current.tmp"
temp_current.symlink_to(os.readlink(previous_link))
os.replace(temp_current, current_link)
# systemd will restart the agent after this function exits
def verify_health(endpoint: str = "http://127.0.0.1:8080/health", timeout: float = 30.0) -> bool:
deadline = time.time() + timeout
while time.time() < deadline:
try:
resp = requests.get(endpoint, timeout=5)
data = resp.json()
if data.get("status") == "healthy" and data["agent"]["running"]:
return True
except Exception:
pass
time.sleep(2)
return False
Update Channels
Not every device should receive bleeding-edge builds. Three channels let you stage risk:
| Channel | Purpose | Fleet % |
|---|---|---|
stable |
Battle-tested releases. Receive only patch and minor updates after a 48-hour canary bake period. | 90% |
beta |
Pre-release validation on representative hardware in real environments. | 9% |
canary |
Immediate deployment of every merged main branch build. Used to detect regressions before they reach beta. | 1% |
Each device stores its channel in /opt/pyvorin-edge/channel. The OTA poller reads this file and queries the update server with a ?channel= parameter.
Complete OTA Update Flow
The following Python script is a self-contained OTA updater that runs on the device. It downloads, verifies, stages, swaps, health-checks, and rolls back — all using real SDK classes.
#!/usr/bin/env python3
"""OTA updater for Pyvorin Edge devices.
Uses BundleSigner/BundleVerifier from the SDK and implements atomic
symlink swap with automatic rollback on health check failure.
"""
from __future__ import annotations
import argparse
import hashlib
import json
import os
import shutil
import sys
import tempfile
import time
from pathlib import Path
from typing import Any, Dict
import requests
from pyvorin_edge.packaging.signer import BundleSigner, BundleVerificationError
from pyvorin_edge.packaging.verifier import BundleVerifier
class OTAUpdater:
"""Device-side OTA update orchestrator."""
def __init__(self, base_dir: str, update_server: str) -> None:
self.base_dir = Path(base_dir).resolve()
self.update_server = update_server.rstrip("/")
self.versions_dir = self.base_dir / "versions"
self.current_link = self.base_dir / "current"
self.previous_link = self.base_dir / "previous"
self.verifier = BundleVerifier()
def _local_version(self) -> str:
manifest_path = self.current_link / "manifest.json"
if not manifest_path.is_file():
return "0.0.0"
with open(manifest_path, "r", encoding="utf-8") as f:
data = json.load(f)
return data.get("manifest", {}).get("version", "0.0.0")
def _channel(self) -> str:
channel_file = self.base_dir / "channel"
if channel_file.is_file():
return channel_file.read_text().strip()
return "stable"
def _download_bundle(self, version: str, dest: Path) -> None:
url = f"{self.update_server}/bundles/{version}.tar.gz"
resp = requests.get(url, stream=True, timeout=120)
resp.raise_for_status()
dest.parent.mkdir(parents=True, exist_ok=True)
with tempfile.NamedTemporaryFile(delete=False, dir=dest.parent) as tmp:
for chunk in resp.iter_content(chunk_size=8192):
tmp.write(chunk)
tmp_path = tmp.name
# Verify checksum if server provides one
expected_hash = resp.headers.get("X-Bundle-Hash")
if expected_hash:
actual_hash = hashlib.sha256(open(tmp_path, "rb").read()).hexdigest()
if actual_hash != expected_hash:
os.unlink(tmp_path)
raise BundleVerificationError("Download hash mismatch")
shutil.unpack_archive(tmp_path, dest)
os.unlink(tmp_path)
def _verify_staging(self, staging_dir: Path) -> None:
"""Verify bundle integrity at runtime using the trust anchor."""
self.verifier.verify_at_runtime(staging_dir)
def _atomic_swap(self, new_version: str) -> None:
staging = self.versions_dir / new_version
if not staging.is_dir():
raise RuntimeError(f"Staging directory missing: {staging}")
temp_previous = self.base_dir / ".previous.tmp"
if self.current_link.is_symlink():
temp_previous.symlink_to(os.readlink(self.current_link))
os.replace(temp_previous, self.previous_link)
temp_current = self.base_dir / ".current.tmp"
temp_current.symlink_to(str(staging))
os.replace(temp_current, self.current_link)
def _rollback(self) -> None:
if not self.previous_link.is_symlink():
raise RuntimeError("No previous version available for rollback")
temp_current = self.base_dir / ".current.tmp"
temp_current.symlink_to(os.readlink(self.previous_link))
os.replace(temp_current, self.current_link)
def _health_check(self, timeout: float = 60.0) -> bool:
endpoint = "http://127.0.0.1:8080/health"
deadline = time.time() + timeout
while time.time() < deadline:
try:
resp = requests.get(endpoint, timeout=5)
data = resp.json()
if (
data.get("status") == "healthy"
and data["agent"]["running"]
and data["metrics"]["cpu_percent"] < 95
and data["metrics"]["disk_percent"] < 95
and data["cloud"]["queue_depth"] < 10000
):
return True
except Exception:
pass
time.sleep(3)
return False
def run(self) -> int:
current_version = self._local_version()
channel = self._channel()
print(f"Current version: {current_version} Channel: {channel}")
# 1. Check for update
resp = requests.get(
f"{self.update_server}/check",
params={"version": current_version, "channel": channel},
timeout=30,
)
resp.raise_for_status()
update_info: Dict[str, Any] = resp.json()
if not update_info.get("available"):
print("No update available.")
return 0
new_version = update_info["version"]
print(f"Update available: {new_version}")
# 2. Download to staging
staging_dir = self.versions_dir / new_version
if staging_dir.exists():
shutil.rmtree(staging_dir)
self._download_bundle(new_version, staging_dir)
# 3. Verify
try:
self._verify_staging(staging_dir)
print("Bundle verification passed.")
except BundleVerificationError as exc:
print(f"Bundle verification failed: {exc}")
shutil.rmtree(staging_dir)
return 1
# 4. Atomic swap
self._atomic_swap(new_version)
print(f"Swapped to {new_version}. Restarting agent...")
# 5. Restart agent (caller must handle systemd restart)
# In production, this script is invoked by systemd with:
# ExecStart=/usr/local/bin/ota-updater.py
# ExecStopPost=/usr/bin/systemctl restart pyvorin-edge
# For this example we simulate a restart notification:
print("Agent restart triggered. Waiting for health check...")
time.sleep(5) # Allow systemd to restart the service
# 6. Health check
if self._health_check():
print("Health check passed. Update committed.")
return 0
else:
print("Health check FAILED. Rolling back...")
self._rollback()
print("Rollback complete. Agent will restart to previous version.")
return 1
def main() -> int:
parser = argparse.ArgumentParser(description="Pyvorin Edge OTA Updater")
parser.add_argument("--base-dir", default="/opt/pyvorin-edge", help="Base install directory")
parser.add_argument("--server", required=True, help="Update server base URL")
args = parser.parse_args()
updater = OTAUpdater(base_dir=args.base_dir, update_server=args.server)
return updater.run()
if __name__ == "__main__":
sys.exit(main())
Integration with BundleVerifier.verify_at_runtime()
The verifier class in /var/www/pyvorin/edge_sdk/pyvorin_edge/packaging/verifier.py is called both during the OTA staging step and on every agent startup. The runtime verification flow is:
- Load
manifest.jsonfrom the bundle directory. - Verify the manifest block exists and is well-formed.
- For every file listed in
manifest.files, compute SHA-256 and compare against the expected hash. - Log each verification result. If any file is missing or mismatched, raise
BundleVerificationError.
This is exactly what OTAUpdater._verify_staging() delegates to:
# From /var/www/pyvorin/edge_sdk/pyvorin_edge/packaging/verifier.py
class BundleVerifier:
def verify_at_runtime(self, bundle_dir: str | Path) -> bool:
bundle_path = Path(bundle_dir).resolve()
manifest_path = bundle_path / "manifest.json"
if not manifest_path.is_file():
raise BundleVerificationError(f"Manifest not found: {manifest_path}")
with open(manifest_path, "r", encoding="utf-8") as f:
signed_manifest: dict[str, Any] = json.load(f)
manifest = signed_manifest.get("manifest")
if manifest is None:
raise BundleVerificationError("Malformed manifest: missing manifest block")
files_info: dict[str, str] = manifest.get("files", {})
all_valid = True
for relative_path, expected_hash in files_info.items():
file_path = bundle_path / relative_path
if not file_path.is_file():
logger.error("Missing file during runtime verification: %s", relative_path)
all_valid = False
continue
actual_hash = self._hash_file(file_path)
if actual_hash != expected_hash:
logger.error("Hash mismatch for %s", relative_path)
all_valid = False
if not all_valid:
raise BundleVerificationError("Runtime bundle verification failed")
return True
Server-Side Manifest Endpoint
The update server must return a JSON payload that the OTA poller can parse. A minimal example:
from flask import Flask, request, jsonify
app = Flask(__name__)
VERSIONS = {
"stable": "1.4.1",
"beta": "1.4.2-rc2",
"canary": "1.5.0-dev.3",
}
@app.route("/check")
def check():
current = request.args.get("version", "0.0.0")
channel = request.args.get("channel", "stable")
latest = VERSIONS.get(channel, VERSIONS["stable"])
return jsonify({
"available": latest != current,
"version": latest,
"channel": channel,
"download_url": f"/bundles/{latest}.tar.gz",
"hash_url": f"/bundles/{latest}.sha256",
})
Summary
A secure OTA pipeline for Pyvorin Edge requires four pillars: cryptographic signing (Ed25519 via BundleSigner), runtime verification (via BundleVerifier.verify_at_runtime()), atomic filesystem swaps, and automatic rollback driven by the agent's own /health endpoint. Separate releases into stable, beta, and canary channels to control blast radius. Never deploy an update you cannot roll back in under 30 seconds.