SimulatorAdapter — Synthetic Data for Real Testing

June 2, 2026 | 14 min read

Why Simulate?

Hardware is expensive, fragile, and sometimes unavailable during development. The SimulatorAdapter generates realistic sensor streams with configurable noise, drift, anomaly injection, and scenario modes. You can validate window logic, tune rule thresholds, and demo pipelines to stakeholders—all without a single physical sensor.

SensorConfig

Before the simulator can produce data, you define each sensor with a SensorConfig dataclass. This tells the simulator the physical range, expected noise, and drift behaviour.

FieldTypeDescription
namestrSensor identifier.
sensor_typeSensorTypeEnum member (TEMPERATURE, HUMIDITY, CO2, etc.).
unitstrDisplay unit.
min_valuefloatHard lower clamp.
max_valuefloatHard upper clamp.
noise_stdfloatStandard deviation of Gaussian noise.
noise_modelNoiseModelCurrently only GAUSSIAN.
drift_modelDriftModelLINEAR, SINUSOIDAL, or NONE.
drift_ratefloatUnits per second for linear drift.
drift_amplitudefloatPeak amplitude for sinusoidal drift.
drift_period_secondsfloatPeriod of sinusoidal drift (default 86400).
baselinefloatCentral value around which noise is applied.

Basic Configuration


from pyv_edge_agent.ingest.simulator_adapter import SimulatorAdapter, SensorConfig, SensorType, NoiseModel, DriftModel

sensors = [
    SensorConfig(
        name="office_temp",
        sensor_type=SensorType.TEMPERATURE,
        unit="°C",
        min_value=10.0,
        max_value=35.0,
        noise_std=0.3,
        noise_model=NoiseModel.GAUSSIAN,
        baseline=21.0,
    ),
    SensorConfig(
        name="office_humidity",
        sensor_type=SensorType.HUMIDITY,
        unit="%RH",
        min_value=20.0,
        max_value=90.0,
        noise_std=1.5,
        baseline=45.0,
    ),
]

sim = SimulatorAdapter(sensors=sensors, seed=42)
  

Generating Single Readings

Use generate_reading() when you want precise control over timestamps, such as simulating a specific historical period or aligning with another data source.


from datetime import datetime, timezone

reading = sim.generate_reading("office_temp")
print(reading)
# {'sensor_name': 'office_temp', 'sensor_type': 'temperature', ...}

# Backdated reading
ts = datetime(2024, 6, 1, 12, 0, 0, tzinfo=timezone.utc)
reading = sim.generate_reading("office_temp", timestamp=ts)
print(reading["timestamp"])  # '2024-06-01T12:00:00+00:00'
  

Generating Batches

generate_batch() produces a time-series list-of-lists. Each inner list is one sample frame containing one reading per requested sensor. This shape mirrors multi-sensor polling and is ideal for feeding directly into Pipeline.run().


# 10 minutes of data at 1 Hz
frames = sim.generate_batch(
    duration_seconds=600.0,
    sample_rate_hz=1.0,
    sensor_names=["office_temp", "office_humidity"],
)

print(f"Total frames: {len(frames)}")          # 600
print(f"Sensors per frame: {len(frames[0])}")  # 2

# Feed into a pipeline
all_readings = [r for frame in frames for r in frame]
result = pipeline.run(all_readings)
  

Noise Models

Currently the SDK ships with Gaussian noise. The value is computed as baseline + drift + N(0, noise_std²), then clamped to [min_value, max_value]. If you need coloured noise (pink, brown), supply a custom SimulatorAdapter subclass or post-process the readings.


noisy_sensor = SensorConfig(
    name="noisy_vibration",
    sensor_type=SensorType.VIBRATION,
    unit="mm/s",
    min_value=0.0,
    max_value=20.0,
    noise_std=2.0,
    baseline=5.0,
)
  

Drift Models

Real sensors drift over time—thermocouples age, pressure transducers fatigue. The simulator supports two drift models:

  • Linear — value shifts by drift_rate × elapsed_seconds.
  • Sinusoidal — value oscillates by drift_amplitude × sin(2π × elapsed / drift_period_seconds), useful for modelling daily temperature cycles.

sensors = [
    SensorConfig(
        name="solar_panel_temp",
        sensor_type=SensorType.TEMPERATURE,
        unit="°C",
        min_value=-10.0,
        max_value=80.0,
        noise_std=0.5,
        drift_model=DriftModel.SINUSOIDAL,
        drift_amplitude=15.0,
        drift_period_seconds=86400.0,
        baseline=25.0,
    ),
    SensorConfig(
        name="aging_battery",
        sensor_type=SensorType.VOLTAGE,
        unit="V",
        min_value=3.0,
        max_value=4.2,
        noise_std=0.02,
        drift_model=DriftModel.LINEAR,
        drift_rate=-1e-6,  # Lose ~0.086 V per day
        baseline=4.0,
    ),
]

sim = SimulatorAdapter(sensors=sensors, seed=123)
  

Anomaly Injection

Unless a scenario is active, the simulator randomly injects anomalies with probability anomaly_probability. Anomalies last between 60 and 600 seconds and apply sensor-type-specific offsets (e.g., +5–15 °C for temperature). The resulting reading dict includes an anomaly: True flag.


# 5 % chance of anomaly per reading
sim = SimulatorAdapter(sensors=sensors, seed=42, anomaly_probability=0.05)

anomaly_count = 0
for frame in sim.generate_batch(3600.0, 1.0):
    for r in frame:
        if r["anomaly"]:
            anomaly_count += 1

print(f"Anomalies injected: {anomaly_count}")
  

Scenario Modes

Scenarios override anomaly injection and produce coordinated, deterministic events. They are perfect for integration tests and demos.

ScenarioEffect
NORMAL_DAYBaseline behaviour, no overrides.
LEAK_EVENTLeak sensor reads True between 10:00 and 11:00.
MOULD_RISK_DAYHumidity +25 %RH, temperature −3 °C.
POWER_OUTAGECurrent and voltage drop to 0; temperature and humidity drift.

from pyv_edge_agent.ingest.simulator_adapter import Scenario

sim.set_scenario(Scenario.POWER_OUTAGE)
reading = sim.generate_reading("office_temp")
print(reading["scenario"])  # 'power_outage'
  

Loading from Config File

Rather than hard-coding SensorConfig objects, you can load a JSON definition. This is the recommended approach for CI/CD environments where the same simulator config is shared across developer laptops and build agents.


# sensors.json
{
  "sensors": [
    {
      "id": "server_inlet",
      "type": "temperature",
      "unit": "°C",
      "normal_range": {"min": 18, "max": 24},
      "alert_threshold": {"min": 15, "max": 30},
      "noise_std": 0.3,
      "baseline": 21
    }
  ]
}
  

from pathlib import Path

sim = SimulatorAdapter.from_config_file(
    Path("sensors.json"),
    seed=42,
    anomaly_probability=0.02,
    scenario=Scenario.NORMAL_DAY,
)
  

Full Working Example


from pyvorin_edge.pipeline import Pipeline, WindowConfig, RuleConfig
from pyvorin_edge.sensors import Sensor, SensorType
from pyv_edge_agent.ingest.simulator_adapter import (
    SimulatorAdapter, SensorConfig, SensorType as SimType,
    NoiseModel, DriftModel, Scenario,
)

# 1. Configure simulator
sim = SimulatorAdapter(
    sensors=[
        SensorConfig(
            name="reactor_temp",
            sensor_type=SimType.TEMPERATURE,
            unit="°C",
            min_value=20.0,
            max_value=120.0,
            noise_std=0.5,
            baseline=85.0,
        ),
    ],
    seed=99,
    anomaly_probability=0.05,
)

# 2. Build pipeline
pipeline = Pipeline("reactor_monitor")
pipeline.add_sensor(Sensor(
    name="reactor_temp",
    sensor_type=SensorType.TEMPERATURE,
    unit="°C",
    normal_range=(80.0, 95.0),
    alert_threshold=100.0,
))
pipeline.add_window(WindowConfig(
    duration_seconds=300.0,
    window_type="rolling",
    sensor_name="reactor_temp",
))
pipeline.add_rule(RuleConfig(
    name="overheating",
    condition=lambda r: r.value > 100.0,
    severity="critical",
    cooldown_seconds=60.0,
))

# 3. Generate data and run
frames = sim.generate_batch(duration_seconds=3600.0, sample_rate_hz=0.1)
readings = [r for frame in frames for r in frame]
result = pipeline.run(readings)

print(f"Readings: {result.readings_processed}")
print(f"Events: {len(result.events)}")
print(f"Latency: {result.latency_ms:.3f} ms/reading")