How Pyvorin compiles Python to native code
Pyvorin is not a JIT guessing at runtime. It is an ahead-of-time compiler that analyses your source, figures out the types, and lowers everything to LLVM IR. Here is the full pipeline, from source to execution.
The compilation pipeline
Every function we target passes through five stages. If any stage rejects the code, we either fall back to CPython or raise a strict-mode error.
AST Parse
Source is parsed with the standard ast module. We walk the tree and
builds an internal representation annotated with source locations.
Type Inference
A constraint solver propagates types through the graph. We track int, float, string, bool, list, dict, set, tuple, and file-handle types. Where types are ambiguous, we insert runtime guards or fall back.
Lower to LLVM IR
Typed AST nodes are translated into LLVM intermediate representation. Loop unrolling, vectorisation and inlining are applied by the LLVM opt pipeline.
Native Codegen
LLVM llc and the platform assembler emit a shared object. The binary is
cached on disk keyed by source hash and target architecture.
Load & Execute
The .so is loaded with ctypes. Arguments are marshalled from
Python objects to native types and results are returned back.
AST Parse & Normalisation
Pyvorin starts with Python’s built-in ast module. Unlike transpilers that
operate on source text, Pyvorin works on the structured AST - this means whitespace
and comments are irrelevant, and refactoring your code style never breaks compilation.
During parsing, we also perform normalisation: list/dict literals are flattened,
chained comparisons (a < b < c) are expanded, and tuple unpacking is
desugared into indexed assignments. This simplifies the later stages.
- Preserves line numbers for precise error messages
- Detects unsupported constructs early (before type inference)
- Annotates every node with scope and variable references
# Python source
def benchmark():
total = 0
for i in range(1000):
total += i * 2
return total
# AST (simplified)
FunctionDef(name='benchmark')
├─ Assign(target='total', value=Constant(0))
├─ For(target='i', iter=Call(range, [1000]))
│ └─ AugAssign(target='total', op=Add(),
│ value=BinOp(left='i', op=Mult(), right=2))
└─ Return(value='total')
# Type inference trace
total: int ← Constant(0) is int
i: int ← range() yields int
i * 2: int ← int * int → int
total += ...: int ← int + int → int
return total: int ← return type is int
# If types are ambiguous:
value = some_list[0]
# → Pyvorin checks list element type at compile time
# or inserts a runtime type guard
Type Inference
We do not require type annotations. Instead, we run a fixed-point constraint solver over the AST that propagates types from assignments and return statements outward.
Supported types: int, float, bool, string,
list, dict, set, tuple, and
file_handle. For each variable, we track a type lattice. If a variable
is used as both int and string in different branches, Pyvorin
either unifies to object (which triggers fallback) or emits a runtime guard.
- No annotations required - works with untyped legacy code
- Function return types are inferred from all return statements
- List/dict element types are tracked per-container
LLVM IR Generation
Once types are known, we lower the AST into LLVM intermediate representation. Every Python construct has a corresponding IR pattern:
-
int + int
Becomes
add i64 %a, %b- a single machine instruction - for i in range(N) Becomes a counted loop with phi nodes for the induction variable
-
list.append(x)
Calls into the C runtime
nexus_list_appendvia a function pointer -
if/else
Becomes LLVM
brinstructions with basic block targets
We use llvmlite to construct IR programmatically. This is more robust
than generating IR as text strings - it prevents type mismatches and ensures every
basic block is properly terminated.
; LLVM IR for: total += i * 2
%prod = mul i64 %i, 2
%new_total = add i64 %total, %prod
store i64 %new_total, i64* %total_ptr
; Loop header with phi
loop_header:
%i = phi i64 [ 0, %entry ], [ %i_next, %loop_body ]
%cond = icmp slt i64 %i, 1000
br i1 %cond, label %loop_body, label %loop_exit
loop_body:
; ... body instructions ...
%i_next = add i64 %i, 1
br label %loop_header
# On-disk cache layout
~/.pyvorin_cache/disk_compile/
├── ab3f2c1d_linux_x86_64_avx2.so
├── ab3f2c1d_linux_x86_64_avx2.json # metadata
├── e901a4b2_darwin_arm64_neon.so
└── ...
# Cache key = SHA256(source + python_version +
# pyvorin_version + cpu_features)
Native Code Generation
LLVM’s MC layer converts IR to target-specific machine code. Pyvorin invokes LLVM through
llvmlite’s JIT compilation API, which produces an in-memory executable. For
caching, the same IR is also serialized and compiled to a shared object on disk.
The optimisation pipeline includes:
- Loop unrolling - reduces branch overhead for small loops
- Auto-vectorisation - SIMD (AVX2/AVX-512/NEON) for data-parallel operations
- Inlining - eliminates call overhead for small functions
- Dead code elimination - removes unreachable branches
The resulting .so is cached keyed by a SHA-256 hash of the source, Python version,
our version, and detected CPU features. Moving to a machine with AVX-512 triggers an
automatic recompile.
Load & Execute
The compiled shared object is loaded into the Python process with ctypes.CDLL.
We build a fast wrapper function that marshals Python arguments to C types and
converts the return value back to Python objects.
The marshalling layer is optimised for the common case:
- Integers and floats are passed as raw C values (no heap allocation)
- Strings are UTF-8 encoded/decoded with a fast path for ASCII
- Lists are converted to C arrays with reference-count tracking
- Dicts use a flat hash table with open addressing
If any argument cannot be marshalled (e.g., a custom object), Pyvorin falls back to CPython for that call. This means you can mix compiled and interpreted code freely.
# Python call site
result = benchmark() # compiled function
# What happens under the hood:
# 1. Wrapper checks arg types (none here)
# 2. Calls ctypes function pointer
# 3. Native code runs (no GIL!)
# 4. Return value (i64) is converted to Python int
# 5. GIL re-acquired before returning to Python
C Runtime Libraries
Pyvorin’s compiled code does not call back into CPython for every list append or dict lookup. Instead, it uses optimised C runtimes that operate on raw memory.
nexus_builtins.so
Core arithmetic, comparisons, and math intrinsics. Exports sqrt, sin,
cos, isqrt, factorial, and random number generation.
Uses platform libm with AVX2 vectorisation where available.
list_runtime.so
Contiguous array-backed list operations. Supports append, get,
set, len, pop, and iteration. Grows with amortised
2× reallocation. Tuple literals are stored as fixed-size arrays.
dict_runtime.so
Flat hash table with open addressing and robin-hood hashing. Supports get,
set, contains, keys, values, items,
and dict comprehensions. Load factor triggers rehash at 0.7.
set_runtime.so
Hash set with the same backing structure as dicts. Supports union,
intersection, difference, symmetric_difference,
and set literals. Set comprehensions compile to native loops.
file_runtime.so
Buffered file I/O with handle table. Supports open, read,
write, readline, readlines, and close.
Uses fwrite with fflush after writes to ensure data persistence.
string_runtime.so
UTF-8 string operations including find, split, join,
strip, startswith, endswith, and slicing.
Immutable strings use reference counting for memory management.
Fallback & Strict Mode
Pyvorin’s philosophy is safe by default. If any part of a function cannot be compiled, the entire function falls back to CPython. No partial compilation, no silent performance cliffs.
Every fallback is reported with a reason:
[FALLBACK] function: process_data
reason: unsupported construct 'eval'
line: 42
suggestion: replace with a lookup table
For CI/CD pipelines, enable strict mode to treat any fallback as an error:
compiler = PyvorinCompiler(strict=True)
compiler.compile(my_function)
# CompilationError if any fallback occurs
Deterministic
Same source + same environment = same binary. No non-determinism from compilation.
Sandboxed
Compiled code runs in a separate memory arena. Bounds checks prevent buffer overflows (disable with fast=True).
Observable
Every compilation produces a report: what compiled, what fell back, compile time, binary size, and optimisations applied.
Ready to see it in action?
Explore how Pyvorin compiles your Python code with our interactive benchmark suite.