FAQ - Pyvorin Native Python Acceleration

Basics

What is Pyvorin?

Pyvorin compiles plain Python to native machine code. You write normal Python, we analyse it, figure out the types, and hand it off to LLVM. What you get back is a .so file that runs directly on your CPU - no interpreter, no bytecode, no GIL getting in the way for the compiled bits.

You don't need type hints. You don't need to rewrite anything. If we can compile a function, it runs natively. If we can't, it falls back to CPython and we tell you exactly why.

Do I need to change my Python code?

In most cases, no. We work with standard Python 3. If your code uses constructs inside our supported subset - loops, arithmetic, list/dict/set operations, string manipulation, file I/O, and so on - it compiles automatically.

If your code uses something we cannot compile - dynamic exec(), weird attribute access, certain introspection tricks - we fall back to CPython for that function and tell you exactly why in the compile log.

How do I install Pyvorin?

pip install pyvorin

Pyvorin requires Python 3.10+ and a 64-bit Linux, macOS, or Windows system. LLVM and llvmlite are installed automatically as dependencies. For enterprise deployments with offline environments, see the installation guide.

Is Pyvorin open source?

Pyvorin is distributed under a commercial license with a generous free tier. The core compiler, runtime libraries, and benchmark suite are proprietary. We provide full source access to enterprise customers under NDA for security audits and compliance reviews.

What platforms are supported?

Linux: x86_64, ARM64 (glibc 2.28+)
macOS: x86_64, Apple Silicon (macOS 12+)
Windows: x86_64 (Windows 10/Server 2019+)

We target the CPU features available at compile time (SSE2, AVX2, AVX-512 on x86; NEON on ARM). Binaries are cached per architecture so you get optimal code for every machine.

Performance

How much faster is Pyvorin than CPython?

On number-crunching code - loops, arithmetic, list operations - we are typically 5–50× faster than CPython. String processing tends to be 2–10× faster. I/O-bound code won't see massive gains because the OS is the bottleneck, but you still get 1.2–2× from compiling the parsing logic.

We've published all 48 benchmarks with full methodology. No cherry-picking, no hiding the bad ones.

Why is native code faster than CPython bytecode?

CPython executes your code through an interpreter loop: fetch bytecode, decode, dispatch, repeat. Every operation carries overhead - type checks on every attribute access, reference-count increments/decrements, GIL acquisition.

We get rid of this overhead by:

Type specialization: Once we know the types, we emit direct machine instructions instead of generic PyNumber_Add calls.
No GIL in compiled regions: Native code runs outside the Python interpreter loop, eliminating GIL contention for CPU-bound work.
LLVM optimization: Loop unrolling, vectorization (SIMD), inlining, and dead-code elimination are applied automatically.
Direct memory layout: Lists and dicts use contiguous C arrays with direct pointer arithmetic instead of Python object indirection.

Does Pyvorin help with I/O-bound or network-bound code?

Not much. If you are bottlenecked on disk, network, or a database, no compiler helps. That said, most I/O code has processing in between the waits - parsing, filtering, aggregating - and that part compiles natively.

For truly async I/O, keep using asyncio, aiohttp, or asyncpg. We don't replace your async runtime; we just speed up the synchronous bits between the awaits.

What about startup time and compile latency?

First compilation of a function takes 50–500 ms depending on complexity. After that, the compiled .so is cached on disk keyed by a hash of the source + Python version + CPU features. Subsequent runs load the cached binary in < 5 ms.

For serverless or short-lived processes, you can pre-compile your module at build time. The binaries are portable across identical architecture targets.

Does Pyvorin use multiple CPU cores?

Compiled functions release the GIL, so you can run them in parallel from multiple threads. We also support OpenMP-style prange loops and SIMD vectorization for single-function parallelism.

Full automatic multithreading inside a single loop is on our roadmap for Q3 2025.

Compilation Pipeline

How does the compilation pipeline work?

Our pipeline has five stages:

AST Parse: Source is parsed with Python’s standard ast module into a typed internal representation.
Type Inference: A constraint solver propagates types through the graph. We track int, float, string, bool, list, dict, set, tuple, and file-handle types.
Lower to LLVM IR: Typed AST nodes are translated into LLVM intermediate representation. Loops, conditionals, and expressions become SSA-form IR.
Native Codegen: LLVM optimizes and emits a platform-specific shared object.
Load & Execute: The binary is loaded with ctypes. Arguments are marshalled from Python objects to native handles, and results are decoded back.

What is fallback and when does it happen?

Fallback happens when we hit a construct we cannot compile. Instead of crashing or returning garbage, we route execution back to CPython. Every fallback is logged - you see exactly which function fell back and why.

Common fallback reasons:

Unsupported built-in function (e.g., eval, compile)
Dynamic typing that cannot be resolved
Complex exception handling patterns
C extension calls that mutate Python internals
Nested classes with metaclasses

Can I force strict mode (no fallback)?

Yes. Pass strict=True to the compiler:

from pyvorin import PyvorinCompiler
compiler = PyvorinCompiler(strict=True)
compiler.compile(my_function)

In strict mode, if any part of the function cannot compile, we raise a CompilationError with a detailed explanation. No silent fallbacks. This is the right choice for CI/CD pipelines where native execution is a hard requirement.

How is the compiled binary cached?

Compiled binaries are stored in ~/.pyvorin_cache/disk_compile/. The cache key is a hash of:

Function source code (normalized)
Python version (major.minor)
Pyvorin version
CPU features detected at runtime (AVX2, AVX-512, NEON, etc.)

This means upgrading Python or moving to a machine with AVX-512 will trigger a recompile, but day-to-day runs reuse the cache instantly.

Supported Python

Which Python constructs does Pyvorin compile natively?

Here is what we compile natively as of the latest release:

Data Types

int, float, bool, string
list, dict, set, tuple
file handles (open, read, write, readlines)

Control Flow

if / elif / else
for (range, list, enumerate, zip, reversed)
while, break, continue, return
Nested loops (with unrolling)

Functions & Classes

User-defined functions
Classes with __init__
Inheritance (single)
staticmethod, classmethod
Recursive functions (depth-limited)

Operations

Arithmetic (+, -, *, /, //, %, **)
Bitwise (&, |, ^, ~, <<, >>)
Comparisons, boolean logic
String methods (find, split, join, strip, etc.)
List/dict/set comprehensions

Stdlib

math (sqrt, sin, cos, isqrt, factorial, etc.)
random (randint, choice, random, shuffle)
time (time, sleep)
json (loads, dumps)

Exceptions

try / except
raise, assert

What is NOT yet supported?

We are transparent about limitations. The following currently trigger fallback:

Dynamic code generation (eval, exec, compile)
Custom metaclasses (other than ABCMeta)
Complex multiple inheritance
Direct mutation of C extension internals (numpy arrays, pandas DataFrames, etc. fall back gracefully to Python)

Do I need type hints?

No. We infer types automatically from how variables are used. Type hints are respected if present, but never required. This keeps us compatible with untyped legacy codebases.

Can Pyvorin compile third-party libraries like NumPy or Pandas?

Not directly - we compile Python source, not C extensions. But you can call NumPy or Pandas from your code, and we will compile the surrounding logic (loops, filters, aggregations) while leaving the library calls alone.

For arrays and dataframes, we recommend Polars or DuckDB (already native) for the heavy lifting, and Pyvorin for the orchestration and custom business logic around them.

Correctness & Guarantees

How do you guarantee correctness?

Every release is validated against a test suite of 166+ benchmarks covering all supported constructs. Each benchmark is compared against CPython ground truth. If we produce a different result, the release is blocked.

Our correctness pipeline:

Run CPython to get the expected result
Compile with us (native or fallback)
Compare outputs with approximate-equality for floats and exact equality for ints/strings
Hash large outputs to detect subtle differences
Block release if any benchmark is wrong

What about floating-point differences?

We use the same IEEE-754 double-precision format as CPython. Minor differences can crop up from LLVM re-ordering floating-point operations for vectorization, different math library implementations, or fast-math optimizations (which are disabled by default).

By default we validate float results with a relative tolerance of 1e-9 and absolute tolerance of 1e-12. You can tighten or loosen this per-function.

Can compiled code crash or segfault?

Compiled code operates on C-level memory and can theoretically segfault if a bug exists in the compiler. To mitigate this:

All compiled code is sandboxed in a separate memory arena
Bounds checks are emitted for list/dict accesses (can be disabled with fast=True for trusted code)
Reference counting is preserved for Python objects that cross the boundary
Our CI runs every benchmark under AddressSanitizer

vs Other Compilers

How does Pyvorin compare to Numba?

Numba is excellent for numerical computing but has limitations:

Numba requires the @njit decorator on every function. We work module-wide with a single call.
Numba struggles with strings, dicts, sets, and file I/O. We compile all of those natively.
Numba's object mode fallback is opaque. We report every fallback with a reason.
Numba focuses on NumPy arrays. We focus on general Python.

For pure numeric array code, Numba may still be faster. For general Python (ETL, string processing, business logic), Pyvorin typically wins.

How does Pyvorin compare to Cython?

Cython requires you to write .pyx files with C-like type declarations. Pyvorin works directly on .py files with zero annotations.

Cython gives you more control - direct C API calls, manual memory management - but at the cost of language complexity. We trade some of that control for much simpler adoption.

How does Pyvorin compare to PyPy?

PyPy is a complete replacement CPython interpreter with a tracing JIT. It speeds up long-running programs but has drawbacks:

Warmup time: PyPy needs hundreds or thousands of iterations before the JIT kicks in. We are fast immediately after the first compile.
Compatibility: PyPy does not support all C extensions. We run on CPython and keep full C extension compatibility.
Memory: PyPy uses more memory. We use the same heap as CPython.

How does Pyvorin compare to Mojo?

Mojo is a new language that requires rewriting your code. We are a compiler for existing Python - no rewrite, no new syntax, no new toolchain.

Mojo may be faster for AI/ML kernels thanks to its MLIR backend and explicit memory management. We are faster to adopt because we work with your existing codebase today.

Enterprise & Licensing

What license models are available?

Free Tier: Up to 5 compiled functions, community support, no offline license. Perfect for evaluation and small scripts.
Pro: Unlimited compilation, email support, 2 offline seats. £99/month or £990/year.
Team: Up to 10 developers, shared license pool, CI/CD integration, Slack support. £499/month.
Enterprise: Unlimited seats, on-premise license server, source code escrow, dedicated support engineer, custom SLA. Contact sales.

Can we run Pyvorin in air-gapped environments?

Yes. Enterprise customers receive an on-premise license server that validates seats against your internal identity provider (LDAP, Active Directory, SAML). No internet connection is required after initial setup.

Do you offer source code escrow?

Yes. Enterprise agreements include source code escrow through a third-party trustee. If Pyvorin ceases operations, the escrow releases the full source code, build system, and documentation to licensed customers.

What SLAs are available?

Enterprise customers can choose:

Standard: 24-hour response, business hours
Premium: 4-hour response, 24/7 coverage
Critical: 1-hour response, 24/7 with dedicated engineer

Security & Privacy

Does Pyvorin send my source code to the cloud?

No. All compilation happens locally on your machine. We do not upload source code, ASTs, or compiled binaries anywhere. The only network traffic is license validation - a simple HTTPS ping with a hashed machine fingerprint.

Enterprise on-premise deployments have zero outbound network requirements.

Can compiled binaries contain backdoors or telemetry?

No. Pyvorin compiled binaries are pure machine code + a small C runtime for list/dict/string operations. There is no network code, no telemetry, and no hidden functionality in the runtime. Enterprise customers can audit the runtime source code (written in C) to verify this.

Frequently Asked Questions