Requirements

What you need depends on the node type.

Figures marked (inferred)

Items marked (inferred) come from the node's code and defaults, not published specs — treat them as guidance, not a guarantee.

Local inference node

Scales with the model you pick — the CLI wizard filters the catalog to what your free memory holds.

Runtime: Node.js ≥ 20.
OS / arch: macOS (Apple Silicon or Intel), Linux x64, Windows x64.
CPU: 2 cores minimum; throughput scales with core count.
Memory (inferred): ~8 GB for 7B on CPU, ~16 GB VRAM for 14B on a discrete GPU, ~32 GB unified for 14B on Apple Silicon, ~24 GB CUDA or ~64 GB Apple Silicon for 70B-class.
GPU: optional — Metal, CUDA, and ROCm auto-detect; CPU-only works but is slow.
Disk: 3–50 GB free per GGUF model.
Network: always-on outbound WSS to the gateway; outbound HTTPS for the model catalog.

No weights — the work is outbound API calls, so it stays light.

Runtime: a single Rust binary (no Python, CUDA, or llama.cpp), shipped as a slim debian:bookworm-slim image (tens of MB).
OS: anywhere Docker runs — Fly.io, Railway, Render, a VPS, Kubernetes.
CPU / GPU: minimal CPU, no GPU.
RAM (inferred): ~100–200 MB resident.
Disk: ~1 GB volume for identity and state.
Network: outbound WSS to the gateway and HTTPS to the provider; no inbound ports.
Provider credentials: the provider API key is read from an environment variable at serve time — never logged.

Persistent identity: the device identity under ~/.venna/ (a mounted volume under Docker) must survive restarts; losing it forfeits earnings attribution.
Connectivity: a persistent outbound connection with reconnect backoff. A mid-stream disconnect ends that job incomplete.
Account: an operator account to pair against via the RFC 8628 device flow.