Pricing

Seed

$29 / month
  • 3 runtime instances
  • Edge or server deployment
  • Behavior Trees + hybrid LLM execution
  • Agent Memory (shared blackboard)
  • Safety Containment + EscapeVector
  • Cryptographic execution receipts
  • Tamper-evident audit logs
  • Offline survival mode
  • Local + cloud routing
  • Fleet dashboard
  • OTA verified updates
  • MCP Integration
  • Council routing
  • 30-day log retention
  • Email support (48h)
Start free trial

Horizon

$149 / month
  • 50 runtime instances
  • Everything in Seed
  • Shadow mode
  • Speculative execution
  • Human-in-the-Loop approvals
  • Multi-Agent Swarms
  • Reflection Mode
  • SLO enforcement
  • Prometheus metrics
  • Advanced policy engine
  • 90-day log retention
  • Email support (24h)
Get Horizon

Infinite

$699 / month
  • 500 runtime instances
  • Everything in Horizon
  • On-premise deployment
  • Federated model aggregation
  • Multimodal inference
  • Custom retention policy
  • Dedicated onboarding
  • Priority support (8h)
Get Infinite

Questions and answers

Can I try Igris before paying?

Yes. Every plan includes a 7-day free trial — no credit card required to start. You get full access to your chosen tier during the trial. If you don't subscribe, your account downgrades to Seed at the end of the trial.

What hardware do I need?

Any x86_64 or ARM64 device running Linux or macOS with at least 512MB RAM. Tested on Raspberry Pi 4, NVIDIA Jetson, Apple Silicon, and standard servers. The runtime binary is under 16MB.

Do I need internet connectivity?

No. The runtime operates fully offline with local LLM inference via llama.cpp. When connectivity is available, it routes to cloud providers for better results. The switch between local and cloud is automatic — same API either way.

What models can I use?

Any GGUF format model — Llama, Mistral, Phi-3, Qwen, and thousands more from HuggingFace. You can also fine-tune on-device with QLoRA and use your own custom models. Bring your own model, no vendor lock-in.

How is this different from using OpenAI/Anthropic directly?

We sit between your application and providers. Thompson Sampling learns which provider performs best for your workload. You get automatic failover, cost optimization, local fallback when cloud is down, and features like Council Mode and Speculative Execution that no single provider offers.

What is a runtime instance?

A runtime instance is a deployed Igris runtime executing autonomous workloads on a server, edge device, robot, or agent host.

Do I pay for inference?

No. Igris does not host AI models. You provide your own model providers such as OpenAI, Anthropic, DeepSeek, Gemini, or local models.

Can I scale beyond 500 instances?

Yes. Enterprise deployments can scale beyond 500 instances with volume pricing.

What happens if I exceed my instance limit?

Overage pricing applies at $2/device/month for Horizon and $1.50/device/month for Infinite. You'll only pay for the additional instances you use.

Can I upgrade or downgrade?

Yes. Upgrades take effect immediately. Downgrades apply at the start of your next billing cycle. No configuration or data is lost when changing plans.

What is Thompson Sampling?

A Bayesian learning algorithm that routes each request to the best provider based on observed latency, cost, error rate, and quality. It learns your specific workload patterns — starting with cautious exploration and converging to optimal routing after ~500 requests.

What are Speculative Execution and Council Mode?

Speculative Execution races 2-3 providers in parallel and returns the fastest quality response. Council Mode sends a request to multiple providers, has them evaluate each other's answers, then synthesizes the best response. Speed vs. quality — you choose per request.

How do the AI agents work?

Planning agents break complex tasks into steps using chain-of-thought reasoning. Reflection agents self-critique and regenerate until quality thresholds are met. Swarm agents run multiple perspectives in parallel with consensus voting. All agents support tool use (HTTP, shell, filesystem) with sandboxed execution.

What are Behavior Trees?

A hybrid execution engine combining deterministic control flow (sequence, selector, parallel nodes) with LLM-powered adaptive reasoning. The LLM can generate and modify subtrees at runtime, with watchdog safety and bounded execution guarantees.

Can I use it without the dashboard?

Yes. The runtime operates completely standalone. The dashboard is optional for fleet management and provides visibility into routing decisions, device health, and audit trails when you need to manage multiple devices.

How secure is the platform?

Every routing decision is cryptographically signed with Ed25519. API keys are encrypted at rest with AES-256-GCM. The runtime uses post-quantum TLS (Rustls + AWS-LC-RS). JWT authentication with token blacklisting protects all API endpoints. Tool execution runs in sandboxed environments with enforced resource limits on memory, CPU, and execution time.

Is my data sent to the cloud?

AI execution on-device stays on-device. Only metadata (device health, performance metrics, audit logs) syncs with the dashboard when online. If you route requests through cloud providers, that data goes to the provider you selected — we don't intercept or store it. Provider API keys are stored encrypted in your own vault.

What is EscapeVector?

A 72-hour encrypted response cache (AES-256-GCM) that activates when all providers fail. Pre-cached responses keep your system operational during extended outages. Combined with local LLM fallback, the platform degrades gracefully rather than failing.

What is Gold Code?

An Ed25519-signed emergency override protocol. Gold Code patches are cryptographically verified before execution — only patches signed by your authorized keys are accepted. This gives you a secure way to push emergency fixes to fleet devices.

Do you train on my data?

No. We never train models on your data. QLoRA fine-tuning happens entirely on your device. Federated learning shares only encrypted model weight updates across your fleet — raw data never leaves the device.

What happens if a device goes offline?

The runtime continues operating with local LLM inference, cached responses via EscapeVector, and local agent execution. All decisions are still cryptographically signed. When connectivity returns, the device syncs telemetry and audit logs with the dashboard automatically.

How does fleet management work?

Devices register with Ed25519 signatures via the fleet API. The dashboard shows device health, telemetry, and configuration. You can push model updates, configuration changes, and emergency patches (Gold Code) to individual devices or your entire fleet with cryptographic verification.

Can I self-host?

Yes. The Infinite plan includes on-premise deployment. Enterprise plans support air-gapped operation with no external dependencies. The entire stack — runtime, routing, fleet management — runs within your infrastructure.

What observability do I get?

Prometheus-compatible metrics (150+), distributed request tracing, per-request cost tracking, routing decision audit logs, and provider performance leaderboards. The Cognitive Advisor (Infinite tier) automatically proposes optimizations based on observed patterns.

Complete control from edge to cloud.