Lobstack is a managed cloud platform for deploying autonomous AI agents. Each agent runs on a dedicated virtual machine with persistent memory, 117+ integrations, and goes live in under 90 seconds with zero DevOps.

How long does it take to deploy an AI agent?

You can deploy an AI agent on Lobstack in under 90 seconds. Describe what your agent should do, pick a model, and Lobstack handles all the infrastructure — no Docker, Kubernetes, or server setup needed.

What AI models does Lobstack support?

Lobstack supports 24 AI models including Claude Opus and Sonnet from Anthropic, GPT-5 and GPT-4o from OpenAI, Gemini 2.5 Pro from Google, plus DeepSeek, Mistral, and Grok. You can hot-swap models without losing agent memory or configuration.

Is Lobstack secure for enterprise use?

Yes. Every agent runs in a gVisor-sandboxed VM with AES-256 encryption at rest, TLS 1.3 in transit, HashiCorp Vault for secrets, row-level security on all tables, and full audit logging. Lobstack is SOC 2 aligned.

Docs/Security & Infrastructure/Kubernetes & Scaling

Infrastructure

Kubernetes Orchestration & Auto-Scaling

Production-grade Kubernetes deployment with horizontal, vertical, and event-driven auto-scaling for both the control plane and agent runtime.

Cluster Architecture#

Lobstack runs on a dedicated Kubernetes cluster with strict namespace isolation. The control plane (API server) and agent runtime (individual AI agent pods) are separated into distinct namespaces with independent scaling policies, resource quotas, and security contexts.

Namespace Layout

lobstack-control-plane    # Lobstack API (Next.js) — 3-20 replicas
lobstack-agents           # Agent pods (gVisor sandbox) — 0-100+ pods
lobstack-vault            # HashiCorp Vault HA cluster — 3-5 replicas
lobstack-monitoring       # Prometheus, Falco, audit collection
lobstack-ingress          # Istio ingress gateway
istio-system              # Istio control plane (istiod)

💡

Pod Security Standards

All Lobstack namespaces enforce the restricted Pod Security Standard — the most restrictive level. This requires non-root users, dropped capabilities, seccomp profiles, and read-only root filesystems.

Control Plane Deployment#

The Lobstack API runs as a multi-replica Deployment with anti-affinity rules to spread pods across availability zones. This ensures no single zone failure can take down the platform.

Property	Value	Purpose
Min Replicas	3	Always-on availability across zones
Max Replicas	20 (50 in production)	Handle traffic spikes
Strategy	RollingUpdate (maxSurge: 1, maxUnavailable: 0)	Zero-downtime deploys
PDB	minAvailable: 2	Survive node drains and upgrades
Topology Spread	maxSkew: 1 per zone	Even distribution across AZs
Startup Probe	5s interval, 12 failures	60s grace period for cold starts
Security Context	runAsNonRoot, readOnlyRootFilesystem, drop ALL	Minimal attack surface

Control Plane Resource Allocation

resources:
  requests:
    cpu: 250m        # Guaranteed baseline
    memory: 512Mi
  limits:
    cpu: "1"         # Burst up to 1 vCPU
    memory: 1Gi

# Production override:
resources:
  requests:
    cpu: 500m
    memory: 1Gi
  limits:
    cpu: "2"
    memory: 2Gi

Agent Runtime#

Each AI agent runs in its own isolated Kubernetes pod with a gVisor sandbox runtime. Agents are created dynamically when a user provisions a new agent and destroyed on teardown. The orchestrator manages the full lifecycle:

🚀

Dynamic Pod Creation

The K8s orchestrator creates a dedicated pod + ClusterIP service per agent with Vault-injected secrets.

🛡️

gVisor Sandbox

Every agent pod runs with RuntimeClass: gvisor (runsc handler) — an application-level kernel that intercepts syscalls.

📦

Resource Isolation

CPU/memory limits enforced per tier (starter: 1 vCPU/2GB → enterprise: 8 vCPU/16GB). ResourceQuota caps the namespace.

🔒

Network Isolation

NetworkPolicies prevent inter-agent communication. Each pod can only reach the Lobstack API and external AI APIs.

💾

Ephemeral Workspace

Agent workspace is an emptyDir volume with size limits per tier (5GB → 50GB). Data is ephemeral to the pod lifecycle.

Tier	CPU Request	CPU Limit	Memory	Workspace
Starter	250m	1 vCPU	512Mi → 2Gi	5 Gi
Standard	500m	2 vCPU	1Gi → 4Gi	10 Gi
Performance	1 vCPU	4 vCPU	2Gi → 8Gi	20 Gi
Enterprise	2 vCPU	8 vCPU	4Gi → 16Gi	50 Gi

Auto-Scaling#

Lobstack uses three layers of auto-scaling to handle variable load efficiently — from steady-state traffic to sudden spikes in agent provisioning.

Horizontal Pod Autoscaler (HPA)#

The Lobstack API scales horizontally based on CPU utilization, memory utilization, and HTTP request rate.

HPA Configuration

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
spec:
  scaleTargetRef:
    kind: Deployment
    name: lobstack-api
  minReplicas: 3
  maxReplicas: 20
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70    # Scale up at 70% CPU
    - type: Resource
      resource:
        name: memory
        target:
          averageUtilization: 80    # Scale up at 80% memory
    - type: Pods
      pods:
        metric:
          name: http_requests_per_second
        target:
          averageValue: "100"       # Scale up at 100 RPS/pod

KEDA Event-Driven Scaling#

Agent pods are scaled by KEDA based on the number of pending provisioning requests in the database. When users request new agents, KEDA detects the queue depth and spins up pods proactively.

KEDA ScaledObject

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
spec:
  scaleTargetRef:
    name: agent-pool
  minReplicaCount: 0       # Scale to zero when idle
  maxReplicaCount: 100     # 200 in production
  pollingInterval: 15      # Check every 15 seconds
  cooldownPeriod: 120      # Wait 2 min before scaling down
  triggers:
    - type: postgresql
      metadata:
        query: "SELECT COUNT(*) FROM agent_instances WHERE status = 'provisioning'"
        targetQueryValue: "1"    # 1 pod per pending request
    - type: cpu
      metadata:
        value: "75"              # Also scale on CPU pressure

Vertical Pod Autoscaler (VPA)#

VPA right-sizes resource requests based on actual usage patterns. It monitors CPU and memory consumption over time and adjusts requests to eliminate waste while preventing OOM kills.

Cluster Autoscaler#

When pods can't be scheduled due to insufficient node capacity, the cluster autoscaler provisions new nodes from the cloud provider. It uses a least-waste expander strategy and scales down idle nodes after 5 minutes.

Parameter	Value	Description
scale-down-unneeded-time	5 minutes	How long a node must be idle before removal
scale-down-utilization-threshold	0.5	Nodes below 50% utilization are candidates for removal
max-node-provision-time	10 minutes	Timeout for new node to become ready
balance-similar-node-groups	true	Even distribution across node pools

Node Pool Architecture#

The cluster uses dedicated node pools for different workload types, ensuring agents don't compete for resources with the control plane.

Node Pool	Server Type	Count	Purpose
Control Plane	cpx31 (4 vCPU, 8 GB)	3	K8s API server, etcd, scheduler
API Workers	cpx21 (3 vCPU, 4 GB)	3+	Lobstack API, dashboard serving
Agent Workers	cpx41 (8 vCPU, 16 GB)	5+	gVisor-enabled agent pods

✅

Terraform managed

All node pools are provisioned via Terraform modules at infra/terraform/modules/k8s-cluster/. Changes to cluster size are made through terraform plan and terraform apply.

Overview Encryption (AES-256)