Kiro replaces guesswork autoscaling with pod-level intelligence — and mathematically guarantees that no busy pod is ever deleted during scale-down.
Kubernetes HPA makes scale-down decisions without knowing which pods are actively handling requests. The result is silent and costly.
When we designed Kiro's scale-down protection, we wanted something stronger than "it usually works." We proved it holds — no edge cases, no configuration dependencies.
Scrapes every pod individually every 15 seconds for loadFactor and isBusy. Knows exactly which pods are handling active work — not just the cluster average.
The pressure metric is floored at busyCount/total. HPA's desired replica count can never drop below the number of actively busy pods. Proven invariant.
Tracks UP↔DOWN direction alternations in a sliding window. When it detects 3+ oscillations in 5 minutes, it emits HOLD — preventing thrashing, cold-start latency, and surprise billing spikes.
Supports HTTP JSON endpoints, Prometheus text format, and JMX/RMI — whatever your application already exposes. No sidecar, no agent, no SDK required.
Embedded DuckDB — no Redis, no Kafka, no Postgres. Runs as a single pod. Ships anywhere EKS, OpenShift, or vanilla Kubernetes runs. Install in one Helm command.
Built-in web UI at port :8090. See pod states, scaling pressure, anti-churn events, and reconcile loop activity — no separate monitoring stack needed.
Every 15 seconds Kiro hits every pod's metrics endpoint concurrently using Java 21 virtual threads.
every 15s · timeout 5s · 3 retriesEach pod's isBusy and loadFactor are stored. Stale pods (scrape failed) are assumed busy — safe side.
Before HPA acts, Kiro applies deletion protection to every non-idle pod at the Kubernetes scheduler level.
applied before every HPA cycleKiro serves kiro_scaling_pressure via the Custom Metrics API — floored at busyCount/total.
KEDA, VPA, and standard HPA address different problems. Kiro is a safety layer, not a replacement — and complements all of them.
| Capability | Kiro | HPA (CPU/mem) | KEDA | VPA |
|---|---|---|---|---|
| Pod-level busy awareness | ✅ Per-pod scrape | ❌ Aggregate only | ❌ Aggregate only | N/A |
| Busy-pod delete guard | ✅ Math-proven | ❌ | ❌ | N/A |
| Anti-churn protection | ✅ HOLD signal | ❌ | ❌ | ❌ |
| Custom metric protocols | HTTP · JMX · Prometheus | — CPU/mem only | 50+ trigger types | — CPU/mem only |
| Zero infra dependencies | ✅ Embedded DuckDB | ✅ | ❌ Redis etc. | ✅ |
| Built-in live dashboard | ✅ Port 8090 | ❌ | ❌ | ❌ |
| Math-proven invariants | ✅ Unit-tested every build | ❌ | ❌ | ❌ |
| Drop-in HPA integration | ✅ Standard Custom Metrics API | ✅ Native | ✅ | Separate |
Add one endpoint to your service. Kiro handles everything else.
Expose GET /metrics returning {"loadFactor": 0.82, "isBusy": true}. Works with any language — Java, Python, Go, Node. Also supports Prometheus text format and JMX/RMI.
Single Helm chart. One pod. No external services. Ships to EKS, OpenShift 4.10+, and vanilla Kubernetes 1.22+.
Apply a KiroScaler resource pointing at your deployment with your upscaleLoadThreshold and downscaleLoadThreshold.
Point your existing HPA at kiro_scaling_pressure via the standard custom.metrics.k8s.io/v1beta2 API. No changes to HPA's min/max settings.
Per-KiroScaler licensing — priced per protected deployment, not per node. A 500-node cluster with 3 critical services pays for 3 scalers.
We'll walk through a live demo tailored to your cluster setup, show you the dashboard, and answer every question your team has before you commit a single line of code.
We'll be in touch within 1 business day to schedule your demo. Check your inbox for a confirmation.