Skip to content

Scaling

Infragate serverless instances scale automatically with demand: higher traffic launches more instances, and they scale back down as load subsides. This elasticity can increase costs, because credits are consumed per invocation and execution time.

How it behaves

New instances are created as concurrent requests increase, and unnecessary instances are retired when load drops.
Idle services can scale to zero. The first request after an idle period may experience extra latency while a new instance boots (often called a cold start).
Bursts are handled by rapidly starting additional instances until capacity meets demand or account limits are reached.

Concurrency and queuing

Each instance processes a limited amount of work at a time. When concurrency exceeds that capacity, the platform starts more instances to keep up.
If growth outpaces spin-up speed or you hit limits, requests may queue temporarily. Use client retries with exponential backoff and jitter for resilience.

Latency tips (mitigate cold starts)

Keep startup fast: minimize dependency size, avoid heavy global initialization, and lazy-load noncritical components.
Reuse connections (HTTP, database) across requests within the same instance to reduce per-invocation overhead.
Tune memory/CPU settings thoughtfully: more resources can reduce startup and execution time but may increase cost per unit time.

Cost considerations

Shorter executions and efficient startup reduce total credits consumed.
Highly spiky traffic patterns benefit from scale-to-zero, but expect occasional cold starts unless traffic stays warm.