Skip to content

Scaling

Infragate serverless instances scale automatically with demand: higher traffic launches more instances, and they scale back down as load subsides. This elasticity can increase costs, because credits are consumed per invocation and execution time.

  • New instances are created as concurrent requests increase, and unnecessary instances are retired when load drops.
  • Idle services can scale to zero. The first request after an idle period may experience extra latency while a new instance boots (often called a cold start).
  • Bursts are handled by rapidly starting additional instances until capacity meets demand or account limits are reached.
  • Each instance processes a limited amount of work at a time. When concurrency exceeds that capacity, the platform starts more instances to keep up.
  • If growth outpaces spin-up speed or you hit limits, requests may queue temporarily. Use client retries with exponential backoff and jitter for resilience.
  • Keep startup fast: minimize dependency size, avoid heavy global initialization, and lazy-load noncritical components.
  • Reuse connections (HTTP, database) across requests within the same instance to reduce per-invocation overhead.
  • Tune memory/CPU settings thoughtfully: more resources can reduce startup and execution time but may increase cost per unit time.
  • Shorter executions and efficient startup reduce total credits consumed.
  • Highly spiky traffic patterns benefit from scale-to-zero, but expect occasional cold starts unless traffic stays warm.