Early-stage AI infrastructure startup building agent deployment platform
Software Engineer, Infrastructure
Join a 12-person infrastructure team building the backbone for AI agent deployment and orchestration. You'll design and maintain cloud infrastructure that powers agent workloads across multiple cloud providers, working with Kubernetes, containerization, and infrastructure-as-code tools. This is a hybrid role in San Francisco where you'll own reliability, scalability, and observability for a platform serving AI development teams.
What we're looking for
- 4-7 years of experience in infrastructure, DevOps, or SRE roles
- Proficiency with Kubernetes, Docker, and container orchestration at scale
- Strong experience with AWS and/or GCP, plus infrastructure-as-code tools like Terraform
- Hands-on expertise setting up and maintaining observability stacks (Prometheus, Grafana, or similar)
- Experience designing for high availability, security, and multi-cloud deployments