DevOps Engineer
U.S. GenAI startup, Cambridge Office
Full-Time Employment with We . We are committed to building a transformative AI platform that revolutionizes software development. Our goal is to enable you to have a long, impactful career with us, with opportunity for advancement. If you want a role where you can shape the future of AI-powered infrastructure, read on!
About UsWe are a Boston, MA based Generative AI Start-up on a mission to automate custom software creation to unlock the next industrial revolution. We're building an AI-powered platform capable of autonomously generating enterprise-grade software, powered by thousands of cooperative AI agents working in concert.
We're backed by multiple tier 1 investors, have success as founders at our previous start-up, and hold dozens of Generative AI patents.
Location: 1 Kendall Square, Cambridge, MA (In-person role)
About the RoleWe're looking for an exceptional DevOps Engineer to architect and maintain the infrastructure that powers our revolutionary AI agent ecosystem. You'll be instrumental in building scalable, resilient systems that support both our cutting-edge AI platform and modern applications. This role offers the unique opportunity to work at the intersection of traditional DevOps and emerging AI infrastructure, creating systems that enable thousands of AI agents to collaborate seamlessly.
As our DevOps Engineer, you'll take ownership of our entire infrastructure stack-from Kubernetes orchestration to AI agent deployment pipelines. You'll work directly with our engineering teams to ensure our platform can scale to support enterprise customers while maintaining the performance and reliability they demand.
What Success Looks Like-
Architect and implement robust Kubernetes infrastructure that scales effortlessly to support our growing AI agent ecosystem
-
Create sophisticated CI/CD pipelines that enable rapid, reliable deployment of both traditional services and AI agents
-
Develop Python-based automation to eliminate manual tasks and accelerate development velocity
-
Design monitoring and observability systems for deep insights into both infrastructure and AI agent performance
-
Optimize cloud infrastructure for cost-efficiency while maintaining enterprise-grade reliability
-
Collaborate effectively with development teams to improve developer experience and productivity
-
Proactively identify and resolve infrastructure bottlenecks before they impact customers
-
Establish infrastructure best practices to support rapid growth
-
Build systems that handle the unique challenges of AI workloads at scale
-
Maintain 99.9%+ uptime for critical production services
Core Infrastructure:
-
Kubernetes cluster design, deployment, and management for AI and application workloads
-
Infrastructure as Code using Terraform for multi-cloud environments
-
Container orchestration and optimization for AI agent deployment
-
Network architecture and security for distributed systems
Automation & Tooling:
-
Python-based automation scripts for infrastructure management
-
Helm chart development and maintenance for application deployment
-
CI/CD pipeline design using modern DevOps tools
-
Developer productivity tooling and automation
Monitoring & Reliability:
-
Comprehensive monitoring, alerting, and tracing systems
-
Performance optimization for AI workloads
-
Incident response and disaster recovery planning
-
Cost optimization and resource management
AI Infrastructure (Unique to Us):
-
Infrastructure for AI agent orchestration and management
-
MLOps pipeline integration
-
Scalable systems for handling AI model deployment
-
Resource optimization for GPU/compute-intensive workloads
-
5 8 years of DevOps/Infrastructure experience
-
Expert-level Python proficiency for automation and scripting
-
Deep Kubernetes expertise: deployment, scaling, troubleshooting, and optimization
-
Strong experience with Helm for application package management
-
Proven track record designing and implementing CI/CD pipelines
-
Hands-on experience with major cloud platforms (AWS, Azure, or GCP)
-
Terraform expertise for Infrastructure as Code
-
Strong Linux administration and containerization (Docker) skills
-
Experience with monitoring tools (Prometheus, Grafana, ELK stack)
-
Understanding of microservices architecture and distributed systems
-
CKA (Certified Kubernetes Administrator) or CKAD certification
-
Experience with MLOps tools (MLflow, Kubeflow, Ray, etc.)
-
Knowledge of AI/ML infrastructure requirements and optimization
-
Experience with GPU orchestration and management
-
API gateway and service mesh implementation (Istio, Linkerd)
-
GitOps experience (ArgoCD, Flux)
-
Experience scaling infrastructure for high-growth startups
-
Contributions to open-source infrastructure projects
-
Experience with multi-region, highly available deployments
-
Background in security and compliance (SOC2, HIPAA)
-
Competitive Salary
-
Comprehensive health, dental, and vision insurance
-
401(k) with company match
-
Flexible PTO policy
-
$5,000 annual professional development budget
-
Latest hardware and software tools
-
The opportunity to shape infrastructure for the future of software development
-
Work with cutting-edge AI technology and world-class engineers
-
Modern office in Cambridge's innovation hub
-
Regular team events and activities
-
The chance to solve novel infrastructure challenges at the intersection of DevOps and AI
Who we are: Our founding team consists of a Serial Gen AI Inventor and a successful Serial Entrepreneur. We work hard, maintain a curious mindset, and believe in a low-ego, high-output approach.
We move fast. Time is our most precious asset. We make decisions quickly and iterate rapidly, believing that a good decision today beats a perfect decision next week.
We have a Championship Mindset. We operate like a professional team-winning together by maintaining high standards, supporting each other, and staying laser-focused on our mission.
We have a Passion for Invention. As technologists pushing the boundaries of what's possible with AI, we thrive on solving problems that haven't been solved before.
What We Ask of YouThis role requires someone who thrives in ambiguity and loves tackling unprecedented challenges. You'll be building infrastructure for a type of platform that's never existed before-one where thousands of AI agents collaborate to write software. This means being comfortable with rapid change, continuous learning, and creative problem-solving.
You should be excited about working independently while collaborating in-person with our team at our Cambridge headquarters. The ability to communicate complex technical concepts clearly and work effectively with both technical and non-technical stakeholders is essential.
To ApplyApply with your resume and a brief note about:
-
Your most challenging infrastructure project and how you solved it
-
Why you're excited about building infrastructure for AI-powered software development
Here's what you can expect:
-
Initial screening call (30 minutes)
-
Technical discussion with our team (45 minutes)
-
Deep dive system design (60 minutes)
-
Final conversation with leadership (45 minutes)
-
Offer discussion
We are an equal opportunity employer committed to building a diverse and inclusive team.
Recommended Jobs
Special Education Teacher
Job Description Job Description RCS is looking for Special Education Teachers to join us in providing the highest quality of individualized, educational services to students with Autism Spectru…
Senior Manager, Business Development
Description & Requirements Maximus is currently seeking a Senior Manager, Business Developmen t. This is an exciting opportunity in our Federal Civilian Division supporting our Administration…
Manager, Ontology and Data Modeling
Overview Manager, Ontology and Data Modeling The role of the Manager of Ontology and Data Modeling is to develop, implement, and maintain enterprise ontologies in support of Capital One's Data…
Director - Membership
Job Description Job Description Description The YMCA is for Youth Development, Healthy Living and Social Responsibility. Under the supervision of the Executive Director, this position will drive…
Join Our Talent Community
Job Description Job Description Join our Talent Community at Legacy Care Group! We specialize in providing exceptional healthcare professionals to some of Massachusetts' leading hospitals and f…
Litigation Associate (Securities) - Boston
Job Description Job Description Job Title: Litigation Associate – Securities Locations: Austin, Boston, Chicago, Houston, New York, NY | Orange County, CA | San Diego, CA | San Francisco, CA…
Supervisor Distribution - 1st or 3rd Shift
Job Description Job Description Job Description The primary purpose of this job is to supervise the department responsible for in meeting or exceeding Key Performance Indicators (KPI), maintai…
Clinical Director of Justice Services
Location & Program: MID New Bedford Licensure & Education: Master Degree, Independent License (LMFT, LICSW, LMHC) required Salary: $96,080 - $110,000 annually (Education, experience, & licens…
Home Care Scheduling Coordinator
Job Description Job Description Job description Founded in 2003, Guardian Angel Senior Services was created from a dream that we could make a difference in the lives of seniors and go above and …
Director of Business Development - Electrode Technology
Fastcap is an innovative, high-tech start-up company with an amazing company culture. We trust our employees and wholeheartedly believe in the value of transparency at all levels of the company. We e…