Director, Site Reliability Engineering
Come Join Our Passionate Team!
At Barracuda, we make the world a safer place. We believe every business deserves access to cloud-enabled, enterprise-grade security solutions that are easy to buy, deploy, and use. We protect email, networks, data, and applications with innovative solutions that grow and adapt with our customers’ journey. More than 220,000 organizations worldwide trust Barracuda to protect them — in ways they may not even know they are at risk — so they can focus on taking their business to the next level.
We know a diverse workforce adds to our collective value and strength as an organization. Barracuda Networks is proud to be an Equal Opportunity Employer, committed to equal employment opportunity and equitable compensation regardless of race, gender, religion, sex, sexual orientation, national origin, or disability.
Envision Yourself at Barracuda:
We are seeking a strategic and visionary Director of Site Reliability Engineering (SRE), in the Cloud Operations group, to lead global reliability initiatives across Barracuda’s SaaS portfolio. You will oversee a distributed team of Site Reliability Engineers and partner closely with Product Engineering, Security & Compliance, and other Cloud Operations teams to ensure our platforms are highly available, scalable, secure, and cost-efficient. This role will also drive AI-powered automation and agentic systems adoption to transform reliability operations.
What will you be working on:
- Strategic Leadership: Define and execute Barracuda’s global SRE strategy, aligning reliability goals with business objectives and customer SLAs.
- Operational Excellence: Drive continuous improvement in availability, latency, performance, and cost optimization across all cloud services.
- AI & Agentic Systems Integration: Implement AI-driven observability and anomaly detection for proactive incident prevention; deploy agentic automation systems to manage routine operational tasks, optimize cloud resources, and accelerate remediation workflows; explore LLM-based runbooks and autonomous agents for incident triage and root cause analysis.
- Cross-Functional Collaboration: Partner with Engineering, Security, and FinOps teams to embed reliability into product design and delivery pipelines.
- Architecture & Governance: Influence architectural decisions for reliability, disaster recovery, and observability systems; ensure compliance with security and regulatory standards.
- Automation & Tooling: Champion Infrastructure-as-Code and CI/CD automation at scale using Terraform, Cloud Formation, GitHub Actions, and Jenkins.
- Incident & Risk Management: Facilitate incident response protocols, conduct executive-level postmortems, and implement proactive risk mitigation strategies.
- Service Level Management: Define and enforce SLIs and SLOs across global services; report reliability metrics to executive leadership.
- Team Development: Build and mentor a high-performing SRE organization; foster a culture of ownership, innovation, and collaboration across regions.
- Cloud Optimization: Lead initiatives for cost governance and performance tuning in AWS and Azure environments.
- Executive Communication: Present reliability roadmaps, KPIs, and risk assessments to senior leadership and stakeholders.
What you bring to the role:
- Experience: 12+ years in infrastructure, cloud operations, or SRE roles, including 5+ years in leadership positions managing distributed teams.
- Cloud Expertise: Deep knowledge of AWS and Azure architectures, security, and operations in large-scale SaaS environments.
- AI & Automation: Experience implementing AI-driven observability, predictive analytics, and autonomous remediation systems.
- Infrastructure as Code: Proven success implementing such as Terraform or CloudFormation at enterprise scale.
- CI/CD & Automation: Advanced experience with GitHub Actions, Jenkins, and deployment strategies (blue/green, canary, rolling).
- Container Orchestration: Expertise in Kubernetes (EKS, AKS) and containerized workloads.
- Observability & Resilience: Strong background in Prometheus, Grafana, ELK, and APM tools; experience designing self-healing systems.
- Programming: Proficiency in Python, Go, or similar languages for automation and tooling.
- Leadership Skills: Exceptional ability to lead globally distributed teams, influence cross-functional stakeholders, and drive cultural change.
- Certifications: AWS Solutions Architect/DevOps Professional and Kubernetes certifications (CKA, CKAD) preferred.
What You Will Get from Us:
- A leadership role where your vision shapes the reliability of mission-critical systems.
- Opportunities for career growth and executive visibility.
- High-quality health benefits, retirement plan with employer match, and flexible time off.
- The chance to work on cutting-edge cloud reliability challenges at scale.
The anticipated base salary range for this role is $180,000 to $240,000. Actual compensation offered will be dependent upon the individual's skills, experience, and qualifications as they directly relate to the requirements of the position, the budget for the position, and applicable employment laws. At Barracuda, we believe in fair and equitable compensation practices that reflect both market realities and the unique circumstances of each geographical location. We recognize that cost-of-living disparities, market conditions, and other factors can significantly impact compensation expectations in different regions. The compensation range provided in this job description is for illustrative purposes only and may not reflect the actual compensation offers for the position in your location. Final compensation will be determined based on a variety of factors including the candidates’ qualifications and experience
#LI-hybrid
Recommended Jobs
Corporate Accountant
Company Overview Cabot Properties is a leading international private equity real estate investment firm specializing in logistics and industrial properties. With a strategic focus on high-grow…
GENERAL HELP (Northampton)
Responsibilities Support manufacturing operations to improve productivity and reduce downtime. Assist with plant and facility tasks; follow all safety and lock-out/tag-out procedures. Ma…
Junior Research Associate
High Yield Junior Research Associate Firm Overview Polen Capital, a global investment management firm advising approximately $42 billion in assets as of January 2026, provides high value-added …
Prep Cook
Job Description Prep Cook PeopleReady of Springfield, MA is now hiring Prep Cooks. As a Prep Cook, you will perform many different duties that are part of preparing meals. Apply today and yo…
Supplier Accounts Receivable Specialist (Finance)
Description Under the direction of the AVP of Finance, this position is responsible for the collection of supplier accounts receivable. Key Accountabilities: Accounts Receivable Supplier c…
SY 25-26: Temporary Lead Teacher, Special Education (per diem)
SY 25-26: Temporary Lead Teacher Special Education (per diem) MISSION Bridge Boston Charter School is an inclusive and joyful community that combines a dynamic academic and social-emotional lear…
Postdoctoral Research Fellow-Lane Lab
Postdoctoral Research Fellow - Lane Lab Dana-Farber Cancer Institute Boston, MA Full Time Overview A full-time postdoctoral research fellow position is available in the cancer biology …
MH Patient Access Coordinator I
SUMMARY: 0 Brown University Health employees are expected to successfully role model the organization's values of Compassion, Accountability, Respect, and Excellence as these values guide our everyday…
Physical Therapist - Outpatient - License Required
ATTENTION: This position requires graduation as a Doctor of Physical Therapy (or equivalent standing) and a state license as a Physical Therapist! Now Hiring Physical Therapist Full-time or P…
Low Voltage Systems Specialist
Low Voltage Systems Specialist Location: Holliston, MA Country: United States Salary: $80K-$120K Start Date: Description: Our client, a leading electrical contractor with nearly 50…