Summary:
We are seeking a Sr. Manager of DevOps to lead and manage our DevOps team, focusing on database management, automation, and daily operational excellence. This hands-on leadership role is critical in ensuring system reliability, scalability, and performance while driving continuous improvement in CI/CD pipelines, infrastructure, and database operations.The ideal candidate will have deep expertise in managing databases, cloud environments, and DevOps best practices while also leading a team that supports enterprise applications and mission-critical workloads.
You will work closely with software engineers, database administrators, and IT operations teams to streamline deployments, enhance system performance, and ensure security compliance.
Responsibilities:
- Leadership and Team Management
- Lead, mentor, and grow a high-performing DevOps team focused on automation, reliability, and performance.
- Foster a collaborative, results-driven culture with a focus on operational excellence.
- Define clear goals and KPIs for DevOps engineers and database administrators.
- Database Operations & Management
- Oversee database infrastructure, ensuring high availability, security, and scalability.
- Implement backup, recovery, and disaster recovery strategies for critical databases.
- Work with application teams to optimize database performance and query efficiency.
- Maintain database compliance with security and regulatory standards.
- Infrastructure & DevOps Automation
- Design and implement scalable and automated infrastructure solutions.
- Manage CI/CD pipelines to ensure fast, reliable, and secure deployments.
- Optimize cloud-based and on-prem infrastructure for performance and cost efficiency.
- Ensure infrastructure as code (IaC) best practices are followed for repeatability and consistency.
- Operational Excellence & Incident Management
- Oversee day-to-day DevOps operations, ensuring system uptime and reliability.
- Define and implement monitoring, alerting, and logging strategies for proactive issue resolution.
- Establish incident response plans and lead root cause analysis (RCA) for system failures.
- Collaborate with engineering teams to ensure system reliability and zero-downtime deployments.
- Security & Compliance
- Enforce security best practices across infrastructure, applications, and databases.
- Ensure compliance with industry regulations and internal security policies.
- Partner with security team to conduct regular audits and vulnerability assessments.
- Cross-Team Collaboration
- Partner with software development, IT, and data engineering teams to align DevOps and database strategies with business goals.
- Act as a bridge between development and operations to drive efficiency and innovation.
- Work with stakeholders to implement new technologies that enhance DevOps capabilities.
- Qualifications
- 8+ years of experience in DevOps, or Site Reliability Engineering (SRE).
- 3+ years of experience in managing a DevOps team
- Strong hands-on experience with databases (SQL, NoSQL, PostgreSQL, MongoDB, etc.).
- Expertise in CI/CD pipelines, automation, and infrastructure as code (Terraform, Ansible, Kubernetes, etc.).
- Experience with AWS, Azure, or GCP for cloud-based infrastructure management.
- Knowledge of containerization (Docker, Kubernetes) and microservices architecture.
- Strong background in monitoring/logging tools (Prometheus, Grafana, Splunk, ELK, etc.).
- Understanding of networking, security best practices, and compliance frameworks.
- Excellent problem-solving, communication, and leadership skills.
- Preferred Qualifications
- Experience managing multi-cloud environments.
- Expertise in performance tuning and database optimization.