logo inner

Sr. Site Reliability Engineer - Scale & Performance (Hybrid)

HashiCorpOnsite

The Role


As a Senior Site Reliability Engineer for the Operational Readiness team, you will play a critical role in enhancing the scalability, performance, and reliability of HashiCorp's cloud products. With at least 5 years of experience in site reliability engineering or a related field, you will lead efforts to identify performance bottlenecks, address, and mitigate operational challenges before they impact our customers. Your expertise in load testing, performance analysis, and system hardening will ensure that our services meet the highest standards of operational excellence.Having a holistic view of enterprise and cloud systems, you will play a pivotal role in enhancing our operational resilience and maintaining the reliability of our enterprise and cloud-based products.

With a focus on overall Quality you will be at the forefront of ensuring high availability and performance across HashiCorp’s offerings.You will provide expert execution of the test plans, defining system wide strategies for product load and performance testing. You will be working on a wide variety of tools and exploring new avenues to ensure all the products meet the essential Operational readiness criteria. Utilize top-notch troubleshooting techniques like simulating the system with Chaos to identify, organize, and advocate for novel solutions to remediate customer impact on complex interconnected systems. 

Key Responsibilities


  • Implement best practices for system reliability, including proactive identification of potential failure points and the development of automated mitigations
  • Design and execute comprehensive load testing strategies to identify performance bottlenecks and scalability limits across our cloud products
  • Implement best practices and technologies to improve system resilience, ensuring high availability and fault tolerance.
  • Work closely with engineering and product teams to integrate operational readiness into the development lifecycle, enhancing product stability and user satisfaction.
  • Build and refine tools and frameworks for automated testing, environment simulation, and incident reproduction, reducing manual effort and increasing test coverage.
  • Conduct in-depth analysis of testing results, documenting findings and making actionable recommendations for system enhancements.
  • Drive Systemic Improvements to the products by introducing Chaos Testing and partnering with product development teams. 
  • Share your knowledge and expertise with team members, fostering a culture of learning and continuous improvement.
  • Develop and implement disaster recovery and backup strategies to ensure data integrity and system resilience.

Ideal Candidate


  • 5+ years of experience in SRE , systems engineering, or non functional testing roles with a focus on operational readiness, performance testing, or system scalability.
  • Experience in driving systemic improvements through Chaos engineering practices.
  • Programming skills in any of the high level languages or scripting 
  • Proven track record of leading successful load testing and performance optimization initiatives in cloud and on-prem environments.
  • Experience in creating and managing test environments for automated testing.
  • Strong fundamentals of CI/CD process and maintaining quality pipelines.
  • Experience with version control systems (e.g., Git) and agile project management methodologies
  • Understanding of monitoring and alerting systems, with the ability to develop metrics and alarms that accurately reflect system health and operational risks.
  • Strong technical foundation in cloud technologies ( AWS, Azure, Or GCP) and container technologies like Nomad or Kubernetes.
  • Strong experience with performance testing tools like K6, Artillery, Vegeta, Locust etc
  • Effective communication and collaboration skills, capable of working with cross-functional teams and articulating technical concepts to diverse audiences.
  • Familiarity with HashiCorp products and tools is a plus.
  • Exposure to the disaster recovery domain is a plus.#LI-Hybrid

Life at HashiCorp

HashiCorp was founded by Mitchell Hashimoto and Armon Dadgar in 2012 with the goal of revolutionizing datacenter management: application development, delivery, and maintenance. The datacenter of today is very different than the datacenter of yesterday, and we think the datacenter of tomorrow is just around the corner. We're writing software to take you all the way from yesterday to today, and then safely to tomorrow and beyond. Physical, virtual, containers. Private cloud, public cloud, hybrid cloud. IaaS, PaaS, SaaS. Windows, Linux, Mac. These are just some of the choices faced when architecting a datacenter of today. And the choice is not one or the other; instead, it is often a combination of many of these. HashiCorp builds tools to ease these decisions by presenting solutions that span the gaps. Our tools manage both physical machines and virtual machines, Windows, and Linux, SaaS and IaaS, etc. And we're committed to supporting next-generation technologies, as well. HashiCorp was founded and continues to be run by the primary authors of all our core technologies powering thousands of companies worldwide. We speak at conferences and write books related to application and infrastructure management. All our foundational technologies are open source and developed openly, and have been since 2010. The Tao of HashiCorp is the foundation that guides our vision, roadmap, and product design. As you evaluate using or contributing to HashiCorp's products, it may be valuable to understand the motivations and intentions for our work. Learn more about the Tao of HashiCorp here: https://www.hashicorp.com/tao-of-hashicorp
Thrive Here & What We Value- Collaborative and Supportive Work Environment- Agile Methodologies- Customer-Centric Approach- Continuous Learning and Improvement- Innovation and Creativity- Outstanding Customer Experiences- Flexible Working Arrangements- Comprehensiveness over Point Solutions- Investment in Deployment Options
Your tracker settings

We use cookies and similar methods to recognize visitors and remember their preferences. We also use them to measure ad campaign effectiveness, target ads and analyze site traffic. To learn more about these methods, including how to disable them, view our Cookie Policy or Privacy Policy.

By tapping `Accept`, you consent to the use of these methods by us and third parties. You can always change your tracker preferences by visiting our Cookie Policy.

logo innerThatStartupJob
Discover the best startup and their job positions, all in one place.
Copyright © 2024