logo inner

Sr Site Reliability Engineer

People ConnectCm Washington | Bellevue, Washington, United StatesOnsite
This job is no longer open

Do you aspire to take on a strategic, leadership-oriented role where you design and guide infrastructure at an architectural level? Are you passionate about identifying and solving complex operational challenges, improving system reliability, and driving modernization? Do you thrive on designing scalable, fault-tolerant systems and implementing automation that transforms on-prem applications into cloud-native solutions? If so, this role is the perfect next step in your journey!
As a Senior Site Reliability Engineer at Classmates.com, you will be responsible for designing, implementing, and maintaining the infrastructure and systems that power our applications and services. Collaborating closely with cross-functional teams, you will drive operational excellence, automate processes, and continuously improve system reliability. You’ll be a trusted specialist on complex technical and business challenges, leveraging your expertise in cloud technologies, automation, and performance optimization to shape the future of our platform.In this role, you will work collaboratively with the team, often multitasking, and consistently driving projects to completion.

Success in this position requires steadfast persistence, innovative thinking, the ability to interpret performance data effectively, and strong interpersonal skills. And if you can achieve all this while having fun, even better!

Location and Logistics


  • This is a hybrid role requiring 2-3 days per week in our Bellevue, WA office.
  • Local candidates are encouraged to apply.
  • Please note, we are unable to offer visa sponsorship, visa transfer, or corp-to-corp arrangements for this position.

Key Responsibilities:


  • Cloud Strategy and Architecture
    • Provide strategic leadership, mentorship, and a technical vision to advance site reliability engineering, DevOps, and a ‘cloud-first’ culture across the organization.
    • Define and implement scalable, secure, and cost-optimized cloud strategies that align with business goals and future growth.
    • Lead architectural decisions, establishing and enforcing best practices for cloud infrastructure design and operational excellence.
    • Drive modernization initiatives, transitioning legacy on-premise applications to cloud-native architectures using containerization, microservices, and serverless technologies.
    • Stay ahead of emerging cloud technologies, evaluating new tools and services to enhance performance, reliability, and developer self-service capabilities.

  • Infrastructure Automation and Design
    • Collaborate on designing, building, and maintaining scalable infrastructure across cloud and on-prem environments.
    • Architect and implement automated solutions to provision, monitor, and scale complex infrastructures, leveraging IaC tools like Terraform and Puppet, with a focus on modular, reusable designs.
    • Develop automation scripts, maintain CI/CD pipelines, and plan for scalability and capacity, conducting load testing as needed.

  • Reliability and Performance Engineering
    • Ensure system reliability, availability, and performance through monitoring, alerting, and incident response.
    • Implement and manage SLOs/SLIs to meet and exceed reliability standards.
    • Identify and address performance bottlenecks across the infrastructure and application stack.
    • Build and maintain observability solutions (e.g., monitoring, logging, and tracing) and improve system health dashboards.
    • Define and enforce best practices for reliability engineering, including failure injection and chaos engineering.

  • Security and Compliance:
    • Implement security measures for cloud-native applications and ensure compliance with industry standards (SOC2, PCI, etc.).
    • Collaborate with security teams to respond to active threats, audit systems, and continuously update configurations.
    • Monitor security configurations and dashboards, ensuring proactive responses to potential vulnerabilities.

  • Incident Management and Root Cause Analysis:
    • Participate in on-call rotations to provide 24/7 support for production environments.
    • Lead post-incident reviews, collaborating with cross-functional teams to identify systemic improvements.
    • Establish metrics for tracking incident frequency, response times, and resolution effectiveness.
    • Proactively test system resilience through Chaos Engineering experiments and failure injection.
    • Create and maintain runbooks and operational documentation to drive continuous improvement.

  • Disaster Recovery and Business Continuity
    • Design and test disaster recovery (DR) and business continuity strategies, ensuring backup and failover mechanisms are effective.
    • Develop and implement testing schedules for DR strategies to validate readiness and compliance.

  • Cost Management and Financial Optimization
    • Monitor cloud usage and lead FinOps initiatives to control and optimize infrastructure costs.
    • Collaborate with stakeholders to drive financial accountability and efficiency across engineering teams.

  •  Collaboration, Knowledge Sharing, and Communication:
    • Collaborate across teams to ensure alignment and effective project implementation.
    • Communicate during incidents and changes, providing transparency to stakeholders.
    • Mentor and share knowledge with team members to foster a culture of continuous learning and innovation.
    • Facilitate the evaluation and adoption of tools and technologies that enhance team productivity.

    Qualifications:


    • Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field, or equivalent experience.
    • 5+ years of experience as a Site Reliability Engineer or in a similar role, working with highly available and production environments.
    • Proficiency in AWS and containerization technologies like Kubernetes and Docker.
    • Strong experience with Infrastructure as Code (IaC) using Terraform, with automation scripting skills in Python, Bash/Shell, or Go.
    • Deep knowledge of Linux/Unix systems and networking fundamentals (e.g., TCP/IP, DNS, HTTP, VPN).
    • Experience with monitoring and observability tools (e.g., Datadog, Prometheus, Grafana) and incident management.
    • Familiarity with CI/CD pipelines, preferably using tools like GitLab, and strong knowledge of DevOps practices.
    • Excellent troubleshooting skills, with experience in performance optimization and root cause analysis.
    • Strong communication and collaboration skills.
    • Bonus skills: experience with Rundeck, Java, Spring Framework, Terragrunt, Puppet, Vector, Loki, VictoriaMetrics, and additional cloud platforms (e.g., GCP, Azure), as well as relevant certifications such as AWS Solutions Architect or Certified Kubernetes Administrator (CKA).

    Classmates


     Classmates is the premier online, social, and mobile destination for reconnecting with the people from your high school years. Classmates offers the largest digitized collection of high school yearbooks online, with over 450,000 available to view, tag, sign, and share, and has the most comprehensive directory of high schools and class lists from the 1940s to today. 

    Salary Range:


    Min: $152,700Mid: $170,800Max: $190,600The pay range reflects the salary amount the Company reasonably expects to pay for the position. It is not a guarantee of actual compensation or a specific payment amount to any candidate. The actual compensation will depend on numerous factors including, without limitation, a particular candidate’s experience and qualifications.The Company's Applicant and Worker Privacy Notice can be found here.

    PeopleConnect is an equal opportunity employer.


    Local area candidates are encouraged to apply, and please note we are not able to offer visa sponsorship, visa transfer, or corp-corp arrangements.

    Note for Principal Agencies


     - Principal agents should not forward resumes to PeopleConnect, as we will not be responsible for any fees arising from the use of resumes submitted from agencies without a prior written and signed agreement and authorized job order for this position in place.

    PeopleConnect, Inc. is an equal opportunity employer 152700.00 To 190600.00 (USD) Annually


    This job is no longer open

    Life at People Connect

    We started as two companies with deep Seattle roots, both in the business of finding, and managing, information about people. These companies were brought together to revolutionize an industry that’s filled with dark corners and reactive strategy. Using the combined resources of Classmates’ one-of-a-kind social network, Intelius’ proprietary people profiles, and the merged talents and experience of our employees, we aim to shine a light in those dark corners and be the digital identity company that empowers consumers. Located in the heart of Downtown Seattle, we work in small empowered teams that create a big impact. We are agile and down-to-earth, collaborate toward common goals, and enjoy a balance between strategic and tactical as well as work and life. Our core values are deeply focused on balance, collaboration, continual improvement
    Thrive Here & What We Value1. Innovative: Embracing new ideas and technologies.2. Creative: Fostering a culture of imagination and problem-solving.3. Collaborative: Encouraging teamwork and knowledge sharing.4. Talented: Attracting and developing skilled individuals.5. Diverse: Valuing different perspectives and experiences.6. Supportive: Providing a positive and nurturing environment.7. Adaptable: Embracing change and adapting to new challenges.8. Customer-focused: Prioritizing customer satisfaction and needs.9. Accountable: Upholding responsibility and accountability.10. Growth-oriented: Striving for continuous improvement and development.
    Your tracker settings

    We use cookies and similar methods to recognize visitors and remember their preferences. We also use them to measure ad campaign effectiveness, target ads and analyze site traffic. To learn more about these methods, including how to disable them, view our Cookie Policy or Privacy Policy.

    By tapping `Accept`, you consent to the use of these methods by us and third parties. You can always change your tracker preferences by visiting our Cookie Policy.

    logo innerThatStartupJob
    Discover the best startup and their job positions, all in one place.
    Copyright © 2025