About our Client
Their goal is to save ten million hard-working employees ten billion dollars. They are a values-driven, well-funded, and fast-growing Financial Technology and HR company. They aim to empower small and midsize businesses with financial tools that make them the place where people want to work.Our client has created a financial empowerment platform that helps small but mighty HR teams make a big impact on employee financial wellness. This platform is quickly becoming the employee financial wellness super-app that employees can’t live without and employers are eager to offer to attract and retain talent.They were recognized for rapid growth in the 2023 Deloitte Technology Fast 500 and Canadian Technology Fast 50 programs.
About the Role
We are looking for a Senior Site Reliability Engineer to enhance our client's cloud infrastructure with complex AWS builds, infrastructure-as-code, and observability/logging/APM solutions. You'll be part of an embedded reliability team, working alongside application and data engineers to monitor, benchmark, and scale our products. This role offers the opportunity to work with cutting-edge technologies and leverage AWS services while bridging the gap between bare metal infrastructure and a Ruby on Rails production environment.
Your priorities?
Predictability, reliability, and scalability.
RESPONSIBILITIES
- Develop and maintain infrastructure-as-code CloudFormation templates, focusing on serverless resources (ECS, Fargate, Lambda).
- Implement instrumentation and performance monitoring for infrastructure and Ruby on Rails applications using AWS tools (Athena, CloudTrail) and third-party observability platforms (DataDog, OpenTelemetry).
- Manage deployment pipelines, including blue/green deployments and intelligent auto-scaling strategies.
- Oversee database and caching solutions (RDS, ElastiCache/Redis), handling updates, playbooks, and downtime planning.
- Optimize AWS cost management, including forecasting and implementing savings programs (reserved instances, auto-scaling strategies).
- Collaborate with risk and security teams to ensure SOC-2 compliance and strengthen cybersecurity practices.
- Work closely with application developers on shared metrics, database performance, and load testing.
- Partner with data engineers to support data warehouse development, ELT, and ETL processes.
- Participate in agile development practices, including sprint planning, backlog grooming, and stand-ups.
- Adhere to secure coding practices and software development lifecycle (SDLC) standards.
WHAT YOU BRING
- 5+ years of infrastructure experience.
- 2+ years of AWS experience, including production deployments (certifications preferred).
- Proficiency with Infrastructure-as-Code (IaC) tools, specifically CloudFormation.
- Experience with containerization (Docker, ECS, ECR).
- Strong background in performance monitoring and observability, using tools like DataDog, New Relic, and OpenTelemetry.
- Ability to iterate quickly for experimentation and build scalable, maintainable solutions for core functionality.
- Strong SQL and data analysis skills with a problem-solving mindset.
Compensation:CAD 120000-150000