The role:
- As a member of our Technology team, the Site Reliability Engineer will be on an on-call rotation to respond to incidents that impact Newsela.com availability and provide support for developers during internal and external incidents.
- Maintain and assist in extending our infrastructure with Terraform, Github Actions CI/CD, Prefect, and AWS services.
- Build monitoring that alerts on symptoms rather than outages using Datadog, Sentry and CloudWatch.
- Look for ways to turn repeatable manual actions into automations to reduce on-call toil.
- Improve operational processes (such as deployments, releases, migrations, etc) to make them run seamlessly with fault tolerance in mind.
- Design, build and maintain core cloud infrastructure on AWS and GCP that enables scaling to support thousands of concurrent users
- Debug production issues across services and levels of the stack.
- Provide infrastructure and architectural planning support as an embedded team member within a domain of Newsela’s application developers.
Why you’ll love this role:
- As a member of our growing Technology team, you will have the opportunity to make a real and immediate impact by:
- being involved in the growth of Newsela’s infrastructure.
- influencing improved resiliency and reliability of the Newsela product.
- You'll impact Newsela.com's availability, which will ultimately scale Newsela’s ability to bring engaging, culturally responsive learning content to K-12 classrooms nationwide.
Why you’re a great fit:
- 2+ years of experience as a Site Reliability Engineer.
- Background in Infrastructure as code: use Terraform and Github CI/CD for automation, containerize our environments (Docker, ECS), and leverage cloud technologies to meet our goals.
- Systems experience managing, configuring and troubleshooting operating system issues, storage (block and object), networking (VPCs, proxies and CDNs), and administer high-availability datastores (mySQL, Postgres, Neo4J) and Redis clusters.
- Monitoring and instrumentation: implement metrics in Datadog, Sentry, log management and related systems, and Slack/JIRA integrations.
- Understanding of engineering practices: availability, reliability and scalability, as well as disaster recovery.
- Ability to work in a variety of languages: Shell, IaC, Python, and SQL.
- Be able to plan using your familiarity with agile methodologies; use epics, issues to drive projects.
- Personal and team workload organization and ability to self-organize and accomplish tasks asynchronously.
- Contributing to Newsela architecture diagrams, process diagrams and runbook documentation.
- Completing Root Cause Analysis (RCA) investigations and perform readiness reviews.
- Improving team practices through code reviews, handoffs of work, and incidents.
- Self-awareness, handling conflict in the team, providing and receiving feedback, and maintaining good relationships with other engineering teams.
- Willingness to proactively step in and do the right thing while providing candid and constructive feedback.
Why you’ll love working at Newsela:
- Health & Wellness: Access to the world’s leading medical experts for healthcare (pets included!). Discounts and resources to stay healthy: mind, body, and soul.
- Work From Home: Almost all of our roles are fully remote - tech stipend included!
- Supporting ALL Families: Supplemental programs and time off to take care of your family and yourself.
- Time Off: Flexible PTO to recharge, including Sabbatical Leave
- Professional Development: Annual stipends for continued learning and education
- Make A Difference: No matter your role or department, the work you do each day helps share the future of education and improves the lives of students and teachers.
Base Compensation: $95,000 - $105,000. Total compensation for this role also includes incentive stock options and benefits. This compensation range may be adjusted based on actual experience.