logo inner

Site Reliability Engineer - Observability

Emerald Cloud LabAustin, Texas, United StatesRemote
This job is no longer open

The 
Emerald Cloud Laboratory (ECL) enables life scientists to move out of the lab, and to conduct research entirely from a computer. Stepping away from manual completion of experiments at the bench, scientists on the ECL leverage the remote, automated execution of all standard biology and chemistry experiments in Emerald’s industrial lab facilities, working within a software platform for all stages of research workflows, from experimental design to data analysis. Our system empowers scientists at Big Pharma companies, startups, and academic laboratories by allowing them to run wet lab experiments from anywhere in the world without ever stepping foot into the lab.The Team:Site Reliability Engineering at ECL is responsible for the security, reliability, and capacity of the software and virtual machines used to develop and run both our application and our laboratory, as well as development and improvement of internal specialty applications and integration.

You will be joining a tight-knit and interdisciplinary team. Our methodology relies heavily on automation, infrastructure-as-code, and continuous integration and deployment.Our Responsibilities:

  • Design and develop processes and tools to automate and audit all aspects of development and production environments and databases for the ECL cloud application backend
  • Continuously improve our set of in-house Go and Python facilities for automating container builds and deployments, and our bespoke Wolfram Language-based automated unit testing environment
  • Develop applications related to laboratory systemsAutomated provision and deployment of Wolfram Enterprise Private Cloud instances for integration with our customer-facing Command Center applicationDevelopment of domain-specific language infrastructure in support of ECL's Symbolic Lab Language
  • Coordinate with and advise other teams to plan and execute releases of application upgrades, new services, and migrations to new architectures or infrastructures, without degradation or interruption of service
  • Efficiently and dynamically prioritize ad hoc requests alongside roadmap initiatives
  • Coordinate with IT where premises and cloud infrastructure intersect. Evaluate and integrate open-source and commercial tools to serve the above purposes

Our Technology Stack:

  • Execution environment: Kubernetes on AWS EKS; AWS Lambda and Fargate
  • Languages: Python; Wolfram Language; Go; shell scripting
  • Database: AWS Aurora PostgresSQL
  • Other infrastructure: GitHub; DockerHub; Ubuntu, Debian, Alpine; Envoy+Contour; Terraform; AlertManager; PagerDuty; SendGrid; Auth0; Serverless
  • Observability Infrastructure: Prometheus, Grafana, OTEL, Honeycomb, AWS Cloudwatch
  • AWS services: EC2; EKS; RDS; ELB/ALB/NLB; IAM; S3; Certificate Manager; CloudWatch; Route 53; ElastiCache; RDS; SQS; VPC; premises-to-cloud VPN; security groups; CloudFront

Required Skills and Experience:

  • Coding in Python/Go: Proficient in developing and automating solutions to enhance infrastructure reliability and performance.
  • Observability Setup: Adept at implementing comprehensive observability solutions, including distributed tracing with OpenTelemetry (Otel), creating actionable dashboards using Grafana, and setting up effective monitoring and alerting systems. Experience with setting up front end observability is preferred.
  • SLI/SLO Metrics: Proven track record of setting up Service Level Indicators (SLIs), Service Level Objectives (SLOs), and other key performance metrics to ensure service reliability and performance.
  • DevOps Practices: Proficient in CI/CD tools, with experience in automating deployment pipelines and seamlessly deploying applications to Kubernetes from source control management (SCM).
  • Cloud Administration (AWS preferred): Skilled in cloud infrastructure management, with hands-on experience in AWS (EKS, Fargate, IAM, S3, VPC).
  • Cloud Networking & Security: Deep understanding of cloud networking and security concepts, including VPCs, VPNs, subnets, and security best practices.
  • Infrastructure Automation: Ability to automate Infrastructure provisioning using Terraform.

Ideal Candidate: 

  • Extensive experience in building comprehensive observability solution for an end to end distributed system ( Microservices deployed in Kubernetes)
  • Proven track record of setting up Front End Observability solution for Web based and Desktop application

About ECL: https://www.emeraldcloudlab.comThe Emerald Cloud Laboratory (ECL) enables life scientists to move out of the lab, and to conduct research entirely from a computer. Stepping away from manual completion of experiments at the bench, scientists on the ECL leverage the remote, automated execution of all standard biology and chemistry experiments in Emerald’s industrial lab facilities, working within a software platform for all stages of research workflows, from experimental design to data analysis.Optional but welcome: A link to your Github account or any projects you are proud of can be especially helpful.

With project links, please include a short remark to help us get our bearings.At Emerald Cloud Lab, we are committed to pioneering the future of scientific research by providing an innovative, cloud-based laboratory environment. We believe in the power of collaboration, diversity, and the continuous pursuit of knowledge to drive groundbreaking discoveries. If you are passionate about reshaping the landscape of scientific experimentation and eager to contribute to a culture of excellence and innovation, we invite you to join us.

This job is no longer open

Life at Emerald Cloud Lab

At Emerald Cloud Lab our mission is to empower scientists to transcend the laboratory. ECL was founded by scientists, for scientists. Our vision is to build a system that sweeps aside the daily grind scientists face in the laboratory and allows the day-to-day work to center on orchestrating science. There is transformative potential in a world where scientific ideas have a more direct route to realization and where progress in science and medicine is driven by the strength of our ideas more so than by our labor in the lab. Succeeding in this mission has the chance to provide unprecedented leverage and autonomy to scientists worldwide and in doing so to accelerate the rate of progress in pharmaceutical and biotechnology research, medical diagnostics, agricultural research, life science, and materials science.
Thrive Here & What We Value- Collaborative environment- Continuous pursuit of knowledge for groundbreaking discoveries- Innovative cloud-based laboratory environment- Emphasis on collaboration, diversity, innovation- Worldclass design and engineering- Commitment to reshaping scientific experimentation
Your tracker settings

We use cookies and similar methods to recognize visitors and remember their preferences. We also use them to measure ad campaign effectiveness, target ads and analyze site traffic. To learn more about these methods, including how to disable them, view our Cookie Policy or Privacy Policy.

By tapping `Accept`, you consent to the use of these methods by us and third parties. You can always change your tracker preferences by visiting our Cookie Policy.

logo innerThatStartupJob
Discover the best startup and their job positions, all in one place.
Copyright © 2024