logo inner

Site Reliability Engineer (Node Operator / Restaking systems/ AVSs)

NethermindWorldwideRemote
This job is no longer open

What are we all about?


We are a team of builders and researchers on a mission to empower enterprises and developers worldwide to access and build on decentralized systems.Our expertise covers several domains: Ethereum and Starknet protocol engineering, layer-2, cryptography research, protocol research, decentralized finance (DeFi), security auditing, formal verification, real-time monitoring, smart contract development, and dapps and enterprise engineering.Working to solve some of the most challenging problems in the blockchain space, we frequently collaborate with, such as Ethereum Foundation, Starknet Foundation, Gnosis Chain, Flashbots, Forta Protocol, Lido, EigenLayer, Open Zeppelin, RISCZero, Aleph Zero, and many more.Today, there are nearly 200 of us working remotely from over 45+ countries.View all our open positions here: https://www.nethermind.io/open-roles

The Role


We are looking to onboard an accountable SRE to join our DevOps and SRE team with focus on operating Ethereum validator systems and Eigenlayer AVSs (Actively Validated Services) for restaking solutions. You will be part of the Nethermind team accountable for managing Nethermind Node Operator services and duties. You will be responsible for deploying, monitoring, maintaining, and troubleshooting Ethereum validators on the blockchain network, as well as other production systems. You will work remotely to cover a different timezone and collaborate closely with the team to ensure smooth operations, automate tasks, document processes, and continuously improve the system.

Responsibilities:


  • Responsible for monitoring and maintaining production systems including Ethereum validators and Blockchain nodes, AVSs, and other applications. This involves setting up monitoring tools, troubleshooting issues, performing regular maintenance tasks to ensure optimal performance, and implementing custom tooling if required.
  • In the event of an incident or outage, the SRE will be responsible for quickly identifying the root cause of the issue and implementing a fix to restore service. This may require working outside of normal business hours to respond to incidents in a timely manner.
  • Work intensively with Container Orchestration technologies and constantly optimizing infrastructure costs.
  • Responsible for documenting processes, procedures, post-incident reports, and best practices related to running our services in production. This documentation will help ensure consistency and quality across the team, and will also serve as a reference for future team members.
  • Collaborate closely with other members of the team to ensure that all production services are running smoothly and that any issues are addressed quickly especially Ethereum validators. This may include participating in on-call rotations, attending team meetings, and working on cross-functional projects with other teams.
  • Responsible for automating as many tasks as possible in order to reduce the amount of manual work required to manage infrastructures. This includes scripting, developing tools, and setting up automation using Terraform and CI/CD to streamline processes.
  • Responsible for continuously improving the processes, procedures, and tools used to manage blockchain nodes and validators. This includes identifying areas for improvement, implementing changes, and measuring the impact of those changes to ensure they are effective.
  • Responsible for evaluating the business needs and producing various designs to achieve the assigned projects.
  • Provide systems expertise and drive operational best practices. Responsible for setting up and maintaining performance system monitoring.

In this role, we need you to have experience in (you should have):


  • IAC experience running on any cloud platform, preferably on AWS and GCP.
  • Proficiency in Linux operating system and command-line tools.
  • Skills in programming languages such as Python, Golang, or Bash.
  • Experience with CI/CD pipelines and automation frameworks, preferably ArgoCD.
  • Proficiency with containerization technologies such as Docker with Docker Compose and Kubernetes.
  • Familiarity and experience working with Helm Charts.
  • Design and Implementation with high availability, reliability, security, and cost optimization in mind.
  • Perform proactive analysis of infrastructure capacity and performance, system backup, and recovery.
  • Ensuring security systems/appliances are functional and improved upon for proactive cyber defense.
  • Act as a role model for technical competence, helpfulness, facilitation of learning, and teamwork.
  • Experience with monitoring and alerting tools such as Prometheus and Grafana.
  • Strong troubleshooting and problem-solving skills and excellent communication and collaboration skills.
  • Ability to work independently and remotely, while also being a team player.

Nice to have skills


  • Expertise in blockchain nodes and validators maintenance, especially Ethereum’s, will be preferred.
  • Experience with Kubernetes cluster deployment strategy with Argo CD.
  • Scripting proficiency in multiple languages like Bash, Python, Golang, or others.

Disclaimer: I hereby consent to my personal information being stored and processed by Demerzel Solutions Limited (t/a Nethermind) (the “Company”) for recruitment purposes in relation to both the selected job role and any other role the Company considers me a qualified candidate for. All data storing and processing by the Company takes place in accordance with the UK GDPR. Kindly refer to our privacy policy for more details. 


Your consent to share personal information is entirely voluntary, and you may withdraw your consent at any time. Should you have any questions about this process, or wish to withdraw your consent please contact: legalnotices@nethermind.io Keep up to date on what we are working on by following us on

our social channels


Click here to view our Privacy Policy.


This job is no longer open

Life at Nethermind

Nethermind has a world class team of builders and researchers with expertise in Ethereum, protocol engineering, layer 2 scaling, decentralized finance, smart contracts development and enterprise blockchain. We provide technology, R&D and consulting services for blockchain and DeFi businesses. The Nethermind team actively contributes to Ethereum core development and supports many Ethereum projects to help further develop the ecosystem. Working with amazing partners such as StarkWare, POA, EWF, Baseline, Provide and many more, Nethermind is building the future of blockchain and DeFi. Build with Nethermind. Github: https://github.com/NethermindEth Contact us on: hello@nethermind.io For technical assistance, join us on our Discord: https://discord.gg/PaCMRFdvWT
Thrive Here & What We Value* Collaborative Environment* Professional Growth Opportunities* Flexible Working Arrangements* Equity Opportunities* Continuous Learning & Development* Problem Solving & Analytical Skills* Mission-driven Company Culture* Open Positions Available for Viewers to Apply* Privacy Policy
Your tracker settings

We use cookies and similar methods to recognize visitors and remember their preferences. We also use them to measure ad campaign effectiveness, target ads and analyze site traffic. To learn more about these methods, including how to disable them, view our Cookie Policy or Privacy Policy.

By tapping `Accept`, you consent to the use of these methods by us and third parties. You can always change your tracker preferences by visiting our Cookie Policy.

logo innerThatStartupJob
Discover the best startup and their job positions, all in one place.
Copyright © 2024