About Us
Founded in 2018, Bakkt builds technology that connects commerce.Our vision is to connect the digital economy by offering one ecosystem for cryptocurrency and digital assets, loyalty, and commerce. We enable our partners and clients to deliver new opportunities to their customers through SaaS and API solutions that unlock crypto and drive loyalty, powering engagement and performance.Come build with us.
As a Site Reliability Engineer, you will be responsible for closely monitoring our production environments, swiftly addressing issues, and applying creative solutions to ensure the seamless operation of our platform. You will utilize your natural curiosity and strong problem-solving skills to investigate and resolve technical issues across our applications, services, databases & infrastructure.
Responsibilities
- Observability:
- Implement and manage robust monitoring systems to continuously track the functional and non-functional health and performance of our production systems.
- Proactively identify anomalies and potential issues before they impact our clients.
- Client Support:
- Partner with software engineering, project management and customer success teams to respond to client requests and support inquires.
- Work closely with our clients to provide support during integration, and ensure a positive experience.
- Incident Management:
- Lead escalation remediation's by working across multiple teams such as software engineering, devops, and project management for web applications and services running in a 24/7, always on, cloud platform environment.
- Participate in an on-call rotation to address and resolve critical incidents outside of regular business hours.
- Operations:
- Execute and develop operational procedures necessary for service requests and incident response.
- Maintain critical platform support knowledge, such as customer contact lists, vendor escalation procedures, scheduled job inventories, and operational playbooks.
- Support planning and execution of production changes and software releases.
- Automation:
- Develop scripts and tools to automate repetitive tasks, streamline workflows, and improve the efficiency of the production support process.
- Assist in the automation of customer operational tasks and ensures alignment with business requirements regarding customer facing processes such as customer order reconciliation.
- Ensure timely execution of scheduled and repeatable processes such as periodic system validations, daily triage, system monitoring and event log management.
- Continuous Improvement:
- Actively participate in process improvement initiatives, suggesting enhancements to observability, logging strategies, incident response procedures, and support workflows.
Requirements
- A bachelor’s degree in Computer Science, Information Technology or equivalent
- 5+ years of application support and production support experience supporting cloud-based platforms using an SRE support model.
- Proven track record in a production support/SRE role, demonstrating your ability to monitor and troubleshoot complex systems in highly available production environments.
- Experience with common development tools and practices, including Java-based, Springboot environments and source control tools, such as GIT in a team environment
- Demonstrated ability to understand application logs and and supporting various monitoring and visualization tools (e.g. Alertsite, LogStash, DataDog)
- Excellent communication skills, both written and verbal, for effective interaction with technical and non-technical stakeholders.
- Self-starter who can work independently and effectively across functional team environments.
- Proven ability to learn new IT technologies and disciplines.
Preferred
- Ability to read and interpret Java, Angular, SQL and other software coding languages
- Experience with GCP, Google Kubernetes Engine, Google Compute Engine
- Experience with n-tier web and services application architectures and in Java-based, Springboot and Tomcat Environment.
- Working knowledge of SQL Server
- Experience with JIRA or other Service Desk tools
- Experience with multiple OS platforms (Linux, Windows)
- Experience with Mongo and scripting language like python
Bakkt is devoted to having diversity in its workforce and is proud to be an equal opportunity employer. Bakkt does not make any employment decisions based on race, color, religion, sex, national origin, veteran status, disability, age, sexual orientation, gender identity of any other characteristic protected by law