We engage the most inspired minds to do their best work wherever they work best—powering the freedom to create worldwide.
WP Engine is the most trusted WordPress technology company, powering over 1.5M digital experiences in 150+ countries for businesses of all sizes. WP Engine’s all-in-one platform enables customers to design, build, power, and manage extraordinary WordPress, eCommerce, and headless sites—all thanks to a nonstop commitment to innovation, award-winning WordPress expertise, and a set of core values that guides us every day. Since launching in 2010, WP Engine has become the world’s leading WordPress Digital Experience Platform (DXP), now with over 185,000 customers in 150 countries.
We’re proud of the technology and service we offer our customers that move their businesses forward faster. At WP Engine, we strive to do the best work of our careers and feel empowered to do what is right for our customers. We love investing in employee success and uncovering opportunities for you, the best, to get better. 99 percent of employees believe you’re made to feel welcome at WP Engine. Be you. Be here.
What's cool about this job
The evolution of our platform is required for our scale, and we are searching for an experienced site reliability engineer to join our rapidly growing engineering team. We are actively introducing Machine Learning into our platform to improve operations and visibility. If you are an engineer experienced with technology transformations, service-oriented architectures, mentoring engineers, and are motivated by scale, you may be the engineer we are looking for.
The day to day
Work with a team of passionate engineers to build the core AIOps framework that keeps WPEngine running smoothly.
Constantly look for opportunities to automate and optimize.
Write, test, integrate, and debug software applications that are resilient, secure, maintainable, and perform at scale.
Continuously improve WP Engine’s secure, performant platform that supports 10’s of millions of end users.
Improve problem detection and monitoring.
Use and improve WP Engine's continuous integration and delivery pipelines to safely and rapidly push new code to production.
Expand and fine-tune observability services to aid engineering and support efforts by creating better feedback loops using ETL pipelines, monitoring tools and machine learning.
Constantly look for opportunities to automate and optimize.
Design, develop, and maintain robust platform infrastructure that supports ML workflows and data pipelines.
Implement scalable and automated processes for deploying machine learning models into production environments.
Develop and maintain automation tools to streamline ML operations, model training, and deployment processes.
Establish monitoring systems to ensure the health, performance, and reliability of WPEngine platforms.
Partner with leadership to define, document, and communicate technical details and policy in support of Production Engineering
Give and receive feedback effectively both within and outside of your team
Work closely with Stakeholders and Product Management to ensure the Production Engineering Team are appropriately engaged in, and aware of, feature/product releases
Participate in on call rotation and determine/implement solutions to reduce production interrupts
Your expertise and passion
5+ years experience in a site reliability engineer role
Proficient in programming
Experience with Machine Learning and AIOps
Experience with cloud technologies and services - GCP, AWS
Proven history of continuous learning and ability to stay ahead of technology trends
Proactive with natural problem-solving abilities, an inquisitive personality, a continuous learning approach, and an eagerness to tackle big problems even with uncertain requirements
Experience with a kubernetes environment at large scale
On-call experience for critical services with good troubleshooting skills
Bachelor’s degree in Computer Science (or a related field) OR equivalent experience
Helpful experience
Familiarity with the Go programming language (Golang)
Familiarity with the Python programming language
Familiarity with CI/CD, IAC and common tools (Terraform, Helm, Cloudbuild, Jenkins)
Experience with ML tools Google’s VertexAI platform, , langchain, pytorch, tensorflow or similar tools
Experience fine tuning LLMs
Elasticsearch administration or development experience
This role involves on call work
On-call is a weekly rotation among the team members
Level two escalation on a follow the sun model
The Perks and Benefits
Company Stock Options (Every employee is an owner in the company)
Health Benefits (100% Paid Employee Medical, Dental, and Vision)
Pension Scheme with a match
Life Insurance and Income Protection (100% Paid)
Supplementary Maternity & Paternity Pay and Caregiver’s Leave
Employee Assistance Program
Generous Vacation Time (Who doesn’t like time off)
Home Office Stipend
Tax free annual wellness benefit through Clevercards
4 Company Wellness Days
On-going education through LinkedIn Learning, Workday Learning and our Career Growth Portal
Free annual subscription to Calm
#LI-DO1At WP Engine, we strive to have the broadest possible view of diversity, going beyond visible differences to include the background, experiences, skills, and perspectives that make each person unique. WP Engine is proud to be an equal opportunity workplace and is committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, Veteran status, or any other basis protected by federal, state, or local law.