The Platform team provides the foundation that empowers Fair’s engineers to build incredible things. We are expanding our team team with a remote Senior Site Reliability Engineer position.
Reporting to the Platform Engineering Manager, you are customer-centric, and you have demonstrated expertise in multiple areas of software engineering but are passionate about building and operational excellence across our platform and delivering “5 Nines Availability”. This is a great opportunity for you if you have experience dealing with issues of scale, debugging low-level production problems, and improving the availability of systems.
Responsibilities:
Oversee the site reliability and operation of our infrastructure and platform
Design and champion SRE best practices from idea conception to delivery
Maintain and evolve our Golang-based applications to provide great experience for our customers (other engineers in the organization)
Help grow the SRE wing of our platform team
Maximally automate processes to promote human-free operations
Improve metrics on quality of service, incidents and availability
Participate in our follow the sun on-call duties while focusing on reducing incidents and need for help with creative technological solutions or processes
Help troubleshoot and debug production issues across all of our services
Required Qualifications:
Bachelor's degree in computer science, applied mathematics or related field or 5 years of equivalent work experience
8+ years of relevant work experience
Relevant certifications in AWS, Kubernetes, Linux, database administration, networking, security, Six Sigma are welcomed
5+ years of experience in reliability engineering, software engineering, systems engineering, platform engineering, SRE, ops, or similar fields
Expert knowledge of Go
Expert knowledge of Linux
Deep systems, cloud and infrastructure knowledge
Experienced with Python, Ruby and Bash scripting languages
Familiarity with AWS, Docker, Kubernetes, and Terraform
Experience with integration and end-to-end testing in microservices environments
Experience with microservices, distributed logging and tracing
Experience managing monitoring systems
Experience building CI/CD pipelines
Familiarity with security concepts and best practices
Experience with computer networks
Our Benefits:
100% coverage of medical, dental and vision benefits for employees AND their families