Introducing Resumonk AI Plan! Leverage AI rewrites & personalized suggestions to create a winning resume. Start your free trial now.

× Close

Published over 3 years ago


Headquarters: Santa Monica, California, USA
URL: https://www.fair.com/

The Platform team provides the foundation that empowers Fair’s engineers to build incredible things. We are expanding our team team with a remote Senior Site Reliability Engineer position.

Reporting to the Platform Engineering Manager, you are customer-centric, and you have demonstrated expertise in multiple areas of software engineering but are passionate about building and operational excellence across our platform and delivering “5 Nines Availability”. This is a great opportunity for you if you have experience dealing with issues of scale, debugging low-level production problems, and improving the availability of systems.

Responsibilities:
  • Oversee the site reliability and operation of our infrastructure and platform
  • Design and champion SRE best practices from idea conception to delivery
  • Maintain and evolve our Golang-based applications to provide great experience for our customers (other engineers in the organization)
  • Help grow the SRE wing of our platform team
  • Maximally automate processes to promote human-free operations
  • Improve metrics on quality of service, incidents and availability
  • Participate in our follow the sun on-call duties while focusing on reducing incidents and need for help with creative technological solutions or processes
  • Help troubleshoot and debug production issues across all of our services

Required Qualifications:
  • Bachelor's degree in computer science, applied mathematics or related field or 5 years of equivalent work experience
  • 8+ years of relevant work experience
  • Relevant certifications in AWS, Kubernetes, Linux, database administration, networking, security, Six Sigma are welcomed
  • 5+ years of experience in reliability engineering, software engineering, systems engineering, platform engineering, SRE, ops, or similar fields
  • Expert knowledge of Go
  • Expert knowledge of Linux
  • Deep systems, cloud and infrastructure knowledge
  • Experienced with Python, Ruby and Bash scripting languages
  • Familiarity with AWS, Docker, Kubernetes, and Terraform
  • Experience with integration and end-to-end testing in microservices environments
  • Experience with microservices, distributed logging and tracing
  • Experience managing monitoring systems
  • Experience building CI/CD pipelines
  • Familiarity with security concepts and best practices
  • Experience with computer networks

Our Benefits:
  • 100% coverage of medical, dental and vision benefits for employees AND their families
  • Equity incentives
  • Unlimited vacation package
  • Up to four months 100% paid parental leave
  • Cell Phone reimbursement
  • 401(k)
  • Employee referrals rewards
  • Diverse and inclusive culture
  • Leadership, mentorship, and learning programs

To apply: https://weworkremotely.com/remote-jobs/fair-com-senior-staff-site-reliability-engineer-golang-kubernetes