Our Site Reliability Engineers are the primary interface between our developers and our production operations. No matter how many times we get searched, scraped, scanned, spammed, pinged, paged or queried, they gotta keep their cool - and keep the site running smoothly. You'll work in both the dev and systems worlds, instrumenting key parts of core architecture and supporting devs as they try to do the same. We're looking for a true hacker - you'll work as much in bash as Python, and you'll drop into some C now and then. You'll implement monitoring and alerting systems to support site stability and performance. You'll proactively scale our infrastructure to meet ever-increasing demand. You'll make sure that when something goes bump in the night, someone hears it. And you'll play a key role in keeping Yelp fast, available and growing.
Responsibilities
Work closely with developers in supporting new features and services
Monitor site stability and performance
Scale infrastructure to meet demand
Troubleshoot site issues
Develop custom tools as necessary
Document system design and procedures
Participate in light on-call rotation
↧
Site Reliability Engineer at Yelp Inc. (San Francisco, CA)
↧