Location: HQ - San Mateo, California, United States

A bit about Scalyr

Scalyr’s mission is to provide a different approach to unified observability and log management that is built for modern application development and deployment practices. Founded by Steve Newman, who is also the Writely (aka Google Docs) founder and lead engineer, and led by tech industry veteran and CEO Christine Heckart, Scalyr offers an integrated and extensible suite of monitoring, management, visualization and analysis tools that aggregate and search all the signals needed for real-time observability, including logs, metrics and traces. We are the only observability and log management provider that does not index data and scales horizontally, is blazing fast and is ultra-affordable. The opportunity in front of us is huge and we are still in the very early days. This is going to be one of those companies where people will look back and say “I wish I’d been there when…” well, this is your chance to be part of “when”.

We are growing fast and thanks to the unique purpose-build database technology laid by Steve. Our solution operates at Petabyte scale, brag blazing fast search and make data available for searches in ~2 seconds past ingesting. “Existing log management tools were often slow and clunky, so we were facing a challenge, but the good kind — an opportunity to deliver a new user experience through solid engineering”. With Scalyr, we keep users like you “in the zone” as they handle incidents or debug cloud applications.

Your opportunity

We are looking for a Site Reliability Engineer who can help keep our uptime promise to our customers by making sure we meet our SLOs and can help our engineering teams ship software to our customers fast and with quality. On this job, you will have an amazing opportunity to drive outcomes that improve reliability, stability and cost efficiency of Scalyr. We are looking to add a SRE with prior extensive operations experience for a SaaS product who can drive deployment re-architecture with focus on self-service and automation. Someone who has driven continuous deployment, has run incident post-mortems, has provided feedback to engineering architecture decisions and has automated repetitive operational tasks would be a great fit. You will join a like minded team of awesome SRE engineers who help run our operations smoothly at scale. We value good written communication skills, data driven decisions and a keen eye for continuous improvements. You’ll help simplify, have a passion for new ideas and know how to execute iteratively towards the final goal. We value candor and collaboration.

SRE and DevOps are two titles loosely used in the industry. A SRE engineer at Scalyr defines and provisions the common set of tools the engineering teams to use and facilitate dev <-> ops collaboration by consulting and driving best practices. SRE at Scalyr is also responsible for uptime and providing feedback to engineering on architecture. We dogfood Scalyr for our operations and therefore an SRE also acts as product owner providing product feedback.

Your skills:

  • Strong experience in running operations at large scale for a SaaS product
  • Python/Golang/Java/Ruby  as main scripting languages – We use Python 
  • Familiarity with running Java and Javascript applications including build and deploy
  • Production experience with orchestration systems like Kubernetes, Nomad or Mesos
  • AWS experience and familiarity with other platforms like GCE, Azure
  • Configuration management. Ansible – main tool used internally, Chef and Puppet experience will help
  • Familiarity with CI and practical delivery using any of Travis, CircleCI, Codefresh, Spinnaker, Jenkins or buildkite, familiarity with deployment strategies like blue green, rolling deploys, canary deploys  and best practices around deployment automation (with tools like shipit or spinnaker) is desired
  • Polyglot experience with other SRE tools – we integrate with more tools every day 
  • Curiosity, fast-learning, pursuit to improvements, great communication
  • Keeping a pulse on latest SRE trends and Open Source

Apart from the above technical skills, following soft skills are desired:

  • Ability to work in a diverse and distributed team is highly desired
  • A self-starter that is passionate and motivated by new technologies and has empathy for legacy systems
  • A quick learner that can navigate through unfamiliar programming languages, systems and processes
  • Prior product building experience is optional but desired

Your benefits:

  • Competitive salary & equity at a fast growing startup
  • Fully funded comprehensive medical, dental, and vision coverage
  • Short-term disability, long term disability, and life insurance
  • 401k
  • Free lunch
  • Unlimited vacations
  • Paid parental leave
  • Open and collaborative office environment

Our commitment to diversity

We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.

Apply to this position