Site Reliability Engineering Manager
Jobot - Columbus, OH
Apply NowJob Description
Manage a small team of SREs supporting commercial SaaS platforms in the health and wellness space- 100% RemoteThis Jobot Job is hosted by: Charles SimmonsAre you a fit? Easy Apply now by clicking the "Apply" buttonand sending us your resume.Salary: $150,000 - $170,000 per yearA bit about us:This mid sized SaaS organization powers health & wellness throughout the world. Every day their members focus their passion and expertise in helping health & wellness facilities operate efficiently and engage their members.Whether a neighborhood yoga studio, a national franchise with locations in every city, a YMCA or JCC--and every type of organization in between--we build solutions that make every aspect of running and being a member of a health and wellness organization easier and delightful.Why join us?We truly care for our team members, and this is reflected through our offices, benefits, and great perks.Flexible paid time offAffordable health, dental, and vision insurance optionsMonthly fitness reimbursement401(k) matchingNew-Parent Paid Leave1-month paid sabbatical every 5 yearsup to 100% telecommute or hybrid work in one of the officesJob DetailsWe are seeking a dynamic and experienced Site Reliability Engineering Manager to join our team in the Technology industry. As the SRE Manager, you will be responsible for ensuring the reliability, availability, and scalability of our systems and infrastructure. You will work closely with cross-functional teams to design, implement, and maintain our infrastructure and applications. The successful candidate will have a strong background working in environments build on technologies like Linux, VMware, AWS, Azure, Docker, Kubernetes, Redis, RabbitMQ, monitoring, GitLab CI, Jenkins, Terraform, ElasticSearch, Rancher, Python, Bash, and Lambdas.Responsibilities:Lead a team of SREs to ensure the reliability, availability, and scalability of our systems and infrastructureDesign, implement, and maintain our infrastructure and applicationsDevelop and implement monitoring and alerting systems to ensure the health of our systems and infrastructureCollaborate with cross-functional teams to optimize our systems and infrastructureManage incident response and resolution processesDevelop and maintain disaster recovery plansEnsure compliance with security and regulatory requirementsContinuously improve our processes and infrastructure to increase efficiency and reduce downtimeQualifications:Bachelor's degree in Computer Science, Engineering, or related field3+ years of experience in Site Reliability Engineering or related fieldStrong background in Linux, VMware, AWS, Azure, Docker, Kubernetes, Redis, RabbitMQ, monitoring, GitLab CI, Jenkins, Terraform, ElasticSearch, Rancher, Python, Bash, and Lambdas.Experience leading a team of SREsStrong problem-solving skills and ability to work in a fast-paced environmentExcellent communication and collaboration skillsExperience with agile methodologies and DevOps practicesKnowledge of security and regulatory requirements and best practicesAbility to manage incident response and resolution processesExperience developing and maintaining disaster recovery plansStrong commitment to continuous improvement and learning.Interested in hearing more? Easy Apply now by clicking the "Apply" button.
Created: 2025-02-20