Sr Site Reliability Engineer
McGraw-Hill Education - huntsville, AL
Apply NowJob Description
Overview Impact the Moment At McGraw Hill we create best-in-class, next-generation learning platforms that are used by millions of students and educators worldwide every day. We design intuitive and effective tools and experiences that maximize teachers' time and students' learning. And we do all of this in a supportive and collaborative environment where we work alongside brilliant colleagues, touch lives around the world, see the difference our hard work makes, and continue our paths of lifelong learning. Your impact on team As a Sr Site Reliability Engineer at McGraw Hill, you will play a crucial role in designing and maintaining high-capacity systems that ensure the reliability, performance, and security of our customer platforms. You will collaborate with product teams within a DevOps framework to implement automation tools and processes that enhance predictability, accelerate time-to-market, and optimize costs. Your efforts will directly contribute to operational excellence and help advance our mission to deliver exceptional, reliable services. This is a remote position open to applicants authorized to work for any employer within the United States. What You'll Do Cloud Engineering Design, deploy, and manage automation tools in a DevOps model to enhance predictability, accelerate time-to-market, and ensure repeatability, traceability, and transparency of infrastructure automation (infrastructure-as-code, monitoring-as-code). Collaborate with product development teams to optimize systems for reliability and performance, while managing AWS costs and using optimization tools to maximize ROI and meet Service Level Objectives. Continuously learn and stay updated on the AWS ecosystem through participation in game day scenarios, professional conferences, and other development opportunities. Observability Engineering Ownership of the reliability, uptime, system security, cost, capacity, resiliency, and performance of applications and platforms, while leading data-driven initiatives to enhance stability and improve service levels. Ensure that the architecture and deployment models are adequately designed to meet SLA commitments Act as the primary contact during major incidents, resolving issues and managing on-call alarms. Maintain and enhance telemetry systems to improve visibility into application performance and business metrics, ensuring operational workloads are effectively managed DevSecOps Support healthy software development practices, including complying with agile software development methodology, building standards for code reviews, work packaging, and continuous delivery Partner with CyberSecurity and develop plans and automation to respond to new risks and vulnerabilities Resiliency Engineering Collaborate with development teams to identify system failure points and blast radius, validate monitoring and observability configurations, coordinate failure injection testing, and document steady-state production levels and growth patterns. Plan and forecast for seasonal growth, communicate trends with leadership, and enhance infrastructure scaling plans to handle 2x the anticipated load, while coordinating improvements to software and infrastructure to meet resiliency goals. Mentor and nurture engineers across varying levels of experience; foster growth by setting high-reaching goals and providing support to achieve them. About You Minimum of 5 years of applicable Site Reliability Engineering (SRE) experience. Hands-on experience with following technologies is required: Cloud and Infrastructure as a Code : AWS (CloudFront, S3, EC2, ECS, SES, SQS, SNS, Load Balancing, VPC, Config, Systems Manager, Lambda, API Gateway, DB services) and Terraform Programming and Containerization: Python, Golang, Bash, Ansible, and AWS ECS Security and web platforms: Rapid7, WAF, Apache Apache Tomcat, Angular Config Management and provisioning : Ansible, Packer Telemetry : NewRelic, CloudWatch, DataDog DevSecOps : Artifactory, Jenkins, CircleCI, SonarQube, Jfrog X-Ray, Control Tower, GitHub Experience with Automation tools and software development is a bonus Why McGraw Hill? There has never been a better time to join McGraw Hill. In our culture of curiosity and innovation, you will be able to own your growth and develop as we do. The pay range for this position is between $124,350 - $155,000 annually, however, base pay offered may vary depending on job-related knowledge, skills, experience, and annual bonus plan may be provided as part of the compensation package, in addition to a full range of medical and/or other benefits, depending on the position offered. Click here to learn more about our benefit offerings. McGraw Hill recruiters always use a "@" email address and/or from our Applicant Tracking System, iCIMS. Any variation of this email domain should be considered suspicious. Additionally, McGraw Hill recruiters and authorized representatives will never request sensitive information in email. 47819 McGraw Hill uses an automated employment decision tool (AEDT) to assist in the screening process by recommending candidates with "like skills" based on resume and job data. To request an alternative screening process, please select "Opt-Out" when asked to "Consent to use of Automated Employment Decision Tools" during the application.
Created: 2024-09-18