Data Center Cluster Operations Managerr, InfraOps - ...
Amazon - Douglasville, GA
Apply NowJob Description
Data Center Cluster Operations Manager, InfraOps - Data Center Ops Job ID: 2920932 | Amazon Data Services, Inc. The Cluster Operations Manager is responsible for one or more Amazon Data Center Clusters and Colocation Operations within a particular region. It is the senior Infrastructure Operations role within the region and has managerial responsibility for safety, security, availability, scaling, efficiency, and cost. The Infrastructure Operations organizations are composed of two primary functions: Data Center operations (DCO) and Data Center Engineering Operations (DCEO). A physical security organization, while not reporting directly to the Cluster Operations Manager, is an integral part of the operation. Data Center Operations focuses on the server-level platforms that support both Amazon Retail and Amazon Web Services. Engineering Operations focuses on the mechanical, electrical, and controls systems that support our data center critical environments. Security Operations are charged with the physical security of our people, assets, and customer data. The Cluster Operations Manager must be able to build and lead high-performing teams across each of these functions, understand and manage their daily operations while at the same time having the technical capability and curiosity to dive deep into any given challenges as needed. The Cluster Operations Manager is a key role in the management team that is operating and scaling the world's largest cloud computing infrastructure. We encounter interesting, challenging, and complex problems every day. As a technical manager in Amazon, you can innovate to solve these issues and help drive operations excellence in all areas of your role. You will have the ability to refine and develop processes to optimize operational excellence in every aspect of your role. You must also have a passion for technology along with a desire to achieve best-in-the-world operational performance. AWS Infrastructure Services (AIS) owns the design, planning, delivery, and operation of all AWS global infrastructure. In other words, we're the people who keep the cloud running. We support all AWS data centers and all of the servers, storage, networking, power, and cooling equipment that ensure our customers have continual access to the innovation they rely on. We work on the most challenging problems, with thousands of variables impacting the supply chain "” and we're looking for talented people who want to help. You'll join a diverse team of software, hardware, and network engineers, supply chain specialists, security experts, operations managers, and other vital roles. You'll collaborate with people across AWS to help us deliver the highest standards for safety and security while providing seemingly infinite capacity at the lowest possible cost for our customers. And you'll experience an inclusive culture that welcomes bold ideas and empowers you to own them to completion. Key Job Responsibilities Hiring, managing, and developing the operations team including compute operations managers, engineering operations managers, logistics operations managers, and their teams. Attainment of organizational performance goals and objectives relating to safety, security, availability, scaling, efficiency, and cost. Planning and executing the Infrastructure Operations component of new Data Centers and Colocation (Colo) expansions. Operation and maintenance of mechanical, electrical, and controls systems for Amazon Data Centers to include preventive maintenance, corrective maintenance, and change management. Vendor management of Colo Data Centers services providers to meet or exceed contracted performance SLA's. Safety, security, and availability incident response, incident management, and incident resolution. Continuous improvement of operational processes, procedures, methods, and tools. A day in the lifeThe CM drives all aspects of 24x7 Operations, to include emergent event response and hiring. Safety, Security, and Availability are core tenet areas of this role. This region is a business priority for Machine Learning Zone growth and expansion. BASIC QUALIFICATIONS Bachelor's or Master's degree in Engineering, Computer Science or a related field, or relevant industry experience. 10+ years of relevant management experience in Data Centers operations, facility engineering operations, information technology critical environment facilities, advanced high-volume manufacturing or similar. Demonstrated track record in delivering complex projects. PREFERRED QUALIFICATIONS Expertise in one or more continuous improvement methodologies such as Lean or Six Sigma. Substantial knowledge of mechanical, electrical, and controls systems in a critical or controlled environment. Broad knowledge of information technology infrastructure domains such as compute server platforms, storage server platforms, server components, network devices, technologies and architectures, IT service delivery principles and best practices. The ability to lead in dynamic environments and navigate ambiguity is paramount (new region launch role). Amazon is committed to a diverse and inclusive workplace. Amazon is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status. Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit this link for more information. #J-18808-Ljbffr
Created: 2025-03-10