Openshift Lead Reliability Engineer
United Software Group Inc - Dallas, TX
Apply NowJob Description
One of our prime clients is hiring Openshift Lead Reliability Engineer for Dallas, TX.Role: Openshift Lead Reliability Engineer Location: Dallas, TXMode: ContractDescription:Design automated, containerized cloud application platform solutions with a focus on application concerns, including cloud-ready distributed application architectures, migrating workloads to containers, containerized development workflows, and integrating container platforms with automated CI and CD pipelines. Provide mentoring to the developer community in the best practices associated with CI/CD deployments using Jenkins, Maven, and GIT. Assist in the design, build, management and operation of the continuous delivery framework and tools, and act as a subject matter expert on CI/CD for developer teams. Assist in the design, build, management and operation of the infrastructure as a service layer (hosted and cloud-based platforms) that supports the different platform services. Write and build continuous delivery pipelines to manage and automate the lifecycle of the different platform components. Essential Skills / Knowledge: Good understanding on OpenShift Architecture Hands-on experience in PaaS (OpenShift / Kubernetes) 5+ years' experience with REDHAT OpenShift Container Platform Expertise with Kubernetes Expertise on upgradation / patching major/minor version of OpenShift/Kubernetes Good knowledge on RBAC, Storage, Security Maintain container registry and certificates for operator. Automating the build of containerized systems with CI/CD tooling, Helm charts, and more. Managing deployments and rollbacks of applications Experience in Security & Hardening (under O&M) and suggest best security policies implementation. Experience working with Docker / CRIO / PODMAN Working experience on Prometheus Working experience on ELK / SPLUNK Knowledge on NAS / OCS / Ceph Working experience on Linux based infrastructure Knowledge on Shell / Python Scripting / Playbook development Knowledge on Ansible Good knowledge of automation and configuration management framework like Ansible, Puppet, Chef, etc. Knowledge on Jenkins is add-on Knowledge on Deployment tools is add-on Understanding of internet technologies (DNS, SNMP, HTTP, TCP/IP, CDNs) Experience with virtualization and cloud technologies (VMWare, RHEVM, AWS/GCP) Function as an SRE to enhance platform stability and perform proactive fixes and documenting root cause analysis and emergency response. Monitoring tools - Prometheus and/or Logic Monitor. Suggest & enable monitoring for 360-degree health status of the system. Good understanding of infrastructure tools & domain Ability to prioritize and work well under pressure Able to support weekend based on the Project needs Good Troubleshooting skillsBasic Qualifications: 5+ years of experience in OpenShift
Created: 2025-02-24