Engineer|6127 Engineer|6127
ACL Digital - bridgewater, NJ
Apply NowJob Description
Job Description: Can be located in either Bridgewater or Dallas. Must be local to one of those cities and able to come onsite for a hybrid schedule. Top 3-5 Required Skills (These are not preferred skills. If the candidate does not have these required skills, they will be rejected completely) 1. Delivering reliable operations for web-scale infrastructure for a global market at high release velocity 2. Must have solid experience with at least 1 of the languages: Go, Python 3. Experience with Kafka, Mesos, Nifi, Elasticsearch, MySQL, Vertica, Zookeeper, Nginx. 4. 10+ years of industry experience in managing infrastructure. 5. 5 years of Linux administration in a large-scale SaaS environment. 6. 5 years maintaining production systems on AWS and/or OpenStack. 7. 3 years experience in managing Kubernetes in a large-scale production environment 8. Strong familiarity in running and optimizing RDBs and NoSQL databases. 9. 3 years using infrastructure as code software (eg. Terraform, AWS and Google Cloud Deployment, CloudFormation). 10. 5 years experience in continuous integration practices & tools (Jenkins) 11. Experience with monitoring solutions such as Prometheus, Grafana, ELK. Technologies: What does this temp must know to perform the required job duties(These are not preferred technologies - If they do not have these technologies they will be rejected completely) 1. Must have solid experience with at least 1 of the languages: Go, Python 2. Experience with Kafka, Mesos, Nifi, Elasticsearch, MySQL, Vertica, Zookeeper, Nginx. 3. 5 years of Linux administration in a large-scale SaaS environment. 4. 5 years maintaining production systems on AWS and/or OpenStack. 5. 3 years using infrastructure as code software (eg. Terraform, AWS and Google Cloud Deployment, CloudFormation). 6. 5 years experience in continuous integration practices & tools (Jenkins) 7. Experience with monitoring solutions such as Prometheus, Grafana, ELK. Required Education: (Candidates without this level will be rejected completely): B.S in computer science or Technology Information. Key Words: Any key words, job titles or competitors that our suppliers can be on the lookout for? SRE, DevOps Responsibilities: Build infrastructure as a code using Terraform. Build, create, and enable Kubernetes clusters. Manage and performance tune either database (NIFI, Elasticsearch) or streaming data pipelines (Kafka) Manage CICD pipelines, configuration, automation tools for infrastructure provisioning. Write and maintain runbooks for knowledge-driven automated processes and bots. Do capacity planning based on performance, usage, and utilization stats. Partner with developers and quality engineering teams to automate the monitoring, alerting, availability, and scalability of our applications and systems. Ensure system availability and business continuity by implementing redundant servers/services. Manage after-hours infrastructure updates and maintenance. Proactively research and propose the use of new concepts, processes, technologies, and tools. Proactive monitoring, diagnosis, on-call rotation, and resolution of issues in a 24x7 of the multi-cloud environment (OpenStack), analyze failures and provide support for software engineers to debug production issues across microservices and distributed platforms. Follow SRE's best practices and procedures. Experience Required For You To Be Successful: Follow SRE's best practices and procedures. An extensive background in developing and operating large-scale cloud-based distributed applications Direct experience developing/running applications on OpenStack, GCP, and AWS. Laser focus and be able to design infrastructure solutions for scalability, reliability, high availability, performance, software maintainability, and operational excellence The ability to "fix the plane while in flight" (not just support greenfield solutions) The ability to prioritize existing technical and infrastructure debt, and experience to build and execute a plan to pay it off Required skills: Delivering reliable operations for web-scale infrastructure for a global market at high release velocity Must have solid experience with at least 1 of the languages: Go, Python Experience with Kafka, Mesos, Nifi, Elasticsearch, MySQL, Vertica, Zookeeper, Nginx. 10+ years of industry experience in managing infrastructure. 5 years of Linux administration in a large-scale SaaS environment. 5 years maintaining production systems on AWS and/or OpenStack. 3 years experience in managing Kubernetes in a large-scale production environment Strong familiarity in running and optimizing RDBs and NoSQL databases. 3 years using infrastructure as code software (eg. Terraform, AWS and Google Cloud Deployment, CloudFormation). 5 years experience in continuous integration practices & tools (Jenkins) Experience with monitoring solutions such as Prometheus, Grafana, ELK. Position is 6 months to a year Comments for Suppliers: Can be located in either Bridgewater or Dallas. Must be local to one of those cities and able to come onsite for a hybrid schedule.
Created: 2024-11-02