HPC Architect
Selby Jennings - New York City, NY
Apply NowJob Description
One of the worlds most elite investment management firms, known for their innovative and groundbreaking work in the quantitative finance realm, is growing their HPC team by looking to add a HPC Architect! The ideal candidate is a thought leader in the high-performance computing space, and will have superb technical competency and ensure the highest levels of availability, performance, and security. This is a brand new team, allowing for large amounts of growth within the organization.Position **Design, document, and enhance platform-related services including servers, storage, and cloud.Leverage modern computer architectures including GPU, new CPU architectures, and modern HPC storage platforms.Identify and eliminate inefficient use of compute and storage resources.Provide concise and professional documentation.Measure HPC system performance using quantitative metrics.Manage projects and collaborate with internal and external partners.Provide L3 escalation support for performance and availability issues.Identify areas of improvement proactively.Customize solutions based on evolving requirements.Required **10+ years of experience with Linux (RHEL/Rocky/CentOS/OEL) in an enterprise environment.Expertise in system tuning for high bandwidth compute infrastructure.Experience identifying performance bottlenecks in OS, software architecture, HPC storage, or network layer.Understanding of network protocols (TCP, UDP, RDMA) and server tuning.Knowledge of CPU chipsets (Intel/AMD/ARM).Experience with HPC job schedulers (Slurm, RunAI, Bright Cluster Manager).Proficiency in Python and/or C++.Well-organized, proactive, resourceful, and able to handle a fast-paced environment.Strong critical thinking and problem-solving skills.Excellent verbal and written communication skills.Degree in Engineering, Computer Science, or related IT experience.Desired Skills and ExperienceLinux Engineering, Python, System tuning (CPU/memory/network), Slurm, C++, Storage solutions, CPU chipsets, GPU, Red Hat, Ansible, Kubernetes
Created: 2025-02-23