Research Scientist (Foundation Models)
Pace Engineering Recruiters - Alameda, CA
Apply NowJob Description
About the job:We are developing an advanced learning-model based robot designed for real-world deployment in industries such as manufacturing, warehousing, and logistics to address critical labor shortages and help businesses optimize operations.WHO ARE WE:We're looking for a highly talented Research Scientist with a deep understanding in developing and fine-tuning foundation modals, particularly in the areas of VLMs, VLAs, Vision Transformers, and multi-modal data. We're looking for someone passionate about tackling difficult AI challenges, such as enabling a learning-model robot to perform dynamic and dexterous tasks in unstructured environments.Key ResponsibilitiesBuild foundational model for vision language and action that can exhibit good reasoning and maneuvering capability. Understands transformer based ML architecture really well.Design, train and deploy learning-based perception models for on-robot perception systems. Perception models should be able to do multi-modal learning capturing different semantics such as segmentation, object detection, scene understanding and tracking.Work with ML infrastructure engineers to assess and monitor model performance, analyze and resolve performance bottlenecks.Collaborate with various teams to understand real-world problems and define tasks, incorporating insights into ML products.Produce high-quality code for software development, participate in code reviews to ensure the quality of code, and share knowledge with the team.Comfortable working with sql queries and ETL logic for data ingress.QualificationsMs/PhD in Computer Science with minimum 5 years of industry experience with focus in ML/DL, Robotics, similar technical field of study, or equivalent practical experienceMinimum 2 years of industry experience with training & shipping ML models into production and tracking its lifecycle maintenance process.Deep understanding of computer vision, machine learning and deep learning basic concepts.Strong C++ and Python programming skills for efficient and robust code.Experience with multiple sensors such as Lidar, Mono/Stereo cameras, IMU, etc.Strong communication skills.What makes you stand outPublications on top conferences or journals such as CVPR, NeurIPS, ICCV, TPAMI, TRO, etc.Demonstrated proficiency in tackling robotics and computer vision challenges within at least two of the following domains: multi-sensor feature extraction and fusion, object detection and tracking, 3D Estimation, and embodied AI with Transformer based models.Familiarity with edge-device perception stack deployment, experience with NVIDIA software libraries such as CUDA or TensorRT.Open source project contributor.Experience with GCP or AWS, Kubernetes and Docker.
Created: 2025-01-14