MTS, Data Tooling
Acceler8 Talent - new york city, NY
Apply NowJob Description
Who we are: Since the re-founding in March of 2024, they built a reputation for being kind, innovative, and relentlessly dedicated to developing transformative enterprise AI solutions. Our team thrives on collaboration and is driven by the mission to tackle real-world challenges through groundbreaking technology.Our flagship product is an empathetic conversational chatbot built on our 350B+ frontier model and continuously enhanced through advanced fine-tuning, inference, and orchestration techniques. At the heart of our success lies the quality of our data"”crucial for training, labeling, and ensuring our models perform at their best.About the Role As a Member of Technical Staff on the Data Platform team, you will be instrumental in creating innovative data tools that transform raw inputs into high-quality datasets for ML training and labeling. Your work will focus on designing and building robust pipelines for data transformation, filtering, analysis, and cleaning. We are looking for data engineers who deeply understand ML and are passionate about developing next-generation tools that empower our data curation processes at scale.This is a good role for you if you:Have extensive experience working in ML environments, with a keen understanding of how high-quality data drives model performance.Are skilled at designing and implementing data tools that streamline the process of dataset creation, data annotation, and labeling.Possess a strong background in building systems for efficient data transformation, filtering, analysis, and cleaning.Thrive in innovative, fast-paced settings where you can directly impact the quality and reliability of training data for cutting-edge AI applications.Responsibilities include:Designing, building, and maintaining state-of-the-art data tools that convert raw data into high-quality datasets for ML training and labeling.Developing robust workflows and pipelines for data transformation, filtering, analysis, and cleaning that enhance dataset quality.Collaborating with ML researchers, data scientists, and engineers to ensure our data tools meet the rigorous standards required for enterprise-grade AI.Continuously evaluating and integrating emerging technologies to keep our data curation processes at the forefront of innovation.Driving the evolution of our data platform to support scalable, efficient, and effective data annotation and labeling pipelines.
Created: 2025-02-22