Sr. Data Scientist
ZipRecruiter - New York City, NY
Apply NowJob Description
Job Description: RetailStat is seeking a highly skilled and experienced Senior Data Scientist to join our team. In this role, you will be part of a team responsible for designing, testing, and implementing machine learning solutions across a wide range of projects in the retail intelligence space. One key area will be to help us implement optimization problems in the retailer space using a host of inputs (sales, geospatial, census, etc.), as well as help build models in the credit risk-scoring and financial space. We work at the intersection between location and financial data, linking the financial and the geospatial worlds. You will work closely with cross-functional teams to understand business requirements, develop machine learning (ML) models, and create scalable and efficient solutions to complex problems in the space of retail intelligence. Your expertise in classical statistics, ML, and stakeholder management will be crucial in driving the creative design of data science solutions across the organization. We have a big push this year to elevate data science to a more central position in the company and there is ample opportunity for designing creative new products to service the space. Responsibilities: Lead the methodological implementation of our new flagship product, which is a company priority. Collaborate with stakeholders to understand business requirements and translate them into the appropriate ML framework. Design and develop ML-based solutions for complex problems in the retail intelligence space. Build proof-of-concept models and assist the engineering team when moving them to production. Stay up to date with the latest developments in ML / AI to apply emerging methodologies creatively to complex problems at RetailStat. Provide leadership and mentorship to more junior members of the data science team to advance team and personal capabilities in the field. Requirements: Proven ability to think creatively about translating new and complex data sources into new data products. Strong foundation in classical statistics "” inference, distributions, sampling design, hypothesis testing, probability theory, regression models (linear, generalized, etc.) and its distributional assumptions. Strong familiarity with a large variety of ML methods and feature engineering, with the ability to apply them appropriately depending on the specifics of the problem. Proven experience with data cleaning and pre-processing. Strong proficiency with the Python ecosystem of tools to implement ML models in efficient ways. Strong proficiency in SQL. Working knowledge of geospatial data, tools, and methodologies. Familiarity with large-scale (500GB+ data ingested/day) data processing pipelines using Spark / PySpark. Working knowledge of distributed computing, cluster optimization and management, and performance tuning. Working knowledge of neural networks / deep learning. Familiarity with a large variety of database technologies, including relational, NoSQL, and unstructured search, with the ability to select the optimum mix of technologies for a given use case. Excellent problem-solving skills and the ability to troubleshoot complex issues in a distributed data environment. Strong communication and collaboration skills, with the ability to effectively convey technical concepts to both technical and non-technical stakeholders. Qualifications: PhD or master's in any quantitative-heavy field. 5+ years as a data scientist. 2+ years of experience in stakeholder management. Experience with location and/or financial data strongly preferred. Excellent verbal and written communication skills. #J-18808-Ljbffr
Created: 2025-03-05