Data Scientist
Massachusetts General Hospital - somerville, MA
Apply NowJob Description
Description The Clinical Augmented Intelligence Group (CLAI) is seeking a Geospatial Data Scientist with a strong Geoinformatics and Machine Learning background to develop new spatial and temporal methods to address limitations of the multi-scale computational models used in environmental health and climate research. Our group conducts research at the intersection of computational sciences and health. Led by Dr. Hossein Estiri, CLAI resides within the Massachusetts General Hospital (MGH) Department of Medicine. As the largest hospital-based research program in the world, MGH enables CLAI research to leverage state-of-the-art large-scale geospatial, environmental, and clinical data for developing specialized AI/ML methods to enrich human phenotype models and exposure assessment. This position is initially available for one year with a possible extension for up to three years based on performance evaluation. Salary will be commensurate with the candidate's experience and qualifications. PRINCIPAL DUTIES AND RESPONSIBILITIES : * Participate in cutting-edge research in environmental/exposure/clinical data science and machine learning applications. * Work with CLAI faculty and staff as well as an interdisciplinary team of collaborators within the Mass General Brigham system and the Harvard community. * Develop knowledge discovery algorithms for large-scale real-world data * Automate analytical data processes/workflows for advanced geospatial modeling and simulation * Translate computational algorithms into parallelized and GPU-optimized code * Design and implement environmental/geospatial/clinical data quality assessment procedures * Adapt new procedures, methods, or instrumentation for collecting, preparing, and analyzing continental/global scale environmental exposure data * Maintaining relational and geospatial databases of research data * Contribute to experiments with Generative AI, and Large Language Models (LLMs) * Contribute to data filtering and curation for LLMs pre and post-training * Tabulate and visualize data for presentation at research conferences and for manuscript preparation * Supervise other personnel in the laboratory to coordinate research efforts as needed * Perform pertinent scientific literature reviews as needed * Assist with the ordering and procurement of computational infrastructure and equipment and with general team coordination as needed * Provide expertise in standardization, storage, and management of large-scale geospatial/environmental data sets * Collaborate to maintain a workplace that embraces teamwork and inclusivity Qualifications SKILLS & COMPETENCIES REQUIRED: * Master's Degree in Geoinformatics, Urban Planning, or a related discipline with a focus on Geospatial Science * Experience in spatial and temporal methods, Geoinformatics, or Data Mining * Experience in multi-scale computational models and one or more statistical/programming languages (e.g., R, C++, Python) * Familiarity with collaborative scientific computing and version control systems * Strong technical/scientific writing, interpersonal, verbal communication, presentation, time-management, planning, problem-solving, and organizational skills * Ability to work as part of a diverse team and promote collaboration and cooperation among teams * Demonstrated ability to work and make decisions independently in a fast-paced academic environment Preferred Qualifications PhD Degree in Geoinformatics, Urban Planning, or a related discipline with a focus on Geospatial Science Relevant work experience, including full-time postdoctoral experience in an exposome research lab Fluency in domain-specific libraries (e.g., sf, terra, geopandas) Experience developing and implementing large-scale data analytics pipelines on real-world data Experience in geospatial database design and implementation experience with geostatistics and spatial interaction modeling techniques experience with high-performance cluster computing and/or cloud computing at scale Ability to design and execute research agenda Experience with applying Generative AI model specialization techniques (e.g., SFT, RLHF) Strong publication record Experience with software engineering and developing user-friendly interfaces Experience with open science practices and data management tools that facilitate reproducible science (e.g., PositCloud, Google Colab) Experience with informatics/medical ontologies EEO Statement Massachusetts General Hospital is an Affirmative Action Employer. By embracing diverse skills, perspectives and ideas, we choose to lead. All qualified applicants will receive consideration for employment without regard to race, color, religious creed, national origin, sex, age, gender identity, disability, sexual orientation, military service, genetic information, and/or other status protected under law. We will ensure that all individuals with a disability are provided a reasonable accommodation to participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment. Primary Location: MA-Somerville-MGH Assembly Row Other Locations: MA-Boston-MGH Main Campus Work Locations: MGH Assembly Row 399 Revolution Drive Somerville 02145 Job: IT/Health IT/Informatics - Other Organization: Massachusetts General Hospital(MGH) Schedule: Full-time Standard Hours: 40 Shift: Day Job Employee Status: Regular Recruiting Department: MGH Lab of Computer Science Job Posting: Oct 29, 2024
Created: 2024-11-05