Data Director
Gointellects INC - New York City, NY
Apply NowJob Description
Skills Required: At least 12+ years with delivery of Data and Analytics professional services including commercials as a Lead resource . Experience with Building Customer Orientation and Growing Accounts (Must have) Experience with one or more Cloud platforms; preferred is AWS Data storage: S3, DynamoDB, RDS, Redshift, Aurora. Data processing: AWS Glue, Athena, EMR, Lambda, Kinesis, Step Functions. Compute services: EC2, ECS, EKS, EMR and EMR Serverless IAM Policies: Fine-grained access control for data and services. Infrastructure as Code (IaC) tools: Terraform, AWS CloudFormation, or CDK. Experience in Data Pipeline Orchestration (Airflow). Experience with designing, building, and maintaining complex workflows using Apache Airflow. DAG optimization, error handling, retries, and monitoring techniques. Automate data ingestion, transformation, and AI/ML model deployment pipelines using Airflow, AWS Lambda, and Step Functions. Custom operator development and plugin creation for specific use cases. Integration of Airflow with data storage and processing tools (e.g., S3, Snowflake, Databricks). Building CI/CD pipelines for deploying data pipelines using GitHub Actions, Jenkins, or GitLab. Containerization using Docker and orchestration with Kubernetes. Experience in Big Data and Processing Frameworks . Strong understanding of distributed computing frameworks like Apache Spark. Hands-on experience in Databricks: Cluster management, performance tuning, and cost optimization. Writing optimized PySpark/SQL code for large-scale data processing. Integration with other tools like Snowflake and S3. Experience in Data Lake Management (AWS S3). Designing scalable and secure data lakes using AWS S3. Expertise in file formats (Parquet, Iceberg, ORC, Avro) and partitioning strategies. Implementing data lifecycle policies, versioning, and tiered storage (e.g., Glacier). Security and compliance: Encryption (SSE/KMS), access control using IAM, bucket policies, and S3 Access Points. Programming experience Proficiency in Python (Airflow, Databricks, AWS SDKs like Boto3). Experience with SQL for data manipulation and transformation . Experience in AI/ML Development & Deployment . Collaborate with data scientists to deploy advanced machine learning and deep learning models into production using Databricks MLflow , SageMaker, or similar platforms. Experience in fine-tuning LLMs on domain-specific datasets and optimizing for inference. Experience in developing solutions for predictive analytics, anomaly detection, and personalization using AI/ML models. Strong understanding of machine learning pipelines, including feature engineering and model training. Familiar with MLOps pipelines for continuous model training, monitoring, and retraining, ensuring robust deployment processes. Leverage generative AI to automate report generation, trend analysis, and narrative creation for stakeholders. Develop monitoring systems for data pipelines and AI/ML workflows, ensuring performance optimization and early detection of issues. Implement observability frameworks for deployed generative AI models and intelligent agents. Hands-on experience implementing generative AI models like GPT, DALL·E, or Stable Diffusion. Familiarity with frameworks such as Hugging Face Transformers, OpenAI APIs, LangChain, and Diffusers. Knowledge of prompt engineering techniques and responsible AI principles for generative AI. Familiarity with tools for deploying conversational agents, such as Rasa, ChatGPT API, or Dialogflow. Familiarity with deep learning frameworks like TensorFlow, PyTorch, and Keras. Familiarity with reinforcement learning, model explainability (SHAP, LIME), and personalization techniques. Familiarity with other languages like Scala or Java for Spark. Implementation of Data Security and Compliance (GDPR, HIPAA). Cataloging and metadata management (AWS Glue Data Catalog, Hive Metastore). Implementing RBAC and auditing on data platforms (Snowflake, Databricks). Familiarity in one or more Analytic platforms such as MicroStrategy, Microsoft Power BI or Tableau. Experience building business-oriented solutions in an industry such as retail, media, or banking and finance (Must have) . Pre-sales experience in the US commercial sector (Must have). Demonstrated ability to communicate persuasively through speaking, writing, and client presentations. Collaborating with cross-functional teams to gather and define requirements. Experience in leading teams including managing, mentoring, and coaching senior people. Familiarity with other cloud platforms (Azure Data Factory, GCP BigQuery). Bachelor's degree in computer science, engineering, or related field. Demonstrated continued learning through one or more technical certifications or related methods. Job Type: Full-time Pay: $170,000.00 - $220,000.00 per year Benefits: 401(k) matching Paid time off Compensation Package: Bonus opportunities Schedule: 8 hour shift Work Location: Remote #J-18808-Ljbffr
Created: 2025-02-01