S&P Enterprise Data Organization
Lead Data Scientist
The Team : As a member of the EDO, Collection Platforms & AI – Cognitive Engineering team you will work on building GenAI-driven and ML-powered products and capabilities to power natural language understanding, data extraction, information retrieval and data sourcing solutions for S&P Global. You will define AI strategy, mentor others, and drive production-ready AI products and pipelines while leading by example in a highly engaging work environment. You will work in a (truly) global team and be encouraged for thoughtful risk-taking and self-initiative.
What’s in it for you :
- Be a part of a global company and build solutions at enterprise scale
- Lead and grow a highly skilled, hands-on technical team (including mentoring junior data scientists)
- Contribute to solving high-complexity, high-impact problems end-to-end
- Architect and oversee production-ready pipelines from ideation to deployment
Responsibilities :
Define AI roadmap, tooling choices, and best practices for model building, prompt engineering, fine-tuning, and vector retrieval systemsArchitect, develop and deploy large-scale ML and GenAI-powered products and pipelinesOwn all stages of the data science project lifecycle, including :Identification and scoping of high-value data science and AI opportunitiesPartnering with business leaders, domain experts, and end-users to gather requirements and align on success metricsEvaluation, interpretation, and communication of results to executive stakeholdersLead exploratory data analysis, proof-of-concepts, model benchmarking, and validation experiments for both ML and GenAI approachesEstablish and enforce coding standards, perform code reviews, and optimize data science workflowsDrive deployment, monitoring, and scaling strategies for models in production (including both ML and GenAI services)Mentor and guide junior data scientists; foster a culture of continuous learning and innovationManage stakeholders across functions to ensure alignment and timely deliveryTechnical Requirements :
Hands-on experience with large language models (e.g., OpenAI, Anthropic, Llama), prompt engineering, fine-tuning / customization, and embedding-based retrievalExpert proficiency in Python (NumPy, Pandas, SpaCy, scikit-learn, PyTorch / TF 2, Hugging Face Transformers)Deep understanding of ML & Deep Learning models, including architectures for NLP (e.g., transformers), GNNs, and multimodal systemsStrong grasp of statistics, probability, and the mathematics underpinning modern AIAbility to surf and synthesize current AI / ML research, with a track record of applying new methods in productionProven experience on at least one end-to-end GenAI or advanced NLP project : custom NER, table extraction via LLMs, Q&A systems, summarization pipelines, OCR integrations, or GNN solutionsFamiliarity with orchestration and deployment tools : Airflow, Redis, Flask / Django / FastAPI, SQL, R-Shiny / Dash / StreamlitOpenness to evaluate and adopt emerging technologies and programming languages as neededMaster’s or Ph.D. in Computer Science, Statistics, Mathematics, or related field (minimum Bachelor’s)6+ years of relevant experience in Data Science / AI, with at least 2 years in a leadership or technical lead rolePrior experience in the Economics / Financial industry, especially with market-intelligence or risk analytics productsPublic contributions or demos on GitHub, Kaggle, StackOverflow, technical blogs, or publicationsLocation : Mexico City, Santa Fe (2 days a week onsite)