S&P Enterprise Data Organization
Data Scientist
The Team :
As a member of the Data Transformation team you will work on building ML powered products and capabilities to power natural language understanding, data extraction, information retrieval and data sourcing solutions for S&P Global Market Intelligence and our clients. You will spearhead development of production-ready AI products and pipelines while leading-by-example in a highly engaging work environment. You will work in a (truly) global team and encouraged for thoughtful risk-taking and self-initiative.
The Impact :
- The Data Transformation team has already delivered breakthrough products and significant business value over the last 3 years.
- In this role you will be developing our next generation of new products while enhancing existing ones aiming at solving high-impact business problems.
What’s in it for you :
Be a part of a global company and build solutions at enterprise scaleCollaborate with a highly skilled and technically strong teamContribute to solving high complexity, high impact problemsKey Responsibilities
Design, Develop and Deploy ML powered products and pipelinesPlay a central role in all stages of the data science project life cycle, including :Identification of suitable data science project opportunitiesPartnering with business leaders, domain experts, and end-users to gain business understanding, data understanding, and collect requirementsEvaluation / interpretation of results and presentation to business leadersPerforming exploratory data analysis, proof-of-concept modelling, model benchmarking and setup model validation experimentsTraining large models both for experimentation and productionDevelop production ready pipelines for enterprise scale projectsPerform code reviews & optimization for your projects and teamSpearhead deployment and model scaling strategiesStakeholder management and representing the team in front of our leadershipLeading and mentoring by example including project scrumsWhat We’re Looking For :
2+ years of professional experience in Data Science domainExpertise in Python (Numpy, Pandas, Spacy, Sklearn, Pytorch / TF2, HuggingFace etc.)Experience with SOTA models related to NLP and expertise in text matching techniques, including sentence transformers, word embeddings, and similarity measuresExpertise in probabilistic machine learning model for classification, regression & clusteringStrong experience in feature engineering, data preprocessing, and building machine learning models for large datasets.Exposure to Information Retrieval, Web scraping and Data Extraction at scaleOOP Design patterns, Test-Driven Development and Enterprise System designSQL (any variant, bonus if this is a big data variant)Linux OS (e.g. bash toolset and other utilities)Version control system experience with Git, GitHub, or Azure DevOps.Problem-solving and debugging skillsSoftware craftsmanship, adherence to Agile principles and taking pride in writing good codeTechniques to communicate change to non-technical peopleNice to have
Prior work to show on Github, Kaggle, StackOverflow etc.Cloud expertise (AWS and GCP preferably)Expertise in deploying machine learning models in cloud environmentsFamiliarity in working with LLMsLocation : Mexico City (Santa Fe, 2 days onsite a week)