Website Capgemini

About the Company: Global Leaders in Technology Transformation

Capgemini is a prominent global leader in partnering with enterprises to transform, modernise, and manage their business ecosystems by harnessing the ultimate power of technology. Guided daily by a core purpose of unleashing human energy through tech innovation for a highly inclusive and sustainable future, Capgemini is a deeply responsible, diverse organisation of over 300,000 team members spanning nearly 50 countries.

With a rich 50-year heritage and deep industry expertise, Capgemini is trusted by thousands of global clients to address the entire breadth of their business needs—moving fluidly from overarching corporate strategy and systemic design to daily cloud-native operations. This technical engine is fueled completely by the fast-evolving worlds of cloud computing, advanced data architectures, connectivity, digital engineering, and modern artificial intelligence. In India alone, Capgemini houses a massive network of over 150,000 innovators working across 13 major geographic hubs, fostering a continuous-learning culture that enables top-tier software and analytical talent to solve complex problems at absolute enterprise scale.

About the Role: Data Scientist (Advanced Analytics & GenAI)

Are you a seasoned mathematical thinker who can fluidly translate complex business problems into production-ready analytical pipelines? Capgemini is seeking an analytical, execution-focused Data Scientist to join our elite Data Science & Analytics division in a hybrid working model across our premier hubs in Bangalore, Hyderabad, or Chennai.

This mid-to-senior level career path requires an agile practitioner who possesses a strong 6-9 year track record of shipping real-world machine learning systems. Moving far past pure mathematical experimentation or local sandbox scripting, you will sit at the defining intersection of classic predictive statistical modelling and modern generative AI engineering. You will be responsible for exploring massive multi-structured enterprise data lakes, executing strict feature engineering, and designing robust ETL (Extract, Transform, Load) pipelines.

Furthermore, you will play an active role in building, testing, and scaling production-grade Retrieval-Augmented Generation (RAG) structures, tuning prompt workflows, and deploying automated model monitoring architectures to ensure long-term algorithmic stability.

Key Responsibilities & Data Science Workflows

  • Predictive & Prescriptive Modelling: Design, evaluate, and optimise high-performance machine learning models to solve complex, ambiguous enterprise-level problems.

  • Generative AI Systems: Architect and deploy advanced GenAI frameworks, utilising prompt engineering patterns and Retrieval-Augmented Generation (RAG) loops to build semantic context vectors.

  • Production Data Engineering: Design and implement highly resilient, secure ETL data pipelines and streaming workflows using Python, Pandas, and optimised SQL schemas.

  • EDA & Mathematical Profiling: Perform rigorous exploratory data analysis (EDA), data cleaning, statistical modelling, and systematic outlier handling on multi-tiered data structures.

  • MLOps & Model Lifecycle Deployment: Package, deploy, and containerise analytical models into live production systems, establishing robust telemetry to monitor real-world inference behaviour.

  • Cross-Functional Storytelling: Partner intimately across global engineering, product management, and corporate business teams to present deep statistical insights via high-fidelity data visualisations.

Candidate Prerequisites & Technical Matrix

We are looking for a business-minded technical practitioner who can confidently unpack complex commercial roadblocks and address them with solid, mathematically verifiable code.

Minimum Required Qualifications:

  • Educational Track: Bachelor’s degree (Any Graduate) from an accredited college or university with a foundational focus on computer science, engineering, data science, statistics, or an equivalent quantitative track.

  • Experience Horizon: 6 to 9 years of verified professional experience working directly within a core Data Scientist, ML Engineer, or advanced analytics business intelligence role.

  • Core Mathematical Scripting: Expert-level fluency in writing analytical scripts inside Python, leaning heavily on fundamental computation libraries like Pandas, NumPy, and Matplotlib.

  • Data Transport & Storage: Deep operational mastery of relational database architectures (SQL) combined with the ability to orchestrate custom data pipelines from scratch.

  • Modern AI Acumen: Clear hands-on or production-level familiarity handling Prompt Engineering methodologies, large language models (LLMs), and Retrieval-Augmented Generation (RAG) systems.

  • Production Delivery Foundations: Solid understanding of the complete software development lifecycle, including container packaging, code versioning, and deploying models into cloud environments.

Key Technical Competencies

  • Systemic Analytical Thinking: The organic ability to deconstruct highly ambiguous, unmapped business problems and convert them into clean mathematical frameworks.

  • Strategic Communication: Exceptional storytelling and data visualisation skills to present complex algorithmic behaviour clearly to non-technical corporate leadership.

  • Production Discipline: A strong dedication to data cleaning hygiene, rigorous cross-validation patterns, and maintainable software engineering practices.

Upload your CV/resume or any other relevant file. Max. file size: 2 GB.