Talks | Tuhin Sharma

[GIDS 2026] Prototype to Production: Building Enterprise MCP and AI Agents with Templates

Sun, 26 Apr 2026 00:00:00 +0000

Description

More than 95 percent of GenAI pilots fail to reach production not because of capability, but because of missing engineering discipline. This session presents a practical blueprint for bridging that gap. Using two open-source templates refined through real-world enterprise deployments, you will learn how to build Model Context Protocol (MCP) servers and AI agents that are production-ready from day one.

Through detailed code walkthroughs and live demonstrations, you will explore FastAPI-based MCP server architecture, streaming agent implementations with PostgreSQL persistence, and observability with Langfuse tracing. The session also covers Kubernetes deployment patterns, rootless container configurations, SSO integration, session management, and automated recovery strategies. Attendees will leave with production-grade templates, deployment manifests, and concrete engineering patterns that transform prototypes into reliable enterprise systems.

What You Will Learn

Proven architectural patterns for deploying enterprise-grade MCP servers and AI agents
How to implement observability, authentication, and failure recovery from the start
Practical deployment techniques using Kubernetes, OpenShift, and containerized environments
Access to open-source templates with full documentation, ready for immediate use

Who Should Attend

AI engineers, software architects, DevOps specialists, and enterprise developers responsible for taking AI systems from proof of concept to production at scale.

[AI-ML SYSTEMS 2025] Zero to Production: Building Secure, ScalableMCP Servers and AI Agents with Open-Source Templates

Wed, 08 Oct 2025 00:00:00 +0000

Description

Our tutorial presents two battle-tested, extensible templates (MCP and AI Agents) that have been developed and refined through real-world production deployments. These templates, openly available, provide a proven architectural foundation that accelerates the journey from concept to production while enforcing security best practices and operational excellence.

Participants will gain practical experience building a complete agentic ecosystem comprising: Part 1: MCP Server Development - Using our open-source template-mcp-server repository, attendees will create robust MCP servers that enable AI agents to interact securely with external systems. The template includes FastAPI-based HTTP servers, modular tool systems, comprehensive testing frameworks, and enterprise deployment configurations supporting OpenShift/Kubernetes environments. Part 2: Agent Implementation - Leveraging our template-agent framework, participants will build production-ready conversational agents with real-time streaming capabilities, multi-turn conversation management, and enterprise integration features including SSO authentication, PostgreSQL persistence, and Langfuse observability.

Key Technical Contributions

Rapid Deployment Framework: Automation scripts that transform base templates into domain-specific implementations, reducing development time from weeks to hours
Security-First Architecture: Rootless containers using Red Hat UBI, comprehensive authentication patterns, and secure tool execution environments
Production Observability: Built-in tracing, logging, and monitoring capabilities essential for maintaining agents in production
Universal Compatibility: Tool-first design ensuring seamless integration with LangGraph, CrewAI, FastMCP, and other major agent frameworks
Enterprise-Ready Features: Session management, checkpointing, error recovery, and scalable deployment patterns tested in production environments

Practical Outcomes

Each participant will complete the tutorial with:

A fully functional MCP server with custom tools deployed to a container platform
A streaming AI agent with enterprise authentication and conversation persistence
Access to reusable template repositories with comprehensive documentation
Automation scripts for rapid customization and deployment
Best practices documentation for maintaining agentic systems in production

By providing open-source, extensible templates rather than rigid frameworks, we enable teams to rapidly prototype while maintaining production standards, significantly accelerating the adoption of agentic solutions in software engineering workflows.

Open Source Commitment

Both templates are actively maintained and openly available:

MCP Server Template: https://github.com/redhat-data-and-ai/template-mcp-server
Agent Template: https://github.com/redhat-data-and-ai/template-agent

These repositories include comprehensive documentation, example implementations, and deployment manifests, enabling participants to immediately apply tutorial learnings in their organizations.

[PYCON DE & PYDATA 2025] Enhancing RAG with Fast GraphRAG and InstructLab - A Scalable, Interpretable, and Efficient Framework

Wed, 23 Apr 2025 00:00:00 +0000

Description

Retrieval Augmented Generation (RAG) has changed the way AI systems incorporate external knowledge, but it often falls short when faced with real-world challenges like adapting to new data, managing complexity, or delivering reliable answers. Fast GraphRAG steps in to address these gaps with a refreshing approach that blends the structure of knowledge graphs with the proven efficiency of algorithms like PageRank. By focusing on interpretability, scalability, and adaptability, Fast GraphRAG creates a pathway for building AI systems that don’t just retrieve data but leverage it in a meaningful way.

The agenda for the talk is as follows

Challenges in Traditional RAG

Lack of interpretability leads to untrustworthy outputs.
High computational costs limit scalability.
Inflexibility makes adapting to evolving data cumbersome.

Fast GraphRAG’s Core Innovations

Interpretability: Knowledge graphs provide clear, traceable reasoning.
Scalability: Efficient query resolution with minimal overhead.
Adaptability: Dynamic updates ensure relevance in changing domains.
Precision: PageRank sharpens focus on high-value information.
Robust Workflows: Typed and asynchronous handling for complex scenarios.

How Fast GraphRAG Works

Architecture and algorithmic innovations.
Knowledge graphs for intelligent reasoning.
PageRank for multi-hop exploration and precise retrieval.
Entity extraction, incremental updates, and graph exploration.
Role of InstructLab and Fine-tuning.

Demo and Practical Takeaways

Building a knowledge graph and resolving queries.
Open-source tools for scaling Fast GraphRAG.
Real-World applications

Fast GraphRAG isn’t just another tool. It’s a game-changer for anyone frustrated by the limitations of traditional RAG systems. By combining the structured clarity of knowledge graphs with the power of algorithms like PageRank and fine-tuning by InstructLab, it makes retrieval smarter, faster, and the LLM more adaptable. This session will leave you with a clear understanding of how to build/train AI systems that deliver meaningful results while being transparent and trustworthy. Whether you’re a developer, researcher, or just someone passionate about AI, Fast GraphRAG is a framework that sparks possibilities and redefines what intelligent retrieval can achieve.

Presentation Video

[GIDS 2024] Navigating Innovation with Open Hybrid Cloud and Openshift AI

Tue, 23 Apr 2024 00:00:00 +0000

Description

Cloud computing is witnessing a significant shift, with Hybrid Cloud emerging as the dominant force. In this landscape, organizations are embracing a multi-cloud approach to manage operations and drive innovation. At the heart of this transformation is the open-source ecosystem, facilitating the growth of technologies like microservices, containers, Kubernetes, and AI.

Tuhin Sharma brings his extensive expertise in AI and NLP to this session. He will explore the crucial role of open-source in the Hybrid Cloud environment, emphasizing how it enables versatility and innovation across various cloud platforms.

A special focus will be on the role of Generative AI and its impact on enhancing efficiency and creativity in cloud computing. Drawing from real-world experiences and strategies from Red Hat, the talk will provide actionable insights into leveraging these advanced technologies.

Attendees will gain a deeper understanding of the current and future trends in cloud computing, learning how to effectively navigate and utilize these technologies for organizational growth and innovation.

Presentation Video

[GIDS 2023] Pybandit - A Website Optimization Framework for E-commerce SMBs

Fri, 28 Apr 2023 15:10:00 +0000

Description

In this talk, Tuhin and Abir talk about the pros and cons of existing processes and how the Multi-armed bandit algorithm can be used to solve this problem. The speakers will also showcase an open source python library they developed called pybandit, which provides an out of the box solution to implement such an experimentation processes.

Presentation Video

[ODSC EUROPE 2022] Eagleeye - Data Pipeline for Anomaly Detection in Cyber Security

Thu, 16 Jun 2022 16:05:00 +0000

Description

Cloud-native applications. Multiple Cloud providers. Hybrid Cloud. 1000s of VMs and containers. Complex network policies. Millions of connections and requests in any given time window. This is the typical situation faced by a Security Operations Control (SOC) Analyst every single day. In this talk, the speaker talks about the high-availability and highly scalable data pipelines that he built for the following use cases :

Denial of Service: A device in the network stops working.
Data Loss : An example is a rogue agent in the network transmitting IP data outside the network
Data Corruption : A device starts sending erroneous data.

The above can be solved through anomaly detection models. The main challenge here is the data engineering pipeline. With almost 7 Billion events occurring every day, processing and storing that for further analysis is a significant challenge. The machine learning models (for anomaly detection) has to be updated every few hours and requires the pipeline to create the feature store in a significantly small time window. The core components of the data engineering pipeline are:

Apache Flink
Apache Kafka
Apache Pinot
Apache Spark
Mlflow
Apache Superset

The event logs are stored in Pinot through Kafka topic. Pinot supports apache kafka based indexing service for realtime data ingestion. Pinot has primitive capabilities to create sliding time window statistics. More complex real-time statistics are computed using Flink. Apache Flink is a stream-processing engine and provides high throughput and low latency. Spark jobs are used for batch processing. Mlflow is used for machine learning model management. Superset is used for visualization.

The speaker talks through the architectural decisions and shows how to build a modern real-time stream processing data engineering pipeline using the above tools.

Outline

The problem: overview
Different Architecture Choices
The final architecture - a brief explanation
Real-Time Processing
Apache Kafka
Message broker vs Message Queue
RabitMQ vs Kafka
Why Kafka?
Apache Flink
Micro-batching vs Streaming
Flink vs Spark Streaming
Why Flink?
Apache Pinot
OLAP vs OLTP
Why Pinot?
Batch Processing
Apache Spark
Anomaly detection
Models
Data Engineering + Machine Learning
ML and MLLIB
Mlflow - Model management
Visualization - Superset
A short demo

Presentation Video

[DEVCONF.CZ 2022] Building data pipelines for Anomaly Detection

Fri, 28 Jan 2022 18:00:00 +0000

Description

Denial of Service: A device in the network stops working.
Data Loss : An example is a rogue agent in the network transmitting IP data outside the network
Data Corruption : A device starts sending erroneous data.

Apache Flink
Apache Kafka
Apache Pinot
Apache Spark
Mlflow
Apache Superset

The speaker talks through the architectural decisions and shows how to build a modern real-time stream processing data engineering pipeline using the above tools.

Outline

The problem: overview
Different Architecture Choices
The final architecture - a brief explanation
Real-Time Processing
Apache Kafka
Message broker vs Message Queue
RabitMQ vs Kafka
Why Kafka?
Apache Flink
Micro-batching vs Streaming
Flink vs Spark Streaming
Why Flink?
Apache Pinot
OLAP vs OLTP
Why Pinot?
Batch Processing
Apache Spark
Anomaly detection
Models
Data Engineering + Machine Learning
ML and MLLIB
Mlflow - Model management
Visualization - Superset
A short demo

Presentation Video

[GIDS 2021] Signature Verification in Banks using Few Shot Learning

Tue, 27 Apr 2021 12:30:00 +0000

Description

In case of standard image classification task, the input image is fed into a series of layers, and finally at the output we generate a probability distribution over all the classes. But it requires a large number of images. In offline signature verification scenario, we neither have enough signature for each signer and the total number signers is huge as well as dynamically changing. Thus, the cost of data collection and periodical retraining is too high. On the other hand, in a few shot image classification, we require only a few signatures for each signer, hence the name Few Shot.

The speaker will discuss the following:

Introduction to Few Shot Learning
The architecture of Siamese Network
How to train Offline signature verification system using Siamese Networks on real life data
Tools to build such model
How to deploy such model on cloud
How to serve such model in real time
Pros and Cons of our approach

Learn what few shot learning is and how to build and deploy such models on the cloud to solve various classification tasks on image data with very limited amount of data.

Presentation Video

[DATAHACK SUMMIT INDIA 2019] Federated Learning using Deep Learning

Wed, 13 Nov 2019 00:00:00 +0000

Description

Federated learning is a family of Machine Learning algorithms that has the core idea: a connected network exists in which there is a central server node. Each of the nodes creates data – that has to be used for training as well as for prediction. Each of the nodes trains a local model and only that model is shared with the server, not the data. In this talk, We talk about how to build deep learning models using federated learning that is truly privacy-preserving. We will show how to build custom algorithms and loss functions.

Key Takeaways:

Introduction to Federated Learning
- Decentralized Training
- Encryption
- Differential Privacy
Federated Learning – Notebook
- Introduction
- Custom algorithm and loss function

Presentation Video

Coming soon!!

[OREILLY AI LONDON 2019] Anomaly Detection in Smart Buildings using Federated Learning

Thu, 17 Oct 2019 16:00:00 +0000

Description

A modern smart building has a number of internet-enabled devices. IoT sensors to measure temperature, internet-enabled lighting, IP camera, IP Phone, etc. Data is generated at scale across all the devices. There are two critical aspects of the network of devices to function well:

Data Quality - The data that is generated has to be correct (typically within an accepted error range)
Security - with a number of internet-connected devices, securing the network from cyber threats is very important.

But there are two broad challenges to achieve the above:

The data collected are very sensitive to the business operations and hence the solution has to be privacy-preserving
The amount of data generated is huge and is not feasible to upload all of them to the cloud.

The speakers used Federated learning to build anomaly detection models that monitor data quality and cyber security - while preserving data privacy.

Federated learning enables Edge devices to collaboratively learn a machine learning model but keeping all of the data on the device itself. Instead of moving data to the cloud, the models are trained on the device and only the updates of the model are shared across the network. Using federated learning gave us the following advantages:

More accurate and Low latency models: The data are not moved. Only the model updates are shared. This results in models having low latency (since the models are on the device) and are also more accurate
Privacy Preserving: The data remains on the device.
Energy Efficient: The workload on the device is drastically reduced - leading to lesser power consumption and longer device life.

The speakers built deep learning models using pytorch and pysyft.

The speakers discuss their architecture and also show how federated learning can help improve the models. Federated learning provides a framework to port models across organizations for the same domain of the device. This is something that’s not possible in traditional cloud-based anomaly detection models. This makes it easy to deploy with very limited data and the speakers share some of their success stories.

Presentation Video

[ODSC INDIA 2018] Hybrid Recommendation Systems in News Media using Probabilistic Graphical Models

Sat, 01 Sep 2018 14:55:00 +0000

Description

A typical undertaking of recommender frameworks is to enhance customer experience through prior implicit feedback, by providing relevant content from time to time. These systems actively track different sorts of user behavior, such as buying pattern, watching habits browsing activity etc., in order to model user preferences. Unlike the much more extensively explored explicit feedback, we do not have any direct input from the users regarding their preferences. Where understanding the content is important, it is non-trivial to explain the recommendations to the users.

When a new customer comes to the system it is very difficult to provide relevant recommendations to the customer by traditional state-of-art collaborative filtering based recommendation systems, where content-based recommendation does not suffer from this problem. On the other hand, content-based recommendation systems fail to achieve good performance when the user profile is not very well defined, where collaborative filtering does not suffer from this problem. So, there is a need to combine the power of these two recommendation systems and create a hybrid recommendation system which can address this problem in a more effective and robust way. Large media and edtech companies in emerging markets are using a version of this approach.