<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Talks | Tuhin Sharma</title><link>https://tuhinsharma.netlify.app/talks/</link><atom:link href="https://tuhinsharma.netlify.app/talks/index.xml" rel="self" type="application/rss+xml"/><description>Talks</description><generator>Hugo Blox Builder (https://hugoblox.com)</generator><language>en-us</language><lastBuildDate>Sun, 19 May 2024 00:00:00 +0000</lastBuildDate><image><url>https://tuhinsharma.netlify.app/media/icon_hu55b84836e614877e119cbfa37f6d5a66_1386708_512x512_fill_lanczos_center_3.png</url><title>Talks</title><link>https://tuhinsharma.netlify.app/talks/</link></image><item><title>[GIDS 2026] Prototype to Production: Building Enterprise MCP and AI Agents with Templates</title><link>https://tuhinsharma.netlify.app/talks/gids2026/</link><pubDate>Sun, 26 Apr 2026 00:00:00 +0000</pubDate><guid>https://tuhinsharma.netlify.app/talks/gids2026/</guid><description>&lt;h3> Description &lt;/h3>
&lt;p>More than 95 percent of GenAI pilots fail to reach production not because of capability, but because of missing engineering discipline. This session presents a practical blueprint for bridging that gap. Using two open-source templates refined through real-world enterprise deployments, you will learn how to build Model Context Protocol (MCP) servers and AI agents that are production-ready from day one.&lt;/p>
&lt;p>Through detailed code walkthroughs and live demonstrations, you will explore FastAPI-based MCP server architecture, streaming agent implementations with PostgreSQL persistence, and observability with Langfuse tracing. The session also covers Kubernetes deployment patterns, rootless container configurations, SSO integration, session management, and automated recovery strategies. Attendees will leave with production-grade templates, deployment manifests, and concrete engineering patterns that transform prototypes into reliable enterprise systems.&lt;/p>
&lt;h3> What You Will Learn &lt;/h3>
&lt;ul>
&lt;li>Proven architectural patterns for deploying enterprise-grade MCP servers and AI agents&lt;/li>
&lt;li>How to implement observability, authentication, and failure recovery from the start&lt;/li>
&lt;li>Practical deployment techniques using Kubernetes, OpenShift, and containerized environments&lt;/li>
&lt;li>Access to open-source templates with full documentation, ready for immediate use&lt;/li>
&lt;/ul>
&lt;h3> Who Should Attend &lt;/h3>
&lt;p>AI engineers, software architects, DevOps specialists, and enterprise developers responsible for taking AI systems from proof of concept to production at scale.&lt;/p></description></item><item><title>[AI-ML SYSTEMS 2025] Zero to Production: Building Secure, ScalableMCP Servers and AI Agents with Open-Source Templates</title><link>https://tuhinsharma.netlify.app/talks/aimlsystems2025/</link><pubDate>Wed, 08 Oct 2025 00:00:00 +0000</pubDate><guid>https://tuhinsharma.netlify.app/talks/aimlsystems2025/</guid><description>&lt;h3> Description &lt;/h3>
&lt;p>Our tutorial presents two battle-tested, extensible templates (MCP and AI Agents) that have been developed and refined through real-world production deployments. These templates, openly available, provide a proven architectural foundation that accelerates the journey from concept to production while enforcing security best practices and operational excellence.&lt;/p>
&lt;p>Participants will gain practical experience building a complete agentic ecosystem comprising:
Part 1: MCP Server Development - Using our open-source template-mcp-server repository, attendees will create robust MCP servers that enable AI agents to interact securely with external systems. The template includes FastAPI-based HTTP servers, modular tool systems, comprehensive testing frameworks, and enterprise deployment configurations supporting OpenShift/Kubernetes environments.
Part 2: Agent Implementation - Leveraging our template-agent framework, participants will build production-ready conversational agents with real-time streaming capabilities, multi-turn conversation management, and enterprise integration features including SSO authentication, PostgreSQL persistence, and Langfuse observability.&lt;/p>
&lt;h3> Key Technical Contributions &lt;/h3>
&lt;ul>
&lt;li>&lt;em>Rapid Deployment Framework&lt;/em>: Automation scripts that transform base templates into domain-specific implementations, reducing development time from weeks to hours&lt;/li>
&lt;li>&lt;em>Security-First Architecture&lt;/em>: Rootless containers using Red Hat UBI, comprehensive authentication patterns, and secure tool execution environments&lt;/li>
&lt;li>&lt;em>Production Observability&lt;/em>: Built-in tracing, logging, and monitoring capabilities essential for maintaining agents in production&lt;/li>
&lt;li>&lt;em>Universal Compatibility&lt;/em>: Tool-first design ensuring seamless integration with LangGraph, CrewAI, FastMCP, and other major agent frameworks&lt;/li>
&lt;li>&lt;em>Enterprise-Ready Features&lt;/em>: Session management, checkpointing, error recovery, and scalable deployment patterns tested in production environments&lt;/li>
&lt;/ul>
&lt;h3> Practical Outcomes &lt;/h3>
&lt;p>Each participant will complete the tutorial with:&lt;/p>
&lt;ul>
&lt;li>A fully functional MCP server with custom tools deployed to a container platform&lt;/li>
&lt;li>A streaming AI agent with enterprise authentication and conversation persistence&lt;/li>
&lt;li>Access to reusable template repositories with comprehensive documentation&lt;/li>
&lt;li>Automation scripts for rapid customization and deployment&lt;/li>
&lt;li>Best practices documentation for maintaining agentic systems in production&lt;/li>
&lt;/ul>
&lt;p>By providing open-source, extensible templates rather than rigid frameworks, we enable teams to rapidly prototype while maintaining production standards, significantly accelerating the adoption of agentic solutions in software engineering workflows.&lt;/p>
&lt;h3> Open Source Commitment &lt;/h3>
&lt;p>Both templates are actively maintained and openly available:&lt;/p>
&lt;ul>
&lt;li>MCP Server Template: &lt;a href="https://github.com/redhat-data-and-ai/template-mcp-server">https://github.com/redhat-data-and-ai/template-mcp-server&lt;/a>&lt;/li>
&lt;li>Agent Template: &lt;a href="https://github.com/redhat-data-and-ai/template-agent">https://github.com/redhat-data-and-ai/template-agent&lt;/a>&lt;/li>
&lt;/ul>
&lt;p>These repositories include comprehensive documentation, example implementations, and deployment manifests, enabling participants to immediately apply tutorial learnings in their organizations.&lt;/p></description></item><item><title>[PYCON DE &amp; PYDATA 2025] Enhancing RAG with Fast GraphRAG and InstructLab - A Scalable, Interpretable, and Efficient Framework</title><link>https://tuhinsharma.netlify.app/talks/pycondepydata2025/</link><pubDate>Wed, 23 Apr 2025 00:00:00 +0000</pubDate><guid>https://tuhinsharma.netlify.app/talks/pycondepydata2025/</guid><description>&lt;h3>Description&lt;/h3>
&lt;p>Retrieval Augmented Generation (RAG) has changed the way AI systems incorporate external knowledge, but it often falls
short when faced with real-world challenges like adapting to new data, managing complexity, or delivering reliable
answers. Fast GraphRAG steps in to address these gaps with a refreshing approach that blends the structure of knowledge
graphs with the proven efficiency of algorithms like PageRank. By focusing on interpretability, scalability, and
adaptability, Fast GraphRAG creates a pathway for building AI systems that don’t just retrieve data but leverage it in a
meaningful way.&lt;/p>
&lt;p>The agenda for the talk is as follows&lt;/p>
&lt;p>Challenges in Traditional RAG&lt;/p>
&lt;ul>
&lt;li>Lack of interpretability leads to untrustworthy outputs.&lt;/li>
&lt;li>High computational costs limit scalability.&lt;/li>
&lt;li>Inflexibility makes adapting to evolving data cumbersome.&lt;/li>
&lt;/ul>
&lt;p>Fast GraphRAG’s Core Innovations&lt;/p>
&lt;ul>
&lt;li>Interpretability: Knowledge graphs provide clear, traceable reasoning.&lt;/li>
&lt;li>Scalability: Efficient query resolution with minimal overhead.&lt;/li>
&lt;li>Adaptability: Dynamic updates ensure relevance in changing domains.&lt;/li>
&lt;li>Precision: PageRank sharpens focus on high-value information.&lt;/li>
&lt;li>Robust Workflows: Typed and asynchronous handling for complex scenarios.&lt;/li>
&lt;/ul>
&lt;p>How Fast GraphRAG Works&lt;/p>
&lt;ul>
&lt;li>Architecture and algorithmic innovations.&lt;/li>
&lt;li>Knowledge graphs for intelligent reasoning.&lt;/li>
&lt;li>PageRank for multi-hop exploration and precise retrieval.&lt;/li>
&lt;li>Entity extraction, incremental updates, and graph exploration.&lt;/li>
&lt;li>Role of InstructLab and Fine-tuning.&lt;/li>
&lt;/ul>
&lt;p>Demo and Practical Takeaways&lt;/p>
&lt;ul>
&lt;li>Building a knowledge graph and resolving queries.&lt;/li>
&lt;li>Open-source tools for scaling Fast GraphRAG.&lt;/li>
&lt;li>Real-World applications&lt;/li>
&lt;/ul>
&lt;p>Fast GraphRAG isn’t just another tool. It&amp;rsquo;s a game-changer for anyone frustrated by the limitations of traditional RAG
systems. By combining the structured clarity of knowledge graphs with the power of algorithms like PageRank and
fine-tuning by InstructLab, it makes retrieval smarter, faster, and the LLM more adaptable. This session will leave
you with a clear understanding of how to build/train AI systems that deliver meaningful results while being
transparent and trustworthy. Whether you’re a developer, researcher, or just someone passionate about AI, Fast
GraphRAG is a framework that sparks possibilities and redefines what intelligent retrieval can achieve.&lt;/p>
&lt;h2>Presentation Video&lt;/h2>
&lt;div style="position: relative; padding-bottom: 56.25%; height: 0; overflow: hidden;">
&lt;iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen="allowfullscreen" loading="eager" referrerpolicy="strict-origin-when-cross-origin" src="https://www.youtube.com/embed/OxlsHjGxMQ8?autoplay=0&amp;controls=1&amp;end=0&amp;loop=0&amp;mute=0&amp;start=0" style="position: absolute; top: 0; left: 0; width: 100%; height: 100%; border:0;" title="YouTube video"
>&lt;/iframe>
&lt;/div></description></item><item><title>[GIDS 2024] Navigating Innovation with Open Hybrid Cloud and Openshift AI</title><link>https://tuhinsharma.netlify.app/talks/gidsindia2024/</link><pubDate>Tue, 23 Apr 2024 00:00:00 +0000</pubDate><guid>https://tuhinsharma.netlify.app/talks/gidsindia2024/</guid><description>&lt;h3>Description&lt;/h3>
&lt;p>Cloud computing is witnessing a significant shift, with Hybrid Cloud emerging as the dominant force. In this landscape,
organizations are embracing a multi-cloud approach to manage operations and drive innovation. At the heart of this
transformation is the open-source ecosystem, facilitating the growth of technologies like microservices, containers,
Kubernetes, and AI.&lt;/p>
&lt;p>Tuhin Sharma brings his extensive expertise in AI and NLP to this session. He will explore the crucial role of
open-source in the Hybrid Cloud environment, emphasizing how it enables versatility and innovation across various cloud
platforms.&lt;/p>
&lt;p>A special focus will be on the role of Generative AI and its impact on enhancing efficiency and creativity in cloud
computing. Drawing from real-world experiences and strategies from Red Hat, the talk will provide actionable insights
into leveraging these advanced technologies.&lt;/p>
&lt;p>Attendees will gain a deeper understanding of the current and future trends in cloud computing, learning how to
effectively navigate and utilize these technologies for organizational growth and innovation.&lt;/p>
&lt;p>Presentation Video &lt;/p>
&lt;div style="position: relative; padding-bottom: 56.25%; height: 0; overflow: hidden;">
&lt;iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen="allowfullscreen" loading="eager" referrerpolicy="strict-origin-when-cross-origin" src="https://www.youtube.com/embed/XH4SEyH2Aro?autoplay=0&amp;controls=1&amp;end=0&amp;loop=0&amp;mute=0&amp;start=0" style="position: absolute; top: 0; left: 0; width: 100%; height: 100%; border:0;" title="YouTube video"
>&lt;/iframe>
&lt;/div></description></item><item><title>[GIDS 2023] Pybandit - A Website Optimization Framework for E-commerce SMBs</title><link>https://tuhinsharma.netlify.app/talks/gidsindia2023/</link><pubDate>Fri, 28 Apr 2023 15:10:00 +0000</pubDate><guid>https://tuhinsharma.netlify.app/talks/gidsindia2023/</guid><description>&lt;h3>Description&lt;/h3>
&lt;p>In this talk, Tuhin and Abir talk about the pros and cons of existing processes and how the Multi-armed bandit algorithm can be used to solve this problem. The speakers will also showcase an open source python library they developed called pybandit, which provides an out of the box solution to implement such an experimentation processes.&lt;/p>
&lt;p>Presentation Video &lt;/p>
&lt;div style="position: relative; padding-bottom: 56.25%; height: 0; overflow: hidden;">
&lt;iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen="allowfullscreen" loading="eager" referrerpolicy="strict-origin-when-cross-origin" src="https://www.youtube.com/embed/X0PBvTH94i0?autoplay=0&amp;controls=1&amp;end=0&amp;loop=0&amp;mute=0&amp;start=0" style="position: absolute; top: 0; left: 0; width: 100%; height: 100%; border:0;" title="YouTube video"
>&lt;/iframe>
&lt;/div></description></item><item><title>[ODSC EUROPE 2022] Eagleeye - Data Pipeline for Anomaly Detection in Cyber Security</title><link>https://tuhinsharma.netlify.app/talks/odsceurope2022/</link><pubDate>Thu, 16 Jun 2022 16:05:00 +0000</pubDate><guid>https://tuhinsharma.netlify.app/talks/odsceurope2022/</guid><description>&lt;h3>Description&lt;/h3>
&lt;p>Cloud-native applications. Multiple Cloud providers. Hybrid Cloud. 1000s of VMs and containers. Complex network policies. Millions of connections and requests in any given time window. This is the typical situation faced by a Security Operations Control (SOC) Analyst every single day. In this talk, the speaker talks about the high-availability and highly scalable data pipelines that he built for the following use cases :&lt;/p>
&lt;ul>
&lt;li>Denial of Service: A device in the network stops working.&lt;/li>
&lt;li>Data Loss : An example is a rogue agent in the network transmitting IP data outside the network&lt;/li>
&lt;li>Data Corruption : A device starts sending erroneous data.&lt;/li>
&lt;/ul>
&lt;p>The above can be solved through anomaly detection models. The main challenge here is the data engineering pipeline. With almost 7 Billion events occurring every day, processing and storing that for further analysis is a significant challenge. The machine learning models (for anomaly detection) has to be updated every few hours and requires the pipeline to create the feature store in a significantly small time window.
The core components of the data engineering pipeline are:&lt;/p>
&lt;ul>
&lt;li>Apache Flink&lt;/li>
&lt;li>Apache Kafka&lt;/li>
&lt;li>Apache Pinot&lt;/li>
&lt;li>Apache Spark&lt;/li>
&lt;li>Mlflow&lt;/li>
&lt;li>Apache Superset&lt;/li>
&lt;/ul>
&lt;p>The event logs are stored in Pinot through Kafka topic. Pinot supports apache kafka based indexing service for realtime data ingestion. Pinot has primitive capabilities to create sliding time window statistics. More complex real-time statistics are computed using Flink. Apache Flink is a stream-processing engine and provides high throughput and low latency. Spark jobs are used for batch processing. Mlflow is used for machine learning model management. Superset is used for visualization.&lt;/p>
&lt;p>The speaker talks through the architectural decisions and shows how to build a modern real-time stream processing data engineering pipeline using the above tools.&lt;/p>
&lt;p>Outline&lt;/p>
&lt;ul>
&lt;li>The problem: overview&lt;/li>
&lt;li>Different Architecture Choices&lt;/li>
&lt;li>The final architecture - a brief explanation&lt;/li>
&lt;li>Real-Time Processing&lt;/li>
&lt;li>Apache Kafka&lt;/li>
&lt;li>Message broker vs Message Queue&lt;/li>
&lt;li>RabitMQ vs Kafka&lt;/li>
&lt;li>Why Kafka?&lt;/li>
&lt;li>Apache Flink&lt;/li>
&lt;li>Micro-batching vs Streaming&lt;/li>
&lt;li>Flink vs Spark Streaming&lt;/li>
&lt;li>Why Flink?&lt;/li>
&lt;li>Apache Pinot&lt;/li>
&lt;li>OLAP vs OLTP&lt;/li>
&lt;li>Why Pinot?&lt;/li>
&lt;li>Batch Processing&lt;/li>
&lt;li>Apache Spark&lt;/li>
&lt;li>Anomaly detection&lt;/li>
&lt;li>Models&lt;/li>
&lt;li>Data Engineering + Machine Learning&lt;/li>
&lt;li>ML and MLLIB&lt;/li>
&lt;li>Mlflow - Model management&lt;/li>
&lt;li>Visualization - Superset&lt;/li>
&lt;li>A short demo&lt;/li>
&lt;/ul>
&lt;p>Presentation Video &lt;/p>
&lt;div style="position: relative; padding-bottom: 56.25%; height: 0; overflow: hidden;">
&lt;iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen="allowfullscreen" loading="eager" referrerpolicy="strict-origin-when-cross-origin" src="https://www.youtube.com/embed/2YnNlQSO09c?autoplay=0&amp;controls=1&amp;end=0&amp;loop=0&amp;mute=0&amp;start=0" style="position: absolute; top: 0; left: 0; width: 100%; height: 100%; border:0;" title="YouTube video"
>&lt;/iframe>
&lt;/div></description></item><item><title>[DEVCONF.CZ 2022] Building data pipelines for Anomaly Detection</title><link>https://tuhinsharma.netlify.app/talks/devconfcz2022/</link><pubDate>Fri, 28 Jan 2022 18:00:00 +0000</pubDate><guid>https://tuhinsharma.netlify.app/talks/devconfcz2022/</guid><description>&lt;h3>Description&lt;/h3>
&lt;p>Cloud-native applications. Multiple Cloud providers. Hybrid Cloud. 1000s of VMs and containers. Complex network policies. Millions of connections and requests in any given time window. This is the typical situation faced by a Security Operations Control (SOC) Analyst every single day. In this talk, the speaker talks about the high-availability and highly scalable data pipelines that he built for the following use cases :&lt;/p>
&lt;ul>
&lt;li>Denial of Service: A device in the network stops working.&lt;/li>
&lt;li>Data Loss : An example is a rogue agent in the network transmitting IP data outside the network&lt;/li>
&lt;li>Data Corruption : A device starts sending erroneous data.&lt;/li>
&lt;/ul>
&lt;p>The above can be solved through anomaly detection models. The main challenge here is the data engineering pipeline. With almost 7 Billion events occurring every day, processing and storing that for further analysis is a significant challenge. The machine learning models (for anomaly detection) has to be updated every few hours and requires the pipeline to create the feature store in a significantly small time window.
The core components of the data engineering pipeline are:&lt;/p>
&lt;ul>
&lt;li>Apache Flink&lt;/li>
&lt;li>Apache Kafka&lt;/li>
&lt;li>Apache Pinot&lt;/li>
&lt;li>Apache Spark&lt;/li>
&lt;li>Mlflow&lt;/li>
&lt;li>Apache Superset&lt;/li>
&lt;/ul>
&lt;p>The event logs are stored in Pinot through Kafka topic. Pinot supports apache kafka based indexing service for realtime data ingestion. Pinot has primitive capabilities to create sliding time window statistics. More complex real-time statistics are computed using Flink. Apache Flink is a stream-processing engine and provides high throughput and low latency. Spark jobs are used for batch processing. Mlflow is used for machine learning model management. Superset is used for visualization.&lt;/p>
&lt;p>The speaker talks through the architectural decisions and shows how to build a modern real-time stream processing data engineering pipeline using the above tools.&lt;/p>
&lt;p>Outline&lt;/p>
&lt;ul>
&lt;li>The problem: overview&lt;/li>
&lt;li>Different Architecture Choices&lt;/li>
&lt;li>The final architecture - a brief explanation&lt;/li>
&lt;li>Real-Time Processing&lt;/li>
&lt;li>Apache Kafka&lt;/li>
&lt;li>Message broker vs Message Queue&lt;/li>
&lt;li>RabitMQ vs Kafka&lt;/li>
&lt;li>Why Kafka?&lt;/li>
&lt;li>Apache Flink&lt;/li>
&lt;li>Micro-batching vs Streaming&lt;/li>
&lt;li>Flink vs Spark Streaming&lt;/li>
&lt;li>Why Flink?&lt;/li>
&lt;li>Apache Pinot&lt;/li>
&lt;li>OLAP vs OLTP&lt;/li>
&lt;li>Why Pinot?&lt;/li>
&lt;li>Batch Processing&lt;/li>
&lt;li>Apache Spark&lt;/li>
&lt;li>Anomaly detection&lt;/li>
&lt;li>Models&lt;/li>
&lt;li>Data Engineering + Machine Learning&lt;/li>
&lt;li>ML and MLLIB&lt;/li>
&lt;li>Mlflow - Model management&lt;/li>
&lt;li>Visualization - Superset&lt;/li>
&lt;li>A short demo&lt;/li>
&lt;/ul>
&lt;p>Presentation Video &lt;/p>
&lt;div style="position: relative; padding-bottom: 56.25%; height: 0; overflow: hidden;">
&lt;iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen="allowfullscreen" loading="eager" referrerpolicy="strict-origin-when-cross-origin" src="https://www.youtube.com/embed/wdav9Q6wywI?autoplay=0&amp;controls=1&amp;end=0&amp;loop=0&amp;mute=0&amp;start=0" style="position: absolute; top: 0; left: 0; width: 100%; height: 100%; border:0;" title="YouTube video"
>&lt;/iframe>
&lt;/div></description></item><item><title>[GIDS 2021] Signature Verification in Banks using Few Shot Learning</title><link>https://tuhinsharma.netlify.app/talks/gidsindia2021/</link><pubDate>Tue, 27 Apr 2021 12:30:00 +0000</pubDate><guid>https://tuhinsharma.netlify.app/talks/gidsindia2021/</guid><description>&lt;h3>Description&lt;/h3>
&lt;p>In case of standard image classification task, the input image is fed into a series of layers, and finally at the output we generate a probability distribution over all the classes. But it requires a large number of images. In offline signature verification scenario, we neither have enough signature for each signer and the total number signers is huge as well as dynamically changing. Thus, the cost of data collection and periodical retraining is too high. On the other hand, in a few shot image classification, we require only a few signatures for each signer, hence the name Few Shot.&lt;/p>
&lt;p>The speaker will discuss the following:&lt;/p>
&lt;ul>
&lt;li>Introduction to Few Shot Learning&lt;/li>
&lt;li>The architecture of Siamese Network&lt;/li>
&lt;li>How to train Offline signature verification system using Siamese Networks on real life data&lt;/li>
&lt;li>Tools to build such model&lt;/li>
&lt;li>How to deploy such model on cloud&lt;/li>
&lt;li>How to serve such model in real time&lt;/li>
&lt;li>Pros and Cons of our approach&lt;/li>
&lt;/ul>
&lt;p>Learn what few shot learning is and how to build and deploy such models on the cloud to solve various classification tasks on image data with very limited amount of data.&lt;/p>
&lt;p>Presentation Video &lt;/p>
&lt;div style="position: relative; padding-bottom: 56.25%; height: 0; overflow: hidden;">
&lt;iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen="allowfullscreen" loading="eager" referrerpolicy="strict-origin-when-cross-origin" src="https://www.youtube.com/embed/tqL4DdD21Ac?autoplay=0&amp;controls=1&amp;end=0&amp;loop=0&amp;mute=0&amp;start=0" style="position: absolute; top: 0; left: 0; width: 100%; height: 100%; border:0;" title="YouTube video"
>&lt;/iframe>
&lt;/div></description></item><item><title>[DATAHACK SUMMIT INDIA 2019] Federated Learning using Deep Learning</title><link>https://tuhinsharma.netlify.app/talks/dhsindia2019/</link><pubDate>Wed, 13 Nov 2019 00:00:00 +0000</pubDate><guid>https://tuhinsharma.netlify.app/talks/dhsindia2019/</guid><description>&lt;h2>Description&lt;/h2>
&lt;p>Federated learning is a family of Machine Learning algorithms that has the core idea: a connected network exists in which there is a central server node. Each of the nodes creates data – that has to be used for training as well as for prediction. Each of the nodes trains a local model and only that model is shared with the server, not the data.
In this talk, We talk about how to build deep learning models using federated learning that is truly privacy-preserving. We will show how to build custom algorithms and loss functions.&lt;/p>
&lt;p>Key Takeaways:&lt;/p>
&lt;ul>
&lt;li>Introduction to Federated Learning
&lt;ul>
&lt;li>Decentralized Training&lt;/li>
&lt;li>Encryption&lt;/li>
&lt;li>Differential Privacy&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Federated Learning – Notebook
&lt;ul>
&lt;li>Introduction&lt;/li>
&lt;li>Custom algorithm and loss function&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h2>Presentation Video&lt;/h2>
&lt;p>Coming soon!!&lt;/p></description></item><item><title>[OREILLY AI LONDON 2019] Anomaly Detection in Smart Buildings using Federated Learning</title><link>https://tuhinsharma.netlify.app/talks/oreillyailondon2019/</link><pubDate>Thu, 17 Oct 2019 16:00:00 +0000</pubDate><guid>https://tuhinsharma.netlify.app/talks/oreillyailondon2019/</guid><description>&lt;h2>Description&lt;/h2>
&lt;p>A modern smart building has a number of internet-enabled devices. IoT sensors to measure temperature, internet-enabled lighting, IP camera, IP Phone, etc. Data is generated at scale across all the devices. There are two critical aspects of the network of devices to function well:&lt;/p>
&lt;ul>
&lt;li>Data Quality - The data that is generated has to be correct (typically within an accepted error range)&lt;/li>
&lt;li>Security - with a number of internet-connected devices, securing the network from cyber threats is very important.&lt;/li>
&lt;/ul>
&lt;p>But there are two broad challenges to achieve the above:&lt;/p>
&lt;ul>
&lt;li>The data collected are very sensitive to the business operations and hence the solution has to be privacy-preserving&lt;/li>
&lt;li>The amount of data generated is huge and is not feasible to upload all of them to the cloud.&lt;/li>
&lt;/ul>
&lt;p>The speakers used Federated learning to build anomaly detection models that monitor data quality and cyber security - while preserving data privacy.&lt;/p>
&lt;p>Federated learning enables Edge devices to collaboratively learn a machine learning model but keeping all of the data on the device itself. Instead of moving data to the cloud, the models are trained on the device and only the updates of the model are shared across the network. Using federated learning gave us the following advantages:&lt;/p>
&lt;ul>
&lt;li>More accurate and Low latency models: The data are not moved. Only the model updates are shared. This results in models having low latency (since the models are on the device) and are also more accurate&lt;/li>
&lt;li>Privacy Preserving: The data remains on the device.&lt;/li>
&lt;li>Energy Efficient: The workload on the device is drastically reduced - leading to lesser power consumption and longer device life.&lt;/li>
&lt;/ul>
&lt;p>The speakers built deep learning models using pytorch and pysyft.&lt;/p>
&lt;p>The speakers discuss their architecture and also show how federated learning can help improve the models. Federated learning provides a framework to port models across organizations for the same domain of the device. This is something that&amp;rsquo;s not possible in traditional cloud-based anomaly detection models. This makes it easy to deploy with very limited data and the speakers share some of their success stories.&lt;/p>
&lt;h2>Presentation Video&lt;/h2>
&lt;div style="position: relative; padding-bottom: 56.25%; height: 0; overflow: hidden;">
&lt;iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen="allowfullscreen" loading="eager" referrerpolicy="strict-origin-when-cross-origin" src="https://www.youtube.com/embed/6S3R2pOyyxo?autoplay=0&amp;controls=1&amp;end=0&amp;loop=0&amp;mute=0&amp;start=0" style="position: absolute; top: 0; left: 0; width: 100%; height: 100%; border:0;" title="YouTube video"
>&lt;/iframe>
&lt;/div></description></item><item><title>[ODSC INDIA 2018] Hybrid Recommendation Systems in News Media using Probabilistic Graphical Models</title><link>https://tuhinsharma.netlify.app/talks/odscindia2018/</link><pubDate>Sat, 01 Sep 2018 14:55:00 +0000</pubDate><guid>https://tuhinsharma.netlify.app/talks/odscindia2018/</guid><description>&lt;h2>Description&lt;/h2>
&lt;p>A typical undertaking of recommender frameworks is to enhance customer experience through prior implicit feedback, by providing relevant content from time to time. These systems actively track different sorts of user behavior, such as buying pattern, watching habits browsing activity etc., in order to model user preferences. Unlike the much more extensively explored explicit feedback, we do not have any direct input from the users regarding their preferences. Where understanding the content is important, it is non-trivial to explain the recommendations to the users.&lt;/p>
&lt;p>When a new customer comes to the system it is very difficult to provide relevant recommendations to the customer by traditional state-of-art collaborative filtering based recommendation systems, where content-based recommendation does not suffer from this problem. On the other hand, content-based recommendation systems fail to achieve good performance when the user profile is not very well defined, where collaborative filtering does not suffer from this problem. So, there is a need to combine the power of these two recommendation systems and create a hybrid recommendation system which can address this problem in a more effective and robust way. Large media and edtech companies in emerging markets are using a version of this approach.&lt;/p>
&lt;h2>Presentation Video&lt;/h2>
&lt;div style="position: relative; padding-bottom: 56.25%; height: 0; overflow: hidden;">
&lt;iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen="allowfullscreen" loading="eager" referrerpolicy="strict-origin-when-cross-origin" src="https://www.youtube.com/embed/vFQlKWociS0?autoplay=0&amp;controls=1&amp;end=0&amp;loop=0&amp;mute=0&amp;start=0" style="position: absolute; top: 0; left: 0; width: 100%; height: 100%; border:0;" title="YouTube video"
>&lt;/iframe>
&lt;/div></description></item></channel></rss>