Loading…
Tuesday February 11, 2025 5:00pm - 5:50pm PST
Stefan Webb, Zilliz, Developer Advocate

A recent and exciting development in the world of Generative AI has been the use of language to understand images, video, and sound. One example is multi-modal retrieval, which is the process of using one modality, like text, to search another modality, like images. It is not only useful for search engines across media types, but also for grounding LLMs in factual data and reducing hallucinations. In this talk, I explain how to build a simple but performant multi-modal retrieval pipeline using completely open-source tools and models: the vector database Milvus and HuggingFace libraries for modeling and data. I discuss techniques to use multimodal retrieval most effectively and increase recall, as well as some interesting and diverse industry applications. 
Speakers
avatar for Stefan Webb

Stefan Webb

Developer Advocate, Zilliz
Stefan Webb is a Developer Advocate at Zilliz, where he advocates for the open-source vector database, Milvus. Prior to this, he spent three years in the industry as an Applied ML Researcher at Twitter and Meta, collaborating with product teams to tackle their most complex challenges.Stefan... Read More →
Tuesday February 11, 2025 5:00pm - 5:50pm PST
AI DevWorld MAIN STAGE
  AI DevWorld

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Share Modal

Share this link via

Or copy link