Loading…
Thursday February 13, 2025 10:00am - 10:25am PST
Moshe Twitto, Pliops, Founder & CTO

Organizations are increasingly concerned about the lack of power budgets in data centers, particularly as AI infrastructure and emerging AI applications lead to higher energy footprints and strain cooling systems. As they scale their AI operations and add GPU compute tiers, the escalating power and cooling demands, coupled with significant capital investments in GPUs, are eroding margins. A monumental challenge looms as data centers struggle to secure essential power, creating significant pressure for companies striving to expand their AI capabilities.

In today's LLM inferencing computing, GPU prefill operations are heavily compute-bound and critically determine the batch size. While prefill can fully utilize GPU resources, increasing the batch size beyond a certain point only increases the Time to First Token (TTFT) without improving prefill rate. On the other hand, GPU decode operations are HBM bandwidth-bound and mainly influenced by model and KV cache sizes, benefiting significantly from larger batch sizes through higher HBM bandwidth efficiency. Pliops' solution improves prefill time, allowing for larger batch sizes without violating user SLA for prefill operations. This enhancement directly affects decode performance as well, as it gains greatly from the increased batch size. As a result, by improving prefill time, the system achieves nearly proportional improvements in end-to-end throughput.
Speakers
avatar for Moshe Twitto

Moshe Twitto

Founder & CTO, Pliops
Moshe is the CTO and co-Founder of Pliops and an expert in advanced data management and coding algorithms. Prior to co-founding Pliops, Moshe served as CTO of Samsung’s SSD Controller Development Center in Israel, holds MSEE, BSEE degrees from Technion University, Summa Cum Laude... Read More →
Thursday February 13, 2025 10:00am - 10:25am PST
AI DevWorld OPEN Stage

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Share Modal

Share this link via

Or copy link