![]() |
Exynos On-device AIDr. Hyukjune Chung |
Biography
2020 – Current: Samsung Electronics 2005 – 2020: Qualcomm Ph.D. University of Southern California, Department of Electrical Engineering, 2004 Abstract
In this talk, I will focus on on-device AI computing, and for this, I will present evolution of Exynos AI HW/SW solutions, challenges to support generative AI processing, and systematic approaches to overcome these challenges. |
![]() |
TBDFrank Schirrmeister |
Biography
Frank Schirrmeister is Executive Director, Strategic Programs, System Solutions in Synopsys’ System Design Group. He leads strategic activities across system software and hardware assisted development for industries like automotive, data center and 5G/6G communications, as well as for horizontals like Artificial Intelligence / Machine Learning. Prior to Synopsys, Frank held various senior leadership positions at Arteris, Cadence Design Systems, Imperas, Chipvision, and SICAN Microelectronics, focusing on product marketing and management, solutions, strategic ecosystem partner initiatives, and customer engagement. He holds an MSEE from the Technical University of Berlin and actively participates in cross-industry initiatives as Chair of the Design Automation Conference’s Engineering Tracks. Abstract
TBD |
![]() |
The Next-Generation AI Accelerator: Redefining Inference for a Sustainable AI FutureDr. Youngjin Cho |
Biography
Youngjin Cho is Vice President of Hardware at FuriosaAI, where he leads the development of Abstract
As AI inference becomes ubiquitous infrastructure, the industry faces critical challenges in achieving sustainable and cost-effective computing. Current GPU-based solutions, not purpose-built for inference, suffer from poor energy efficiency that threatens AI scalability. This keynote presents TCP (Tensor Contraction Processor), a domain-specific architecture that elevates tensor contraction as the primitive operation. By introducing low-level einsum notation with explicit memory layout and scheduling, TCP enables unprecedented flexibility through software-defined topology and automated compilation, while achieving optimal performance. Our silicon implementation RNGD delivers 512 TFLOPS (FP8) at 150W TDP for air-cooled data centers, demonstrating 4.1× better first token latency and 2.7× improved throughput/watt on LLaMA-7B versus GPUs. This presentation will share our journey from concept to commercial deployment, examining how domain-specific design choices enable sustainable AI infrastructure |