Program – APCCAS2025

Industry Sessions

Industrial Session 1

PIM, LLM, and the On-device AI Chip Revolution

Oct. 13th | 13:30-14:45

IN1-1

PIM moves forward to accelerate LLM

Jaehyun Park

Staff Engineer,
Samsung Electronics, Korea

Abstract

Large language models (LLMs) have transformed modern applications but demand unprecedented computing resources, particularly large memory capacity and high bandwidth for weight processing. While logic process technology has advanced rapidly, memory process scaling has lagged behind, creating a performance bottleneck where LLM decode execution is constrained by memory. To address this, Samsung introduced breakthrough processing-in-memory (PIM) solutions that significantly enhance main memory bandwidth. Using a high-bandwidth GPU cluster with an HBM-PIM system and an LPDDR5-PIM– based system, transformer-based LLMs achieved performance improvements of up to 1.93× and 2.73×, respectively.

IN1-2

The On-device AI Chip Revolution

Jay(Jeongwook) Kim

EVP
DeepX, Korea

Abstract

We are entering the era of “AI Everywhere,” where Artificial Intelligence (AI) is becoming a foundational technology for society, much like electricity or Wi-Fi. AI is evolving beyond a mere tool that offers convenience into “Ambient Intelligence,” deeply integrated and interacting within our daily lives and industrial ecosystems. Amidst this paradigm shift, NPUs (Neural Processing Units), which process vast AI computations in real-time, are no longer an option but have become an indispensable component, akin to the very heart of future technology. Until now, most AI services have been delivered by relying on cloud servers with immense computing resources. While this model contributed to the initial proliferation of AI technology, it is simultaneously revealing clear technical and structural limitations. Concerns over data security and privacy breaches, which arise from transmitting sensitive data to central servers, are eroding public trust in the technology. Network latency, caused by physical distance, acts as a critical drawback in fields like autonomous driving and robotics where split-second decisions are paramount. Furthermore, the massive operational costs required to transmit and process data from billions of devices to the cloud are becoming a major barrier to the widespread adoption and democratization of AI services. To realize the true potential of automation, hyper-personalization, and real-time intelligent services, a new paradigm that moves beyond the centralized model is urgently needed. The fundamental solution lies in On-Device AI. This is a technology that embeds the “brain” of AI directly onto the device where data is generated and consumed, enabling it to infer and make decisions independently without a network connection. By bypassing the cloud, On-Device AI innovatively solves the persistent challenges of cost, security, and speed. Data remains on the user’s device, ensuring complete privacy protection; it guarantees instantaneous responsiveness with zero physical latency; and it maximizes the economic viability of AI technology by eliminating unnecessary data transmission costs. At this major inflection point, as the global AI market shifts from the cloud to the device, we at DeepX are confident in the infinite potential of the exponentially growing On-Device AI market. We are not merely following this trend; rather, we aim to lead this massive transformation through our core technologies and become a key player in pioneering this new era.

Industrial Session 2

Emerging Sensor Technologies for Physical AI

Oct. 14th | 15:00-16:15

IN2-1

Printed Deformable Electronic Devices for Practical Uses

Unyong Jeong

POSTECH/MiDAS H&T

Abstract

Tactile sensors have taking increasing attention for the use in wearable healthcare devices and electronic skins for robots. In fabrication of the deformable tactile sensors, there have been two approaches; electronic sensor and iontronic sensor. Electronic tactile sensors monitor the electrical changes (resistance, capacitance, inductance, voltage, current) resulting from temperature change and mechanical deformation, either with/without selectors (diodes, transistors) and signal modulators (ring oscillator, analog-to-digital converter). Iontronic tactile sensors detect the electrical changes (voltage, capacitance, current) caused by charge distribution and ion transport in an electrolyte-containing medium, either with or without synaptic units used for signal modulation. In both approaches, several technological trends are emerging and competing; time-division multiple access (TDMA) versus event-driven parallel collection, passive sensing versus active sensing, and having multifunctions with clear decoupling between the functions. In this talk, I discuss some of the achievements in both the electronic and iontronic tactile sensors and compare the pros and cons of the approaches in the type of data collection, power supply, structural simplicity, and multifunctionality.

IN2-2

Tactile Sensing for Advanced Automation

Alexander Schmitz

XELA Robotics

Abstract

XELA provides the human sense of touch to robots. Our “uSkin” tactile sensors enable robots to perform tasks that previously only humans could do, like gently grasping or inserting objects, e.g. in assembly or warehouse automation. uSkin is small and easy to integrate into existing grippers and robot hands. The company started in August 2018 as a spin-off from Waseda University and has already >50 customers. The uSkin sensor is a 3-axis tactile sensor module with a soft surface. The uSkin “patch” sensor is available in 5 sizes. uSkin Curved has a curved shape similar to a human fingertip. uSkin Multi-Bend can be bent to attach the sensor to curved objects. We also integrate our sensors in robot hands and grippers. The robot hand has 368 tri-axis sensors. We can also modify uSkin to suit your needs.

IN2-3

Edge AI starts in the sensor

Mustahsin Adib

AMS Osram

Abstract

Industrial AI promises efficiency, automation, and scalability, but its success is fundamentally constrained by the quality and intelligence of the sensors feeding it. While the AI market is expanding rapidly, the sensor market evolves at a different pace—creating a gap between AI’s potential and the realities of data collection. Traditional sensors act as passive data sources, often producing redundant or noisy streams that overload AI systems with resource-intensive preprocessing and validation tasks. The complexity lies in bridging this gap: industrial environments demand robust, efficient, and reliable data flows, yet testing, validation, and calibration grow more challenging as sensors integrate AI-like intelligence. The resolution is a new generation of AI-ready sensors that preprocess, compress, and filter information at the edge—reducing energy, latency, and computational costs. This talk will outline the implementation challenge, ams-OSRAM’s approach and a call for action for the sensor industry.

IN2-4

Metaphotonics-Enhanced CMOS Image Sensor: The Future of Imaging and Sensing

Radwan Siddique

Samsung Semiconductor, Inc

Abstract

The CMOS image sensor (CIS) market is rapidly expanding, driven by demand for smaller, smarter, and more capable imaging systems across mobile, automotive, industrial, and AI vision applications. As CIS faces growing challenges in pixel miniaturization, performance and functionality, metaphotonics – nanoscale photonic structures integrated directly on-chip offers a transformative path forward.
This talk will introduce the role of metaphotonics-enhanced CIS, showing how it enables performance gains, supports higher pixel resolution, and enhances AI vision capabilities. Examples will highlight how metaphotonics improves light control, reduce crosstalk, and enable new features like spectral tuning and polarization detection on-chip. By co-embedding optical-electrical intelligence at the pixel level, metaphotonics is redefining CIS from imaging devices to multifunctional sensing platforms

Industrial Session 3

End-to-End Optimization for AI Systems: LLM Efficiency and Chip-to-System Co-Design

Oct. 14th | 16:30-17:45

IN3-1

End-to-End Optimization for Efficient LLM Inference: From Model Compression to Hardware Architecture

Gunho Park

Research Scientist,
AI Computing Solution (NAVER Cloud), Korea

Abstract

Large language models deliver impressive capabilities but at high cost in memory traffic, compute, and latency. We present an end-to-end approach to efficient LLM inference that begins with modelcompression algorithms and extends through CUDA kernels, runtime policy, and hardware architecture. On the algorithmic side, we revisit quantization beyond fixed-bit settings and introduce formats that enable multi-precision operation with minimal accuracy loss. These choices co-design with custom CUDA kernels to reduce latency and support dynamic, per-request precision selection with negligible overhead. We then map the resulting compute and bandwidth profiles onto custom hardware to fully realize efficiency. Case studies on production-scale LLMs show gains in tokens-per-second and energy efficiency, alongside reductions in memory footprint and total cost of ownership. The result is a practical recipe that links compression decisions to kernel behavior and, ultimately, to architectural implications—enabling consistent, deployable improvements.

IN3-2

Digital Twin-Driven Chip-to-System Co-Design

Tai SiK Yang

Professional,
LG Electronics CTO SoC R&D Center

Abstract

This presentation introduces a digital twin-based chip-to-system co-design approach for optimizing highspeed interfaces in AI systems. By enabling performance and cost optimization at the design stage, the methodology reduces design iterations and accelerates product development, supporting efficient and timely delivery of AI solutions.