![]() |
IN1-1 PIM moves forward to accelerate LLMJaehyun Park Staff Engineer, |
|
Abstract
Large language models (LLMs) have transformed modern applications but demand unprecedented computing resources, particularly large memory capacity and high bandwidth for weight processing. While logic process technology has advanced rapidly, memory process scaling has lagged behind, creating a performance bottleneck where LLM decode execution is constrained by memory. To address this, Samsung introduced breakthrough processing-in-memory (PIM) solutions that significantly enhance main memory bandwidth. Using a high-bandwidth GPU cluster with an HBM-PIM system and an LPDDR5-PIM– based system, transformer-based LLMs achieved performance improvements of up to 1.93× and 2.73×, respectively. |
|
![]() |
IN1-2 The On-device AI Chip RevolutionJay(Jeongwook) Kim EVP |
|
Abstract
We are entering the era of “AI Everywhere,” where Artificial Intelligence (AI) is becoming a foundational technology for society, much like electricity or Wi-Fi. AI is evolving beyond a mere tool that offers convenience into “Ambient Intelligence,” deeply integrated and interacting within our daily lives and industrial ecosystems. Amidst this paradigm shift, NPUs (Neural Processing Units), which process vast AI computations in real-time, are no longer an option but have become an indispensable component, akin to the very heart of future technology. Until now, most AI services have been delivered by relying on cloud servers with immense computing resources. While this model contributed to the initial proliferation of AI technology, it is simultaneously revealing clear technical and structural limitations. Concerns over data security and privacy breaches, which arise from transmitting sensitive data to central servers, are eroding public trust in the technology. Network latency, caused by physical distance, acts as a critical drawback in fields like autonomous driving and robotics where split-second decisions are paramount. Furthermore, the massive operational costs required to transmit and process data from billions of devices to the cloud are becoming a major barrier to the widespread adoption and democratization of AI services. To realize the true potential of automation, hyper-personalization, and real-time intelligent services, a new paradigm that moves beyond the centralized model is urgently needed. The fundamental solution lies in On-Device AI. This is a technology that embeds the “brain” of AI directly onto the device where data is generated and consumed, enabling it to infer and make decisions independently without a network connection. By bypassing the cloud, On-Device AI innovatively solves the persistent challenges of cost, security, and speed. Data remains on the user’s device, ensuring complete privacy protection; it guarantees instantaneous responsiveness with zero physical latency; and it maximizes the economic viability of AI technology by eliminating unnecessary data transmission costs. At this major inflection point, as the global AI market shifts from the cloud to the device, we at DeepX are confident in the infinite potential of the exponentially growing On-Device AI market. We are not merely following this trend; rather, we aim to lead this massive transformation through our core technologies and become a key player in pioneering this new era. |
|
![]() |
IN2-1 Printed Deformable Electronic Devices for Practical UsesUnyong Jeong POSTECH/MiDAS H&T |
|
Abstract
Tactile sensors have taking increasing attention for the use in wearable healthcare devices and electronic skins for robots. In fabrication of the deformable tactile sensors, there have been two approaches; electronic sensor and iontronic sensor. Electronic tactile sensors monitor the electrical changes (resistance, capacitance, inductance, voltage, current) resulting from temperature change and mechanical deformation, either with/without selectors (diodes, transistors) and signal modulators (ring oscillator, analog-to-digital converter). Iontronic tactile sensors detect the electrical changes (voltage, capacitance, current) caused by charge distribution and ion transport in an electrolyte-containing medium, either with or without synaptic units used for signal modulation. In both approaches, several technological trends are emerging and competing; time-division multiple access (TDMA) versus event-driven parallel collection, passive sensing versus active sensing, and having multifunctions with clear decoupling between the functions. In this talk, I discuss some of the achievements in both the electronic and iontronic tactile sensors and compare the pros and cons of the approaches in the type of data collection, power supply, structural simplicity, and multifunctionality. |
|
![]() |
IN2-2 Tactile Sensing for Advanced AutomationAlexander Schmitz XELA Robotics |
|
Abstract
XELA provides the human sense of touch to robots. Our “uSkin” tactile sensors enable robots to perform tasks that previously only humans could do, like gently grasping or inserting objects, e.g. in assembly or warehouse automation. uSkin is small and easy to integrate into existing grippers and robot hands. The company started in August 2018 as a spin-off from Waseda University and has already >50 customers. The uSkin sensor is a 3-axis tactile sensor module with a soft surface. The uSkin “patch” sensor is available in 5 sizes. uSkin Curved has a curved shape similar to a human fingertip. uSkin Multi-Bend can be bent to attach the sensor to curved objects. We also integrate our sensors in robot hands and grippers. The robot hand has 368 tri-axis sensors. We can also modify uSkin to suit your needs. |
|
![]() |
IN2-3 Edge AI starts in the sensorMustahsin Adib AMS Osram |
|
Abstract
Industrial AI promises efficiency, automation, and scalability, but its success is fundamentally constrained by the quality and intelligence of the sensors feeding it. While the AI market is expanding rapidly, the sensor market evolves at a different pace—creating a gap between AI’s potential and the realities of data collection. Traditional sensors act as passive data sources, often producing redundant or noisy streams that overload AI systems with resource-intensive preprocessing and validation tasks. The complexity lies in bridging this gap: industrial environments demand robust, efficient, and reliable data flows, yet testing, validation, and calibration grow more challenging as sensors integrate AI-like intelligence. The resolution is a new generation of AI-ready sensors that preprocess, compress, and filter information at the edge—reducing energy, latency, and computational costs. This talk will outline the implementation challenge, ams-OSRAM’s approach and a call for action for the sensor industry. |
|
![]() |
IN2-4 Metaphotonics-Enhanced CMOS Image Sensor: The Future of Imaging and SensingRadwan Siddique Samsung Semiconductor, Inc |
|
Abstract
The CMOS image sensor (CIS) market is rapidly expanding, driven by demand for smaller, smarter, and more capable imaging systems across mobile, automotive, industrial, and AI vision applications. As CIS faces growing challenges in pixel miniaturization, performance and functionality, metaphotonics – nanoscale photonic structures integrated directly on-chip offers a transformative path forward. |
|
![]() |
IN3-1 End-to-End Optimization for Efficient LLM Inference: From Model Compression to Hardware ArchitectureGunho Park Research Scientist, |
|
Abstract
Large language models deliver impressive capabilities but at high cost in memory traffic, compute, and latency. We present an end-to-end approach to efficient LLM inference that begins with modelcompression algorithms and extends through CUDA kernels, runtime policy, and hardware architecture. On the algorithmic side, we revisit quantization beyond fixed-bit settings and introduce formats that enable multi-precision operation with minimal accuracy loss. These choices co-design with custom CUDA kernels to reduce latency and support dynamic, per-request precision selection with negligible overhead. We then map the resulting compute and bandwidth profiles onto custom hardware to fully realize efficiency. Case studies on production-scale LLMs show gains in tokens-per-second and energy efficiency, alongside reductions in memory footprint and total cost of ownership. The result is a practical recipe that links compression decisions to kernel behavior and, ultimately, to architectural implications—enabling consistent, deployable improvements. |
|
![]() |
IN3-2 Digital Twin-Driven Chip-to-System Co-DesignTai SiK Yang Professional, |
|
Abstract
This presentation introduces a digital twin-based chip-to-system co-design approach for optimizing highspeed interfaces in AI systems. By enabling performance and cost optimization at the design stage, the methodology reduces design iterations and accelerates product development, supporting efficient and timely delivery of AI solutions. |
|







