PhD Candidate, HKUST ECE / ACCESS
Kunming SHAO 邵堃明
I am a PhD candidate at The Hong Kong University of Science and Technology (HKUST) and the AI Chip Center for Emerging Smart Systems (ACCESS), supervised by Prof. Chi-Ying TSUI and Prof. Tim Kwang-Ting CHENG. My research connects digital compute-in-memory circuits, AI accelerator architecture, compiler automation, efficient ML systems, edge LLM agents, and quantization.
I expect to graduate in Summer 2027 and am actively exploring postdoctoral opportunities in AI hardware, computing-in-memory, and efficient intelligence systems.
HKUST ECE / ACCESS
Excellent graduation thesis
Fellowship and academic awards
Research Focus
Hardware-efficient intelligence from circuit to system
Digital CIM and ReRAM
Macro architecture, robust in-memory computation, hybrid SRAM/ReRAM cells, and design automation for digital compute-in-memory.
AI accelerators
Energy-efficient accelerators for DNNs, SNNs, Transformers, FlashAttention, stochastic/approximate computing, and FP8 computation.
ML systems and agents
Edge RAG, wearable medical LLM agents, retrieval acceleration, quantization, and hardware-aware deployment of efficient models.
Selected Threads
Recent publication directions
ReRAM-coupled digital CIM accelerator for FlashAttention dataflow.
IEEE TVLSI FP8 DCIMShift-aware aligned-mantissa bitwidth prediction for efficient FP8 computation on digital CIM.
DATE'26 DS-CIMDigital stochastic CIM with accurate OR-accumulation for edge AI models.
BioCAS'25 Wearable Medical RAGMemory-efficient retrieval architecture for RAG-enabled wearable medical LLM agents.
ISLPED'25 DIRC-RAGHigh-density digital In-ReRAM computation for edge RAG retrieval.
DATE'25 SynDCIMPerformance-aware DCIM compiler with multi-spec subcircuit synthesis.
News
Latest updates
RedBird Award: I received the HKUST RedBird Academic Excellence Award.
ESSERC'26: My first-authored paper SwiftCIM: a 55nm 23.2μJ/Token L-0.5 ReRAM Coupled Digital CIM Accelerator with Fully-Fused Multi-Head Attention Dataflow for FlashAttention was accepted.
TVLSI: The paper I led and co-first-authored, Balancing FP8 Computation Accuracy and Efficiency on Digital CIM via Shift-Aware On-the-fly Aligned-Mantissa Bitwidth Prediction, was accepted.
US Patent: Our co-authored patent on a hybrid computing-in-memory device and multi-level sensing method was approved and published.
Chinese Patent: Our co-authored patent on a hybrid CIM device and multi-level data-bit sensing method was approved and published.
DATE'26: My first-authored paper DS-CIM: Digital Stochastic Computing-In-Memory Featuring Accurate OR-Accumulation via Sample Region Remapping for Edge AI Models was accepted.
A-SSCC'25: Our co-authored paper Lemem: A 179.8TFLOPS/W, 24.21TFLOPS Learning-In-Memory Processor with Layer-Fused Forward/Backward Pipeline for Edge DNN/SNN Training/Inference was accepted.
BioCAS'25: The paper I led and co-first-authored, A Memory-Efficient Retrieval Architecture for RAG-Enabled Wearable Medical LLMs-Agents, was accepted.
TCAD: Our co-authored paper Configurable Dataflow and Adaptive Mapping Optimization for Hybrid ReRAM and SRAM Compute-in-Memory Accelerator was accepted.
CASS Travel Grant: I received the IEEE CASS Student Travel Grant.
ISLPED'25: My first-authored paper DIRC-RAG: Accelerating Edge RAG with Robust High-Density and High-Loading-Bandwidth Digital In-ReRAM Computation was accepted.
DAC'25 WIP: My work-in-progress poster on AI accelerators based on approximate computing was accepted.
CICC'25: Our co-first-authored paper E-NPU: A 34~126nJ/Class Event-Driven Adaptive Neural SoC with Signal-Dynamics-Aware Feature Clustering and Multi-Model In-Memory Inference/Training for Personalized Medical Wearables was accepted.
ISCAS'25: Our co-first-authored paper A Flexible Precision Scaling Deep Neural Network Accelerator with Efficient Weight Combination was accepted.
PQE: I passed the PhD Qualification Exam and continued as a PhD candidate.
DATE'25: My first-authored paper SynDCIM: A Performance-Aware Digital Computing-in-Memory Compiler with Multi-Spec-Oriented Subcircuit Synthesis was accepted.
ICCAD'24: Our co-authored paper ReSCIM: Variation-Resilient High Weight-Loading Bandwidth In-Memory Computation Based on Fine-Grained Hybrid Integration of Multi-Level ReRAM and SRAM Cells was accepted.
Thesis: My BEng thesis, Digital Compute-In-Memory Automatic Design Methodology, was selected as an excellent graduation project at SCUT.
HKPFS & RedBird: I received the Hong Kong PhD Fellowship and HKUST RedBird Award.
DAC'23: Our co-authored paper AutoDCIM: An Automated Digital CIM Compiler was accepted.
Collaborations
Research group and joint projects
I coordinate a multi-institution collaboration across HKUST, SCUT, Westlake University, SYSU, WHU, and other partners on in-memory computation, approximate computing, efficient algorithms, and emerging non-volatile memories. The projects below highlight first/co-first author and corresponding-author roles.
Kunming Shao and Fengshi Tian, HKUST.
Liang Zhao, SCUT, and Kunming Shao, HKUST.
Fengshi Tian, HKUST; Jinbo Chen, Westlake; Kunming Shao, HKUST.
Kunming Shao, HKUST, and Liang Zhao, SCUT.
Kunming Shao, HKUST; Zhipeng Liao, Westlake; Jiangnan Yu and Xiaomeng Wang, HKUST.
Zhipeng Liao, Westlake, and Kunming Shao, HKUST.
Kunming Shao, HKUST, and Liang Zhao, SCUT.
Liang Zhao, SCUT, and Kunming Shao, HKUST.
Kunming Shao and Xiaomeng Wang, HKUST.
