Chao-Tsung Huang, NTHU EE

Chao-Tsung Huang

Research

Active Projects

Personal High-performance Computing System for Image Generative AI Inference
Image-generative artificial intelligence (Image-GenAI) has achieved unprecedent success in realistic text-to-image synthesis, image editing, and image restortation. However, it demands an exceptionally high level of computing capability for performing real-time inference with high-quality diffusion models on AI PCs. In this project, we aim to resolve this issue by developing a multi-chip-module system optimized for Image-GenAI models. In particular, we target a chiplet design with a tiled architecture and a network-on-chip mesh for 2.5D heterogeneous integration. This project is supported by NSTC.

Deep Learning Accelerator for Image Deblurring
In this project, we aim to provide high-quality image deblurring by supporting large-receptive-field UNet and global channel attention. Also, we will apply model-architecture co-design to reduce the hugh demand of DRAM bandwidth. This project is supported by Novatek. [APSIPA ASC'23]

eCNN: Embedded Convolutional Network Processor
Convolutional networks, or CNN, recently have made tremendous progress not only on artificial intelligence tasks, such as object recognition and classification, but also on image processing applications, including denoising and super-resolution. It is highly potential to solve all these vision tasks in one single CNN processor; however, the extremely demanding complexity defies its usage on embedded devices so far. In this project, we aim to propose a series of novel CNN processors to achieve real-time, low-power, and high-resolution vision tasks. This project is first supported by MOST (Semiconductor Moonshot Project) and continued as an NSTC project. [eCNN, MICRO'19][ERNet, ICASSP'20][RingCNN, ISCA'21][AICAS'21][ESSCIRC'22/SSC-L][Falcon, TVLSI][EDA, ISSCC'25]

Completed Projects

Advanced Embedded Computing
In this project, we aim to push CNN acceleration to an advanced level: 8K UHD applications with novel model features, such as video processing, instance normalization, and channel attention. To achieve high energy efficency and high computing performance simultaneously, we apply a holistic approach to jointly optimize hardware-oriented models, energy-efficient architecture, and 16nm chip implementation. This project is supported by MOST and TSMC. [SiPS'21][VISTA, ISSCC'23/JSSC][PFNPU, ASSCC'24][STEP, ISSCC'25]

Factored Light-Field Display
Light-field display is an emerging method for true 3D display devices, but the bottleneck for commercialization is its low-resolution nature. Factored display is a technique to increase the resolution by using multiple LCD layers and complex light-field factorization. In this project, we will build a prototype factored display to study how its system parameters affect visual quality and will also develop high-quality and cost-effective factorization algorithms for real-time and low-power implementation. This project was supported by Raydium. [ISSCC'22/TVLSI, TIP]

Empirical Bayesian Neighborhood
Range-weighted neighborhood formulations are useful and popular for their edge-preserving property and simplicity, but they are mostly proposed as intuitive tools. This project studies how to formulate these problems in empirical Bayesian ways, instead of intuitive ways. It also aims to optimize the essential bandwidth parameters by model fitting to empirical distributions, instead of by heuristic setting. The idea can also be extended to build Markov Random Fields for stereo matching. This project was supported by MOST.
Webpages: [denoise, CVPR'15/TIP], [fast fitting, SPL'16], [stereo matching, ICCV'17/TPAMI]

Phase-Based Signal Processor
Complex steerable pyramid (CSP) can decompose signals into pixel-wise phase information, and manipulating the phase can easily create position-shift effect. Therefore, it can provide efficient frame interpolation, view synthesis, and even digital refocusing in one single processor. In this project, we aim to design and implement a high-performance phase-based signal processor for supporting such diverse applications in one unified framework. This project was supported by MOST. [ICASSP'19][ICASSP'20]

Image Deblurring for FORMOSAT-5
FORMOSAT-5 is the first remote sensing satellite entirely made in Taiwan and aims to build up Taiwan's self-reliant space tecnology. However, it was found to have defocus issues in its optical system after launch. In this project, we collaborate with NSPO researchers to deblur its remote sensing images. In contrast with conventional algorithms which cause over-smoothness, we aim to recover original texture and maintain vivid realisticity based on the specific priors of its blur kernel. This project was supported by NSPO. [ACRS'18]

Satellite Light-Field Processor
We define a satellite light field by one full-resolution center view and four half-resolution surrounding views. This setting provides cost-effective but high-quality light-field systems. In this project, we aim to build powerful chips to bring low-power computational photography applications in real time. One application is high-quality depth esitmation (stereo matching) at Full HD to 4K Ultra HD 30fps, and the other one is physically realistic refocusing for Full HD videos. This project was supported by Novatek and MOST. [ISSCC'15][ASSCC'18][VLSIC'19][OJ-SSCS'23]

DragonFly
Lens-array cameras provide the opportunity to have high SNR image quality on thin mobile devices by using DSLR-level sensors. They also can bring us novel light-field applications, such as digital refocus and natural matting. This project focuses on algorithm developement for disparity-based refocusing, denoising, and matting. To show how this can change the way we capture videos, prototype lens-array cameras, DragonFly, and a real-time demo platform will be established. This project was supported by MOST.
Webpages: [refocus & DragonFly, ICASSP'15], [block-based refocusing, TIP], [stereo matching, ICCV'17/TPAMI]

Super-Vision
Humans can only have rough perception for early-vision information, such as object distance, size, and speed. This project targets to develop a low-level vision system, which is based on stereo imaging, for acquiring such early-vision information but in high precision. It will not only help people understand the physical structure of interested objects faster and more precisely, but also can assist and improve the efficiency of high-level computer vision applications. This project was supported by Novatek.