个人介绍:
刘一茳,南京大学电子学院博士,研究方向包括端侧大模型、模型压缩和加速、软硬协同优化设计等,在CCF-A类国际顶级期刊和会议上发表多篇论文,一作论文(AAAI'25 Pruning-Aware Tuning)应用于三星公司手机和电视端侧业务,与UC伯克利合作论文(ICCV'23 Q-Diffusion)收录于MIT公开课并被NVIDIA TensorRT团队投入使用。
目前负责实验室算法团队建设,围绕高性能AI算法方向致力于探索模型压缩和加速、多模态高效表征和融合、大模型高效微调和存算优化等技术。
往期经历:毕业于西安电子科技大学(本科)和英国爱丁堡大学(研究生),曾就职中船重工第八研究院从事算法研究八年。
研究课题:
神经网络量化加速中量化感知训练和训练后量化的研究;
去噪扩散概率模型的应用和量化加速;
多模态大模型的高效微调技术。
研究成果:
(AAAI 2025) PAT, the pruning-aware tuning approach for efficient LLMs.
[Paper] PAT: Pruning-Aware Tuning for Large Language Models
[Code] https://github.com/kriskrisliu/PAT
(CVPR '24) CDCCA, a Self-Corrected Multimodal Large Language Model designed to optimize the performance of models deployed on client devices by leveraging advanced cloud capabilities.
[Paper] Cloud-Device Collaborative Learning for Multimodal Large Language Models
[Code] https://github.com/2644521362/Cdcca
(CVPR '24) In this paper, we introduce PromptCoT, an innovative enhancer that autonomously refines prompts for users.
[Paper] PromptCoT: Align Prompt Distribution via Adapted Chain of Thought
[Code] https://github.com/SanGibb/PromptCoT
(ICCV '23) Compress diffusion models to accelerate the generation process through post-training quantization, and propose time-step-aware calibration scheme to deal with the changing output distributions in diffusion models over time steps:
[Paper] Q-Diffusion: Quantizing Diffusion Models; https://arxiv.org/abs/2302.04304
[Website] https://xiuyuli.com/qdiffusion
(CVPR '23) Propose NoisyQuant, a quantizer-agnostic enhancement for the posttraining activation quantization performance of vision transformers by actively altering the heavy-tailed activation distribution with additive noisy bias to fit a given quantizer:
[Paper] NoisyQuant: Noisy Bias-Enhanced Post-Training Activation Quantization for Vision Transformers; https://arxiv.org/abs/2211.16056
(开源项目) GitHub开源仓库LLaMA2-Accessory(Stars 1.5k+)核心开发者,联合上海人工智能实验室打造多模态大模型的预训练、微调、量化和测评等完整工具链:
[Website] https://github.com/Alpha-VLLM/LLaMA2-Accessory
E-mail:liuyijiang at smail.nju.edu.cn