Zirui's Homepage

profile.jpeg

About me

I’m Zirui “Ray” Liu, an assistant professor from the Department of Computer Science at UMN. Previously I graduated from Computer Science at Rice University, where I worked with Prof. Xia Hu and Prof. Vladimir Braverman.

I am mostly interested in Large Language Models and their applications, focusing on enabling them to combine and process information from diverse sources and domains. For that reason I deeply care about efficiency, reasoning, long-context ability, and understanding their inner working mechanism. I also enjoy extending foundation models to other domains, exploring the interplay between different source of data.

📧📧 Recruiting: I am always looking for PhD students and research interns with strong coding skills. Feel free to drop me a line to ziruiliu dot recruit at gmail dot com together with resume, transcripts, and a short description of why you’d like to work with me.

News

  • Four paper accepted at EMNLP 2025 (1 Oral, 3 Findings). Three paper accepted at Neurips 2025 (1 Oral, 2 Poster). Kudo to my students and collaborators.

  • 2025/7. We are organizing the VISION workshop at ICCV about industrial inspection.

  • Gave a talk (Paper, Record, Slide) at ASAP seminar about the impact of numerical precision to LLM reasoning evaluation.

  • Received NSF CIRC planning, NSF NAIRR Pilot, UMN DSI Internal Funding, and Adobe Gifts. Thanks NSF, UMN DSI, and Adobe!

  • One paper accepted at ICML 2025. Previously In KIVI we observed K cache has outlier channels, while V doesn’t. In this ICML 2025 paper, we found that this observation is caused by RoPE.

  • Giving one tutorial at AAAI 2025 about KV Cache Optimization. Slide can be found here

  • One paper accepted at CVPR 2025 about Structured Pruning

  • Two paper accepted at ICLR 2025. One about LLM-based file system and one about zero-order fine-tuning of LLMs

  • KVCache Compression Benchmark accepted at EMNLP24. If you want to know the research landspace of this area, take a look at the paper and code.

  • Introduced a rare disease question-answering (ReDis-QA) dataset to assess the chatbot ability of diagnosing rare diseases.

  • 🔥🔥 Our KIVI largely inspires the KV Cache quantization system in Huggingface. Code is available here; And our Self-Extend is used in Llama.cpp, implemented by KerasNLP, and highlighted during Google I/O session. Code is available here.

  • Our KIVI, Self-Extend, and Compress-then-prompt are accepted by ICML 2024. Self-Extend has been selected as Spotlight (3.5%) at ICML2024!

Publications

Please refer to publications or Google Scholar.

latest posts