HMI Lab | PKUCS

Books

[Springer Nature, 2020 Jun 29] “Deep Reinforcement Learning: Fundamentals, Research and Applications”, Springer Nature, 2020 Jun 29, (Electronic Edition 250,000 downloads; selectd to Annual High-Impact Publications in Computer Science by Chinese researchers).
H. Dong, Z. Ding, S. Zhangs, eds.

Journals & Conferences

[CVPR 2025] Nan Huang, Wenzhao Zheng, Chenfeng Xu, Kurt Keutzer, Shanghang Zhang, Angjoo Kanazawa, Qianqian Wang, Segment Any Motion in Videos, In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2025.
Codes: https://github.com/nnanhuang/SegAnyMo

[CVPR 2025] Jiajun Cao, Yuan Zhang, Tao Huang, Ming Lu, Qizhe Zhang, Ruichuan An, Ningning MA, Shanghang Zhang, MoVE-KD: Knowledge Distillation for VLMs with Mixture of Visual Encoders, In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2025.
Codes: https://github.com/hey-cjj/MoVE-KD

[CVPR 2025] Yueru Jia, Jiaming Liu, Sixiang Chen, Chenyang Gu, Zhilve Wang, Xiaoqi Li, Longzan Luo, Pengwei Wang, Renrui Zhang, Zhongyuan Wang, Shanghang Zhang, Lift3D Policy: Lifting 2D Foundation Models for Robust 3D Robotic Manipulation, In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2025.
Codes: https://github.com/PKU-HMI-Lab/LIFT3D.git

[CVPR 2025] Yuheng Ji, Huajie Tan, Jiayu Shi, Xiaoshuai Hao, Yuan Zhang, Hengyuan Zhang, Pengwei Wang, Mengdi Zhao, Yao Mu, Pengju An, Xinda Xue, Qinghang Su, Huaihai Lyu, Xiaolong Zheng, Jiaming Liu, Zhongyuan Wang, Shanghang Zhang, RoboBrain: A Unified Brain Model for Robotic Manipulation from Abstract to Concrete, In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2025.
Codes: https://github.com/FlagOpen/RoboBrain

[CVPR 2025] Jinchang Xu, Shaokang Wang, Jintao Chen, Zhe Li, Peidong Jia, Fei Zhao, Guoqing Xiang, Zhijian Hao, Shanghang Zhang, Xiaodong Xie, Decouple Distortion from Perception: Region Adaptive Diffusion for Extreme-low Bitrate Perception Image Compression, In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2025.
Codes: https://github.com/xjc97/mridc.git

[ICLR 2025] Xingqun Qi, Yatian Wang, Hengyuan Zhang, Jiahao Pan, Wei Xue, Shanghang Zhang, Wenhan Luo, Qifeng Liu, Yike Guo, “Co3Gesture: Towards Coherent Concurrent Co-speech 3D Gesture Generation with Interactive Diffusion”, ICLR 2025, Spotlight (Top 5.1%).
Codes: https://mattie-e.github.io/Co3/

[ICLR 2025] Weifeng Lin, Xinyu Wei, Ruichuan An, Peng Gao, Bocheng Zou, Yulin Luo, Siyuan Huang, Shanghang Zhang, Hongsheng Li, “Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want”, ICLR 2025.
Codes: https://github.com/AFeng-x/Draw-and-Understand

[ICLR 2025] Renrui Zhang, Xinyu Wei, Dongzhi Jiang, Yichi Zhang, Ziyu Guo, Chengzhuo Tong, Jiaming Liu, Aojun Zhou, Shanghang Zhang, Peng Gao, Hongsheng Li, “MAVIS: Mathematical Visual Instruction Tuning”, ICLR 2025.
Codes: https://github.com/ZrrSkywalker/MAVIS

[ICRA 2025] Jianing Li, Hao Wang, Guxixi Gu Chenyang, Ming Lu, Wenzhao Zheng, LI DU, Shanghang Zhang, SliceOcc: Indoor 3D Semantic Occupancy Prediction with Vertical Slice Representation, ICRA 2025.
Codes: https://github.com/NorthSummer/SliceOcc.git

[ICRA 2025] High-Quality 3D Creation from A Single Image Using Subject-Specific Knowledge Prior, Nan Huang, Ting Zhang, Yuhui Yuan, Dong Chen, Shanghang Zhang, ICRA 2025.
Codes: https://github.com/nnanhuang/Customize-it-3D

[ICRA 2025] DOZE: A Dataset for Open-Vocabulary Zero-Shot Object Navigation in Dynamic Environments, Ma, Ji,Dai, Ryan,Mu, Yao,Wu, Pengying,Wang, Hao,Chi, Xiaowei,Fei, Yang,Zhang, Shanghang,Liu, Chang, ICRA 2025.
Codes: https://github.com/JiMa25/DOZE-Dataset

[AAAI 2025] Bowen Liu, Haoyang Li, Shuning Wang, Shuo Nie, Shanghang Zhang, Subgraph Aggregation for Out-of-Distribution Generalization on Graphs, AAAI 2025.
Codes: https://github.com/Nanolbw/SuGAr

[AAAI 2025] Senqiao Yang, Jiaming Liu, Renrui Zhang, Mingjie Pan, Ziyu Guo, Xiaoqi Li, Zehui Chen, Peng Gao, Hongsheng Li, Yandong Guo, Shanghang Zhang, LiDAR-LLM: Exploring the Potential of Large Language Models for 3D LiDAR Understanding, AAAI 2025.
Codes: https://huggingface.co/datasets/Senqiao/LiDAR-LLM-Nu-Caption

[AAAI 2025] Yueru Jia, Yuhui Yuan, Aosong Cheng, Chuke Wang, Ji Li, Huizhu Jia, Shanghang Zhang, DesignEdit: Unify Spatial-Aware Image Editing via Training-free Inpainting with a Multi-Layered Latent Diffusion Framework, AAAI 2025.
Codes: https://github.com/design-edit/DesignEdit.git

[NeurIPS 2024] Yuan Zhang, Fei xiao, Tao Huang, Chun-Kai Fan, Hongyuan Dong, Jiawen Li, Jiacong Wang, Kuan Cheng, Shanghang Zhang*, Haoyuan Guo*, Unveiling the Tapestry of Consistency in Large Vision-Language Models, Advances in Neural Information Processing Systems (NeurIPS), 2024.
Codes: https://github.com/foundation-multimodal-models/ConBench

[NeurIPS 2024] Jiaming Liu, Mengzhen Liu, Zhenyu Wang, Pengju An, Xiaoqi Li, Kaichen Zhou, Senqiao Yang, Renrui Zhang, Yandong Guo, Shanghang Zhang*, RoboMamba: Efficient Vision-Language-Action Model for Robotic Reasoning and Manipulation, Advances in Neural Information Processing Systems (NeurIPS), 2024.
Codes: https://github.com/lmzpai/roboMamba

[NeurIPS 2024] Peng Li, Yuan Liu, Xiaoxiao Long, Feihu Zhang, Cheng Lin, Mengfei Li, Xingqun Qi, Shanghang Zhang, Wei Xue, Wenhan Luo, Ping Tan, Wenping Wang, Qifeng Liu, Yike Guo, Era3D: High-Resolution Multiview Diffusion using Efficient Row-wise Attention, Advances in Neural Information Processing Systems (NeurIPS), 2024.
Codes: https://github.com/pengHTYX/Era3D

[EMNLP 2024] Shitian Zhao, Renrui Zhang, Xu Luo, Yan Wang, Shanghang Zhang, Peng Gao, Unleashing the Potentials of Likelihood Composition for Multi-modal Language Models, EMNLP 2024
Codes: https://github.com/zhaoshitian/Likelihood-Composition-Toolkit

[EMNLP 2024] Xinyan Chen, Jiaxin Ge, Tianjun Zhang, Jiaming Liu, Shanghang Zhang, Learning from Mistakes: Iterative Prompt Relabeling for Text-to-Image Diffusion Model Training, EMNLP 2024
Codes: https://github.com/xinyan-cxy/IPR-RLDF

[ECCV 2024] Yulin Luo, Ruichuan An, Bocheng Zou, Yiming Tang, Jiaming Liu, Shanghang Zhang, LLM as Dataset Analyst: Subpopulation Structure Discovery with Large Language Model, ECCV 2024.
Codes: https://github.com/llm-as-dataset-analyst/SSDLLM

[ECCV 2024] Xiaobao Wei, Jiajun Cao, Yizhu Jin, Ming Lu, Guangyu Wang, Shanghang Zhang, I-MedSAM: Implicit Medical Image Segmentation with Segment Anything, ECCV 2024.
Codes: https://github.com/ucwxb/I-MedSAM

[ACM International Conference on Multimedia 2024] Rongyu Zhang, Zefan Cai, Huanrui Yang, Zidong Liu, Denis Gudovskiy, Tomoyuki Okuno, Yohei Nakata, Kurt Keutzer, Baobao Chang, Yuan Du, Li Du, Shanghang Zhang*. "VeCAF: Vision-language Collaborative Active Finetuning with Training Objective Awareness." In Proceedings of the 32nd ACM International Conference on Multimedia, pp. 5451-5459. 2024.
Codes: https://github.com/RoyZry98/VeCAF-Pytorch

[ICML 2024] Anthony Chen, Huanrui Yang, Yulu Gan, Denis A Gudovskiy, Zhen Dong, Haofan Wang, Tomoyuki Okuno, Yohei Nakata, Kurt Keutzer, Shanghang Zhang*. Split-Ensemble: Efficient OOD-aware Ensemble via Task and Model Splitting. Accepted by ICML 2024.
Codes: https://github.com/antonioo-c/Split-Ensemble

[ICML 2024] Yixiong Zou, Shanghang Zhang, Haichen Zhou, Yuhua Li and Ruixuan Li. Compositional Few-Shot Class-Incremental Learning. Accepted by ICML 2024.
Codes: https://github.com/Zoilsen/Comp-FSCIL

[ICML 2024] Pengying Wu, Yao Mu, Bingxian Wu, Yi Hou, Ji Ma, Shanghang Zhang*, Chang Liu*. VoroNav: Voronoi-based Zero-shot Object Navigation with Large Language Model, Accepted by ICML 2024.
Codes: https://voro-nav.github.io/

[JBHI 2024] Xingqun Qi, Zhuojie Wu, Wenxuan Zou, Min Ren, Yifan Gao, Muyi Sun, Shanghang Zhang, Caifeng Shan, and Zhenan Sun. "Exploring generalizable distillation for efficient medical image segmentation." IEEE Journal of Biomedical and Health Informatics (JBHI) (2024).
Codes: https://github.com/XingqunQi-lab/GKD-Framework

[CVPR 2024] Yuan Zhang, Tao Huang, Jiaming Liu, Tao Jiang, Kuan Cheng, Shanghang Zhang*, FreeKD: Knowledge Distillation via Semantic Frequency Prompt. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024.
Codes: https://github.com/Gumpest/FreeKD

[CVPR 2024] Jiaming Liu, Ran Xu. Senqiao Yang. Renrui Zhang. Qizhe Zhang, Zehui Chen, Yandong Guo, Shanghang Zhang*, Continual-MAE: Adaptive Distribution Masked Autoencoders for Continual Test-Time Adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024.
Codes: https://github.com/RanXu2000/continual-mae?tab=readme-ov-file

[CVPR 2024] Xiaobao Wei, Renrui Zhang, Jiarui Wu, Jiaming Liu, Ming Lu, Yandong Guo, Shanghang Zhang*, NTO3D: Neural Target Object 3D Reconstruction with Segment Anything. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024.
Codes: https://github.com/ucwxb/NTO3D

[CVPR 2024] Xingqun Qi, Jiahao Pan, Peng Li, Ruibin Yuan, Xiaowei Chi, Mengfei Li, Wenhan Luo, Wei Xue, Shanghang Zhang, Qifeng Liu, Yike Guo, Weakly-Supervised Emotion Transition Learning for Diverse 3D Co-speech Gesture Generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024.
Codes: https://xingqunqi-lab.github.io/Emo-Transition-Gesture/

[CVPR 2024] Guanqun Wang, Jiaming Liu, Chenxuan Li, Yuan Zhang, Ma Junpeng, Xinyu Wei, Kevin Zhang, Maurice Chong, Renrui Zhang, Yijiang Liu, Shanghang Zhang*, Cloud-Device Collaborative Learning for Multimodal Large Language Models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024.
Codes: https://github.com/KongoCat/Cloud-Device-Collaborative-Learning-for-Multimodal-Large-Language-Models

[CVPR 2024] Zhi Zhang, Qizhe Zhang, Zijun Gao, Renrui Zhang, Ekaterina Shutova, Shiji Zhou, Shanghang Zhang*, Gradient-based Parameter Selection for Efficient Fine-Tuning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024.
Codes: https://github.com/FightingFighting/GPS

[Nature Methods 2024] Zhou, Yu, Jiajun Cao, Justin Sonneck, Sweta Banerjee, Stefanie Dörr, Anika Grüneboom, Kristina Lorenz, Shanghang Zhang, and Jianxu Chen. "EfficientBioAI: making bioimaging AI models efficient in energy and latency." Nature Methods (2024), (Nature Series Journal).
Codes: https://github.com/MMV-Lab/EfficientBioAI

[ICRA 2024] Jiaming Liu, Qizhe Zhang, Xiaoqi Li, Jianing Li, Guanqun Wang, Ming Lu, Tiejun Huang, Shanghang Zhang*, Unsupervised Spike Depth Estimation via Cross-modality Cross-domain Knowledge Transfer, International Conference on Robotics and Automation (ICRA), 2024.
Codes: https://github.com/Theia-4869/BiCross

[ICRA 2024] Jiaming Liu, Rongyu Zhang, Xiaoqi Li, Xiaowei Chi, Zehui Chen, Ming Lu, Yandong Guo, Shanghang Zhang*, Multi-geometric Space Alignments for Domain Adaptive Multi-view 3D Object Detection, International Conference on Robotics and Automation (ICRA), 2024.
Codes: https://github.com/RoyZry98/BEVUDA-Pytorch

[ICRA 2024] Mingjie Pan, Jiaming Liu, Renrui Zhang, Peixiang Huang, Xiaoqi Li, Bing Wang, Hongwei Xie, Li Liu, Shanghang Zhang*, RenderOcc: Vision-Centric 3D Occupancy Prediction with 2D Rendering Supervision, International Conference on Robotics and Automation (ICRA), 2024.
Codes: https://github.com/pmj110119/RenderOcc

[ICRA 2024] Jiayi Ni, Senqiao Yang, Jiaming Liu, Xiaoqi Li, Wenyu Jiao, Ran Xu, Zehui Chen, Yi Liu, Shanghang Zhang*, Distribution-Aware Continual Test Time Adaptation for Semantic Segmentation, International Conference on Robotics and Automation (ICRA), 2024.
Codes: https://github.com/RochelleNi/DAT?tab=readme-ov-file

[ICLR 2024] Jiaming Liu, Senqiao Yang, Peidong Jia, Renrui Zhang, Ming Lu, Yandong Guo, Wei Xue, Shanghang Zhang*, ViDA: Homeostatic Visual Domain Adapter for Continual Test Time Adaptation, The Twelfth International Conference on Learning Representations (ICLR), 2024.
Codes: https://github.com/Yangsenqiao/vida

[ICLR 2024] Chi-Min Chan, Weize Chen, Yusheng Su, Jianxuan Yu, Wei Xue, Shanghang Zhang, Jie Fu, Zhiyuan Liu, ChatEval: Towards Better LLM-based Evaluators through Multi-Agent Debate, The Twelfth International Conference on Learning Representations (ICLR), 2024.
Codes: https://github.com/thunlp/ChatEval

[AAAI 2024] Yang S, Wu J, Liu J, Li X, Zhang Q, Pan M, Gan Y, Chen Z, Zhang S, Exploring Sparse Visual Prompt for Domain Adaptive Dense Prediction, Thirty-Eighth AAAI Conference on Artificial Intelligence (AAAI), 2024.
Codes: https://github.com/Anonymous-012/SVDP

[AAAI 2024] Zhang R, Luo Y, Liu J, Yang H, Dong Z, Gudovskiy D, Okuno T, Nakata Y, Keutzer K, Du Y, Zhang S, Efficient Deweahter Mixture-of-Experts with Uncertainty-Aware Feature-wise Linear Modulation, Thirty-Eighth AAAI Conference on Artificial Intelligence (AAAI), 2024.
Codes: https://github.com/RoyZry98/MoFME-Pytorch

[NeurIPS 2023] Zhou Q, Li W, Jiang L, Wang G, Zhou G, Zhang S, Zhao H. PAD: A Dataset and Benchmark for Pose-agnostic Anomaly Detection. Advances in Neural Information Processing Systems (NeurIPS), 2023.
Codes: https://github.com/EricLee0224/PAD

[IEEE Transactions on Intelligent Vehicles 2023] Li J, Lu M, Liu J, Guo Y, Du Y, Du L, Zhang S*. BEV-LGKD: A Unified LiDAR-Guided Knowledge Distillation Framework for Multi-View BEV 3D Object Detection. IEEE Transactions on Intelligent Vehicles. 2023 Sep 26.
Codes: https://github.com/NorthSummer/LGKD.git

[ICCV 2023] Li X, Liu Y, Lian L, Yang H, Dong Z, Kang D, Zhang S, Keutzer K. Q-diffusion: Quantizing diffusion models. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) 2023 (pp. 17535-17545).
Codes: https://github.com/Xiuyu-Li/q-diffusion

[ICCV 2023] Zhu X, Zhang R, He B, Guo Z, Zeng Z, Qin Z, Zhang S, Gao P. Pointclip v2: Prompting clip and gpt for powerful 3d open-world learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) 2023 (pp. 2639-2650).
Codes: https://github.com/yangyangyang127/PointCLIP_V2

[DAC 2023] Xiao L, Yang H, Dong Z, Keutzer K, Du L, Zhang S*. Csq: Growing mixed-precision quantization scheme with bi-level continuous sparsification. In2023 60th ACM/IEEE Design Automation Conference (DAC) 2023 Jul 9 (pp. 1-6). IEEE.
Codes: https://github.com/lawsonX/CSQ

[CVPR 2023] Chi X, Liu J, Lu M, Zhang R, Wang Z, Guo Y, Zhang S*. BEV-SAN: Accurate BEV 3D Object Detection via Slice Attention Networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023 (pp. 17461-17470).
Codes: https://github.com/litwellchi/BEV-SAN.git

[CVPR 2023] Lu Y, Xu C, Wei X, Xie X, Tomizuka M, Keutzer K, Zhang S*. Open-vocabulary point-cloud object detection without 3d annotation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023 (pp. 1190-1199).
Codes: https://github.com/lyhdet/OV-3DET

[CVPR 2023] Chen A, Zhang K, Zhang R, Wang Z, Lu Y, Guo Y, Zhang S*. Pimae: Point cloud and image interactive masked autoencoders for 3d object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023 (pp. 5291-5301).
Codes: https://github.com/antonioo-c/PiMAE

[CVPR 2023] Liu Y, Yang H, Dong Z, Keutzer K, Du L, Zhang S*. NoisyQuant: Noisy Bias-Enhanced Post-Training Activation Quantization for Vision Transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023 (pp. 20321-20330).
Codes: https://github.com/kriskrisliu/NoisyQuant

[CVPR 2023] Gu J, Wang K, Luo H, Chen C, Jiang W, Fang Y, Zhang S, You Y, Zhao J. MSINet: Twins Contrastive Search of Multi-Scale Interaction for Object ReID. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023 (pp. 19243-19253).
Codes: https://github.com/vimar-gu/MSINet

[CVPR 2023] Ma Y, Li H, Zhang Z, Guo J, Zhang S, Gong R, Liu X. Annealing-Based Label-Transfer Learning for Open World Object Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023 (pp. 11454-11463).
Codes: https://github.com/DIG-Beihang/ALLOW

[NOSSDAV 2023] Zhang R, Du L, Liu J, Song C, Wang F, Li X, Lu M, Guo Y, Zhang S.*. RepCaM: Re-parameterization Content-aware Modulation for Neural Video Delivery. In Proceedings of the 33rd Workshop on Network and Operating System Support for Digital Audio and Video (NOSSDAV, CCF B), 2023 Jun 7 (pp. 1-7).
Codes: https://github.com/RoyZry98/RepCaM-Pytorch

[NeurIPS 2022] Wei X, Zhang Y, Zhang X, Gong R, Zhang S, Zhang Q, Yu F, Liu X. Outlier suppression: Pushing the limit of low-bit transformer language models. Advances in Neural Information Processing Systems (NeurIPS). 2022 Dec 6;35:17402-14.
Codes: https://github.com/wimh966/outlier_suppression?tab=readme-ov-file

[NeurIPS 2022] Zhou H, Xiao S, Zhang S, Peng J, Zhang S, Li J. Jump Self-attention: Capturing High-order Statistics in Transformers. Advances in Neural Information Processing Systems (NeurIPS). 2022 Dec 6;35:17899-910.
Codes: https://github.com/zhouhaoyi/JAT2022

[NeurIPS 2022] Zou Y, Zhang S, Li Y, Li R. Margin-based few-shot class-incremental learning with class-level overfitting mitigation. Advances in neural information processing systems (NeurIPS). 2022 Dec 6;35:27267-79.
Codes: https://github.com/Zoilsen/CLOM

[ECCV 2022] J. Yu, J. Liu, X.Wei, H. Zhou, Y. Nakata, D. Gudovskiy, T. Okuno, J. Li, K. Keutzer, S. Zhang*, MTTrans: Cross-Domain Object Detection with Mean Teacher Transformer, 17th European Conference on Computer Vision (ECCV) 2022.
Codes: https://github.com/Lafite-Yu/MTTrans-OpenSource

[ECCV 2022] X. Li, J. Liu, S.Wang, C. Lyu, M. Lu, Y. Chen, A. Yao, Y. Guo, S. Zhang*, Efficient Meta-Tuning for Content-aware Neural Video Delivery, 17th European Conference on Computer Vision (ECCV) 2022.
Codes: https://github.com/Neural-video-delivery/EMT-Pytorch-ECCV2022

[ICML 2022] Chu X, Jin Y, Zhu W, Wang Y, Wang X, Zhang S, Mei H. DNA: Domain generalization with diversified neural averaging. In International Conference on Machine Learning (ICML) 2022 Jun 28 (pp. 4010-4034). PMLR.
Codes: https://github.com/JinYujie99/DNA

[IJCAI 2022] Li T, Chen X, Dong Z, Yu W, Yan Y, Keutzer K, Zhang S*. Domain-Adaptive Text Classification with Structured Knowledge from Unlabeled Data. International Joint Conference on Artificial Intelligence (IJCAI), 2022.
Codes: https://github.com/hikaru-nara/DASK

[TMM 2022] S. Zhou, L. Wang, S. Zhang*, Z.Wang*, W.Zhu*. “Active Gradual Domain Adaptation: Dataset and Approach”, IEEE Transactions on Multimedia (TMM), 2022.
Codes: https://github.com/LianzheWang/Active-Gradual-Domain-Adaptation-Dataset-and-Approach

[CVPR 2022] C. Zhang#, M. Zhang#, S. Zhang#, et al. "Delving deep into the generalization of vision transformers under distribution shifts.", Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
Codes: https://github.com/Phoenix1153/ViT_OOD_generalization

[WACV 2022] Reed CJ, Yue X, Nrusimha A, Ebrahimi S, Vijaykumar V, Mao R, Li B, Zhang S, Guillory D, Metzger S, Keutzer K. Self-supervised pretraining improves self-supervised pretraining. InProceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2022 (pp. 2584-2594).
Codes: https://github.com/cjrd/self-supervised-pretraining?tab=readme-ov-file