A survey of efficient fine-tuning methods for Vision-Language Models - Prompt and Adapter
文献类型: 外文期刊
第一作者: Xing, Jialu
作者: Xing, Jialu;Liu, Jianping;Sun, Lulu;Chen, Xi;Gu, Xunxun;Wang, Yingfei;Liu, Jianping;Wang, Jian;Liu, Jianping
作者机构:
关键词: Vision-language; Computer vision; Efficient fine-tuning; Pre-training model; Prompt; Adapter
期刊名称:COMPUTERS & GRAPHICS-UK ( 影响因子:2.5; 五年影响因子:2.2 )
ISSN: 0097-8493
年卷期: 2024 年 119 卷
页码:
收录情况: SCI
摘要: Vision Language Model (VLM) is a popular research field located at the fusion of computer vision and natural language processing (NLP). With the emergence of transformer networks and mass web data, numerous large scale VLMs or Vision -Language Pre-training Models (VLPM) have been achieving state-of-the-art results in many tasks, such as retrieval (CLIP) and generation (DALL-E). Although large models have shown impressive results, the cost of retraining and full fine-tuning is prohibitive for general researchers. In recent years, Efficient fine-tuning (EFT) which a very low-cost tuning method has been a good solution to this problem has greatly alleviated this problem, and driven by this, a new fine-tuning paradigm has developed. Since Prompt and Adapter are most widely used in the field of visual language, this review focuses on analysing the progress of the application of these two methods. Firstly, we reviewed the VLM research paradigm based on the differences in pre-training-fine-tuning methods; Next, We categorized the Prompt into 3 types (7 subtypes) of usage patterns based on the different modal information, and categorized the Adapter into 2 types of usage patterns based on whether it plays a role in modal fusion, furthermore we discussed them in vision and vision-language tasks. Finally, we discussed the stability and social ethics of EFT, and possible future research directions were proposed.
分类号:
- 相关文献
作者其他论文 更多>>
-
An improved 3D-SwinT-CNN network to evaluate the fermentation degree of black tea
作者:Zhu, Fengle;Wang, Jian;Zhang, Yuqian;Zhao, Zhangfeng;Shi, Jiang;He, Mengzhu
关键词:Black tea fermentation; Hyperspectral imaging; 3D-SwinT-CNN; 3D convolutional neural networks; Swin transformer
-
NIa-Pro of sugarcane mosaic virus targets Corn Cysteine Protease 1 (CCP1) to undermine salicylic acid-mediated defense in maize
作者:Yuan, Wen;Chen, Xi;Du, Kaitong;Jiang, Tong;Li, Mengfei;Fan, Zaifeng;Zhou, Tao;Yuan, Wen;Chen, Xi;Du, Kaitong;Jiang, Tong;Li, Mengfei;Fan, Zaifeng;Zhou, Tao;Cao, Yanyong;Li, Xiangdong;Doehlemann, Gunther
关键词:
-
The Function of SD1 on Shoot Length and its Pyramiding Effect on Shoot Length and Plant Height in Rice (Oryza sativa L.)
作者:Dong, Jingfang;Ma, Yamei;Hu, Haifei;Wang, Jian;Yang, Wu;Fu, Hua;Zhang, Longting;Chen, Jiansong;Zhou, Lian;Li, Wenhui;Nie, Shuai;Zhao, Junliang;Liu, Bin;Yang, Tifeng;Zhang, Shaohong;Zhang, Longting;Liu, Ziqiang
关键词:Shoot Length; Plant Height; Causal gene; Allele Mining; Pyramiding Effect; Rice
-
Mapping Maize Planting Densities Using Unmanned Aerial Vehicles, Multispectral Remote Sensing, and Deep Learning Technology
作者:Shen, Jianing;Hu, Jingyu;Wang, Jian;Shu, Meiyan;Guo, Wei;Qiao, Hongbo;Yue, Jibo;Wang, Qilei;Zhao, Meng;Liu, Yang;Niu, Qinglin;Niu, Qinglin
关键词:maize planting density; object detection; machine learning; vegetation index; YOLO; GLCM
-
Auxin regulates bulbil initiation by mediating sucrose metabolism in Lilium lancifolium
作者:Xin, Yin;Chen, Xi;Liang, Jiahui;Wu, Jingxiang;Zhang, Mingfang;Zhang, Xiuhai;Du, Yunpeng;Xin, Yin;Wang, Shaokun;Pan, Wenqiang;Wu, Jingxiang;Wu, Jian;Chen, Xi;Yu, Xiaonan;Zaccai, Michele
关键词:
-
Greenhouse cultivation enhances pesticide bioaccumulation in cowpeas following repeated spraying
作者:Cui, Kai;Wang, Jian;Guan, Shuai;Liang, Jingyun;Fang, Liping;Li, Teng;Dong, Zhan;Ding, Ruiyan;Ma, Guoping;Wu, Xiaohu;Zheng, Yongquan
关键词:Pesticide residue; Cowpea; Distribution; Greenhouse and open-field scenarios; Risk assessment
-
Pretrained Deep Learning Networks and Multispectral Imagery Enhance Maize LCC, FVC, and Maturity Estimation
作者:Hu, Jingyu;Feng, Hao;Shen, Jianing;Wang, Jian;Guo, Wei;Qiao, Hongbo;Yue, Jibo;Wang, Qilei;Liu, Yang;Liu, Yang;Feng, Haikuan;Yang, Hao;Niu, Qinglin;Niu, Qinglin
关键词:unmanned aerial vehicle; crop leaf chlorophyll content; fractional vegetation cover; maturity; deep learning; ensemble learning; maize