农科机构知识库联盟

A survey of efficient fine-tuning methods for Vision-Language Models - Prompt and Adapter

文献类型：外文期刊

第一作者： Xing, Jialu

作者： Xing, Jialu;Liu, Jianping;Sun, Lulu;Chen, Xi;Gu, Xunxun;Wang, Yingfei;Liu, Jianping;Wang, Jian;Liu, Jianping

作者机构：

关键词： Vision-language; Computer vision; Efficient fine-tuning; Pre-training model; Prompt; Adapter

期刊名称：COMPUTERS & GRAPHICS-UK （影响因子：2.5；五年影响因子：2.2 ）

ISSN： 0097-8493

年卷期： 2024 年 119 卷

页码：

收录情况： SCI

摘要： Vision Language Model (VLM) is a popular research field located at the fusion of computer vision and natural language processing (NLP). With the emergence of transformer networks and mass web data, numerous large scale VLMs or Vision -Language Pre-training Models (VLPM) have been achieving state-of-the-art results in many tasks, such as retrieval (CLIP) and generation (DALL-E). Although large models have shown impressive results, the cost of retraining and full fine-tuning is prohibitive for general researchers. In recent years, Efficient fine-tuning (EFT) which a very low-cost tuning method has been a good solution to this problem has greatly alleviated this problem, and driven by this, a new fine-tuning paradigm has developed. Since Prompt and Adapter are most widely used in the field of visual language, this review focuses on analysing the progress of the application of these two methods. Firstly, we reviewed the VLM research paradigm based on the differences in pre-training-fine-tuning methods; Next, We categorized the Prompt into 3 types (7 subtypes) of usage patterns based on the different modal information, and categorized the Adapter into 2 types of usage patterns based on whether it plays a role in modal fusion, furthermore we discussed them in vision and vision-language tasks. Finally, we discussed the stability and social ethics of EFT, and possible future research directions were proposed.

分类号：

相关文献

作者其他论文更多>>

An improved 3D-SwinT-CNN network to evaluate the fermentation degree of black tea

作者：Zhu, Fengle;Wang, Jian;Zhang, Yuqian;Zhao, Zhangfeng;Shi, Jiang;He, Mengzhu

关键词：Black tea fermentation; Hyperspectral imaging; 3D-SwinT-CNN; 3D convolutional neural networks; Swin transformer
NIa-Pro of sugarcane mosaic virus targets Corn Cysteine Protease 1 (CCP1) to undermine salicylic acid-mediated defense in maize

作者：Yuan, Wen;Chen, Xi;Du, Kaitong;Jiang, Tong;Li, Mengfei;Fan, Zaifeng;Zhou, Tao;Yuan, Wen;Chen, Xi;Du, Kaitong;Jiang, Tong;Li, Mengfei;Fan, Zaifeng;Zhou, Tao;Cao, Yanyong;Li, Xiangdong;Doehlemann, Gunther

关键词：
The Function of SD1 on Shoot Length and its Pyramiding Effect on Shoot Length and Plant Height in Rice (Oryza sativa L.)

作者：Dong, Jingfang;Ma, Yamei;Hu, Haifei;Wang, Jian;Yang, Wu;Fu, Hua;Zhang, Longting;Chen, Jiansong;Zhou, Lian;Li, Wenhui;Nie, Shuai;Zhao, Junliang;Liu, Bin;Yang, Tifeng;Zhang, Shaohong;Zhang, Longting;Liu, Ziqiang

关键词：Shoot Length; Plant Height; Causal gene; Allele Mining; Pyramiding Effect; Rice
Mapping Maize Planting Densities Using Unmanned Aerial Vehicles, Multispectral Remote Sensing, and Deep Learning Technology

作者：Shen, Jianing;Hu, Jingyu;Wang, Jian;Shu, Meiyan;Guo, Wei;Qiao, Hongbo;Yue, Jibo;Wang, Qilei;Zhao, Meng;Liu, Yang;Niu, Qinglin;Niu, Qinglin

关键词：maize planting density; object detection; machine learning; vegetation index; YOLO; GLCM
Auxin regulates bulbil initiation by mediating sucrose metabolism in Lilium lancifolium

作者：Xin, Yin;Chen, Xi;Liang, Jiahui;Wu, Jingxiang;Zhang, Mingfang;Zhang, Xiuhai;Du, Yunpeng;Xin, Yin;Wang, Shaokun;Pan, Wenqiang;Wu, Jingxiang;Wu, Jian;Chen, Xi;Yu, Xiaonan;Zaccai, Michele

关键词：
Greenhouse cultivation enhances pesticide bioaccumulation in cowpeas following repeated spraying

作者：Cui, Kai;Wang, Jian;Guan, Shuai;Liang, Jingyun;Fang, Liping;Li, Teng;Dong, Zhan;Ding, Ruiyan;Ma, Guoping;Wu, Xiaohu;Zheng, Yongquan

关键词：Pesticide residue; Cowpea; Distribution; Greenhouse and open-field scenarios; Risk assessment
Pretrained Deep Learning Networks and Multispectral Imagery Enhance Maize LCC, FVC, and Maturity Estimation

作者：Hu, Jingyu;Feng, Hao;Shen, Jianing;Wang, Jian;Guo, Wei;Qiao, Hongbo;Yue, Jibo;Wang, Qilei;Liu, Yang;Liu, Yang;Feng, Haikuan;Yang, Hao;Niu, Qinglin;Niu, Qinglin

关键词：unmanned aerial vehicle; crop leaf chlorophyll content; fractional vegetation cover; maturity; deep learning; ensemble learning; maize

A survey of efficient fine-tuning methods for Vision-Language Models - Prompt and Adapter

作者其他论文 更多>>

An improved 3D-SwinT-CNN network to evaluate the fermentation degree of black tea

NIa-Pro of sugarcane mosaic virus targets Corn Cysteine Protease 1 (CCP1) to undermine salicylic acid-mediated defense in maize

The Function of SD1 on Shoot Length and its Pyramiding Effect on Shoot Length and Plant Height in Rice (Oryza sativa L.)

Mapping Maize Planting Densities Using Unmanned Aerial Vehicles, Multispectral Remote Sensing, and Deep Learning Technology

Auxin regulates bulbil initiation by mediating sucrose metabolism in Lilium lancifolium

Greenhouse cultivation enhances pesticide bioaccumulation in cowpeas following repeated spraying

Pretrained Deep Learning Networks and Multispectral Imagery Enhance Maize LCC, FVC, and Maturity Estimation

作者其他论文更多>>