Junha

AI August 05, 2024

[VLM] Vision Language Models 3

💙 Analysis of VLMs / Understanding VLMs ◽️ What matters when building vision-language models? Hugging face, arXiv, 2024. Intr…

AI August 04, 2024

💙 Vision-centric Improvement / Region-based VLMs / Hallucination ◽️ Eyes Wide Shut? Exploring the Visual Shortcomings of Multim…

AI August 03, 2024

💙 Vision Laungague Model ◽️ Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks ECCV20 Problem: Existi…

AI August 02, 2024

General VLM (BLIP2 > BLIP = OSCAR(detector-based) = SimVLM = LEMON(Scaling up vision-language pre-training for image captio…

AI April 25, 2024

240425_Trend A. Retrieval augmented generation (RAG) Architecture What is RAG? ( LinkedIn ) How RAG works? In terms of L…