Junha

Junha Song

Artificial Intelligence Researcher

Featured Post

Welcome!

This blog currenlty contains a total of 246 posts : 116 (AI), 61 (Code), 29 (Util), and 40 (Note). I am excited to launch my new blog. I believe that this personal website make an opportunity to disseminate my academic endeavors with the world. …

Read more

[VLM] Vision Language Models 3

πŸ’™ Analysis of VLMs / Understanding VLMs ◽️ What matters when building vision-language models? Hugging face, arXiv, 2024. Intr…

[VLM] Vision Language Models 2

πŸ’™ Vision-centric Improvement / Region-based VLMs / Hallucination ◽️ Eyes Wide Shut? Exploring the Visual Shortcomings of Multim…

[VLM] Vision Language Models 1

πŸ’™ Vision Laungague Model ◽️ Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks ECCV20 Problem: Existi…

[VLM] Paper list

General VLM (BLIP2 > BLIP = OSCAR(detector-based) = SimVLM = LEMON(Scaling up vision-language pre-training for image captio…

[Note] Hot papers in April 2024

240425_Trend A. Retrieval augmented generation (RAG) Architecture What is RAG? ( LinkedIn ) How RAG works? In terms of L…

Most Popular

[VLM] Vision Language Models 3

πŸ’™ Analysis of VLMs / Understanding VLMs ◽️ What matters when building vision-language models? Hugging face, arXiv, 2024. Intr…

[VLM] Paper list

General VLM (BLIP2 > BLIP = OSCAR(detector-based) = SimVLM = LEMON(Scaling up vision-language pre-training for image captio…