👀 Visual Attention Sink & VAR 논문 공부 (ICLR 2025)
👀 SEE WHAT YOU ARE TOLD — Visual Attention Sink in LMMs 논문: SEE WHAT YOU ARE TOLD: VISUAL ATTENTION SINK IN LARGE MULTIMODAL MODELS 저자: Seil Kang, Jinyeong Kim, Junhyeok Kim, Seong Jae Hwang 학회...
👀 SEE WHAT YOU ARE TOLD — Visual Attention Sink in LMMs 논문: SEE WHAT YOU ARE TOLD: VISUAL ATTENTION SINK IN LARGE MULTIMODAL MODELS 저자: Seil Kang, Jinyeong Kim, Junhyeok Kim, Seong Jae Hwang 학회...
⚡ StreamingLLM & Attention Sink — 핵심 논문 리포트 논문: Efficient Streaming Language Models with Attention Sinks 저자: Guangxuan Xiao et al. 학회: ICLR 2024 Repo : GitHub - mit-han-lab/streaming-llm ...
⚙️ Vision Transformers Need Registers — 핵심 논문 리포트 논문: Vision Transformers Need Registers 저자: Timothée Darcet, Maxime Oquab, Julien Mairal, Piotr Bojanowski 학회: ICLR 2024 핵심 요약: Vision Transfo...
🔍 Your Large Vision-Language Model Only Needs A Few Attention Heads For Visual Grounding 논문 읽기! 결과이미지 감상하기! - TF방식으로 다양한 LVLM에 적용 가능하다! 제목: Your Large Vision-Language Model Only Needs A F...
🔍 Closed-Loop Transfer for Weakly-supervised Affordance Grounding 논문 읽기! 제목: Closed-Loop Transfer for Weakly-supervised Affordance Grounding 학회 및 저자: Tang et al., ICCV 2025 요약: 기존 연구인 LOC...
🔍 Selective Contrastive Learning for Weakly Supervised Affordance Grounding 논문 읽기! 제목: Selective Contrastive Learning for Weakly Supervised Affordance Grounding 학회 및 저자: Moon et al., ICCV 2...
🐍 (한국어) Reasoning Mamba: Hypergraph + Mamba로 Affordance Grounding 문제 해결! 제목: Reasoning Mamba: Hypergraph-Guided Region Relation Calculating for Weakly Supervised Affordance Grounding 학회: CV...
🎭 (English) MaskPrompt: Achieving Open-Vocabulary Affordance Segmentation with Object Shape Mask Prompts! Title: MaskPrompt: Open-Vocabulary Affordance Segmentation with Object Shape Mask Pro...
🖼️ (한국어) Qwen2.5-VL: 다이나믹 해상도와 초장기 비디오 이해까지! 제목: Qwen2.5-VL Technical Report 학회: arXiv (2025년 2월, Alibaba Qwen Team) 코드/체크포인트: GitHub – Qwen2.5-VL 핵심 키워드: Vision-Language Model, Dynamic...
🎥 LAVAD: Training-free Video Anomaly Detection with LLM! LA-VAD = LAnguage-based Video Anomaly Detection. In other words, language model-based video anomaly detection!! Title: Harnessing ...