Research 50
- 🐍 Reasoning Mamba: Hypergraph 기반 추론으로 Weakly Supervised Affordance Grounding 강화!
- 🎭 MaskPrompt: 오픈 보캐뷸러리 Affordance Segmentation을 위한 객체 마스크 프롬프트
- 🖼️ Qwen2.5-VL: Next-Gen Vision-Language Model with Dynamic Resolution & Long Video Understanding
- 🎥 LAVAD: Training-Free Video Anomaly Detection with LLMs
- 📍 GEM: Grounding Everything in Vision-Language Transformers
- 🔎 VL-SAM Hands-on: VL-SAM 을 실습해보자!
- 🔎 VL-SAM: Training-Free Open-Ended Object Detection & Segmentation
- 🔎 CLIP Surgery: A Closer Look at the Explainability of Contrastive Language-Image Pre-training
- 🧩 PartCLIPSeg: Open-Vocabulary Part-level Segmentation with CLIP Guidance
- 🔎 ClipSurgery Hands-on: ClipSurgery 을 실습해보자!
- 🔎 Open-Vocabulary SAM: Segment and Recognize Twenty-thousand Classes Interactively
- 🧩 Segment Anything, Even Occluded (SAMEO): 가려진 부분까지 세그멘트하는 SAM 확장
- 🧠 EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything — 실전형 SAM의 표준
- 🧩 RTMDet, SOTA of Real-Time, One-Stage Object Detectors: 실시간, One-Stage Object Detector의 정수
- 🎨 An Image is Worth One Word: Textual Inversion - 이미지를 `거시기` 화 해보리기!!
- 📝 TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering
- 📊 Evaluation Metrics in CIRR - CIRR분야의 Metrics 알아보기
- 🧠 OSrCIR: Reason-before-Retrieve for Composed Image Retrieval
- 🧠 [CIReVL] VISION-BY-LANGUAGE FOR TRAINING-FREE COMPOSITIONAL IMAGE RETRIEVAL : First Training Free on CIRR
- 🧠 CIRCO - Zero-Shot Composed Image Retrieval with Textual Inversion (ICCV 2023)
- 🧠 CIR - Composed Image Retrieval on Real-life Images : 이미지 탐색의 시작연구!!
- 🧠 FashionIQ - Fashion Image Retrieval with Natural Language: 패션 이미지 검색의 새로운 표준
- 👁️ MLLMs Know Where to Look: Training-free Visual Detail Perception
- 🧠 Notes-guided MLLM Reasoning
- 🧠Lost in the Middle - 긴 문맥에서 언어모델이 진짜 정보를 기억할까?
- 🔍 WSAG-PLSP: Weakly Supervised 학습을 통한 Affordance Grounding 문제해결!
- 🔍 WSMA: Multimodal Weak Supervision으로 Egocentric Affordance Grounding 혁신!
- 📌 LOCATE: Weakly Supervised Affordance Grounding을 위한 Object Part Localization & Transfer
- 🔗 Understanding GLIP - CLIP이해하기!!!
- 📝Understanding YOLO-World - 실시간 Open-Vocabulary Object Detection의 혁신!!!
- 📝Understanding CLIP4HOI - CLIP4HOI를 알아보자!!!
- 📝Understanding EZ-HOI - EZ-HOI 알아보기!!
- 📝Understanding CLIP-Adapter - CLIP-Adapter 알아보기?!!
- 📝Understanding YOLO - YOLO 알아보기?!!
- 📝Understanding BLIP - BLIP 알아보기?!!
- 📝Understanding FG-CLip - FG-Clip 알아보기?!!
- 📝 LoRA: Low-Rank Fine-Tuning for Large Language Models - Understanding LORA- LORA 알아보기?!!
- 📝 Understanding SEEM - SEEM(Segment Everything Everywhere All at Once) 알아보기!!
- 📝 Understanding LISA - LISA 알아보기?!!
- 🧠 Understanding SAM2 - SAM2 알아보기?!!
- 📝 Understanding Grounding DINO!! - Grounding DINO 논문 공부!
- AI에서 'Ground'란 무엇인가? Grounding DINO, Grounding SAM, 그리고 Grounded Affordance까지!
- 📝 DINO: The Evolutionary Object Detection Model of DETR!! - DINO: DETR의 진화형 객체 탐지 모델!! (ICLR 2023)
- 📝 The First Transformer-based Image Detection Model!! DETR! - Transformer로 객채 탐지까지!! DETR의 등장!! (CVPR 2020)
- 📝 Segment Anything, You are amazing! - 누끼의 괴물, SAM의 등장!! (ICCV, 2023)
- 📝 ViT, you can do greater things! - The emergence of DINO!! // ViT, 너는 더 큰일을 할수있어! - DINO의 등장!! (ICCV 2021)
- 📝 Peeking into the Mind of AI: Understanding CAM! - AI의 속마음을 들여다본다!! CAM 알아보기
- 📝 Understanding CLIP - CLIP 모델 이해하기
- Exploring Major Journals in AI - AI와 관련된 주요 저널 알아보기 (feat. h-index)
- 📝 Image? You Can Do Transformer Too!! - The Emergence of ViT!! - 이미지? 너도 Transformer 할수있어!! - ViT의 등장!! (ICLR 2021)