๐ฅ๏ธ Grounded SAM Hands-On with Python! - Grounded SAM ์ค์ต with python!
๐ฆ (English) Hands-On with Grounded SAM! Detect objects with DINO, then Segment with SAM!
In this post, weโll do a hands-on walkthrough of Grounding DINO + SAM = Grounded SAM!
Weโll keep following the GitHub repo and run the code,
but if you go step by step, itโs not too hard!
So once again, letโs skip the theory for now,
and dive straight into the code to understand what Grounded SAM is all about!!
๐งฑ 1. Clone the GitHub Repository
1
2
git clone https://github.com/IDEA-Research/Grounded-Segment-Anything
cd Grounded-Segment-Anything/
๐ฆ 2. Install the Models
- From here on, weโre following the setup from the GitHub repo directly!!
- Please start in an environment where PyTorch and GPU are set up correctly.
- If notโฆ youโll likely run into many issues! ๐
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
export AM_I_DOCKER=False
export BUILD_WITH_CUDA=True
export CUDA_HOME=/path/to/cuda-11.3/
# Install SAM
python -m pip install -e segment_anything
# Install GroundingDINO
pip install --no-build-isolation -e GroundingDINO
# Install diffusers
pip install --upgrade diffusers[torch]
# OSX-specific install: I skipped this on Ubuntu, but if you're on Mac, you should run this!!
git submodule update --init --recursive
cd grounded-sam-osx && bash install.sh
# Install RAM & Tag2Text
git clone https://github.com/xinyu1205/recognize-anything.git
pip install -r ./recognize-anything/requirements.txt
pip install -e ./recognize-anything/
# Final dependencies โ may vary per user!
pip install opencv-python pycocotools matplotlib onnxruntime onnx ipykernel
# Bonus tip!! The supervision version must match exactly as below!!
## I found this after many errors โ trust me, use this version!
pip install supervision==0.21.0
๐ 3. Run Object Detection (from Jupyter Notebook)
Now!! With the provided grounded_sam.ipynb
from the repo, you can jump straight into segmentation~!
I reused the same image from our previous Grounding DINO test.
Just like before, I input prompts and tested various labels.
Here are the results!
Please note: only one segment is returned per prompt!
person
. The simplest and one of the standard COCO dataset labels!!
From detection to segmentation โ flawless!
cat
. We already know Grounding DINO failed to detect this before, so skipping it.rugby
. I hoped it would detect the ball, but sadly, detection failed again!
helmet
. Fantastic result!
jump
. It accurately captured the jumping person!
How about a full sentence this time: player is running
?
Once again, performance on full sentences isnโt quite there yet!
Now I tried a different image.
holding
โ curious to see what it catches~
I was hoping it might isolate just the hand, but I guess thatโs asking too much!
bat
โ can it detect a small baseball bat?
Absolutely! To help understand, hereโs the mask version too!
catcher
and referee
!!
Clearly distinguishes large human figures!
๐ Final Thoughts
Grounded SAM!! After Grounding DINO,
we now go from detection to actual image segmentation!
SAM alone was conceptually interesting but lacked text input,
so Grounded SAM is amazing in that it allows text prompts! ๐
That said, imagine how powerful it would be if it could handle large images and multiple segments in one shot!
๐ฆ(ํ๊ตญ์ด) Grounded SAM ์ค์ต! DINO๋ก ๊ฐ์ฑ ํ์ง ํ Segment๊น์ง!!!
์ด๋ฒ ํฌ์คํ
์ Grounding DINO ์ SAM ์ ๊ฒฐํฉํ Grounded SAM์ ์ค์ต์
๋๋ค!
๊ณ์ํด์ GitHub repo์์ ์ฝ๋๋ฅผ ๋ด๋ ค๋ฐ์ ์คํํ์ง๋ง,
์ฒ์ฒํ ๋ฐ๋ผํด๋ณด๋ฉด ๋ชจ๋ ์ํ ์ ์์ต๋๋ค!
๊ทธ๋์ ์ด๋ฒ์๋ ์ด๋ก ์ ์ ์ ๋ค๋ก ๋ฏธ๋ค๋๊ณ ,
Grounded SAM์ด ๋ญ์ง ์ดํดํ๊ธฐ ์ํด ๋ฐ๋ก ์ฝ๋๋ถํฐ ์คํํด๋ด
์๋ค!!
๐งฑ 1. GitHub ์ ์ฅ์ ํด๋ก
1
2
git clone https://github.com/IDEA-Research/Grounded-Segment-Anything
cd Grounded-Segment-Anything/
๐ฆ 2. ๋ชจ๋ธ ์ค์น
- ์ฌ๊ธฐ๋ถํฐ๋ git repo์ ์ค์น ๋ฐฉ๋ฒ์ ๊ทธ๋๋ก ๋ฐ๋ผํ์ต๋๋ค!!.
- pytorch ๋ฐ GPU ์ธํ ์ด ์ ๋์ด์๋ ํ๊ฒฝ์์ ์์ํด์ฃผ์ธ์!
- ์๋๋ผ๋ฉด, ๋ง์ ๋๊ด์ ๋ถ๋ซํ๋ฆฌ๋ผ ํ์ธํฉ๋๋ค!
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
export AM_I_DOCKER=False
export BUILD_WITH_CUDA=True
export CUDA_HOME=/path/to/cuda-11.3/
# SAM ์ค์น
python -m pip install -e segment_anything
# GroundingDINO ์ค์น
pip install --no-build-isolation -e GroundingDINO
# diffusers ์ค์น
pip install --upgrade diffusers[torch]
# osx ์ค์น : ์ ๋ ์ฐ๋ถํฌ ํ๊ฒฝ์ด์ด ์๋ตํ์ง๋ง mac์ ์ฐ์ ๋ค๋ฉด!!
git submodule update --init --recursive
cd grounded-sam-osx && bash install.sh
# RAM & Tag2Text ์ค์น
git clone https://github.com/xinyu1205/recognize-anything.git
pip install -r ./recognize-anything/requirements.txt
pip install -e ./recognize-anything/
# ๋ง์ง๋ง ํ์ํ ํจ์๋ค์ธ๋ฐ, ์๊ฑด ์ฌ์ฉ์๋ง๋ค ๋ค๋ฅผ ์ ์์ต๋๋ค!
pip install opencv-python pycocotools matplotlib onnxruntime onnx ipykernel
# + ๊ฟํ!! ์๋์ ๊ฐ์ด supervision ์ ๋ฒ์ ผ์ด ์ ๋ง์์ผํฉ๋๋ค!!
## ์ฌ๋ฌ ์๋ฌ๋ฅผ ํตํด ๋ฐ๊ฒฌํ ์ฌ์ค๋ก! ๊ผญ ์ด๋ฒ์ ผ์ด๋์ด์ผํฉ๋๋ค!
pip install supervision==0.21.0
๐ 3. ๊ฐ์ฒด ํ์ง ์คํ ์คํ (์ฃผํผํฐ ๋ ธํธ๋ถ์์!!)
์ด์ !! ๊ธฐ์กด repo์ ์ ๊ณต๋ grounded_sam.ipynb
๋ฅผ ์ฌ์ฉํ๋ฉด ๋ฐ๋ก segment๋ฅผ ์งํํ ์ ์์ต๋๋ค~!
์ ๋ ์ด๋ฏธ์ง๋ง ์ง๋๋ฒ Grounding DINO์ ๋์ผํ ์ด๋ฏธ์ง๋ฅผ ์ฌ์ฉํด๋ณด์์ต๋๋ค!
์ด์ ์ง๋๋ฒ๊ณผ ๋์ผํ๊ฒ, ํ๋กฌํฌํธ๋ฅผ ๋ฃ๊ณ ํ
์คํธ๋ฅผ ์งํํด๋ณด์๊ณ !
๊ฒฐ๊ณผ๋ฅผ ๊ณต์ ํฉ๋ค!! ํ๊ฐ์ Segment ๋ง ๋๋ค๋๊ฒ์ ๊ฐ์ํด์ฃผ์ธ์!!
person
. ๊ฐ์ฅ ๊ฐ๋จํ๋ฉฐ ๊ธฐ์กด coco dataset์ ์๋ person!!
๊ฐ์ง์ ์ด์ด Segment๊น์ง ๋๋ด์ค๋๋ค!
cat
. ์ ์ง๋๋ณ grounding-DINO์์ ์ด๋ฏธ detecting์ด ์๋๋๊ฒ์ ํ์ธํ์์ผ๋ skip!!rugby
. ๊ณต์ด ์กํ๊ฒ์ ์์ํ์ง๋ง ์ฌ๊ธฐ์ ์ข ์์ฝ๋ค์! detection ๋ถํฐ ํ๋ ค๋ฒ๋ ธ์ด!!
helmet
. ์์ฃผ ๋ฉ์ ธ์!!
jump
. ์ ๋ง ์ ํํ๋ ์ฌ๋์ ์ ์ก์๋ ๋๋ค!
player is running
์ด๋ฒ์ ๋ฌธ์ฅ์ผ๋ก!?!!
์ด๋ฒ์๋ ์ญ์ ๋ฌธ์ฅ์์๋ ์ํ์ง ๋ชปํ๋๊ฒ์ ๋ณด์์ต๋๋ค!
์ด์ ์ด๋ฏธ์ง๋ฅผ ๋ฐ๊พธ์ด๋ณด์์ต๋๋ค!
holding
์ด๋ป๊ฒ ๋ ์ง ๊ถ๊ธํ๋๋ฐ~~
ํน์๋ ์ ๋ถ๋ถ๋ง์ ์บ์นํ ๊น ํ๋๋ฐ! ๊ทธ๊ฑด ์์ฌ์ด๋ค์~!
bat
์์ ๋ฐฉ๋ง์ด๋ ์ ์บ์นํ ๊น์!?
์ํฉ๋๋ค! ์ดํด๋ฅผ ์ํด mask ์ด๋ฏธ์ง๋ ํจ๊ป!!
catcher
์ referee
!!
ํผ์ง๋งํ๊ฒ ์ธ๋ฌผ๋ก ์ ๊ตฌ๋ถํฉ๋๋ค!!
๐ ๋ง๋ฌด๋ฆฌ
Grounded SAM!! Grounding DINO์ ์ด์ด์!! ๋ํ์
๋ด๋ถ์ ์ด๋ฏธ์ง๋ฅผ segment!!
SAM์์๋ Text ํ๋กฌํฌํธ๊ฐ ๊ฐ๋
์ ์ผ๋ก๋ง ์ ์๋์ด ์์ฌ์ ๋๋ฐ
์ด Grounded SAM์์๋ ํ
์คํธ ์ ์๊ฐ ๊ฐ๋ฅํด์ ๋๋ฌด ์ข์์ต๋๋ค!^^
๋ค๋ง, ํฐ ์ด๋ฏธ์ง๋ฅผ ๋์ด ์ด๋ฏธ์ง ๋ด์ segment ๊น์ง ๋๋ค๋ฉด ์ผ~~๋ง๋ ์ข์๊น์~!
1
/home/smartride/DrFirst/LOCATE/AGD20K/Seen/trainset/exocentric/hit/baseball_bat/hit_baseball_bat_000029.jpg