grounding-DINO ์ค์ต with python!
๐ฆ (English) Grounding DINO Practice! Running the Evolved Model of DINO
This post is a hands-on guide to the Grounding DINO, the evolved version of DINO!
Just like DINO, we clone the model from GitHub and run it โ and surprisingly, itโs even simpler ๐
So this time, weโll skip the theory for now and jump straight into running the code!!
๐งฑ 1. Clone the GitHub Repository
\($ git clone https://github.com/IDEA-Research/GroundingDINO.git cd GroundingDINO/\)$
๐ฆ 2. Install the Model
\($ pip install -e .\)$
๐ง 3. Download the Pretrained Weights
\($ mkdir weights cd weights/ wget -q https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha/groundingdino_swint_ogc.pth cd ..\)$
๐ 4. Run the Inference (Command Template)
The command structure for running Grounding DINO is shown below.
Each part marked with {}
can be replaced with the appropriate value for your environment:
\($ CUDA_VISIBLE_DEVICES={GPU_ID} python demo/inference_on_a_image.py \ -c {CONFIG_PATH} \ -p {CHECKPOINT_PATH} \ -i {INPUT_IMAGE_PATH} \ -o {OUTPUT_DIR} \ -t "{TEXT_PROMPT}"\)$
Example values:
Variable | Description |
---|---|
{GPU_ID} | GPU ID to use (e.g., 0 ) โ you can check it using nvidia-smi |
{CONFIG_PATH} | Config file path (e.g., groundingdino/config/GroundingDINO_SwinT_OGC.py ) |
{CHECKPOINT_PATH} | Path to the pretrained weights (e.g., weights/groundingdino_swint_ogc.pth ) |
{INPUT_IMAGE_PATH} | Input image path (e.g., /home/user/images/catch_rugby_ball_001480.jpg ) |
{OUTPUT_DIR} | Directory to save the result (e.g., /home/user/images/GroundingDINO/results ) |
{TEXT_PROMPT} | Text prompt to detect (e.g., "chair" ) |
- You can change the
TEXT_PROMPT
to try different detection phrases!!
โ Real-World Prompt Tests!
Letโs now try running the inference while changing only the TEXT_PROMPT
and see the results!
person
โ A common COCO category.
Of course it works great!
cat
โ Will it produce any false positives even when the object isnโt there?
Nothing detected! Well done ๐
rugby
โ Likely missing from most test sets. Will it still work?
Oh nice! It makes sense!!
jump
โ Now letโs try a verb!
Whoa~ It works for actions too!?
player is jumping
โ What about a full sentence?
Hmmโฆ Seems like it breaks it apart instead of treating it as one phrase.
๐ Conclusion
Grounding DINO was very easy to install, and the inference workflow is intuitive!
Especially useful if you want to experiment with diverse text prompts.
If it could even understand full sentences, and support segmentation too โ
that would be amazing, right!? ๐
Well, maybe thereโs a model out there that already does that?
Letโs keep exploring together!
๐ฆ(ํ๊ตญ์ด) Grounding DINO ์ค์ต! DINO์ ์งํ ๋ชจ๋ธ์ ์ง์ ์คํํด๋ณด์!
์ด๋ฒ ํฌ์คํ
์ DINO์ ํ์ ๋ชจ๋ธ์ธ Grounding DINO ์ค์ต์
๋๋ค!
DINO์ ๋ง์ฐฌ๊ฐ์ง๋ก GitHub repo์์ ์ฝ๋๋ฅผ ๋ด๋ ค๋ฐ์ ์คํํ์ง๋ง, ์คํ๋ ค ๋ ๊ฐ๋จํ๊ฒ ๊ตฌ์ฑ๋์ด ์๋๋ผ๊ตฌ์ ๐
๊ทธ๋์ ์ด๋ฒ์ ์ด๋ก ์ ์ ์ ๋ค๋ก ๋ฏธ๋ฃจ๊ณ , ๋ฐ๋ก ์ฝ๋๋ถํฐ ์คํํด๋ด
๋๋ค!!
๐งฑ 1. GitHub ์ ์ฅ์ ํด๋ก
1
2
git clone https://github.com/IDEA-Research/GroundingDINO.git
cd GroundingDINO/
๐ฆ 2. ๋ชจ๋ธ ์ค์น
1
pip install -e .
๐ง 3. Pretrained Weight ๋ค์ด๋ก๋
1
2
3
4
mkdir weights
cd weights/
wget -q https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha/groundingdino_swint_ogc.pth
cd ..
๐ 4. ๊ฐ์ฒด ํ์ง ์คํ ์คํ (๋ช ๋ น์ด ํ ํ๋ฆฟ)
๋ค์ ๋ช
๋ น์ด๋ Grounding DINO๋ฅผ ์คํํ๋ ์ ์ฒด ๊ตฌ์กฐ์
๋๋ค.
ํ์ํ ๊ฐ๋ค์ {}
๋ณ์๋ก ๊ตฌ์ฑ๋์ด ์์ด ์ํฉ์ ๋ง๊ฒ ๋์ฒดํ๋ฉด ๋ฉ๋๋ค:
1
2
3
4
5
6
CUDA_VISIBLE_DEVICES={GPU_ID} python demo/inference_on_a_image.py \
-c {CONFIG_PATH} \
-p {CHECKPOINT_PATH} \
-i {INPUT_IMAGE_PATH} \
-o {OUTPUT_DIR} \
-t "{TEXT_PROMPT}"
์์ ๊ฐ ๋์ :
๋ณ์๋ช | ์ค๋ช |
---|---|
{GPU_ID} | ์ฌ์ฉํ GPU ID (์: 0 ) - nvidia-smi ํ๋ฉด ํ์ธํ ์ ์์ต๋๋ค!! |
{CONFIG_PATH} | ์ค์ ํ์ผ ๊ฒฝ๋ก (์: groundingdino/config/GroundingDINO_SwinT_OGC.py ) |
{CHECKPOINT_PATH} | ๋ชจ๋ธ ๊ฐ์ค์น ๊ฒฝ๋ก (์: weights/groundingdino_swint_ogc.pth ) |
{INPUT_IMAGE_PATH} | ๊ฐ์งํ ์ด๋ฏธ์ง ๊ฒฝ๋ก (์: /home/user/images/catch_rugby_ball_001480.jpg ) |
{OUTPUT_DIR} | ์ถ๋ ฅ ์ ์ฅ ๋๋ ํฐ๋ฆฌ (์: /home/user/images/GroundingDINO/results ) |
{TEXT_PROMPT} | ํ์งํ ๋์ ํ
์คํธ (์: "chair" ) |
โ ์ค์ ํ๋กฌํฌํธ๋ณ ํ ์คํธ!!
์ด์ TEXT_PROMPT
๋ฅผ ๋ฐ๊ฟ๊ฐ๋ฉฐ ๊ฒฐ๊ณผ๋ฅผ ๋ณด๊ฒ ์ต๋๋ค!!
person
. ๊ฐ์ฅ ๊ฐ๋จํ๋ฉฐ ๊ธฐ์กด coco dataset์ ์๋ person!!
์ญ์ ์ํ๋๊ตฌ๋ง~~
cat
. ์๋๊ฒ์ ์คํ์งํ์ง๋ ์์๊น์!?
์๋ฌด๊ฒ๋ ์์ก๋ค? ์ํ์ด!!
ํ
์คํธ ์
์๋ ์์์ rugby
. ๋ญ๋น๋ผ๋ ๋จ์ด๋ก ์๋์ ํ ๊น์!?
๋ญ๋น~~ ๊ทธ๋ด๋ฏํด ์ข์!?!
jump
์ด๋ฒ์ ๋์ฌ์ผ๋ก!!
์ค~ ๋์ฌ ์ํ๋๊ฑธ!?!
player is jumping
์ด๋ฒ์ ๋ฌธ์ฅ!!
์์ฝ๊ฒ๋ ๋ฌธ์ฅ์ ํ๋๋ก ์ธ์ํ๋๊ฒ ์๋๋ผ ๋์ด์ ๋ณด๋๋ฏํ๋ค์!!
๐ ๋ง๋ฌด๋ฆฌ
Grounding DINO๋ ์ค์น๋ ๊ฐ๋จํ๊ณ , inference๋ ์ง๊ด์ ์ผ๋ก ๋์ด ์์ด์ ๋ฐ๋ก ์คํํด๋ณด๊ธฐ ์ข์์ต๋๋ค!
ํนํ, ๋ค์ํ ํ
์คํธ ํ๋กฌํํธ๋ฅผ ์คํํด๋ณด๊ณ ์ถ์ ๋ ๋งค์ฐ ์ ์ฉํ ๊ฒ ๊ฐ์์.
๋ฌธ์ฅ๊น์ง ์ ์ธ์งํ๊ณ , Segment ๊น์งํ ์ ์๋ค๋ฉด!!
์ผ~~๋ง๋ ์ข์๊น์!!
๊ทธ๋ฐ๋ฐ! ๊ทธ๋ฐ ์ฐ๊ตฌ๋ ์์ง ์์๊น์!?
ํจ๊ป ๊ณต๋ถํด๋ด
์๋ค!^^