๐ฅ๏ธ SEEM Practice!! - SEEM ์ค์ต!! with python. gradio
๐ฆ (English) SEEM Model Practice!!
SEEM: Segment Everything Everywhere All At Once
We previously introduced SEEM in this post.
This time, weโll go through a hands-on tutorial using the SEEM model!
๐งฑ 1. SEEM Git Clone
Clone the official repository from the GitHub page:
1
git clone git@github.com:UX-Decoder/Segment-Everything-Everywhere-All-At-Once.git
๐ฆ 2. Required Packages in Virtual Environment
I used a conda
virtual environment to install the required packages:
1
2
conda create -n seem python=3.9 -y
conda activate seem
Then install the dependencies from the repo:
1
2
3
4
5
6
7
8
9
10
# Python Package Installation
pip install -r assets/requirements/requirements.txt
pip install -r assets/requirements/requirements_custom.txt
# Custom Operator [only needed for training deformable vision encoder]
cd modeling/vision/encoder/ops && sh make.sh && cd ../../../../
# System Package [only needed for SEEM demo]
sudo apt update
sudo apt install ffmpeg
Note: I encountered an error while installing mpi4py
, so I adjusted some variables:
1
2
3
4
5
6
mpicc --version
which mpicc
export LD=ld
export CC=mpicc
export CXX=mpicxx
pip install --no-cache-dir --force-reinstall mpi4py
๐ง 3. Run the SEEM Model
Execute the command from the demo
directory:
The model will automatically download and launch a gradio-based interface!
1
2
export PYTHONPATH=$(pwd) # Required to avoid ModuleNotFoundError: No module named 'modeling'
python demo/seem/app.py
You will see logs like the following:
1
2
3
4
5
6
7
8
9
inputs = [ImageMask(label="[Stroke] Draw on Image",type="pil"), gr.inputs.CheckboxGroup(choices=["Stroke", "Example", "Text", "Audio", "Video", "Panoptic"], type="value", label="Interative Mode"), ImageMask(label="[Example] Draw on Referring Image",type="pil"), gr.Textbox(label="[Text] Referring Text"), gr.Audio(label="[Audio] Referring Audio", source="microphone", type="filepath"), gr.Video(label="[Video] Referring Video Segmentation",format="mp4",interactive=True)]
.../demo/seem/app.py:131: GradioDeprecationWarning: `optional` parameter is deprecated
.../demo/seem/app.py:136: GradioDeprecationWarning: gr.outputs is deprecated, use gradio.components instead
Running on local URL: http://0.0.0.0:8507
IMPORTANT: You are using gradio version 3.42.0, however version 4.44.1 is available.
--------
Running on public URL: https://98c5c3e82a6870cfe5.gradio.live
This share link expires in 72 hours.
Now, if you open https://98c5c3e82a6870cfe5.gradio.live
, youโll be able to access the environment!
(Note: Since I wonโt keep the server running, please use your own URL.)
Hereโs what the interface looks like:
- Interactive Mode: You can try various prompts including stroke, text, audio, etc.
๐ง 4. SEEM Model Testing
Letโs try some tests!
Text prompt: โA person using a computer on the couchโ โ worked really well!
If you just upload an image, it automatically segments objects and labels them!
Using the stroke feature, it segments only the selected region and even labels it!
๐ Conclusion
SEEM truly feels like the peak of current segmentation models.
Its flexibility and ease of use are simply amazing!
๐ฆ(ํ๊ตญ์ด) SEEM ๋ชจ๋ธ ์ค์ต!!
SEEM : Segment Everything Everywhere All At Once
์ง๋ ํฌ์คํ
์์ ์์๋ณด์๋ SEEM!!
์ด๋ฒ ํฌ์คํ
์ ์ด SEEM
๋ชจ๋ธ์ ์ค์ต์ ์งํํด๋ณด๊ฒ ์ต๋๋ค!!
๐งฑ 1. SEEM Git Clone
- ๊ณต์ Git ์ฌ์ดํธ์์ Repo๋ฅผ Clone ํฉ๋๋ค!!
1
git@github.com:UX-Decoder/Segment-Everything-Everywhere-All-At-Once.git
๐ฆ 2. ๊ฐ์ํ๊ฒฝ์์์ ํ์ ํจํค์ง ์ค์น!!
์ ๋ conda ๊ฐ์ํ๊ฒฝ์์ ํ์ ํจํค์ง๋ค์ ์ค์นํ์ต๋๋ค!!
1
2
conda create -n seem python=3.9 -y
conda activate seem
์ด์ , repo์์ ์ ๊ณตํ๋ requirements๋ฅผ ์ค์นํด์ค๋๋ค!
1
2
3
4
5
6
7
8
9
10
# Python Package Installation
pip install -r assets/requirements/requirements.txt
pip install -r assets/requirements/requirements_custom.txt
# Customer Operator [only need training deformable vision encoder]
cd modeling/vision/encoder/ops && sh make.sh && cd ../../../../
# System Package [only need for demo in SEEM]
sudo apt update
sudo apt install ffmpeg
์ฌ๊ธฐ์!! ์ ๋ mpi4py์ค์น์์ ์๋ฌ๊ฐ ์๊ฒผ์ด์
์๋์ ๊ฐ์ด ๋ณ์๋ฅผ ์กฐ๊ธ ๋ฐ๊พธ์ด์ฃผ์์์ต๋๋ค!
1
2
3
4
5
6
mpicc --version
which mpicc
export LD=ld
export CC=mpicc
export CXX=mpicxx
pip install --no-cache-dir --force-reinstall mpi4py
๐ง 3. SEEM ๋ชจ๋ธ ์คํ!!
์๋์ ๊ฐ์ด demoํด๋ํ์ ๋ช
๋ น์ด๋ฅผ ์คํํ๋ฉด!!
์์์ ๋ชจ๋ธ์ ์ ๋ฐ๊ณ !!
gradio ๊ธฐ๋ฐ์ ์ฌ์ดํธ๊ฐ ์์๋ฉ๋๋ค!!
1
2
export PYTHONPATH=$(pwd) # ์๋๋ฉด ModuleNotFoundError: No module named 'modeling' ๋ผ๋ ์๋ฌ๊ฐ ๋น๋๋ค!
python demo/seem/app.py
๊ทธ๋ผ!! ๋ชจ๋ธ์ ๋ค์ด๋ฐ์ผ๋ฉฐ ์๋์ ๊ฐ์ด ์ฃผ๋ฃจ๋ฅต ๋ก๊ทธ๊ฐ ๋๋๋ค~!
1
2
3
4
5
6
7
8
9
10
11
inputs = [ImageMask(label="[Stroke] Draw on Image",type="pil"), gr.inputs.CheckboxGroup(choices=["Stroke", "Example", "Text", "Audio", "Video", "Panoptic"], type="value", label="Interative Mode"), ImageMask(label="[Example] Draw on Referring Image",type="pil"), gr.Textbox(label="[Text] Referring Text"), gr.Audio(label="[Audio] Referring Audio", source="microphone", type="filepath"), gr.Video(label="[Video] Referring Video Segmentation",format="mp4",interactive=True)]
.../Segment-Everything-Everywhere-All-At-Once/demo/seem/app.py:131: GradioDeprecationWarning: `optional` parameter is deprecated, and it has no effect
inputs = [ImageMask(label="[Stroke] Draw on Image",type="pil"), gr.inputs.CheckboxGroup(choices=["Stroke", "Example", "Text", "Audio", "Video", "Panoptic"], type="value", label="Interative Mode"), ImageMask(label="[Example] Draw on Referring Image",type="pil"), gr.Textbox(label="[Text] Referring Text"), gr.Audio(label="[Audio] Referring Audio", source="microphone", type="filepath"), gr.Video(label="[Video] Referring Video Segmentation",format="mp4",interactive=True)]
.../Segment-Everything-Everywhere-All-At-Once/demo/seem/app.py:136: GradioDeprecationWarning: Usage of gradio.outputs is deprecated, and will not be supported in the future, please import your components from gradio.components
gr.outputs.Image(
Running on local URL: http://0.0.0.0:8507
IMPORTANT: You are using gradio version 3.42.0, however version 4.44.1 is available, please upgrade.
--------
Running on public URL: https://98c5c3e82a6870cfe5.gradio.live
This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)
๊ทธ๋ผ!! https://98c5c3e82a6870cfe5.gradio.live
๋ก ์ ์ํด๋ณด๋ฉด ์ด์ ํ๊ฒฝ์ ์ ์์ด ๋๊ฒ์
๋๋ค!!
(์ ๋ ์ด ์๋ฒ๋ฅผ ๊ณ์ ์ผ๋์ง ์์๊ฒ์ด๊ธฐ์ ์ url์ด ์๋๋ผ ์ฌ๋ฌ๋ถ์ url๋ก ์ ์ํ์ธ์!!)
์๋์ ๊ฐ์ ํ๋ฉด์ ๋ณผ ์ ์์ต๋๋ค!
- Interative Mode : stroke (์ด๋ฏธ์ง์ ํ์ํ๊ธฐ), text, audio ๋ฑ ํ๋กฌํฌํธ๋ฅผ ๋ฃ์์ ์์ต๋๋ค!
๐ง 4. SEEM ๋ชจ๋ธ ํ ์คํธ!!
์ด์ . ํ ์คํธ๋ฅผ ์งํํด๋ด ์๋ค!!
์ํ์์ ์ปดํจํฐํ๋ ์ฌ๋์ ๋ํ ํ ์คํธ ํ๋กฌํฌํธ ํ ์คํธ! ์์ฃผ์ข์!
๊ทธ๋ฅ ์ด๋ฏธ์ง๋ง ๋ฃ๋๋ค๋ฉด!? ์์์ segmentation ํ๊ณ ์ด๋ฆ๊น์ง!!
์ด๋ฒ์ stroke ๊ธฐ๋ฅ์ ์จ์ํ๋ฉด!@? ๊ทธ๋ถ๋ถ๋ง segmentation ํ๊ณ ์ด๋ฆ๊น์ง!!
๐ ๋ง๋ฌด๋ฆฌ
SEEM, ๊ทธ๋์์ segmentation ์ ์ ์ ์ธ๋ฏ! ์์ ์์ ๋ก ๋๋ฌด ๋ฉ์ง๋ค์!