🖥️ SEEM Practice!! - SEEM 실습!! with python. gradio

Posted Jun 8, 2025

By DrFirst

6 min read

🦖 (English) SEEM Model Practice!!

SEEM: Segment Everything Everywhere All At Once

We previously introduced SEEM in this post.
This time, we’ll go through a hands-on tutorial using the SEEM model!

🧱 1. SEEM Git Clone

Clone the official repository from the GitHub page:

git clone git@github.com:UX-Decoder/Segment-Everything-Everywhere-All-At-Once.git

📦 2. Required Packages in Virtual Environment

I used a conda virtual environment to install the required packages:

  
conda create -n seem python=3.9 -y
conda activate seem

Then install the dependencies from the repo:

  
# Python Package Installation
pip install -r assets/requirements/requirements.txt
pip install -r assets/requirements/requirements_custom.txt

# Custom Operator [only needed for training deformable vision encoder]
cd modeling/vision/encoder/ops && sh make.sh && cd ../../../../

# System Package [only needed for SEEM demo]
sudo apt update
sudo apt install ffmpeg

Note: I encountered an error while installing mpi4py, so I adjusted some variables:

  
mpicc --version
which mpicc
export LD=ld
export CC=mpicc
export CXX=mpicxx
pip install --no-cache-dir --force-reinstall mpi4py

🧊 3. Run the SEEM Model

Execute the command from the demo directory:
The model will automatically download and launch a gradio-based interface!

  
export PYTHONPATH=$(pwd)  # Required to avoid ModuleNotFoundError: No module named 'modeling'
python demo/seem/app.py

You will see logs like the following:

  
  inputs = [ImageMask(label="[Stroke] Draw on Image",type="pil"), gr.inputs.CheckboxGroup(choices=["Stroke", "Example", "Text", "Audio", "Video", "Panoptic"], type="value", label="Interative Mode"), ImageMask(label="[Example] Draw on Referring Image",type="pil"), gr.Textbox(label="[Text] Referring Text"), gr.Audio(label="[Audio] Referring Audio", source="microphone", type="filepath"), gr.Video(label="[Video] Referring Video Segmentation",format="mp4",interactive=True)]
.../demo/seem/app.py:131: GradioDeprecationWarning: `optional` parameter is deprecated
.../demo/seem/app.py:136: GradioDeprecationWarning: gr.outputs is deprecated, use gradio.components instead
Running on local URL:  http://0.0.0.0:8507
IMPORTANT: You are using gradio version 3.42.0, however version 4.44.1 is available.
--------
Running on public URL: https://98c5c3e82a6870cfe5.gradio.live

This share link expires in 72 hours.

Now, if you open https://98c5c3e82a6870cfe5.gradio.live, you’ll be able to access the environment!
(Note: Since I won’t keep the server running, please use your own URL.)

Here’s what the interface looks like:

Interactive Mode: You can try various prompts including stroke, text, audio, etc.

🧊 4. SEEM Model Testing

Let’s try some tests!

Text prompt: “A person using a computer on the couch” – worked really well!

Segmentation based on keywords works great too!

If you just upload an image, it automatically segments objects and labels them!

Using the stroke feature, it segments only the selected region and even labels it!

🎉 Conclusion

SEEM truly feels like the peak of current segmentation models.
Its flexibility and ease of use are simply amazing!

🦖(한국어) SEEM 모델 실습!!

SEEM : Segment Everything Everywhere All At Once

지난 포스팅 에서 알아보았던 SEEM!!
이번 포스팅은 이 SEEM 모델의 실습을 진행해보겠습니다!!

🧱 1. SEEM Git Clone

공식 Git 사이트에서 Repo를 Clone 합니다!!

git@github.com:UX-Decoder/Segment-Everything-Everywhere-All-At-Once.git

📦 2. 가상환경에서의 필요 패키지 설치!!

저는 conda 가상환경에서 필요 패키지들을 설치했습니다!!

  
conda create -n seem python=3.9 -y
conda activate seem

이제, repo에서 제공하는 requirements를 설치해줍니다!

  
# Python Package Installation
pip install -r assets/requirements/requirements.txt
pip install -r assets/requirements/requirements_custom.txt

# Customer Operator [only need training deformable vision encoder]
cd modeling/vision/encoder/ops && sh make.sh && cd ../../../../

# System Package [only need for demo in SEEM]
sudo apt update
sudo apt install ffmpeg

여기서!! 저는 mpi4py설치에서 에러가 생겼어서
아래와 같이 변수를 조금 바꾸어주었었습니다!

  
mpicc --version
which mpicc
export LD=ld
export CC=mpicc
export CXX=mpicxx 
pip install --no-cache-dir --force-reinstall mpi4py

🧊 3. SEEM 모델 실행!!

아래와 같이 demo폴더태의 명령어를 실행하면!!
알아서 모델을 잘 받고!!
gradio 기반의 사이트가 시작됩니다!!

  
export PYTHONPATH=$(pwd) # 아니면 ModuleNotFoundError: No module named 'modeling' 라는 에러가 뜹니다!
python demo/seem/app.py

그럼!! 모델을 다운받으며 아래와 같이 주루륵 로그가 끕니다~!

  
  inputs = [ImageMask(label="[Stroke] Draw on Image",type="pil"), gr.inputs.CheckboxGroup(choices=["Stroke", "Example", "Text", "Audio", "Video", "Panoptic"], type="value", label="Interative Mode"), ImageMask(label="[Example] Draw on Referring Image",type="pil"), gr.Textbox(label="[Text] Referring Text"), gr.Audio(label="[Audio] Referring Audio", source="microphone", type="filepath"), gr.Video(label="[Video] Referring Video Segmentation",format="mp4",interactive=True)]
.../Segment-Everything-Everywhere-All-At-Once/demo/seem/app.py:131: GradioDeprecationWarning: `optional` parameter is deprecated, and it has no effect
  inputs = [ImageMask(label="[Stroke] Draw on Image",type="pil"), gr.inputs.CheckboxGroup(choices=["Stroke", "Example", "Text", "Audio", "Video", "Panoptic"], type="value", label="Interative Mode"), ImageMask(label="[Example] Draw on Referring Image",type="pil"), gr.Textbox(label="[Text] Referring Text"), gr.Audio(label="[Audio] Referring Audio", source="microphone", type="filepath"), gr.Video(label="[Video] Referring Video Segmentation",format="mp4",interactive=True)]
.../Segment-Everything-Everywhere-All-At-Once/demo/seem/app.py:136: GradioDeprecationWarning: Usage of gradio.outputs is deprecated, and will not be supported in the future, please import your components from gradio.components
  gr.outputs.Image(
Running on local URL:  http://0.0.0.0:8507
IMPORTANT: You are using gradio version 3.42.0, however version 4.44.1 is available, please upgrade.
--------
Running on public URL: https://98c5c3e82a6870cfe5.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)

그럼!! https://98c5c3e82a6870cfe5.gradio.live 로 접속해보면 이제 환경에 접속이 된것입니다!!
(저는 이 서버를 계속 켜두지 않을것이기에 제 url이 아니라 여러분의 url로 접속하세요!!)

아래와 같은 화면을 볼 수 있습니다!

Interative Mode : stroke (이미지에 표시하기), text, audio 등 프롬포트를 넣을수 있습니다!

🧊 4. SEEM 모델 테스트!!

이제. 테스트를 진행해봅시다!!

소파에서 컴퓨터하는 사람에 대한 텍스트 프롬포트 테스트! 아주좋아!

단어에 대한 segmentation 도 잘해요~!

그냥 이미지만 넣는다면!? 알아서 segmentation 하고 이름까지!!

이번엔 stroke 기능을 써서하면!@? 그부분만 segmentation 하고 이름까지!!

🎉 마무리

SEEM, 그동안의 segmentation 의 정점인듯! 자유자제로 너무 멋지네요!

AI, Experiment

This post is licensed under CC BY 4.0 by the author.

🦖 (English) SEEM Model Practice!!

🧱 1. SEEM Git Clone

📦 2. Required Packages in Virtual Environment

🧊 3. Run the SEEM Model

🧊 4. SEEM Model Testing

🎉 Conclusion

🦖(한국어) SEEM 모델 실습!!

🧱 1. SEEM Git Clone

📦 2. 가상환경에서의 필요 패키지 설치!!

🧊 3. SEEM 모델 실행!!

🧊 4. SEEM 모델 테스트!!

🎉 마무리

Trending Tags