Post

๐Ÿ–ฅ๏ธ Image segmentation with Python using SAM! - ํŒŒ์ด์ฌ์œผ๋กœ ๋ˆ„๋ผ๋”ฐ๊ธฐ!? SAM (Segment Anything Model) ์‹ค์Šต

๐Ÿ–ฅ๏ธ Image segmentation with Python using SAM! - ํŒŒ์ด์ฌ์œผ๋กœ ๋ˆ„๋ผ๋”ฐ๊ธฐ!? SAM (Segment Anything Model) ์‹ค์Šต

๐Ÿงผ Image segmentation with Python? SAM in Action!

Today, letโ€™s practice how to do Image segmentation using SAM (Segment Anything Model),
based on the theory we explored in the previous post!

Weโ€™ll use Ultralytics to run SAM easily,
instead of downloading and configuring the model from Meta or Hugging Face.

Letโ€™s start by loading the SAM model using Ultralytics:

1
2
3
4
5
6
7
8
from ultralytics import SAM
import cv2
import matplotlib.pyplot as plt
import numpy as np

# Load model
model = SAM("sam_b.pt")
model.eval()

Once the model is loaded, the architecture consists mainly of:

  • image_encoder
  • prompt_encoder
  • mask_decoder

And more specifically, the structure looks like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
# SAM (Segment Anything Model)
- model: SAMModel
  - image_encoder: ImageEncoderViT
    - patch_embed: PatchEmbed
      - proj: Conv2d(3, 768, kernel_size=16, stride=16)
    - blocks: ModuleList
      - Block[0-11] (12 Transformer Blocks)
        - norm1: LayerNorm(768)
        - attn: REAttention
          - qkv: Linear(768 โ†’ 2304)
          - proj: Linear(768 โ†’ 768)
        - norm2: LayerNorm(768)
        - mlp: MLPBlock
          - lin1: Linear(768 โ†’ 3072)
          - lin2: Linear(3072 โ†’ 768)
          - act: GELU
    - neck: Sequential
      - Conv2d(768 โ†’ 256, 1x1)
      - LayerNorm2d
      - Conv2d(256 โ†’ 256, 3x3, padding=1)
      - LayerNorm2d

  - prompt_encoder: PromptEncoder
    - pe_layer: PositionEmbeddingRandom
    - point_embeddings: ModuleList (4 x Embedding(1, 256))
    - not_a_point_embed: Embedding(1, 256)
    - no_mask_embed: Embedding(1, 256)
    - mask_downscaling: Sequential
      - Conv2d(1 โ†’ 4, kernel=2, stride=2)
      - LayerNorm2d
      - GELU
      - Conv2d(4 โ†’ 16, kernel=2, stride=2)
      - LayerNorm2d
      - GELU
      - Conv2d(16 โ†’ 256, kernel=1, stride=1)

  - mask_decoder: MaskDecoder
    - transformer: TwoWayTransformer
      - layers: ModuleList (2 x TwoWayAttentionBlock)
        - self_attn: Attention
        - cross_attn_token_to_image: Attention
        - cross_attn_image_to_token: Attention
        - norm1, norm2, norm3, norm4: LayerNorm(256)
        - mlp: MLPBlock
          - lin1: Linear(256 โ†’ 2048)
          - lin2: Linear(2048 โ†’ 256)
          - act: ReLU
      - final_attn_token_to_image: Attention
      - norm_final_attn: LayerNorm(256)
    - iou_token: Embedding(1, 256)
    - mask_tokens: Embedding(4, 256)
    - output_upscaling: Sequential
      - ConvTranspose2d(256 โ†’ 64, kernel=2, stride=2)
      - LayerNorm2d
      - GELU
      - ConvTranspose2d(64 โ†’ 32, kernel=2, stride=2)
      - GELU
    - output_hypernetworks_mlps: ModuleList (4 x MLP)
      - Linear(256 โ†’ 256)
      - Linear(256 โ†’ 256)
      - Linear(256 โ†’ 32)
    - iou_prediction_head: MLP
      - Linear(256 โ†’ 256)
      - Linear(256 โ†’ 256)
      - Linear(256 โ†’ 4)

Now letโ€™s try out three different segmentation methods โ€” point, bounding box, and fully automatic!

Weโ€™ll use the following image of a dog:

dog


๐ŸŸข Prompt Type 1: Point-Based Segmentation

You specify a single point, and the model segments the object around it.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
from ultralytics import SAM
import cv2
import matplotlib.pyplot as plt
import numpy as np

model = SAM("sam_b.pt")

img_name = "dog.jpg"
my_points = [3000, 2000]

results = model(img_name, points=[my_points], labels=[1])

image = cv2.imread(img_name)
image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

mask = None
for result in results:
    if result.masks is not None:
        mask = result.masks.data[0].cpu().numpy()
        break

if mask is None:
    raise ValueError("No mask found.")

def draw_point(img, point, color=(255, 0, 0), radius=15):
    img_with_point = img.copy()
    cv2.circle(img_with_point, tuple(point), radius, color, -1)
    return img_with_point

image_with_point = draw_point(image_rgb, my_points)

mask_visual = (mask * 255).astype(np.uint8)
colored_mask = cv2.applyColorMap(mask_visual, cv2.COLORMAP_JET)
combined = cv2.addWeighted(image_with_point, 0.7, colored_mask, 0.5, 0)

plt.figure(figsize=(20, 6))

plt.subplot(1, 3, 1)
plt.imshow(image_with_point)
plt.title("Original + Point")
plt.axis("off")

plt.subplot(1, 3, 2)
plt.imshow(mask, cmap='gray')
plt.title("Mask Only")
plt.axis("off")

plt.subplot(1, 3, 3)
plt.imshow(combined)
plt.title("Overlay: Image + Mask + Point")
plt.axis("off")

plt.tight_layout()
plt.show()

point

As shown above, the model extracts the dogโ€™s head based on the red point!


๐Ÿ”ด Prompt Type 2: Bounding Box-Based Segmentation

You provide a rectangular bounding box, and the model segments the object inside that region.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
from ultralytics import SAM
import cv2
import matplotlib.pyplot as plt
import numpy as np

model = SAM("sam_b.pt")

img_name = "dog.jpg"
my_bboxes = [1430.2, 828, 4471.9, 3836.4]

results = model(img_name, bboxes=my_bboxes)

image = cv2.imread(img_name)
image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

plt.figure(figsize=(10, 10))
plt.imshow(image_rgb)

for result in results:
    if result.masks is not None:
        mask = result.masks.data[0].cpu().numpy()
        plt.imshow(mask, alpha=0.5, cmap='jet')

rect = plt.Rectangle((my_bboxes[0], my_bboxes[1]),
                     my_bboxes[2] - my_bboxes[0],
                     my_bboxes[3] - my_bboxes[1],
                     linewidth=2, edgecolor='red', facecolor='none')
plt.gca().add_patch(rect)

plt.title(f"SAM Segmentation with Box Prompt on {img_name}")
plt.axis('off')
plt.show()

bbox

The red bounding box lets SAM extract the subject precisely within the box.


โš™๏ธ Prompt Type 3: No Prompt (Automatic Mask Generation / AMG)

This time, we donโ€™t provide any prompts at all โ€” SAM automatically segments the image into multiple parts.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
from ultralytics import SAM
import cv2
import matplotlib.pyplot as plt
import numpy as np

print("\n--- Method 3: Automatic Mask Generation (No Prompt) ---")

model_amg = SAM("sam_b.pt")
image_path = "dog.jpg"

image = cv2.imread(image_path)
image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

print(f"Running prompt-free segmentation on '{image_path}'...")
results_amg = model_amg(image_path)
print("Done.")

plt.figure(figsize=(12, 10))
plt.imshow(image_rgb)
plt.title("Method 3: SAM Automatic Mask Generation")
plt.axis('off')

if results_amg and results_amg[0].masks is not None:
    masks_amg = results_amg[0].masks.data.cpu().numpy()
    print(f"{len(masks_amg)} masks generated")

    np.random.seed(42)
    for i, mask in enumerate(masks_amg):
        color = np.concatenate([np.random.random(3) * 255, np.array([128])], axis=0)
        h, w = mask.shape[-2:]
        mask_image = mask.reshape(h, w, 1) * color.reshape(1, 1, -1)
        plt.imshow(mask_image.astype(np.uint8))
else:
    print("No masks found.")

plt.show()

The image is segmented into 33 distinct masks, each representing different parts of the dog or the background.

AMG


  • ๐ŸŽ‰ Conclusion

With just a few lines of Python, weโ€™ve performed professional-level segmentation using SAM!
The modelโ€™s performance is truly impressive. Whether using points, boxes, or no prompt at all,
SAM is a powerful tool for background removal and object segmentation.

Letโ€™s keep exploring!


(ํ•œ๊ตญ์–ด) ํŒŒ์ด์ฌ์œผ๋กœ ๋ˆ„๋ผ๋”ฐ๊ธฐ!? SAM ์‹ค์Šต

๋ˆ„๋ผ(์ผ๋ณธ์–ด: ๆŠœใ)๋Š” ์ฃผ๋กœ ์‹œ๊ฐ ๋””์ž์ธ ํ˜„์žฅ์—์„œ ์“ฐ์ด๋Š” ์šฉ์–ด๋กœ,
์›๋ณธ ์ด๋ฏธ์ง€์˜ ํ”ผ์‚ฌ์ฒด๋กœ๋ถ€ํ„ฐ ๋ฐฐ๊ฒฝ์„ ๋ถ„๋ฆฌํ•˜๊ธฐ ์œ„ํ•ด ํ”ผ์‚ฌ์ฒด์˜ ์™ธ๊ณฝ์„ ์„ ๋”ฐ๋Š” ๊ฒƒ์„ โ€˜๋ˆ„๋ผโ€™ ํ˜น์€ โ€˜๋ˆ„๋ผ ๋”ฐ๊ธฐโ€™๋ผ๊ณ  ์ง€์นญํ•ฉ๋‹ˆ๋‹ค!

์˜ค๋Š˜์€ ์ง€๋‚œ ํฌ์ŠคํŒ… ์—์„œ ์•Œ์•„๋ณธ SAM์˜ ์›๋ฆฌ๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ์‹ค์Šต์„ ์ง„ํ–‰ํ•ด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค!!
Python ์ฝ”๋“œ๋กœ ๋ˆ„๋ผ๋”ฐ๊ธฐ!!
SAM (Segment Anything Model)์„ ํ™œ์šฉํ•ด์„œ ์‹ค์Šตํ•ด๋ณด์•„์š”~~ SAM ๋ชจ๋ธ์„ ๋‹ค์šด๋ฐ›๋Š” ๋ฒ•์€ Meta ๊ณต์‹ ์‚ฌ์ดํŠธ์—์„œ ๋‹ค์šด๋ฐ›๋Š” ๋ฐฉ๋ฒ•, Huggingface ๋ชจ๋ธ ๋ฐ›๋Š” ๋ฐฉ๋ฒ• ๋“ฑ์ด ๊ฐ€์žฅ ๊ทผ๋ณธ์ ์ธ ๋ฐฉ๋ฒ•์ด๋ฉฐ,
์˜ค๋Š˜์€ ๋ณด๋‹ค ๊ฐ„๋‹จํ•˜๊ฒŒ ์‚ฌ์šฉํ• ์ˆ˜ ์žˆ๋Š” Ultralytics ๋ฅผ ํ™œ์šฉํ•ด์„œ์‰ฝ๊ฒŒ SAM์„ ์ง„ํ–‰ํ•ด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค!

์•„๋ž˜์™€ ๊ฐ™์ด ultralytics๋กœ ๋ถ€ํ„ฐ SAM ๋ชจ๋ธ์„ ๋ถˆ๋Ÿฌ์˜ต๋‹ˆ๋‹ค~~

1
2
3
4
5
6
7
8
from ultralytics import SAM
import cv2
import matplotlib.pyplot as plt
import numpy as np

# ๋ชจ๋ธ ๋กœ๋“œ
model = SAM("sam_b.pt")
model.eval()

๋กœ๋“œ๋œ ๋ชจ๋ธ๊ตฌ์กฐ๋ฅผ ๋ถ„์„ํ•ด๋ณด๋ฉด!!
๊ฐ„๋‹จํžˆ๋Š” image_encoder / prompt_encoder / mask_decoder ๋กœ,
์„ธ๋ถ€์ ์œผ๋กœ๋Š”๋Š” ์•„๋ž˜์™€ ๊ฐ™์•˜์Šต๋‹ˆ๋‹ค~!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
# SAM (Segment Anything Model)
- model: SAMModel
  - image_encoder: ImageEncoderViT
    - patch_embed: PatchEmbed
      - proj: Conv2d(3, 768, kernel_size=16, stride=16)
    - blocks: ModuleList
      - Block[0-11] (12๊ฐœ์˜ Transformer Blocks)
        - norm1: LayerNorm(768)
        - attn: REAttention
          - qkv: Linear(768 โ†’ 2304)
          - proj: Linear(768 โ†’ 768)
        - norm2: LayerNorm(768)
        - mlp: MLPBlock
          - lin1: Linear(768 โ†’ 3072)
          - lin2: Linear(3072 โ†’ 768)
          - act: GELU
    - neck: Sequential
      - Conv2d(768 โ†’ 256, 1x1)
      - LayerNorm2d
      - Conv2d(256 โ†’ 256, 3x3, padding=1)
      - LayerNorm2d

  - prompt_encoder: PromptEncoder
    - pe_layer: PositionEmbeddingRandom
    - point_embeddings: ModuleList (4 x Embedding(1, 256))
    - not_a_point_embed: Embedding(1, 256)
    - no_mask_embed: Embedding(1, 256)
    - mask_downscaling: Sequential
      - Conv2d(1 โ†’ 4, kernel=2, stride=2)
      - LayerNorm2d
      - GELU
      - Conv2d(4 โ†’ 16, kernel=2, stride=2)
      - LayerNorm2d
      - GELU
      - Conv2d(16 โ†’ 256, kernel=1, stride=1)

  - mask_decoder: MaskDecoder
    - transformer: TwoWayTransformer
      - layers: ModuleList (2 x TwoWayAttentionBlock)
        - self_attn: Attention
        - cross_attn_token_to_image: Attention
        - cross_attn_image_to_token: Attention
        - norm1, norm2, norm3, norm4: LayerNorm(256)
        - mlp: MLPBlock
          - lin1: Linear(256 โ†’ 2048)
          - lin2: Linear(2048 โ†’ 256)
          - act: ReLU
      - final_attn_token_to_image: Attention
      - norm_final_attn: LayerNorm(256)
    - iou_token: Embedding(1, 256)
    - mask_tokens: Embedding(4, 256)
    - output_upscaling: Sequential
      - ConvTranspose2d(256 โ†’ 64, kernel=2, stride=2)
      - LayerNorm2d
      - GELU
      - ConvTranspose2d(64 โ†’ 32, kernel=2, stride=2)
      - GELU
    - output_hypernetworks_mlps: ModuleList (4 x MLP)
      - Linear(256 โ†’ 256)
      - Linear(256 โ†’ 256)
      - Linear(256 โ†’ 32)
    - iou_prediction_head: MLP
      - Linear(256 โ†’ 256)
      - Linear(256 โ†’ 256)
      - Linear(256 โ†’ 4)

์ด์ œ~!!
์ ,bounding box, ์•„๋ฌด๊ฒƒ๋„ ์—†์ด, ์„ธ๊ฐ€์ง€ ๋ฐฉ์‹์œผ๋กœ ์ง„ํ–‰ํ•ด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค!!!

์ง„ํ–‰์—๋Š” ์•„๋ž˜์˜ ๊ฐ•์•„์ง€ ์ด๋ฏธ์ง€๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ๊ณ ๊ณ ๊ณ !!

dog

ํ”„๋กฌํฌํŠธ1. Point ๊ธฐ๋ฐ˜!!

Point ๊ธฐ๋ฐ˜์€ ํ•˜๋‚˜์˜ ์ ์„ ์ œ์‹œํ•˜๊ณ ,
๊ทธ ์ ์„ ๋‘˜๋Ÿฌ์‹ผ ๋ถ€๋ถ„์˜ Segment ๋ฅผ ์ถ”์ถœํ•ฉ๋‹ˆ๋‹ค!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
from ultralytics import SAM
import cv2
import matplotlib.pyplot as plt
import numpy as np

# ๋ชจ๋ธ ๋กœ๋“œ
model = SAM("sam_b.pt")

# ์ด๋ฏธ์ง€ ๋ฐ ํฌ์ธํŠธ ์ •์˜
img_name = "dog.jpg"
my_points = [3000, 2000]

# ์„ธ๊ทธ๋ฉ˜ํ…Œ์ด์…˜ ์ˆ˜ํ–‰
results = model(img_name, points=[my_points], labels=[1])

# ์ด๋ฏธ์ง€ ๋กœ๋“œ ๋ฐ ๋ณ€ํ™˜
image = cv2.imread(img_name)
image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

# ๋งˆ์Šคํฌ ์ถ”์ถœ
mask = None
for result in results:
    if result.masks is not None:
        mask = result.masks.data[0].cpu().numpy()
        break

if mask is None:
    raise ValueError("๋งˆ์Šคํฌ๊ฐ€ ์—†์Šต๋‹ˆ๋‹ค.")

# ๋นจ๊ฐ„ ์  ๊ทธ๋ฆฌ๊ธฐ ํ•จ์ˆ˜
def draw_point(img, point, color=(255, 0, 0), radius=15):
    img_with_point = img.copy()
    cv2.circle(img_with_point, tuple(point), radius, color, -1)
    return img_with_point

# ์ด๋ฏธ์ง€ + ์ 
image_with_point = draw_point(image_rgb, my_points)

# ๋งˆ์Šคํฌ๋งŒ ์‹œ๊ฐํ™”์šฉ์œผ๋กœ ๋ณ€ํ™˜
mask_visual = (mask * 255).astype(np.uint8)

# ๋งˆ์Šคํฌ ์ปฌ๋Ÿฌ๋งต ์˜ค๋ฒ„๋ ˆ์ด ์ƒ์„ฑ
colored_mask = cv2.applyColorMap(mask_visual, cv2.COLORMAP_JET)
combined = cv2.addWeighted(image_with_point, 0.7, colored_mask, 0.5, 0)

# ์‹œ๊ฐํ™”
plt.figure(figsize=(20, 6))

# 1. ์›๋ณธ ์ด๋ฏธ์ง€ + ์ 
plt.subplot(1, 3, 1)
plt.imshow(image_with_point)
plt.title("Original + Point")
plt.axis("off")

# 2. ๋งˆ์Šคํฌ๋งŒ
plt.subplot(1, 3, 2)
plt.imshow(mask, cmap='gray')
plt.title("Mask Only")
plt.axis("off")

# 3. ๋งˆ์Šคํฌ + ์›๋ณธ ์ด๋ฏธ์ง€ + ์ 
plt.subplot(1, 3, 3)
plt.imshow(combined)
plt.title("Overlay: Image + Mask + Point")
plt.axis("off")

plt.tight_layout()
plt.show()

# ๊ฒฐ๊ณผ ์ถœ๋ ฅ
print("Segmentation Result:")
print(f"Number of masks: {len(results[0].masks.data)}")
print(f"Mask shape: {results[0].masks.data[0].shape}")

point

๊ทธ ๊ฒฐ๊ณผ!!
์ฒซ๋ฒˆ์จฐ ์ด๋ฏธ์ง€์™€ ๊ฐ™์ด ๋นจ๊ฐ„์ ์ด ์žˆ๋Š”๊ฒƒ์„ ์ค‘์‹ฌ์œผ๋กœ ๊ฐ•์•„์ง€ segment ๋ฅผ ์‹ค์‹œํ•˜๋ฉด!
๋‘๋ฒˆ์งธ ์ด๋ฏธ์ง€์™€ ๊ฐ™์ด ๋งˆ์Šคํฌ๋ฅผ ์ถ”์ถœํ•˜๊ฒŒ ๋˜๊ณ ,
์ด ๋‘ ์ด๋ฏธ์ง€๋ฅผ ๊ฒน์ฒ˜๋ณด๋ฉด ์„ธ๋ฒˆ์จฐ์™€ ๊ฐ™์ด ๊ฐ•์•„์ง€์˜ ์–ผ๊ตด ๋ถ€๋ถ„๋งŒ์„! ์ถ”์ถœํ•˜๊ฒŒ ๋ฉ๋‹ˆ๋‹ค~!

ํ”„๋กฌํฌํŠธ2. Bounding Box ๊ธฐ๋ฐ˜!!

Bounding Box ๊ธฐ๋ฐ˜์€ ๋„ค ๊ฐœ์˜ ์ ์œผ๋กœ bouding box๋ฅผ ์ œ์‹œํ•˜๊ณ ,
๊ทธ ๋ฐ•์Šค ๋‚ด์˜ ๋ถ€๋ถ„์˜ Segment ๋ฅผ ์ถ”์ถœํ•ฉ๋‹ˆ๋‹ค!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
from ultralytics import SAM
import cv2
import matplotlib.pyplot as plt
import numpy as np

# ๋ชจ๋ธ ๋กœ๋“œ
model = SAM("sam_b.pt")

img_name = "dog.jpg"

my_bboxes=[1430.2,   828,  4471.9, 3836.4]
# ๋ฐ•์Šค ํ”„๋กฌํ”„ํŠธ๋กœ ์ถ”๋ก  ([x_min, y_min, x_max, y_max])
results = model(img_name, bboxes=my_bboxes)

# ์›๋ณธ ์ด๋ฏธ์ง€ ๋กœ๋“œ (์‹œ๊ฐํ™”๋ฅผ ์œ„ํ•ด)
image = cv2.imread(img_name)
image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)  # BGR -> RGB ๋ณ€ํ™˜

# ๊ฒฐ๊ณผ ์‹œ๊ฐํ™”
plt.figure(figsize=(10, 10))
plt.imshow(image_rgb)

# ๋งˆ์Šคํฌ ์˜ค๋ฒ„๋ ˆ์ด
for result in results:
    if result.masks is not None:
        mask = result.masks.data[0].cpu().numpy()  # ์ฒซ ๋ฒˆ์งธ ๋งˆ์Šคํฌ ์ถ”์ถœ
        plt.imshow(mask, alpha=0.5, cmap='jet')  # ๋งˆ์Šคํฌ๋ฅผ ๋ฐ˜ํˆฌ๋ช…ํ•˜๊ฒŒ ํ‘œ์‹œ

# ๋ฐ•์Šค ํ”„๋กฌํ”„ํŠธ ํ‘œ์‹œ
rect = plt.Rectangle((my_bboxes[0], my_bboxes[1]), my_bboxes[2] - my_bboxes[0], my_bboxes[3] - my_bboxes[1], 
                     linewidth=2, edgecolor='red', facecolor='none', label=f'my_bboxes {my_bboxes}')
plt.gca().add_patch(rect)

# ์ œ๋ชฉ ๋ฐ ์„ค์ •
plt.title(f"SAM Segmentation with Box Prompt on {img_name}")
plt.legend()
plt.axis('off')
plt.show()

# ์ถ”๊ฐ€ ์ •๋ณด ์ถœ๋ ฅ (์„ ํƒ ์‚ฌํ•ญ)
print("Segmentation Result:")
print(f"Number of masks: {len(results[0].masks.data)}")
print(f"Mask shape: {results[0].masks.data[0].shape}")

bbox

๊ทธ ๊ฒฐ๊ณผ!!
๋นจ๊ฐ„ Bbox(bounding box) ๋‚ด์—์„œ ๊ฐ•์•„์ง€ ์ด๋ฏธ์ง€๋ฅผ ์ถ”์ถœํ•ด์ค๋‹ˆ๋‹ค~!
์ฐธ ์‰ฝ์ง€์š”~?
๋‘๋ฒˆ์งธ ์ด๋ฏธ์ง€์™€ ๊ฐ™์ด ๋งˆ์Šคํฌ๋ฅผ ์ถ”์ถœํ•˜๊ฒŒ ๋˜๊ณ ,
์ด ๋‘ ์ด๋ฏธ์ง€๋ฅผ ๊ฒน์ฒ˜๋ณด๋ฉด ์„ธ๋ฒˆ์จฐ์™€ ๊ฐ™์ด ๊ฐ•์•„์ง€์˜ ์–ผ๊ตด ๋ถ€๋ถ„๋งŒ์„! ์ถ”์ถœํ•˜๊ฒŒ ๋ฉ๋‹ˆ๋‹ค~!

ํ”„๋กฌํฌํŠธ3. ์ž์œจ๋กœ ๋งก๊ธฐ๊ธฐ!! AMG ๋ฐฉ์‹ ๊ธฐ๋ฐ˜!!

์ด๋ฒˆ์—๋Š” ๋ณ„๋„์˜ ํ”„๋กฌํฌํŠธ ์—†์ด ์ด๋ฏธ์ง€๋ฅผ ์„ธ์„ธํ•˜๊ฒŒ ๋‚˜๋ˆ„๊ฒŒ๋ฉ๋‹ˆ๋‹ค!!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
from ultralytics import SAM
import cv2
import matplotlib.pyplot as plt
import numpy as np

print("\n--- ๋ฐฉ๋ฒ• 3: Automatic Mask Generation (AMG) ๋ฐฉ์‹ (ํ”„๋กฌํ”„ํŠธ ์—†์Œ) ---")

# ๋ชจ๋ธ ๋กœ๋“œ (๋™์ผ ๋ชจ๋ธ ์‚ฌ์šฉ ๊ฐ€๋Šฅ)
model_amg = SAM("sam_b.pt")

# ์ด๋ฏธ์ง€ ๊ฒฝ๋กœ
image_path = "dog.jpg"

# ์›๋ณธ ์ด๋ฏธ์ง€ ๋กœ๋“œ
image = cv2.imread(image_path)
image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

# **ํ”„๋กฌํ”„ํŠธ ์—†์ด** ์ถ”๋ก  ์‹คํ–‰ (AMG ๋ฐฉ์‹)
# ultralytics ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์—์„œ๋Š” ํ”„๋กฌํ”„ํŠธ๋ฅผ ์ œ๊ณตํ•˜์ง€ ์•Š์œผ๋ฉด
# ๋‚ด๋ถ€์ ์œผ๋กœ AMG์™€ ์œ ์‚ฌํ•œ ๋ฐฉ์‹์œผ๋กœ ์—ฌ๋Ÿฌ ์„ธ๊ทธ๋จผํŠธ๋ฅผ ์ƒ์„ฑํ•˜๋ ค๊ณ  ์‹œ๋„ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
print(f"'{image_path}'์—์„œ ํ”„๋กฌํ”„ํŠธ ์—†์ด (AMG ๋ฐฉ์‹) ์ถ”๋ก  ์ค‘...")
results_amg = model_amg(image_path) # bboxes ์ธ์ž ์—†์Œ!
print("์ถ”๋ก  ์™„๋ฃŒ.")

# ๊ฒฐ๊ณผ ์‹œ๊ฐํ™”
plt.figure(figsize=(12, 10))
plt.imshow(image_rgb)
plt.title("๋ฐฉ๋ฒ• 3: SAM Automatic Mask Generation (No Prompt)")
plt.axis('off')

# ๋งˆ์Šคํฌ ์˜ค๋ฒ„๋ ˆ์ด
if results_amg and results_amg[0].masks is not None:
    masks_amg = results_amg[0].masks.data.cpu().numpy()
    print(f"AMG ๋ฐฉ์‹ ๊ฒฐ๊ณผ: {len(masks_amg)}๊ฐœ์˜ ๋งˆ์Šคํฌ ์ƒ์„ฑ๋จ")

    # ์—ฌ๋Ÿฌ ๊ฐœ์˜ ๋งˆ์Šคํฌ๋ฅผ ๋‹ค๋ฅธ ์ƒ‰์ƒ์œผ๋กœ ํ‘œ์‹œ
    np.random.seed(42) # ์ผ๊ด€๋œ ์ƒ‰์ƒ ์ƒ์„ฑ์„ ์œ„ํ•œ ์‹œ๋“œ ๊ณ ์ •
    for i, mask in enumerate(masks_amg):
        # ๊ฐ ๋งˆ์Šคํฌ์— ๋žœ๋ค ์ƒ‰์ƒ ์ ์šฉ (๋ฐ˜ํˆฌ๋ช…)
        color = np.concatenate([np.random.random(3) * 255, np.array([128])], axis=0)
        h, w = mask.shape[-2:]
        mask_image = mask.reshape(h, w, 1) * color.reshape(1, 1, -1)
        plt.imshow(mask_image.astype(np.uint8)) # ์ •์ˆ˜ํ˜•์œผ๋กœ ๋ณ€ํ™˜ํ•˜์—ฌ ์˜ค๋ฒ„๋ ˆ์ด
        # print(f"  - Mask {i+1} shape: {mask.shape}") # ๊ฐœ๋ณ„ ๋งˆ์Šคํฌ ์ •๋ณด ์ถœ๋ ฅ (์„ ํƒ ์‚ฌํ•ญ)
else:
    print("AMG ๋ฐฉ์‹ ์„ธ๊ทธ๋ฉ˜ํ…Œ์ด์…˜ ๊ฒฐ๊ณผ๋ฅผ ์ฐพ์„ ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค.")

plt.show()

# ์ถ”๊ฐ€ ์ •๋ณด ์ถœ๋ ฅ
if results_amg and results_amg[0].masks is not None:
    print("\nAMG ๋ฐฉ์‹ ์„ธ๊ทธ๋ฉ˜ํ…Œ์ด์…˜ ์ƒ์„ธ ์ •๋ณด:")
    print(f" - ์ƒ์„ฑ๋œ ์ด ๋งˆ์Šคํฌ ์ˆ˜: {len(results_amg[0].masks.data)}")
    # AMG ๊ฒฐ๊ณผ์—๋Š” ๋ณดํ†ต ๊ฐ ๋งˆ์Šคํฌ์— ๋Œ€ํ•œ ์ถ”๊ฐ€ ์ •๋ณด(์ ์ˆ˜, ์˜์—ญ ๋“ฑ)๊ฐ€ ์žˆ์„ ์ˆ˜ ์žˆ์œผ๋‚˜,
    # ultralytics ๋ž˜ํผ์˜ ๊ฒฐ๊ณผ ๊ตฌ์กฐ๋ฅผ ํ™•์ธํ•ด์•ผ ํ•จ.
    # print(f" - ๊ฒฐ๊ณผ ๊ฐ์ฒด ํƒ€์ž…: {type(results_amg[0])}") # ๊ฒฐ๊ณผ ๊ตฌ์กฐ ํ™•์ธ์šฉ

์ด๋ฅผ ํ†ตํ•ด์„œ ์•„๋ž˜์™€ ๊ฐ™์ด ๊ฐ•์•„์ง€ ์ด๋ฏธ์ง€๊ฐ€ ์—ฌ๋Ÿฌ๊ฐœ์˜ ์ด๋ฏธ์ง€๋กœ
๋ถ„ํ• ๋œ ๊ฒƒ์„ ํ™•์ธํ• ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค!!

33๊ฐœ์˜ ์„ธ๊ทธ๋จผํŠธ๋กœ ๋ถ„ํ• ๋˜์—ˆ๋„ค์š”~!

AMG


์ง€๊ธˆ๊นŒ์ง€ ์•Œ์•„๋ณธ SAM์˜ ํŒŒ์ด์ฌ ์‹ค์Šต!!

์ •๋ง ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์ด ๋†€๋ž์Šต๋‹ˆ๋‹ค!!

This post is licensed under CC BY 4.0 by the author.