๐ฅ๏ธ Image segmentation with Python using SAM! - ํ์ด์ฌ์ผ๋ก ๋๋ผ๋ฐ๊ธฐ!? SAM (Segment Anything Model) ์ค์ต
๐งผ Image segmentation with Python? SAM in Action!
Today, letโs practice how to do Image segmentation using SAM (Segment Anything Model),
based on the theory we explored in the previous post!
Weโll use Ultralytics to run SAM easily,
instead of downloading and configuring the model from Meta or Hugging Face.
Letโs start by loading the SAM model using Ultralytics:
1
2
3
4
5
6
7
8
from ultralytics import SAM
import cv2
import matplotlib.pyplot as plt
import numpy as np
# Load model
model = SAM("sam_b.pt")
model.eval()
Once the model is loaded, the architecture consists mainly of:
image_encoder
prompt_encoder
mask_decoder
And more specifically, the structure looks like this:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
# SAM (Segment Anything Model)
- model: SAMModel
- image_encoder: ImageEncoderViT
- patch_embed: PatchEmbed
- proj: Conv2d(3, 768, kernel_size=16, stride=16)
- blocks: ModuleList
- Block[0-11] (12 Transformer Blocks)
- norm1: LayerNorm(768)
- attn: REAttention
- qkv: Linear(768 โ 2304)
- proj: Linear(768 โ 768)
- norm2: LayerNorm(768)
- mlp: MLPBlock
- lin1: Linear(768 โ 3072)
- lin2: Linear(3072 โ 768)
- act: GELU
- neck: Sequential
- Conv2d(768 โ 256, 1x1)
- LayerNorm2d
- Conv2d(256 โ 256, 3x3, padding=1)
- LayerNorm2d
- prompt_encoder: PromptEncoder
- pe_layer: PositionEmbeddingRandom
- point_embeddings: ModuleList (4 x Embedding(1, 256))
- not_a_point_embed: Embedding(1, 256)
- no_mask_embed: Embedding(1, 256)
- mask_downscaling: Sequential
- Conv2d(1 โ 4, kernel=2, stride=2)
- LayerNorm2d
- GELU
- Conv2d(4 โ 16, kernel=2, stride=2)
- LayerNorm2d
- GELU
- Conv2d(16 โ 256, kernel=1, stride=1)
- mask_decoder: MaskDecoder
- transformer: TwoWayTransformer
- layers: ModuleList (2 x TwoWayAttentionBlock)
- self_attn: Attention
- cross_attn_token_to_image: Attention
- cross_attn_image_to_token: Attention
- norm1, norm2, norm3, norm4: LayerNorm(256)
- mlp: MLPBlock
- lin1: Linear(256 โ 2048)
- lin2: Linear(2048 โ 256)
- act: ReLU
- final_attn_token_to_image: Attention
- norm_final_attn: LayerNorm(256)
- iou_token: Embedding(1, 256)
- mask_tokens: Embedding(4, 256)
- output_upscaling: Sequential
- ConvTranspose2d(256 โ 64, kernel=2, stride=2)
- LayerNorm2d
- GELU
- ConvTranspose2d(64 โ 32, kernel=2, stride=2)
- GELU
- output_hypernetworks_mlps: ModuleList (4 x MLP)
- Linear(256 โ 256)
- Linear(256 โ 256)
- Linear(256 โ 32)
- iou_prediction_head: MLP
- Linear(256 โ 256)
- Linear(256 โ 256)
- Linear(256 โ 4)
Now letโs try out three different segmentation methods โ point, bounding box, and fully automatic!
Weโll use the following image of a dog:
๐ข Prompt Type 1: Point-Based Segmentation
You specify a single point, and the model segments the object around it.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
from ultralytics import SAM
import cv2
import matplotlib.pyplot as plt
import numpy as np
model = SAM("sam_b.pt")
img_name = "dog.jpg"
my_points = [3000, 2000]
results = model(img_name, points=[my_points], labels=[1])
image = cv2.imread(img_name)
image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
mask = None
for result in results:
if result.masks is not None:
mask = result.masks.data[0].cpu().numpy()
break
if mask is None:
raise ValueError("No mask found.")
def draw_point(img, point, color=(255, 0, 0), radius=15):
img_with_point = img.copy()
cv2.circle(img_with_point, tuple(point), radius, color, -1)
return img_with_point
image_with_point = draw_point(image_rgb, my_points)
mask_visual = (mask * 255).astype(np.uint8)
colored_mask = cv2.applyColorMap(mask_visual, cv2.COLORMAP_JET)
combined = cv2.addWeighted(image_with_point, 0.7, colored_mask, 0.5, 0)
plt.figure(figsize=(20, 6))
plt.subplot(1, 3, 1)
plt.imshow(image_with_point)
plt.title("Original + Point")
plt.axis("off")
plt.subplot(1, 3, 2)
plt.imshow(mask, cmap='gray')
plt.title("Mask Only")
plt.axis("off")
plt.subplot(1, 3, 3)
plt.imshow(combined)
plt.title("Overlay: Image + Mask + Point")
plt.axis("off")
plt.tight_layout()
plt.show()
As shown above, the model extracts the dogโs head based on the red point!
๐ด Prompt Type 2: Bounding Box-Based Segmentation
You provide a rectangular bounding box, and the model segments the object inside that region.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
from ultralytics import SAM
import cv2
import matplotlib.pyplot as plt
import numpy as np
model = SAM("sam_b.pt")
img_name = "dog.jpg"
my_bboxes = [1430.2, 828, 4471.9, 3836.4]
results = model(img_name, bboxes=my_bboxes)
image = cv2.imread(img_name)
image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
plt.figure(figsize=(10, 10))
plt.imshow(image_rgb)
for result in results:
if result.masks is not None:
mask = result.masks.data[0].cpu().numpy()
plt.imshow(mask, alpha=0.5, cmap='jet')
rect = plt.Rectangle((my_bboxes[0], my_bboxes[1]),
my_bboxes[2] - my_bboxes[0],
my_bboxes[3] - my_bboxes[1],
linewidth=2, edgecolor='red', facecolor='none')
plt.gca().add_patch(rect)
plt.title(f"SAM Segmentation with Box Prompt on {img_name}")
plt.axis('off')
plt.show()
The red bounding box lets SAM extract the subject precisely within the box.
โ๏ธ Prompt Type 3: No Prompt (Automatic Mask Generation / AMG)
This time, we donโt provide any prompts at all โ SAM automatically segments the image into multiple parts.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
from ultralytics import SAM
import cv2
import matplotlib.pyplot as plt
import numpy as np
print("\n--- Method 3: Automatic Mask Generation (No Prompt) ---")
model_amg = SAM("sam_b.pt")
image_path = "dog.jpg"
image = cv2.imread(image_path)
image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
print(f"Running prompt-free segmentation on '{image_path}'...")
results_amg = model_amg(image_path)
print("Done.")
plt.figure(figsize=(12, 10))
plt.imshow(image_rgb)
plt.title("Method 3: SAM Automatic Mask Generation")
plt.axis('off')
if results_amg and results_amg[0].masks is not None:
masks_amg = results_amg[0].masks.data.cpu().numpy()
print(f"{len(masks_amg)} masks generated")
np.random.seed(42)
for i, mask in enumerate(masks_amg):
color = np.concatenate([np.random.random(3) * 255, np.array([128])], axis=0)
h, w = mask.shape[-2:]
mask_image = mask.reshape(h, w, 1) * color.reshape(1, 1, -1)
plt.imshow(mask_image.astype(np.uint8))
else:
print("No masks found.")
plt.show()
The image is segmented into 33 distinct masks, each representing different parts of the dog or the background.
- ๐ Conclusion
With just a few lines of Python, weโve performed professional-level segmentation using SAM!
The modelโs performance is truly impressive. Whether using points, boxes, or no prompt at all,
SAM is a powerful tool for background removal and object segmentation.
Letโs keep exploring!
(ํ๊ตญ์ด) ํ์ด์ฌ์ผ๋ก ๋๋ผ๋ฐ๊ธฐ!? SAM ์ค์ต
๋๋ผ(์ผ๋ณธ์ด: ๆใ)๋ ์ฃผ๋ก ์๊ฐ ๋์์ธ ํ์ฅ์์ ์ฐ์ด๋ ์ฉ์ด๋ก,
์๋ณธ ์ด๋ฏธ์ง์ ํผ์ฌ์ฒด๋ก๋ถํฐ ๋ฐฐ๊ฒฝ์ ๋ถ๋ฆฌํ๊ธฐ ์ํด ํผ์ฌ์ฒด์ ์ธ๊ณฝ์ ์ ๋ฐ๋ ๊ฒ์ โ๋๋ผโ ํน์ โ๋๋ผ ๋ฐ๊ธฐโ๋ผ๊ณ ์ง์นญํฉ๋๋ค!
์ค๋์ ์ง๋ ํฌ์คํ
์์ ์์๋ณธ SAM์ ์๋ฆฌ๋ฅผ ๋ฐํ์ผ๋ก ์ค์ต์ ์งํํด๋ณด๊ฒ ์ต๋๋ค!!
Python ์ฝ๋๋ก ๋๋ผ๋ฐ๊ธฐ!!
SAM (Segment Anything Model)์ ํ์ฉํด์ ์ค์ตํด๋ณด์์~~ SAM ๋ชจ๋ธ์ ๋ค์ด๋ฐ๋ ๋ฒ์ Meta ๊ณต์ ์ฌ์ดํธ์์ ๋ค์ด๋ฐ๋ ๋ฐฉ๋ฒ, Huggingface ๋ชจ๋ธ ๋ฐ๋ ๋ฐฉ๋ฒ ๋ฑ์ด ๊ฐ์ฅ ๊ทผ๋ณธ์ ์ธ ๋ฐฉ๋ฒ์ด๋ฉฐ,
์ค๋์ ๋ณด๋ค ๊ฐ๋จํ๊ฒ ์ฌ์ฉํ ์ ์๋ Ultralytics ๋ฅผ ํ์ฉํด์์ฝ๊ฒ SAM์ ์งํํด๋ณด๊ฒ ์ต๋๋ค!
์๋์ ๊ฐ์ด ultralytics๋ก ๋ถํฐ SAM ๋ชจ๋ธ์ ๋ถ๋ฌ์ต๋๋ค~~
1
2
3
4
5
6
7
8
from ultralytics import SAM
import cv2
import matplotlib.pyplot as plt
import numpy as np
# ๋ชจ๋ธ ๋ก๋
model = SAM("sam_b.pt")
model.eval()
๋ก๋๋ ๋ชจ๋ธ๊ตฌ์กฐ๋ฅผ ๋ถ์ํด๋ณด๋ฉด!!
๊ฐ๋จํ๋ image_encoder / prompt_encoder / mask_decoder ๋ก,
์ธ๋ถ์ ์ผ๋ก๋๋ ์๋์ ๊ฐ์์ต๋๋ค~!
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
# SAM (Segment Anything Model)
- model: SAMModel
- image_encoder: ImageEncoderViT
- patch_embed: PatchEmbed
- proj: Conv2d(3, 768, kernel_size=16, stride=16)
- blocks: ModuleList
- Block[0-11] (12๊ฐ์ Transformer Blocks)
- norm1: LayerNorm(768)
- attn: REAttention
- qkv: Linear(768 โ 2304)
- proj: Linear(768 โ 768)
- norm2: LayerNorm(768)
- mlp: MLPBlock
- lin1: Linear(768 โ 3072)
- lin2: Linear(3072 โ 768)
- act: GELU
- neck: Sequential
- Conv2d(768 โ 256, 1x1)
- LayerNorm2d
- Conv2d(256 โ 256, 3x3, padding=1)
- LayerNorm2d
- prompt_encoder: PromptEncoder
- pe_layer: PositionEmbeddingRandom
- point_embeddings: ModuleList (4 x Embedding(1, 256))
- not_a_point_embed: Embedding(1, 256)
- no_mask_embed: Embedding(1, 256)
- mask_downscaling: Sequential
- Conv2d(1 โ 4, kernel=2, stride=2)
- LayerNorm2d
- GELU
- Conv2d(4 โ 16, kernel=2, stride=2)
- LayerNorm2d
- GELU
- Conv2d(16 โ 256, kernel=1, stride=1)
- mask_decoder: MaskDecoder
- transformer: TwoWayTransformer
- layers: ModuleList (2 x TwoWayAttentionBlock)
- self_attn: Attention
- cross_attn_token_to_image: Attention
- cross_attn_image_to_token: Attention
- norm1, norm2, norm3, norm4: LayerNorm(256)
- mlp: MLPBlock
- lin1: Linear(256 โ 2048)
- lin2: Linear(2048 โ 256)
- act: ReLU
- final_attn_token_to_image: Attention
- norm_final_attn: LayerNorm(256)
- iou_token: Embedding(1, 256)
- mask_tokens: Embedding(4, 256)
- output_upscaling: Sequential
- ConvTranspose2d(256 โ 64, kernel=2, stride=2)
- LayerNorm2d
- GELU
- ConvTranspose2d(64 โ 32, kernel=2, stride=2)
- GELU
- output_hypernetworks_mlps: ModuleList (4 x MLP)
- Linear(256 โ 256)
- Linear(256 โ 256)
- Linear(256 โ 32)
- iou_prediction_head: MLP
- Linear(256 โ 256)
- Linear(256 โ 256)
- Linear(256 โ 4)
์ด์ ~!!
์ ,bounding box, ์๋ฌด๊ฒ๋ ์์ด, ์ธ๊ฐ์ง ๋ฐฉ์์ผ๋ก ์งํํด๋ณด๊ฒ ์ต๋๋ค!!!
์งํ์๋ ์๋์ ๊ฐ์์ง ์ด๋ฏธ์ง๋ฅผ ๋ฐํ์ผ๋ก ๊ณ ๊ณ ๊ณ !!
ํ๋กฌํฌํธ1. Point ๊ธฐ๋ฐ!!
Point ๊ธฐ๋ฐ์ ํ๋์ ์ ์ ์ ์ํ๊ณ ,
๊ทธ ์ ์ ๋๋ฌ์ผ ๋ถ๋ถ์ Segment ๋ฅผ ์ถ์ถํฉ๋๋ค!
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
from ultralytics import SAM
import cv2
import matplotlib.pyplot as plt
import numpy as np
# ๋ชจ๋ธ ๋ก๋
model = SAM("sam_b.pt")
# ์ด๋ฏธ์ง ๋ฐ ํฌ์ธํธ ์ ์
img_name = "dog.jpg"
my_points = [3000, 2000]
# ์ธ๊ทธ๋ฉํ
์ด์
์ํ
results = model(img_name, points=[my_points], labels=[1])
# ์ด๋ฏธ์ง ๋ก๋ ๋ฐ ๋ณํ
image = cv2.imread(img_name)
image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# ๋ง์คํฌ ์ถ์ถ
mask = None
for result in results:
if result.masks is not None:
mask = result.masks.data[0].cpu().numpy()
break
if mask is None:
raise ValueError("๋ง์คํฌ๊ฐ ์์ต๋๋ค.")
# ๋นจ๊ฐ ์ ๊ทธ๋ฆฌ๊ธฐ ํจ์
def draw_point(img, point, color=(255, 0, 0), radius=15):
img_with_point = img.copy()
cv2.circle(img_with_point, tuple(point), radius, color, -1)
return img_with_point
# ์ด๋ฏธ์ง + ์
image_with_point = draw_point(image_rgb, my_points)
# ๋ง์คํฌ๋ง ์๊ฐํ์ฉ์ผ๋ก ๋ณํ
mask_visual = (mask * 255).astype(np.uint8)
# ๋ง์คํฌ ์ปฌ๋ฌ๋งต ์ค๋ฒ๋ ์ด ์์ฑ
colored_mask = cv2.applyColorMap(mask_visual, cv2.COLORMAP_JET)
combined = cv2.addWeighted(image_with_point, 0.7, colored_mask, 0.5, 0)
# ์๊ฐํ
plt.figure(figsize=(20, 6))
# 1. ์๋ณธ ์ด๋ฏธ์ง + ์
plt.subplot(1, 3, 1)
plt.imshow(image_with_point)
plt.title("Original + Point")
plt.axis("off")
# 2. ๋ง์คํฌ๋ง
plt.subplot(1, 3, 2)
plt.imshow(mask, cmap='gray')
plt.title("Mask Only")
plt.axis("off")
# 3. ๋ง์คํฌ + ์๋ณธ ์ด๋ฏธ์ง + ์
plt.subplot(1, 3, 3)
plt.imshow(combined)
plt.title("Overlay: Image + Mask + Point")
plt.axis("off")
plt.tight_layout()
plt.show()
# ๊ฒฐ๊ณผ ์ถ๋ ฅ
print("Segmentation Result:")
print(f"Number of masks: {len(results[0].masks.data)}")
print(f"Mask shape: {results[0].masks.data[0].shape}")
๊ทธ ๊ฒฐ๊ณผ!!
์ฒซ๋ฒ์จฐ ์ด๋ฏธ์ง์ ๊ฐ์ด ๋นจ๊ฐ์ ์ด ์๋๊ฒ์ ์ค์ฌ์ผ๋ก ๊ฐ์์ง segment ๋ฅผ ์ค์ํ๋ฉด!
๋๋ฒ์งธ ์ด๋ฏธ์ง์ ๊ฐ์ด ๋ง์คํฌ๋ฅผ ์ถ์ถํ๊ฒ ๋๊ณ ,
์ด ๋ ์ด๋ฏธ์ง๋ฅผ ๊ฒน์ฒ๋ณด๋ฉด ์ธ๋ฒ์จฐ์ ๊ฐ์ด ๊ฐ์์ง์ ์ผ๊ตด ๋ถ๋ถ๋ง์! ์ถ์ถํ๊ฒ ๋ฉ๋๋ค~!
ํ๋กฌํฌํธ2. Bounding Box ๊ธฐ๋ฐ!!
Bounding Box ๊ธฐ๋ฐ์ ๋ค ๊ฐ์ ์ ์ผ๋ก bouding box๋ฅผ ์ ์ํ๊ณ ,
๊ทธ ๋ฐ์ค ๋ด์ ๋ถ๋ถ์ Segment ๋ฅผ ์ถ์ถํฉ๋๋ค!
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
from ultralytics import SAM
import cv2
import matplotlib.pyplot as plt
import numpy as np
# ๋ชจ๋ธ ๋ก๋
model = SAM("sam_b.pt")
img_name = "dog.jpg"
my_bboxes=[1430.2, 828, 4471.9, 3836.4]
# ๋ฐ์ค ํ๋กฌํํธ๋ก ์ถ๋ก ([x_min, y_min, x_max, y_max])
results = model(img_name, bboxes=my_bboxes)
# ์๋ณธ ์ด๋ฏธ์ง ๋ก๋ (์๊ฐํ๋ฅผ ์ํด)
image = cv2.imread(img_name)
image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) # BGR -> RGB ๋ณํ
# ๊ฒฐ๊ณผ ์๊ฐํ
plt.figure(figsize=(10, 10))
plt.imshow(image_rgb)
# ๋ง์คํฌ ์ค๋ฒ๋ ์ด
for result in results:
if result.masks is not None:
mask = result.masks.data[0].cpu().numpy() # ์ฒซ ๋ฒ์งธ ๋ง์คํฌ ์ถ์ถ
plt.imshow(mask, alpha=0.5, cmap='jet') # ๋ง์คํฌ๋ฅผ ๋ฐํฌ๋ช
ํ๊ฒ ํ์
# ๋ฐ์ค ํ๋กฌํํธ ํ์
rect = plt.Rectangle((my_bboxes[0], my_bboxes[1]), my_bboxes[2] - my_bboxes[0], my_bboxes[3] - my_bboxes[1],
linewidth=2, edgecolor='red', facecolor='none', label=f'my_bboxes {my_bboxes}')
plt.gca().add_patch(rect)
# ์ ๋ชฉ ๋ฐ ์ค์
plt.title(f"SAM Segmentation with Box Prompt on {img_name}")
plt.legend()
plt.axis('off')
plt.show()
# ์ถ๊ฐ ์ ๋ณด ์ถ๋ ฅ (์ ํ ์ฌํญ)
print("Segmentation Result:")
print(f"Number of masks: {len(results[0].masks.data)}")
print(f"Mask shape: {results[0].masks.data[0].shape}")
๊ทธ ๊ฒฐ๊ณผ!!
๋นจ๊ฐ Bbox(bounding box) ๋ด์์ ๊ฐ์์ง ์ด๋ฏธ์ง๋ฅผ ์ถ์ถํด์ค๋๋ค~!
์ฐธ ์ฝ์ง์~?
๋๋ฒ์งธ ์ด๋ฏธ์ง์ ๊ฐ์ด ๋ง์คํฌ๋ฅผ ์ถ์ถํ๊ฒ ๋๊ณ ,
์ด ๋ ์ด๋ฏธ์ง๋ฅผ ๊ฒน์ฒ๋ณด๋ฉด ์ธ๋ฒ์จฐ์ ๊ฐ์ด ๊ฐ์์ง์ ์ผ๊ตด ๋ถ๋ถ๋ง์! ์ถ์ถํ๊ฒ ๋ฉ๋๋ค~!
ํ๋กฌํฌํธ3. ์์จ๋ก ๋งก๊ธฐ๊ธฐ!! AMG ๋ฐฉ์ ๊ธฐ๋ฐ!!
์ด๋ฒ์๋ ๋ณ๋์ ํ๋กฌํฌํธ ์์ด ์ด๋ฏธ์ง๋ฅผ ์ธ์ธํ๊ฒ ๋๋๊ฒ๋ฉ๋๋ค!!
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
from ultralytics import SAM
import cv2
import matplotlib.pyplot as plt
import numpy as np
print("\n--- ๋ฐฉ๋ฒ 3: Automatic Mask Generation (AMG) ๋ฐฉ์ (ํ๋กฌํํธ ์์) ---")
# ๋ชจ๋ธ ๋ก๋ (๋์ผ ๋ชจ๋ธ ์ฌ์ฉ ๊ฐ๋ฅ)
model_amg = SAM("sam_b.pt")
# ์ด๋ฏธ์ง ๊ฒฝ๋ก
image_path = "dog.jpg"
# ์๋ณธ ์ด๋ฏธ์ง ๋ก๋
image = cv2.imread(image_path)
image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# **ํ๋กฌํํธ ์์ด** ์ถ๋ก ์คํ (AMG ๋ฐฉ์)
# ultralytics ๋ผ์ด๋ธ๋ฌ๋ฆฌ์์๋ ํ๋กฌํํธ๋ฅผ ์ ๊ณตํ์ง ์์ผ๋ฉด
# ๋ด๋ถ์ ์ผ๋ก AMG์ ์ ์ฌํ ๋ฐฉ์์ผ๋ก ์ฌ๋ฌ ์ธ๊ทธ๋จผํธ๋ฅผ ์์ฑํ๋ ค๊ณ ์๋ํ ์ ์์ต๋๋ค.
print(f"'{image_path}'์์ ํ๋กฌํํธ ์์ด (AMG ๋ฐฉ์) ์ถ๋ก ์ค...")
results_amg = model_amg(image_path) # bboxes ์ธ์ ์์!
print("์ถ๋ก ์๋ฃ.")
# ๊ฒฐ๊ณผ ์๊ฐํ
plt.figure(figsize=(12, 10))
plt.imshow(image_rgb)
plt.title("๋ฐฉ๋ฒ 3: SAM Automatic Mask Generation (No Prompt)")
plt.axis('off')
# ๋ง์คํฌ ์ค๋ฒ๋ ์ด
if results_amg and results_amg[0].masks is not None:
masks_amg = results_amg[0].masks.data.cpu().numpy()
print(f"AMG ๋ฐฉ์ ๊ฒฐ๊ณผ: {len(masks_amg)}๊ฐ์ ๋ง์คํฌ ์์ฑ๋จ")
# ์ฌ๋ฌ ๊ฐ์ ๋ง์คํฌ๋ฅผ ๋ค๋ฅธ ์์์ผ๋ก ํ์
np.random.seed(42) # ์ผ๊ด๋ ์์ ์์ฑ์ ์ํ ์๋ ๊ณ ์
for i, mask in enumerate(masks_amg):
# ๊ฐ ๋ง์คํฌ์ ๋๋ค ์์ ์ ์ฉ (๋ฐํฌ๋ช
)
color = np.concatenate([np.random.random(3) * 255, np.array([128])], axis=0)
h, w = mask.shape[-2:]
mask_image = mask.reshape(h, w, 1) * color.reshape(1, 1, -1)
plt.imshow(mask_image.astype(np.uint8)) # ์ ์ํ์ผ๋ก ๋ณํํ์ฌ ์ค๋ฒ๋ ์ด
# print(f" - Mask {i+1} shape: {mask.shape}") # ๊ฐ๋ณ ๋ง์คํฌ ์ ๋ณด ์ถ๋ ฅ (์ ํ ์ฌํญ)
else:
print("AMG ๋ฐฉ์ ์ธ๊ทธ๋ฉํ
์ด์
๊ฒฐ๊ณผ๋ฅผ ์ฐพ์ ์ ์์ต๋๋ค.")
plt.show()
# ์ถ๊ฐ ์ ๋ณด ์ถ๋ ฅ
if results_amg and results_amg[0].masks is not None:
print("\nAMG ๋ฐฉ์ ์ธ๊ทธ๋ฉํ
์ด์
์์ธ ์ ๋ณด:")
print(f" - ์์ฑ๋ ์ด ๋ง์คํฌ ์: {len(results_amg[0].masks.data)}")
# AMG ๊ฒฐ๊ณผ์๋ ๋ณดํต ๊ฐ ๋ง์คํฌ์ ๋ํ ์ถ๊ฐ ์ ๋ณด(์ ์, ์์ญ ๋ฑ)๊ฐ ์์ ์ ์์ผ๋,
# ultralytics ๋ํผ์ ๊ฒฐ๊ณผ ๊ตฌ์กฐ๋ฅผ ํ์ธํด์ผ ํจ.
# print(f" - ๊ฒฐ๊ณผ ๊ฐ์ฒด ํ์
: {type(results_amg[0])}") # ๊ฒฐ๊ณผ ๊ตฌ์กฐ ํ์ธ์ฉ
์ด๋ฅผ ํตํด์ ์๋์ ๊ฐ์ด ๊ฐ์์ง ์ด๋ฏธ์ง๊ฐ ์ฌ๋ฌ๊ฐ์ ์ด๋ฏธ์ง๋ก
๋ถํ ๋ ๊ฒ์ ํ์ธํ ์ ์์ต๋๋ค!!
33๊ฐ์ ์ธ๊ทธ๋จผํธ๋ก ๋ถํ ๋์๋ค์~!
์ง๊ธ๊น์ง ์์๋ณธ SAM์ ํ์ด์ฌ ์ค์ต!!
์ ๋ง ๋ชจ๋ธ์ ์ฑ๋ฅ์ด ๋๋์ต๋๋ค!!