chapter_6_2
k-평균
각각의 픽셀값 (3차원 -> 1차원 배열) 평균 구함
- 픽셀의 평균값은 활용해서 사과, 바나나, 파인애플에 근사한 이미지를 추출한 것
어떻게 평균값을 구할 수 있을까?
- k-평균 알고리즘 (k-Means) 알고리즘
- 평균값 = Cluster Center = Centroid
데이터 불러오기
다음을 참고하라 : http://bit.ly/hg-06-2
--2022-03-31 02:17:17-- https://bit.ly/fruits_300_data
Resolving bit.ly (bit.ly)... 67.199.248.11, 67.199.248.10
Connecting to bit.ly (bit.ly)|67.199.248.11|:443... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://github.com/rickiepark/hg-mldl/raw/master/fruits_300.npy [following]
--2022-03-31 02:17:17-- https://github.com/rickiepark/hg-mldl/raw/master/fruits_300.npy
Resolving github.com (github.com)... 192.30.255.112
Connecting to github.com (github.com)|192.30.255.112|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://raw.githubusercontent.com/rickiepark/hg-mldl/master/fruits_300.npy [following]
--2022-03-31 02:17:17-- https://raw.githubusercontent.com/rickiepark/hg-mldl/master/fruits_300.npy
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.110.133, 185.199.109.133, 185.199.108.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.110.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 3000128 (2.9M) [application/octet-stream]
Saving to: ‘fruits_300.npy’
fruits_300.npy 100%[===================>] 2.86M --.-KB/s in 0.05s
2022-03-31 02:17:17 (56.9 MB/s) - ‘fruits_300.npy’ saved [3000128/3000128]
- 넘파이 파일을 불러옴
1 | import numpy as np |
(300, 100, 100)
3
- 3차원 (샘플개수, 너비, 높이)
- 2차원 (샘플개수, 너비 x 높이)
(300, 10000)
- k-평균 알고리즘 활용
KMeans(n_clusters=3, random_state=42)
- 모형학습 후, labels
[2 2 2 2 2 0 2 2 2 2 2 2 2 2 2 2 2 2 0 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2 2 2 2 2 0 2 0 2 2 2 2 2 2 2 0 2 2 2 2 2 2 2 2 2 0 0 2 2 2 2 2 2 2 2 0 2
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 0 2 2 2 2 2 2 2 2 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1]
- 직접 샘플의 개수 확인
(array([0, 1, 2], dtype=int32), array([111, 98, 91]))
1 | import matplotlib.pyplot as plt |
클러스터 중심
[[3393.8136117 8837.37750892 5267.70439881]]
[0]
최적의 k-평균 찾기
1 | inertia = [] |
위 결과 최적의 k-평균은 3.0 정도 된다.
chapter6. 비지도학습은 잘 안 쓰인다. 시각화 문법만 유의해서 살펴보자.
Reference : 혼자 공부하는 머신러닝 + 딥러닝
You need to set
install_url
to use ShareThis. Please set it in _config.yml
.