1
votes

I was trying to posterize an image in python using opencv, after some time of searching i found a lead in openCV documentations. But as you can see its for rgb image and what I've got is a gray-scale image, i did it anyways and got weird outputs. i tweaked some places in the code and got even weirder outputs. Can someone please explain whats going on ?

EDIT:

My code

import numpy as np
import cv2

img = cv2.imread('Lenna.png')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

Z = np.float32(gray)

criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 10, 1.0)
K = 8
ret,label,center=cv2.kmeans(Z,K,None,criteria,10,cv2.KMEANS_RANDOM_CENTERS)

center = np.uint8(center)
res = center[label.flatten()]

cv2.imshow('res',res)
cv2.waitKey(0)
cv2.destroyAllWindows()

Input image:

input

Output Image:

output

2
As far as I see it you want is some kind of image quantization. I think you can achieve this by using bins for example. It also helps to know what are the weird outputs and the changes, maybe you can post an image or something, and some of the code you are using to be able to explain what is going on. - api55
yea. quantization is the right word for it. :P - PunyCode
you are missing the reshape part Z = img.reshape((-1,3)) in your case is 1 instead of 3 (1 channel instead of 3). If you do not do it then you are passing weird vectors (I think rows) - api55
you were right buddy.. it worked.. :) - PunyCode

2 Answers

2
votes

Can someone please explain whats going on ?

Kmeans input is a vector of vectors, or in a lot of cases, a vector of pixels or vector of 2D/3D points. In your code you are passing an image, which is a vector of the values in a row. That is why you get this weird values.

What can you do?

Simple, reshape the input to be a 1D vector of grey values.

Z = img.reshape((-1,1))

This way, it will try to use each grey value as input to cluster them (group them) and then it will label each value accordingly.

0
votes

If you look a bit further in the examples, you can find a solution like this:

import numpy as np
import cv2

img = cv2.imread('Lenna.png')

Z = img.reshape((-1,3))

# convert to np.float32
Z = np.float32(Z)

# define criteria, number of clusters(K) and apply kmeans()
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 10, 1.0)
K = 8


ret,label,center=cv2.kmeans(Z,K,None,criteria,10,cv2.KMEANS_RANDOM_CENTERS)

# Now convert back into uint8, and make original image
center = np.uint8(center)
res = center[label.flatten()]
res2 = res.reshape((img.shape))

cv2.imshow('res2',res2)
cv2.waitKey(0)
cv2.destroyAllWindows()

Note here the reshape to account for the RGB image.