15
votes

I want to find dominant color on an image. For this, I know that I should use image histogram. But I am not sure of image format. Which one of rgb, hsv or gray image, should be used?

After the histogram is calculated, I should find max value on histogram. For this, should I find below maximum binVal value for hsv image? Why my result image contains only black color?

float binVal = hist.at<float>(h, s);

EDIT :

I have tried the below code. I draw h-s histogram. And my result images are here. I don't find anything after binary threshold. Maybe I find max histogram value incorrectly.

enter image description hereenter image description here

cvtColor(src, hsv, CV_BGR2HSV);

// Quantize the hue to 30 levels
// and the saturation to 32 levels
int hbins = 20, sbins = 22;
int histSize[] = {hbins, sbins};
// hue varies from 0 to 179, see cvtColor
float hranges[] = { 0, 180 };
// saturation varies from 0 (black-gray-white) to
// 255 (pure spectrum color)
float sranges[] = { 0, 256 };
const float* ranges[] = { hranges, sranges };
MatND hist;
// we compute the histogram from the 0-th and 1-st channels
int channels[] = {0, 1};

calcHist( &hsv, 1, channels, Mat(), // do not use mask
         hist, 2, histSize, ranges,
         true, // the histogram is uniform
         false );
double maxVal=0;
minMaxLoc(hist, 0, &maxVal, 0, 0);

int scale = 10;
Mat histImg = Mat::zeros(sbins*scale, hbins*10, CV_8UC3);
int maxIntensity = -100;
for( int h = 0; h < hbins; h++ ) {
    for( int s = 0; s < sbins; s++ )
    {
        float binVal = hist.at<float>(h, s);
        int intensity = cvRound(binVal*255/maxVal);
        rectangle( histImg, Point(h*scale, s*scale),
                    Point( (h+1)*scale - 1, (s+1)*scale - 1),
                    Scalar::all(intensity),
                    CV_FILLED );
        if(intensity > maxIntensity)
            maxIntensity = intensity;
    }
}
std::cout << "max Intensity " << maxVal << std::endl;
Mat dst;
cv::threshold(src, dst, maxIntensity, 255, cv::THRESH_BINARY);

namedWindow( "Dest", 1 );
imshow( "Dest", dst );
namedWindow( "Source", 1 );
imshow( "Source", src );

namedWindow( "H-S Histogram", 1 );
imshow( "H-S Histogram", histImg );
4

4 Answers

12
votes

Alternatively you could try a k-means approach. Calculate k clusters with k ~ 2..5 and take the centroid of the biggest group as your dominant color.

The python docu of OpenCv has an illustrated example that gets the dominant color(s) pretty well:

11
votes

The solution

  • Find H-S histogram
  • Find peak H value(using minmaxLoc function)
  • Split image 3 channel(h,s,v)
  • Apply to threshold.
  • Create image by merge 3 channel
9
votes

Here's a Python approach using K-Means Clustering to determine the dominant colors in an image with sklearn.cluster.KMeans()


Input image

Results

With n_clusters=5, here are the most dominant colors and percentage distribution

[14.69488554 34.23074345 41.48107857] 13.67%
[141.44980073 207.52576948 236.30722987] 15.69%
[ 31.75790423  77.52713644 114.33328324] 18.77%
[ 48.41205713 118.34814452 176.43411287] 25.19%
[ 84.04820266 161.6848298  217.14045211] 26.69%

Visualization of each color cluster

enter image description here

Similarity with n_clusters=10,

[ 55.09073171 113.28271003  74.97528455] 3.25%
[ 85.36889668 145.80759374 174.59846237] 5.24%
[164.17201088 223.34258123 241.81929254] 6.60%
[ 9.97315932 22.79468111 22.01822211] 7.16%
[19.96940211 47.8375841  72.83728002] 9.27%
[ 26.73510467  70.5847759  124.79314278] 10.52%
[118.44741779 190.98204701 230.66728334] 13.55%
[ 51.61750364 130.59930047 198.76335878] 13.82%
[ 41.10232129 104.89923271 160.54431333] 14.53%
[ 81.70930412 161.823664   221.10258949] 16.04%

enter image description here

import cv2, numpy as np
from sklearn.cluster import KMeans

def visualize_colors(cluster, centroids):
    # Get the number of different clusters, create histogram, and normalize
    labels = np.arange(0, len(np.unique(cluster.labels_)) + 1)
    (hist, _) = np.histogram(cluster.labels_, bins = labels)
    hist = hist.astype("float")
    hist /= hist.sum()

    # Create frequency rect and iterate through each cluster's color and percentage
    rect = np.zeros((50, 300, 3), dtype=np.uint8)
    colors = sorted([(percent, color) for (percent, color) in zip(hist, centroids)])
    start = 0
    for (percent, color) in colors:
        print(color, "{:0.2f}%".format(percent * 100))
        end = start + (percent * 300)
        cv2.rectangle(rect, (int(start), 0), (int(end), 50), \
                      color.astype("uint8").tolist(), -1)
        start = end
    return rect

# Load image and convert to a list of pixels
image = cv2.imread('1.jpg')
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
reshape = image.reshape((image.shape[0] * image.shape[1], 3))

# Find and display most dominant colors
cluster = KMeans(n_clusters=5).fit(reshape)
visualize = visualize_colors(cluster, cluster.cluster_centers_)
visualize = cv2.cvtColor(visualize, cv2.COLOR_RGB2BGR)
cv2.imshow('visualize', visualize)
cv2.waitKey()
6
votes

Here are some suggestions to get you started.

  • All 3 channels in RGB contribute to the color, so you'd have to somehow figure out where three different histograms are all at maximum. (Or their sum is maximum, or whatever.)
  • HSV has all of the color (well, Hue) information in one channel, so you only have to consider one histogram.
  • Grayscale throws away all color information so is pretty much useless for finding color.

Try converting to HSV, then calculate the histogram on the H channel.

As you say, you want to find the max value in the histogram. But:

  • You might want to consider a range of values instead of just one, say from 20-40 instead of just 30. Try different range sizes.
  • Remember that Hue is circular, so H=0 and H=360 are the same.
  • Try plotting the histogram following this:
    http://docs.opencv.org/doc/tutorials/imgproc/histograms/histogram_calculation/histogram_calculation.html
    to see if your results make sense.
  • If you're using a range of Hues and you find a range that is maximum, you can either just use the middle of that range as your dominant color, or you can find the mean of the colors within that range and use that.