4
votes

I make a 2d histogram of some (x, y) data and I get an image like this one:

histogram-2d

I want a way to get the (x, y) coordinates of the point(s) that store the maximum values in H. For example, in the case of the image above it would be two points with the aprox coordinates: (1090, 1040) and (1110, 1090).

This is my code:

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm
from os import getcwd
from os.path import join, realpath, dirname

# Path to dir where this code exists.
mypath = realpath(join(getcwd(), dirname(__file__)))
myfile = 'datafile.dat'

x, y = np.loadtxt(join(mypath,myfile), usecols=(1, 2), unpack=True)

fig = plt.figure()
ax = fig.add_subplot(111)

xmin, xmax = min(x), max(x)
ymin, ymax = min(y), max(y)

rang = [[xmin, xmax], [ymin, ymax]]

binsxy = [int((xmax - xmin) / 20), int((ymax - ymin) / 20)]

H, xedges, yedges = np.histogram2d(x, y, range=rang, bins=binsxy)

extent = [yedges[0], yedges[-1], xedges[0], xedges[-1]]
cp = ax.imshow(H.transpose()[::-1], interpolation='nearest', extent=extent, cmap=cm.jet)
fig.colorbar(cp)

plt.show()

Edit

I've tried the solutions posted by Marek and qarma attempting to obtain the coordinates of the bins rather than the index of them, like so:

# Marek's answer
x_cent, y_cent = unravel_index(H.argmax(), H.shape)
print('Marek')
print(x_cent, y_cent)
print(xedges[x_cent], yedges[y_cent])

# qarma's answer
idx = list(H.flatten()).index(H.max())
x_cent2, y_cent2 = idx / H.shape[1], idx % H.shape[1]
local_maxs = np.argwhere(H == H.max())
print('\nqarma')
print(x_cent2, y_cent2)
print(xedges[x_cent2], yedges[y_cent2])
print(xedges[local_maxs[0,0]], yedges[local_maxs[0,1]], xedges[local_maxs[1,0]], yedges[local_maxs[1,1]])

which results in:

Marek
(53, 50)
(1072.7838144329899, 1005.0837113402063)

qarma
(53, 50)
(1072.7838144329899, 1005.0837113402063)
(1072.7838144329899, 1005.0837113402063, 1092.8257731958763, 1065.3611340206187)

So the maximum coordinates are the same which is good! Now I have a small issue because if I zoom in on the 2d plot, I see that the coordinates are a little off-centered for both the global maximum and the local maximum:

enter image description here

Why is this?

3
scipy.signal.argrelextrema ? stackoverflow.com/a/13491866/624829Zeugma
Possible solution: Peak detection in a 2d array. Depending on your data, you may have to play with the size of the neighborhood, however.unutbu
That is an excellent question you point me to, thank you very much! I'll definitely check it out when I have some more time since it's quite long. Cheers.Gabriel

3 Answers

2
votes

Here's how you can find first global maximum

idx = list(H.flatten()).index(H.max())
x, y = idx / H.shape[1], idx % H.shape[1]

Finding coordinate of all maxima was left as exercise to the reader...

numpy.argwhere(H == H.max())

Edit

Your code:

H, xedges, yedges = np.histogram2d(x, y, range=rang, bins=binsxy)

Here H contains histogram values and xedges, yedges boundaries for histogram bins. Note that size of edges arrays is one larger than size of H in corresponding dimension. Thus:

for x, y in numpy.argwhere(H == H.max()):
    # center is between x and x+1
    print numpy.average(xedges[x:x + 2]), numpy.average(yedges[y:y + 2])
2
votes

This question should help you: Python: get the position of the biggest item in a numpy array

You can use H.max() to get the maximum value and then compare it with H and use numpy.nonzero to find positions of all maximum values: numpy.nonzero(H.max() == H). This is going to be more expensive than just H.argmax() but you will get all maximum values.

2
votes

The library findpeaks can be of use.

pip install findpeaks

I do not see your data but let me try another similar example:

from findpeaks import findpeaks

raw input data

# initialize with default parameters. The "denoise" parameter can be of use in your case
fp = findpeaks()
# import 2D example dataset
img = fp.import_example()
# make the fit
fp.fit(img)
# Make plot
fp.plot()

detected peaks

The persistence can be of use to determine the impact of the peaks. You see that point 1, 2 and 3 show the strongest peaks, followed by the rest.

# Compute persistence
fp.plot_persistence()

persitence plot