0
votes

I am building ML models and have plotted the recall values at different % of complete data, like so:

enter image description here

The y axis represents the recall values, and the x axis shows the percentage of data completeness (so 0.6 complete data means that records with >40% missing data have been removed, 0.7 means that records with >30% missing data have been removed, etc.).

This is the code I used to generate this plot:

fig = plt.figure()
fig.suptitle("True Positive Rate")
ax = fig.add_subplot(111)

subsets=[0.5, 0.6, 0.7, 0.8, 0.9, 1]

ax.plot(subsets, recall_results, marker = "o", linestyle = "--")
    
ax.set_ylabel("True Positive Rate")
ax.set_xlabel("% complete data in samples")
plt.show()

To get an idea of how the model performance is changing when dropping records with different percentages of missing values in comparison with the original data, I want to add the baseline recall value at x point 0, so (after adding this value to my recall_values list) I changed my code to:

fig = plt.figure()
fig.suptitle("True Positive Rate")
ax = fig.add_subplot(111)

subsets=[0, 0.5, 0.6, 0.7, 0.8, 0.9, 1]

ax.plot(subsets, recall_results, marker = "o", linestyle = "--")
    
ax.set_ylabel("True Positive Rate")
ax.set_xlabel("% complete data in samples")
plt.show()

enter image description here

As you can see form the plot, the new recall value at point 0 has been added, but the x values have been changed from 0.5, 0.6, 0.7, 0.8, 0.9, 1 to 0.0, 0.2, 0.4, 0.6, 0.8, 1.0. I understand that the x points written in the new graph have equal distances from each other as opposed to jumping from 0 to 0.5.

There is nothing wrong with the new graph from a technical point of view, but I would rather the x values written be those that I have y (recall) values for [0, 0.5, 0.6, 0.7, 0.8, 0.9, 1]. Can anyone help me out?

Thanks!

1
What exactly is the problem?iacob
@iacob I want a graph that has the x values (0, 0.5, 0.6, 0.7, 0.8, 0.9, 1) I have passed plotted rather then other x values (0.0, 0.2, 0.4, 0.6, 0.8, 1.0).sums22
For clarity: do you want to keep the graph the same, and just change the location of the tick marks to the positions of the data points? or, do you want all the data points to have equal spacing, even though the first two are actually 0.5 apart?tmdavison
You could do ax.set_xticks(subsets) to only have ticks at those positions. Or ax.xaxis.set_major_locator(matplotlib.ticker.MultipleLocator(0.1)) to have ticks every 0.1. Or change subsets to strings to have equally spaced ticks with only those numbers.JohanC
That's perfect @JohanC, thank you.sums22

1 Answers

0
votes

As stated in the comment made by @JohanC, there are 3 possible solutions:

  1. Only have ticks at positions in subset
ax.set_xticks(subsets) 

enter image description here

  1. Have ticks every 0.1
ax.xaxis.set_major_locator(matplotlib.ticker.MultipleLocator(0.1))

enter image description here

  1. Change subsets to strings to have equally spaced ticks with only those numbers
    subsets=["0.0", "0.5", "0.6", "0.7", "0.8", "0.9", "1"]

enter image description here