Changing the color of points in scatter plot for different dummy values

Question

In my dataset, I have a Price column for house prices and 5 dummy columns for different locations in the city. What I want to do is to show data points on the scatter plot with different colors.

For instance, on a scatter plot including all the prices of the houses, I want to have:

Red for all price points when dummy1 which indicates house being in Area1 is equal to 1.
Blue for all price points when dummy2 which indicates house being in Area2 is equal to 2.

and so on until the last column. How can I create that plot? I can create the scatter plot without the color using plt.scatter() but don't know how to add the color code.

Can we see at least the header of your data! few first lines of columns? — Khalil Al Hooti

finefoot finefoot · Accepted Answer · 2018-12-10T16:27:09

Have a look at the docs for matplotlib.pyplot.scatter which describes a parameter c, which can be

A sequence of color specifications of length n.

Here is an example, which creates 100 random x and y data points. If y value is over 5, the point will be blue, else red as specified in c list.

import matplotlib.pyplot as plt
import random

x = list(range(100))
y = [random.randint(0, 10) for _ in range(len(x))]
c = ["b" if y > 5 else "r" for y in y]

plt.scatter(x, y, c=c)
plt.show()

The output will look like this:

Changing the color of points in scatter plot for different dummy values

1 Answers