10
votes

Is it possible to add multiple data to a pandas.tools.plotting.scatter_matrix and assigning a color to each group of data?

I'd like to show the scatter plots with data points for one group of data, let's say, in green and the other group in red in the very same scatter matrix. The same should apply for the density plots on the diagonal. I know that this is possible by using matplotlib's scatter function, but that does not give me a scatter matrix.

The documentation of pandas is mum on that.

2

2 Answers

21
votes

The short answer is determine the color of each dot in the scatter plot, role it into an array and pass it as the color argument.

Example:

from pandas.tools.plotting import scatter_matrix
import pandas as pd
from sklearn import datasets

iris = datasets.load_iris()
iris_data = pd.DataFrame(data=iris['data'],columns=iris['feature_names'])
iris_data["target"] = iris['target']

color_wheel = {1: "#0392cf", 
               2: "#7bc043", 
               3: "#ee4035"}
colors = iris_data["target"].map(lambda x: color_wheel.get(x + 1))
ax = scatter_matrix(iris_data, color=colors, alpha=0.6, figsize=(15, 15), diagonal='hist')

Iris Dataset

1
votes

For me this answer didn't worked... But with this little correction it went well for me !

import pandas as pd
from pandas.plotting import scatter_matrix
from sklearn import datasets

iris = datasets.load_iris()
iris_data = pd.DataFrame(data=iris['data'],columns=iris['feature_names'])
iris_data["target"] = iris['target']

color_wheel = {1: "#0392cf", 
               2: "#7bc043", 
               3: "#ee4035"}
colors = iris_data["target"].map(lambda x: color_wheel.get(x + 1))
ax = scatter_matrix(iris_data, color=colors, alpha=0.6, figsize=(15, 15), diagonal='hist')