ValueError: Supply a 'c' kwarg or a 'color' kwarg but not both; they differ but their functionalities overlap

Question

I tried to run one example in the book of Python Data Science Essential. But, it appeared errors when I ran it. Actually, I just began learning the python. So, I felt that it is hard to fix those errors. Please help me. Here is code:

In:
import pandas as pd
import numpy as np
In: colors = list()
In: palette = {0: "red", 1: "green", 2: "blue"}
In:
for c in np.nditer(iris.target): colors.append(palette[int(c)])
    # using the palette dictionary, we convert
    # each numeric class into a color string
In: dataframe = pd.DataFrame(iris.data,
columns=iris.feature_names)
In: scatterplot = pd.scatter_matrix(dataframe, alpha=0.3,
figsize=(10, 10), diagonal='hist', color=colors, marker='o',
grid=True)

Here is errors:

ValueError Traceback (most recent call last) in () 1 scatterplot = pd.scatter_matrix(dataframe, alpha=0.3, ----> 2 figsize=(10, 10), diagonal='hist', color=colors, marker='o',grid=True)

/Users/leeivan/anaconda/lib/python2.7/site-packages/pandas/tools/plotting.py in scatter_matrix(frame, alpha, figsize, ax, grid, diagonal, marker, density_kwds, hist_kwds, range_padding, **kwds) 378 379 ax.scatter(df[b][common], df[a][common], --> 380 marker=marker, alpha=alpha, **kwds) 381 382 ax.set_xlim(boundaries_list[j])

/Users/leeivan/anaconda/lib/python2.7/site-packages/matplotlib/init.pyc in inner(ax, *args, **kwargs) 1817
warnings.warn(msg % (label_namer, func.name), 1818
RuntimeWarning, stacklevel=2) -> 1819 return func(ax, *args, **kwargs) 1820 pre_doc = inner.doc 1821 if pre_doc is None:

/Users/leeivan/anaconda/lib/python2.7/site-packages/matplotlib/axes/_axes.pyc in scatter(self, x, y, s, c, marker, cmap, norm, vmin, vmax, alpha, linewidths, verts, edgecolors, **kwargs) 3787
facecolors = co 3788 if c is not None: -> 3789 raise ValueError("Supply a 'c' kwarg or a 'color' kwarg" 3790 " but not both; they differ but" 3791 " their functionalities overlap.")

ValueError: Supply a 'c' kwarg or a 'color' kwarg but not both; they differ but their functionalities overlap.

If you think appropriate, having provided both a resolution and an explanation to the problem, please tick the answer as addressing the question. Thanks! — Enzo

Enzo Enzo · Accepted Answer · 2016-12-10T12:23:06

I tested the code below in jupyter and python 3.5 and it works.

import pandas as pd
import numpy as np
from sklearn.datasets import load_iris
%matplotlib inline

iris = load_iris()
colors = list()
palette = {0: "red", 1: "green", 2: "blue"}

for c in np.nditer(iris.target): colors.append(palette[int(c)])
    # using the palette dictionary, we convert
    # each numeric class into a color string
dataframe = pd.DataFrame(iris.data,
columns=iris.feature_names)
scatterplot = pd.scatter_matrix(dataframe, alpha=0.3,
figsize=(10, 10), diagonal='hist', c=colors, marker='o', grid=True)

Clearly the parameter color is generating the error, while c is working. On the other hand it could be a bug in matplotlib.

This is what I found, looking at the pandas function:

def scatter_matrix(frame, alpha=0.5, figsize=None, ax=None, grid=False,
                   diagonal='hist', marker='.', density_kwds=None,
                   hist_kwds=None, range_padding=0.05, **kwds):
    """
    Draw a matrix of scatter plots.
    Parameters
    ----------
    frame : DataFrame
    alpha : float, optional
        amount of transparency applied
    figsize : (float,float), optional
        a tuple (width, height) in inches
    ax : Matplotlib axis object, optional
    grid : bool, optional
        setting this to True will show the grid
    diagonal : {'hist', 'kde'}
        pick between 'kde' and 'hist' for
        either Kernel Density Estimation or Histogram
        plot in the diagonal
    marker : str, optional
        Matplotlib marker type, default '.'
    hist_kwds : other plotting keyword arguments
        To be passed to hist function
    density_kwds : other plotting keyword arguments
        To be passed to kernel density estimate plot
    range_padding : float, optional
        relative extension of axis range in x and y
        with respect to (x_max - x_min) or (y_max - y_min),
        default 0.05
    kwds : other plotting keyword arguments
        To be passed to scatter function

So it appears that colors or c are passed to the scatter function in matplotlib as one of the **kwds in the function call.

This is the scatter function:

matplotlib.pyplot.scatter(x, y, s=20, c=None, marker='o', cmap=None, norm=None, vmin=None, vmax=None, alpha=None, linewidths=None, verts=None, edgecolors=None, hold=None, data=None, **kwargs)

Here the parameter is c and not color, but in other parts color is listed as an alternative to c (as you would expect).

I posted an issue on matplotlib. I will keep you informed.

NEWS as of 11/12/2016

After a bit of discussions, the bug has been accepted by pandas and scheduled for fixing in the next major release. See here on github

Basically when c is specified, c is sent to the scatter function in matplotlib. When color is specified, both c and color are sent, confusing matplotlib.

For the time been, as suggested, use c instead of color

ValueError: Supply a 'c' kwarg or a 'color' kwarg but not both; they differ but their functionalities overlap

2 Answers