1
votes

I have a CSV I've pulled into a Pandas dataframe and I am trying to run a basic KMeans clustering through SciKit-Learn. This is the first time I'm doing this and I've hit an error that I don't understand.

import pandas as pd
import sklearn
import numpy as np 
import seaborn as sns; sns.set()

from sklearn.cluster import KMeans 


input_csv = '/Users/reallymemorable/Documents/Data.Repository/Analysis/Purchasers.Strats.Appends.csv'

# Read the CSV into a dataframe
df = pd.read_csv(input_csv)

# Select out only the relevant columns
df_shortlist = df[['Contact_ID', 'Sales_Stage_Sub', 'Sale_Type', 'Offered_Amount', 'Down_Payment']]

# Create binary dummy columns for Sales_Stage_Sub
df_shortlist_dummy_sales_stage_sub = pd.concat([df_shortlist, pd.get_dummies(df['Sales_Stage_Sub'])], axis=1)

# Create binary dummy columns for Sale_Type
df_shortlist_dummy_sales_stage_sub_and_sale_type = pd.concat([df_shortlist_dummy_sales_stage_sub, pd.get_dummies(df['Sale_Type'])], axis=1)

kmeans = KMeans(n_clusters=4)
kmeans.fit(df_shortlist_dummy_sales_stage_sub_and_sale_type)
y_means = kmeans.predict(df_shortlist_dummy_sales_stage_sub_and_sale_type)

plt.scatter(X[:, 0], X[:, 1], c=y_means, s=50, cmap='viridis')

centers = kmeans.cluster_centers_
plt.scatter(centers[:, 0], centers[:, 1], c='black', s=200, alpha=0.5);

This is the error I get:

Traceback (most recent call last):
  File "ml.spatial.clustering.py", line 4, in <module>
    import seaborn as sns; sns.set()
  File "/Users/reallymemorable/.pyenv/versions/3.6.7/lib/python3.6/site-packages/seaborn/__init__.py", line 6, in <module>
    from .rcmod import *
  File "/Users/reallymemorable/.pyenv/versions/3.6.7/lib/python3.6/site-packages/seaborn/rcmod.py", line 5, in <module>
    from . import palettes, _orig_rc_params
  File "/Users/reallymemorable/.pyenv/versions/3.6.7/lib/python3.6/site-packages/seaborn/palettes.py", line 12, in <module>
    from .utils import desaturate, set_hls_values, get_color_cycle
  File "/Users/reallymemorable/.pyenv/versions/3.6.7/lib/python3.6/site-packages/seaborn/utils.py", line 11, in <module>
    import matplotlib.pyplot as plt
  File "/Users/reallymemorable/.pyenv/versions/3.6.7/lib/python3.6/site-packages/matplotlib/pyplot.py", line 2374, in <module>
    switch_backend(rcParams["backend"])
  File "/Users/reallymemorable/.pyenv/versions/3.6.7/lib/python3.6/site-packages/matplotlib/pyplot.py", line 207, in switch_backend
    backend_mod = importlib.import_module(backend_name)
  File "/Users/reallymemorable/.pyenv/versions/3.6.7/lib/python3.6/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "/Users/reallymemorable/.pyenv/versions/3.6.7/lib/python3.6/site-packages/matplotlib/backends/backend_macosx.py", line 14, in <module>
    from matplotlib.backends import _macosx
ImportError: Python is not installed as a framework. The Mac OS X backend will not be able to function correctly if Python is not installed as a framework. See the Python documentation for more information on installing Python as a framework on Mac OS X. Please either reinstall Python as a framework, or try one of the other backends. If you are using (Ana)Conda please install python.app and replace the use of 'python' with 'pythonw'. See 'Working with Matplotlib on OSX' in the Matplotlib FAQ for more information.
2

2 Answers

1
votes

This is not a SK-Learn error, nor a really python installation error either. It is because of your matplotlib does not have a back-end to draw.

you can try to setup the backend in your matplotlib ocnfiguration file,

backend : WXAgg 

and make sure wxPython is installed.

You can go through this question You need to set your backend and this matplotlib document to know more about what is a backend

0
votes

Your problem doesn't seem to be related to scikit-learn, but with your python installation. Here you can find a possible solution. Good luck!