7
votes

this is just a small problem that has been bugging me for a while.

I have a pandas dataframe consisting of all continuous variables. I want to draw a scatter plot (using matplotlib) for any chosen pair of variables, making use of the interactive widgets in Jupyter as well.

Let's say the data has 3 numeric columns: 'a','b', and 'c'.

So far I have these lines of codes:

def g(x,y):
    plt.scatter(x, y)
interactive_plot = interactive(g, x=['a','b','c'], y=['a','b','c'])
interactive_plot

And they work fine, as in they do churn out a scatter plot whenever I toggle with the drop-down boxes for x and y and select a pair of variables from the 3 variables available. However, the problem here is that previous plots churned out are not erased before a new plot is shown. In other words, matplotlib doesn't update the plot in the existing figure, but simply stack plots/figures on top of each other. So if I change the choice of variable pairs 10 times, I'll get 10 scatter plots, which isn't what I want.

Could anyone help me with this?

Thanks in advance.

1
The code you show would result in something like ValueError: could not convert string to float: a. From the answer below you see how a minimal reproducible example could look like. Please always use such minimal reproducible examples when asking questions.ImportanceOfBeingErnest

1 Answers

12
votes

You may add plt.show() at the end of your function. This replots the graph in the same cell instead of adding a new one.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from ipywidgets import interactive
%matplotlib inline

columns=['a','b','c']
data = np.cumsum(np.random.rand(10,3),axis=1)
df = pd.DataFrame(data,columns=columns)

def g(x,y):
    plt.scatter(df[x], df[y])
    plt.show()

interactive_plot = interactive(g, x=columns, y=columns)
interactive_plot

enter image description here