3
votes

I am attempting to overlay two graphs, a bar graph and a scatter plot, that share an x-axis, but have separate y-axis on either side of the graph. I have tried using matplotlib, ggplot, and seaborn, but I am having the same problem with all of them. I can graph them both separately, and they graph correctly, but when I try to graph them together, the bar graph is correct, but, only a couple data points from the scatter plot show up. I have zoomed in and can confirm that almost none of the scatter-plot points are appearing.

Here is my code. I have loaded a pandas dataframe and am trying to graph 'dKO_Log2FC' as a bar graph, and 'TTCAAG' as a scatter plot. They both share 'bin_end' postion on the x-axis. If I comment out sns.barplot, the scatter plot graphs perfectly. If I comment out the sns.scatterplot, the bar plot graphs as well. When I graph them together without commenting out either, the bar graph graphs, but only two datapoints from 'TTCAAG' column show up. I have played with with size of the scatter dots, zoomed in, etc, but nothing working.

import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
import pandas as pd

file = pd.read_csv('path/to/csv_file.csv')
df = pd.DataFrame(file, columns=['bin_end', 'TTCAAG', 'dKO_Log2FC'])

bin_end = df['bin_end']
TTCAAG = df['TTCAAG']
dKO_Log2FC = df['dKO_Log2FC']

fig, ax = plt.subplots()
ax2 = ax.twinx()

sns.barplot(x=bin_end, y=dKO_Log2FC, ax=ax, color="blue", data=df)

sns.scatterplot(x=bin_end, y=TTCAAG, ax=ax2, color="red", data=df)

plt.title('Histone Position in TS559 vs dKO')
plt.xlabel('Genomic Position (Bin = 1000nt)', fontsize=10)
plt.xticks([])
plt.ylabel('Log2 Fold Change', fontsize=10)

plt.show()

I have have no idea why this the scatter plot won't completely graph. The dataset is quite large, but even when I break it up into smaller bits, only a few scatter points show up. Here are the graphs scatter bar bar + scatter

1
You can in general not put a seaborn scatterplot and a seaborn barplot into the same axes. You can use matplotlib's scatter and bar instead.ImportanceOfBeingErnest
Is there a reason you are using seaborn instead of just matplotlib?BenT
matplotlib drops some of my data. The resolution isn't good.Specifically, I need to see the two drop outs (where Log2FC drastically decreases at two locations), but plt was not showing them unless I zoomed in.Kristin

1 Answers

2
votes

I`m not sure what is the problem, I think is something related to the amount of data or some other data related problem, however as you can plot the data separately, you can generate an image for each plot and then blend the two images to get the required plot.

import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from matplotlib.backends.backend_agg import FigureCanvasAgg as FigureCanvas
from PIL import Image

npoints=200
xRange=np.arange(0,npoints,1)
randomdata0=np.abs(np.random.normal(0,1,npoints))
randomdata1=np.random.normal(10,1,npoints)
axtick=[7,10,14]
ax2tick=[0,1.5,3]

fig0=plt.figure(0)
ax=fig0.gca()
ax2=ax.twinx()
sns.scatterplot(x=xRange,y=randomdata1,ax=ax)
ax.set_yticks(axtick)
ax.set_ylim([6,15])
ax2.set_yticks(ax2tick)
ax2.set_ylim([0,3.5])
plt.xticks([])

canvas0 = FigureCanvas(fig0)
s, (width, height) = canvas0.print_to_buffer()
X0 = Image.frombytes("RGBA", (width, height), s) #Contains the data of the first plot

enter image description here

fig1=plt.figure(1)
ax=fig1.gca()
ax2=ax.twinx()
sns.barplot(x=xRange,y=randomdata0,ax=ax2)
ax.set_yticks(axtick)
ax.set_ylim([6,15])
ax2.set_yticks(ax2tick)
ax2.set_ylim([0,3.5])
plt.xticks([])

canvas1 = FigureCanvas(fig1)
s, (width, height) = canvas1.print_to_buffer()
X1 = Image.frombytes("RGBA", (width, height), s) #Contains the data of the second plot

enter image description here

plt.figure(13,figsize=(10,10))
plt.imshow(Image.blend(X0,X1,0.5),interpolation='gaussian')
Axes=plt.gca()
Axes.spines['top'].set_visible(False)
Axes.spines['right'].set_visible(False)
Axes.spines['bottom'].set_visible(False)
Axes.spines['left'].set_visible(False)
Axes.set_xticks([])
Axes.set_yticks([])

enter image description here

Just remember to set the twin axes with the same range and ticks in both plots, otherwise, there will be some shift in the images and the numbers will not align. Hope it helps