I have two tables of data. Each table has the same dimension 245x10. The original file can be found here. I need to compute t-test for these two tables; however, I get the error when I apply numpy function.
import scipy.stats as st
import numpy as np
import pandas as pd
df = pd.read_csv('GC Cerbellum final.txt', sep='\t')
df1 = df.ix[:, 1:12]
df2 = df.ix[:, 12:]
st.ttest_ind(df1, df2)
/usr/local/lib/python2.7/dist-packages/numpy/core/fromnumeric.pyc in var(a, axis, dtype, out, ddof, keepdims) 2936 2937 return _methods._var(a, axis=axis, dtype=dtype, out=out, ddof=ddof, -> 2938 keepdims=keepdims)
/usr/local/lib/python2.7/dist-packages/numpy/core/_methods.pyc in _var(a, axis, dtype, out, ddof, keepdims) 93 if isinstance(arrmean, mu.ndarray): 94 arrmean = um.true_divide( ---> 95 arrmean, rcount, out=arrmean, casting='unsafe', subok=False) 96 else: 97 arrmean = arrmean.dtype.type(arrmean / rcount)
TypeError: unsupported operand type(s) for /: 'str' and 'int'
I checked, and it looks that all data is integers and I'm not sure why it fails on strings. It can also be the case that because missing values are filled somehow with strings, it fails.
So my question how can I perform t-test for two tables in python with missing values?