The complication with using numpy is that one has two sources of error (and documentation to read), namely python itself as well as numpy.
I believe your problem here is that you are working with a so-called structured (numpy) array.
Consider the following example:
>>> import numpy as np
>>> a = np.array([(1,2), (4,5)], dtype=[('Game 1', '<f8'), ('Game 2', '<f8')])
>>> a.sum()
TypeError: cannot perform reduce with flexible type
Now, I first select the data I want to use:
>>> import numpy as np
>>> a = np.array([(1,2), (4,5)], dtype=[('Game 1', '<f8'), ('Game 2', '<f8')])
>>> a["Game 1"].sum()
5.0
Which is what I wanted.
Maybe you would consider using pandas (python library), or change language to R.
Personal opinions
Even though "numpy" certainly is a mighty library I still avoid using it for data-science and other "activities" where the program is designed around "flexible" data-types. Personally I use numpy when I need something to be fast and maintainable (it is easy to write "code for the future"), but I do not have the time to write a C program.
As far as Pandas goes it is convenient for us "Python hackers" because it is "R data structures implemented in Python", whereas "R" is (obviously) an entirely new language. I personally use R as I consider Pandas to be under rapid development, which makes it difficult to write "code with the future in mind".
As suggested in a comment (@jorijnsmit I believe) there is no need to introduce large dependencies, such as pandas, for "simple" cases. The minimalistic example below, which is compatible to both Python 2 and 3, uses "typical" Python tricks to massage the data it the question.
import csv
## Data-file
data = \
'''
, Game1, Game2, Game3, Game4, Game5
Player1, 2, 6, 5, 2, 2
Player2, 6, 4 , 1, 8, 4
Player3, 8, 3 , 2, 1, 5
Player4, 4, 9 , 4, 7, 9
'''
# Write data to file
with open('data.csv', 'w') as FILE:
FILE.write(data)
print("Raw data:")
print(data)
# 1) Read the data-file (and strip away spaces), the result is data by column:
with open('data.csv','rb') as FILE:
raw = [ [ item.strip() for item in line] \
for line in list(csv.reader(FILE,delimiter=',')) if line]
print("Data after Read:")
print(raw)
# 2) Convert numerical data to integers ("float" would also work)
for (i, line) in enumerate(raw[1:], 1):
for (j, item) in enumerate(line[1:], 1):
raw[i][j] = int(item)
print("Data after conversion:")
print(raw)
# 3) Use the data...
print("Use the data")
for i in range(1, len(raw)):
print("Sum for Player %d: %d" %(i, sum(raw[i][1:])) )
for i in range(1, len(raw)):
print("Total points in Game %d: %d" %(i, sum(list(zip(*raw))[i][1:])) )
The output would be:
Raw data:
, Game1, Game2, Game3, Game4, Game5
Player1, 2, 6, 5, 2, 2
Player2, 6, 4 , 1, 8, 4
Player3, 8, 3 , 2, 1, 5
Player4, 4, 9 , 4, 7, 9
Data after Read:
[['', 'Game1', 'Game2', 'Game3', 'Game4', 'Game5'], ['Player1', '2', '6', '5', '2', '2'], ['Player2', '6', '4', '1', '8', '4'], ['Player3', '8', '3', '2', '1', '5'], ['Player4', '4', '9', '4', '7', '9']]
Data after conversion:
[['', 'Game1', 'Game2', 'Game3', 'Game4', 'Game5'], ['Player1', 2, 6, 5, 2, 2], ['Player2', 6, 4, 1, 8, 4], ['Player3', 8, 3, 2, 1, 5], ['Player4', 4, 9, 4, 7, 9]]
Use the data
Sum for Player 1: 17
Sum for Player 2: 23
Sum for Player 3: 19
Sum for Player 4: 33
Total points in Game 1: 20
Total points in Game 2: 22
Total points in Game 3: 12
Total points in Game 4: 18
;
delimiter. Otherwise you leave us guessing as to how thedataset
was written.x
is probably an array of strings, since nothing in your code converts strings to numbers. – hpaulj