1
votes

Background:

I'm working on a program to show a 2d cross section of 3d data. The data is stored in a simple text csv file in the format x, y, z1, z2, z3, etc. I take a start and end point and flick through the dataset (~110,000 lines) to create a line of points between these two locations, and dump them into an array. This works fine, and fairly quickly (takes about 0.3 seconds). To then display this line, I've been creating a matplotlib stacked bar chart. However, the total run time of the program is about 5.5 seconds. I've narrowed the bulk of it (3 seconds worth) down to the code below.
'values' is an array with the x, y and z values plus a leading identifier, which isn't used in this part of the code. The first plt.bar is plotting the bar sections, and the second is used to create an arbitrary floor of -2000. In order to generate a continuous looking section, I'm using an interval between each bar of zero.

    import matplotlib.pyplot as plt

    for values in crossSection:
        prevNum = None
        layerColour = None
        if values != None:
            for i in range(3, len(values)):
                if values[i] != 'n':
                    num = float(values[i].strip())
                    
                    if prevNum != None:
                        plt.bar(spacing, prevNum-num, width=interval, \
                                bottom=num, color=layerColour, \
                                edgecolor=None, linewidth=0)
                        
                    prevNum = num
                    
                    layerColour = layerParams[i].strip()
                    
            if prevNum != None:        
                plt.bar(spacing, prevNum+2000, width=interval, bottom=-2000, \
                                    color=layerColour, linewidth=0)
        
        spacing += interval

I'm sure there's a more efficient way to do this, but I'm new to Matplotlib and still unfamilar with its capabilities. The other main use of time in the code is:

plt.savefig('output.png')

which takes about a second, but I figure this is to be expected to save the file and I can't do anything about it.

Question:

Is there a faster way of generating the same output (a stacked bar chart or something that looks like one) by using plt.bar() better, or a different Matplotlib function?

EDIT: I forgot to mention in the original post that I'm using Python 3.2.3 and Matplotlib 1.2.0

2

2 Answers

2
votes

Leaving this here in case someone runs into the same problem...
While not exactly the same as using bar(), with a sufficiently large dataset (large enough that using bar() takes a few seconds) the results are indistinguishable from stackplot(). If I sort the data into layers using the method given by tcaswell and feed it into stackplot() the chart is created in 0.2 seconds, rather than 3 seconds.

EDIT

Code provided by tcaswell to turn the data into layers:

accum_values = []
for values in crosssection:
    accum_values.append([float(v.strip()) for v iv values[3:]])
accum_values = np.vstack(accum_values).T 
layer_params = [l.strip() for l in layerParams]
bottom = numpy.zeros(accum_values[0].shape)
1
votes

It looks like you are drawing each bar, you can pass sequences to bar (see this example)

I think something like:

accum_values = []
for values in crosssection:
    accum_values.append([float(v.strip()) for v iv values[3:]])


accum_values = np.vstack(accum_values).T 
layer_params = [l.strip() for l in layerParams]
bottom = numpy.zeros(accum_values[0].shape)

ax = plt.gca()
spacing = interval*numpy.arange(len(accum_values[0]))
for data,color is zip(accum_values,layer_params):
    ax.bar(spacing,data,bottom=bottom,color=color,linewidth=0,width=interval)
    bottom += data

will be faster (because each call to bar creates one BarContainer and I suspect the source of your issues is you were creating one for each bar, instead of one for each layer).

I don't really understand what you are doing with the bars that have tops below their bottoms, so I didn't try to implement that, so you will have to adapt this a bit.