1
votes

I am trying to display the percentage of each column and run into the error-- IndexError: index 3 is out of bounds for axis 0 with size 3

import pandas as pd
import numpy as np 
import matplotlib.pyplot as plt
% matplotlib inline 
import seaborn as sns


df = pd.read_csv('https://github.com/Kevin-ck1/Intro-To-Data-Science/raw/master/hotel_bookings.csv')

df_booking = df.copy()

df_booking['percentage'] = round(df_booking['arrival_date_year']/df_booking['arrival_date_year'].sum() * 100, 1)

df_booking['arrival_date_year'].value_counts(dropna=True).plot(kind="bar")
xlocs, xlabs = plt.xticks()

for i, v in enumerate(df_booking['percentage']):

    plt.text(xlocs[i] - 0.08, v + 25, str(v) + '%', fontsize = 15)

plt.title('Booking')
plt.show()

The error message given is

IndexError                                Traceback (most recent call last)

<ipython-input-59-07b4659a45f8> in <module>()
     21 for i, v in enumerate(df_booking['percentage']):
     22 
---> 23     plt.text(xlocs[i] - 0.08, v + 25, str(v) + '%', fontsize = 15)
     24 
     25 plt.title('Booking')

IndexError: index 3 is out of bounds for axis 0 with size 3
1
Please edit your post and paste your entire codeRoim
All the percentages in the column are 0. Additionally, you're trying to annotate 3 bars with 119390 values from the column.Trenton McKinney
@TrentonMcKinney the output shows zero, but that should not be the actual value...Its should be 100 divided into three valuesKevinCK

1 Answers

1
votes

Working on your current code, make sense to store the counts, then calculate the percentage and then add the ticks using the percentages:

counts = df['arrival_date_year'].value_counts(dropna=True)
perc = round(100*counts/counts.sum(),1).to_numpy()

counts.plot(kind="bar")

xlocs, xlabs = plt.xticks()

for i in range(len(da)):

    plt.text(xlocs[i] - 0.08, counts.iloc[i] + 25, str(perc[i]) + '%', fontsize = 10)

plt.title('Booking')
plt.show()

enter image description here