I have to create a function which would split provided dataframe into chunks of needed size. For instance if dataframe contains 1111 rows, I want to be able to specify chunk size of 400 rows, and get three smaller dataframes with sizes of 400, 400 and 311. Is there a convenience function to do the job? What would be the best way to store and iterate over sliced dataframe?
Example DataFrame
import numpy as np
import pandas as pd
test = pd.concat([pd.Series(np.random.rand(1111)), pd.Series(np.random.rand(1111))], axis = 1)
test.index[::400]
and use this to slice the df:first = test.iloc[:400] second = test.iloc[400:800] third = test.iloc[800]
- EdChumsklearn train_test_split
also - EdChum