splitting a dataframe into chunks and naming each new chunk into a dataframe

Question

is there a good code to split dataframes into chunks and automatically name each chunk into its own dataframe?

for example, dfmaster has 1000 records. split by 200 and create df1, df2,….df5 any guidance would be much appreciated.

I've looked on other boards and there is no guidance for a function that can automatically create new dataframes.

If you're reading your data with pd.read_csv or anything similar, you can use the chunksize-parameter: pandas.pydata.org/pandas-docs/version/0.23/generated/…. You'll make a simple for chunk in pd.read_csv(chunksize=200), and so on. — RoyM

Mayank Porwal Mayank Porwal · Accepted Answer · 2018-11-15T07:15:25

Use numpy for splitting:

See example below:

In [2095]: df
Out[2095]: 
     0     1     2    3     4    5     6     7     8     9     10
0  0.25  0.00  0.00  0.0  0.00  0.0  0.94  0.00  0.00  0.63  0.00
1  0.51  0.51   NaN  NaN   NaN  NaN   NaN   NaN   NaN   NaN   NaN
2  0.54  0.54  0.00  0.0  0.63  0.0  0.51  0.54  0.51  1.00  0.51
3  0.81  0.05  0.13  0.7  0.02  NaN   NaN   NaN   NaN   NaN   NaN

In [2096]: np.split(df, 2)
Out[2096]: 
[     0     1    2    3    4    5     6    7    8     9    10
 0  0.25  0.00  0.0  0.0  0.0  0.0  0.94  0.0  0.0  0.63  0.0
 1  0.51  0.51  NaN  NaN  NaN  NaN   NaN  NaN  NaN   NaN  NaN,
      0     1     2    3     4    5     6     7     8    9     10
 2  0.54  0.54  0.00  0.0  0.63  0.0  0.51  0.54  0.51  1.0  0.51
 3  0.81  0.05  0.13  0.7  0.02  NaN   NaN   NaN   NaN  NaN   NaN]

`df` gets split into 2 dataframes having `2` rows each.

You can do np.split(df, 500)

splitting a dataframe into chunks and naming each new chunk into a dataframe

2 Answers

df gets split into 2 dataframes having 2 rows each.

`df` gets split into 2 dataframes having `2` rows each.