0
votes

I have split my training and test datasets using the train test split library

lengths = [int(len(supervised_data)*0.8),int(len(supervised_data)*0.2)+1]
train_data, test_data = torch.utils.data.random_split(supervised_data, lengths)

Now I am trying to append additional data to train_data.

Because I am trying to run several experiments (adding more data to training while using the same test_data for all experiments).

Is that possible?

1
What data are you trying to append to train data? If it has same format you can append it to train_data. random split will just split the entire data that you provide into two parts - train data and test data depending on the split ratio you provide and will split it randomly. If you have additional data with same format as original data, then you can append it to train data and train your model. - Rishabh Mishra
Why? Just use the original dataframe which you splitted. - Gedas Miksenas
@GedasMiksenas Im trying to run experiments that's why but I want to keep the test data the same for all experiments - manlike
@RishabhMishra it is the same format as the data I already split - manlike
You can definitely join additional data with training data. It is similar to joining train data and test data to get back full supervised_data in your case. So, you will training your model with (train_data + additional_data) and testing it on test_data. - Rishabh Mishra

1 Answers

0
votes

If you want to join two dataframes (train_data, test_data), then you can do that with:

joined_df = pd.concat([train_data, test_data])