If the number of groups is known in advance, you could write a USQL stored procedure that would take as parameter 1) the value of the group 2) the name of the file.
In the pseudo-code below, the name of the final file is driven by the underlying value of the group. The data to be split is sourced from a USQL table (referred in the pseudo-code as ).
DROP PROCEDURE IF EXISTS splitByGroups;
CREATE PROCEDURE splitByGroups(@groupValue string, @file_name_prefix string = "extract")
AS
BEGIN
DECLARE @OUTPUT string = "/output/" + file_name_prefix + "_"+ @groupValue + ".csv";
OUTPUT (
SELECT *
FROM <MyTable>
WHERE <MyGroup> == @groupValue
)
TO @OUTPUT
USING Outputters.Csv(outputHeader : true);
END;
You would then execute the stored procedure as many times as you have groups:
splitByGroups("group1", DEFAULT);
splitByGroups("group1", DEFAULT);
Alternatively, if you wish to analyse the multiple files offline, I would download the full file and use the shell (PowerShell or Linux Shell) to split the file.