I have data in the cloud storage and I want to transfer the data to big query and get statistics. currently I'm using a JobConfigurationLoad to get a single file here is a sample of the code:
JobConfigurationLoad jobconfigurationqLoad = new JobConfigurationLoad();
jobconfigurationqLoad.setSkipLeadingRows(1); // First line are columns names
jobconfigurationqLoad.setSourceUris(Lists.newArrayList("gs://my_app/folder_name/test_file.csv"));
jobconfigurationqLoad.setWriteDisposition("WRITE_APPEND");
jobconfigurationqLoad.setEncoding(PlatformConstants.DEFAULT_ENCODING);
jobconfigurationqLoad.setCreateDisposition("CREATE_IF_NEEDED");
jobconfigurationqLoad.setDestinationTable(tableReference);
**tableReference = my table in big query
jobconfigurationqLoad.setSchemaInline("field1:STRING,field2:STRING");
// JobConfiguration
JobConfiguration jobConfiguration = new JobConfiguration();
jobConfiguration.setLoad(jobconfigurationqLoad);
// JobReference
JobReference jobreference = new JobReference();
jobreference.setProjectId(PROJECT_ID);
// Job
Job insertJob = new Job();
insertJob.setConfiguration(jobConfiguration);
insertJob.setJobReference(jobreference);
In "setSourceUris" I wanted to put only the folder and get all the files that are there but that doesn't seems to work. I saw it the google api some doc about getting a bucket content but not only one folder inside the bucket. something similar is in this answer. i'm using GAE with java.