I know that work is being done to improve the Join feature on Bigquery, not to rant here but it will be hard to analyze 'Terabyte' sets of data as 'advertised' if Joins can not be used properly.
OK, back to the problem, I have two tables one is 600 Megs and the other one is 50Megs, I did try to make a join and I got an error about smaller table must be left. I did some research and I found out that Bigquery considers both tables as big if they are greater than 7MB?
So based on some advice I found online I did select the dataset in question of the smaller table and saved it in a new table, the new data set is 12MB and 600K rows over two columns. Then I tried my query again but I am still getting the same error :
Query Failed : Error: Large table cdrs_test.geoIP_Left must appear as the leftmost table in a join query
geoIP_Left is 12 Megs, the other table is 600Megs..
Can this be fixed somehow or am I busted? If so anyone knows about any other service that I can use to analyse big sets of data that does support joins?
EDIT: This is the actual query;
SELECT COUNT(results.cc_card) AS count,
sum(results.sessiontime) AS time,
geoIP_Left.place AS place
FROM cdrs_test.cdrs_2010_5 AS results
JOIN cdrs_test.geoIP_Left AS geoIP_table
ON results.cc_card = geoIP_table.vcard
WHERE results.sessiontime > 0 AND results.countryName Contains 'India'
GROUP BY place;