0
votes

I am now converting a sql server query into spark.i am facing problem to convert given query

and not exists (Select 1 from @TT t2 where t2.TID = f.ID)

i have worked on it and understood that spark does not support 'not exist' command i have used except but error is

pyspark.sql.utils.AnalysisException: u'Except can only be performed on tables with the same number of columns, but the left table has 7 columns and the right has 31;'

I tried inner join operation as well so except does not work with uneven column number of table . what will be a compatible alternate query for this query in spark kindly help me i am using pyspark 2.0

1
please provide example data and expected output, so that your question can be reproduced. - mtoto
spark.sql("select f.id as TID,f.BldgID as TBldgID,f.LeaseID as TLeaseID,f.Period as TPeriod,f.ChargeAmt as TChargeAmt1,0 as TChargeAmt2,l.EXPIR as TBreakDate from Fact_CMCharges f join LEAS l on l.BLDGID = f.BldgID and l.LEASID = f.LeaseID and date_format(date_add(l.EXPIR,1),'D')<>1 and f.Period=Expirvalue(l.EXPIR) and (l.VACATE is null or l.VACATE >('select RunDate from RunDate')) where f.ActualProjected='Lease' except("Select 1 from TT t2 where t2.TID = f.ID ")").show() here is my code though its messy . i have used except command for expecting a result about above question - Kalyan
Please provide minimal dataset and expected output to your question, instead of copy pasting irrelevant code into comments. - mtoto
fortunately i have solved my problem. Thanks for your reply :) bellow part part has done the trick "....except(select * from TT1 t2 left semi join Fact_CMCharges f2 on t2.TID=f2.ID)....." - Kalyan

1 Answers

0
votes

from spark 2.0, you can use the "left_anti" join