现在在用spark join多张表 速度很慢……
val joinDF = DC_LAB_RESULT_MASTERDF.join(DC_ENCOUNTERDF, Seq("PERSON_ID","PATIENT_ID"), "full").join(DC_VITAL_SIGNSDF, Seq("PERSON_ID","PATIENT_ID"), "full").join(DC_DIAGNOSISDF, Seq("PERSON_ID","PATIENT_ID"), "full")
像这种语句基本要一下午。。没多少数据。。
|