I have a HDFS location and there is a zip file inside that location
HDFS location /development/staging/b8baf3f4-abce-11eb-8592-0242ac110032/records.zip
scala> val loc = "/development/staging/b8baf3f4-abce-11eb-8592-0242ac110032/"
loc: String = "/development/staging/b8baf3f4-abce-11eb-8592-0242ac110032/"
scala> val rdd = sc.textFile(loc)
rdd: org.apache.spark.rdd.RDD[String] = /development/staging/b8baf3f4-abce-11eb-8592-0242ac110032/ MapPartitionsRDD[1] at textFile at <console>:26
scala> rdd.take(2)
res0: Array[String] = Array(PK????????]R�R��*�????�??? ???2972120.dat�S�r�0?
��*�0����?t?�]T�Ж??����
`�6ط�kU;P�M�� rSO�;G��p��?��?�Z1^3@�^�� ��F��ٕb�?~,ٖ
�u6�D��'�@�??��L*�Gp?�kcL�7!r�p1�1e�� a*.{?
�.;��������s�(�)�, ?�=�9U<"*!?5��?;�?�?�مd{h}
��gG���� �?�Z)
but it produces output differently
Can you help on how do i read a file inside a zip file using spark RDD There is only one file inside my zip file