I have dataframe contain longitude and latitude coordinates for each point. I want to convert the geographical coordinates for each point to UTM coordinates.
I tried to use utm module (https://pypi.org/project/utm/)
import utm
df=df.withColumn('UTM',utm.from_latlon(fn.col('lat'),fn.col('lon')))
but I obtain this error :
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-39-8b21f98738ca> in <module>()
----> 1 df=df.withColumn('UTM',utm.from_latlon(fn.col('lat'),fn.col('lon')))
~\Anaconda3\lib\site-packages\utm\conversion.py in from_latlon(latitude, longitude, force_zone_number)
152 .. _[1]: http://www.jaworski.ca/utmzones.htm
153 """
--> 154 if not -80.0 <= latitude <= 84.0:
155 raise OutOfRangeError('latitude out of range (must be between 80 deg S and 84 deg N)')
156 if not -180.0 <= longitude <= 180.0:
F:\spark\spark\python\pyspark\sql\column.py in __nonzero__(self)
633
634 def __nonzero__(self):
--> 635 raise ValueError("Cannot convert column into bool: please use '&' for 'and', '|' for 'or', "
636 "'~' for 'not' when building DataFrame boolean expressions.")
637 __bool__ = __nonzero__
ValueError: Cannot convert column into bool: please use '&' for 'and', '|' for 'or', '~' for 'not' when building DataFrame boolean expressions.
update:
After creating udf that applying utm or pyproj function
The result is:
+--------------------+
| UTM|
+--------------------+
|[Ljava.lang.Objec...|
|[Ljava.lang.Objec...|
|[Ljava.lang.Objec...|
|[Ljava.lang.Objec...|
|[Ljava.lang.Objec...|
+--------------------+
only showing top 5 rows
utm
function. – mayank agrawal