as the title states, I'm having trouble converting a column on a Dynamic Frame from Epoch to a timestamp.
I have tried onverting in into a Data Frame and back to Dynamic Frame but it is not working.
This is what I'm working with:
import sys
from awsglue.transforms import *
from awsglue.utils import getResolvedOptions
from pyspark.context import SparkContext
from awsglue.context import GlueContext
from awsglue.dynamicframe import DynamicFrame
from pyspark.sql import functions as f
from pyspark.sql import types as t
from pyspark.sql.functions import udf
from awsglue.job import Job
## @params: [JOB_NAME]
args = getResolvedOptions(sys.argv, ['JOB_NAME'])
sc = SparkContext()
glueContext = GlueContext(sc)
spark = glueContext.spark_session
job = Job(glueContext)
job.init(args['JOB_NAME'], args)
datasource0 = glueContext.create_dynamic_frame.from_catalog(database = "s3-sat-dth-prd", table_name = "s3_sat_dth_prd_vehicle", transformation_ctx = "datasource0")
applymapping1 = ApplyMapping.apply(frame = datasource0, mappings = [("in", "int", "in", "int"), ("out", "int", "out", "int"), ("ts", "long", "ts", "long"), ("cam", "string", "cam", "string"), ("subclass", "string", "subclass", "string")], transformation_ctx = "applymapping1")
selectfields2 = SelectFields.apply(frame = applymapping1, paths = ["in", "out", "ts", "cam", "subclass"], transformation_ctx = "selectfields2")
resolvechoice3 = ResolveChoice.apply(frame = selectfields2, choice = "MATCH_CATALOG", database = "s3-sat-dth-prd", table_name = "test_split_array_into_records_json", transformation_ctx = "resolvechoice3")
datasink4 = glueContext.write_dynamic_frame.from_catalog(frame = resolvechoice3, database = "s3-sat-dth-prd", table_name = "test_split_array_into_records_json", transformation_ctx = "datasink4")
job.commit()
What I've tried was creating a Data Frame tsconvert = resolvechoice3.toDF()
and turn it back into Dynamic Frame resolvechoice4 = DynamicFrame.fromDF(tsconvert, GlueContext, resolvechoice4)
; I get a syntax error at the last code snippet i pasted right at the end of resolvechoice4
.
COuld not find if there is anything built into Glue to convert to timestamp. When Iìll make sure the data is correctly written to S3, Redshift will be my Target.
Has anybody ever done anything like this and could lead me the way?
Thanks in advance.