Skip to content

Conversation

@elliVM
Copy link
Contributor

@elliVM elliVM commented May 14, 2025

Description

Use spark's functions.from_unixtime to parse epochs into a timestamp string replacing the old custom UDF.
Resulting timestamp strings are now returned in ISO 8601 standard and include zone information.

Example:
epoch 1 results into 1970-01-01T00:00:01Z.

I chose this format but we can use that is supported by the DateTimeFormatter. The old format was not supported

@elliVM elliVM self-assigned this May 14, 2025
@elliVM elliVM linked an issue May 14, 2025 that may be closed by this pull request
@elliVM elliVM requested a review from eemhu May 14, 2025 13:14
Copy link
Contributor

@eemhu eemhu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this remove the support for the timeformat parameter?

@elliVM
Copy link
Contributor Author

elliVM commented May 15, 2025

Hey @eemhu

Indeed this change does remove the support for a custom timeformat. But when you look at how the timeformat is given to the call it is set directly in the AbstractConvertStep class, in addition that format was not of any standard.

From this I concluded that the feature was not used but please correct me if I was wrong.

Sparks from_unixtime does support different formats so it can be added to the call method if we think it's better that way?

@eemhu
Copy link
Contributor

eemhu commented May 20, 2025

Hey @eemhu

Indeed this change does remove the support for a custom timeformat. But when you look at how the timeformat is given to the call it is set directly in the AbstractConvertStep class, in addition that format was not of any standard.

From this I concluded that the feature was not used but please correct me if I was wrong.

Sparks from_unixtime does support different formats so it can be added to the call method if we think it's better that way?

The format set directly in AbstractConvertStep is the DPL default timeformat. It was used if none other was specified with the timeformat parameter. I don't think it should be removed.

@elliVM
Copy link
Contributor Author

elliVM commented May 20, 2025

Okey thank you, I will refactor so that different time formats are supported including default is supported

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

optimize ctime transform statement

2 participants