Java.sql for Spark Scala: Examples using Dates, Times, etc

Java.sql for Spark Scala: Examples using Dates, Times, etc

Last updated:
Table of Contents

WIP Alert This is a work in progress. Current information is correct but more content may be added in the future.

Spark version 2.4.8 used. All code available on this jupyter notebook

Spark dataframes use java.sql.Date and java.sql.Timestamp to represent dates and datetimes, respectively.

Here are examples for common operations you'll need to use on those.

java.time: You'll find yourself, however, very often having to deal with objects from the java.time API 1, so there are also some common ways to convert to/from those two types.

String to java.sql.Timestamp

Use Timestamp.valueOf("YYYY-MM-DD hh:mm:ss.fff")

import java.sql.Timestamp

val ts = Timestamp.valueOf("2022-05-12 12:00:00.123")
# 2022-05-12 12:00:00.123

time.Instant to sql.Timestamp

Use Timestamp.from(instantObject)

import java.time.Instant
import java.sql.Timestamp

val instantObj = Instant.now()

val ts = Timestamp.from(instantObj)

References

1: The java.time API was introduced in Java 8 and it's now the de facto date/time API for Java/Scala. See examples here: Java.time API: Examples and Reference in Scala