
Since Spark runs on the JVM, it delegates the mapping to the Java standard library, which loads data from the Internet Assigned Numbers Authority Time Zone Database (IANA TZDB). For example, you now have to maintain a special time zone database to map time zone names to offsets. This additional level of abstraction from zone offsets makes life easier but brings complications.

Most people prefer to point out a location such as America/Los_Angeles or Europe/Paris. This representation of time zone information eliminates ambiguity, but it is inconvenient. Usually, time zone offsets are defined as offsets in hours from Greenwich Mean Time (GMT) or UTC+0 ( Coordinated Universal Time). The time zone offset allows you to unambiguously bind a local timestamp to a time instant. The valid range for fractions is from 0 to 999,999 microseconds.Īt any concrete instant, depending on time zone, you can observe many different wall clock values:Ĭonversely, a wall clock value can represent many different time instants. Spark supports fractional seconds with up to microsecond precision. The hour, minute, and second fields have standard ranges: 0–23 for hours and 0–59 for minutes and seconds.


If you write and read a timestamp value with a different session time zone, you may see different values of the hour, minute, and second fields, but they are the same concrete time instant. When writing timestamp values out to non-text data sources like Parquet, the values are just instants (like timestamp in UTC) that have no time zone information. The Timestamp type extends the Date type with new fields: hour, minute, second (which can have a fractional part) and together with a global (session scoped) time zone.
