Understanding of TimescaleDB

Edison Devadoss
3 min readApr 12, 2022

--

Hi friends, In this blog, I am going to share my understanding of TimescaleDB

TimescaleDB is for storing time-series data. Then what is time-series data?

What is the Time series:

Time series is a data set that tracks a sample over time. Time series data also referred to as time-stamped data, is a sequence of data points(any single fact is a data point) indexed in time order.

Uses Time series data:

  • Forecasting is one of the most main cases. (Predictions)
  • Anomaly detection.
  • Used across IoT systems to manage the insane volume of data coming from millions of devices
  • It is used in DevOps to track system health and trends.
  • Time series data is used by scientists, engineers, thinkers

List of various Time-series databases:

Since we have so many databases for time-series data, why do we choose TimescaleDB?

In the TimescaleDB, we can access using SQL language. SQL is one of the most popular languages to work with datasets. TimescaleDB supports running the databases in multi-cluster mode, as well as TimescaleDB provides various features.

Timescale DB:

It is an open-source time-series database developed by Timesacle Inc. It is written in C and extends PostgreSQL. It supports standard SQL queries and relational databases.

Timescale DB was founded by Ajay Kulkarni and Michael in response to their need for a database solution to support IoT workloads.

TimescaleDB is a relational database for time-series data. It is implemented as an extension of PostgreSQL. This extension model allows the database to take advantage of the richness of PostgreSQL from 40+ data types.

For a better understanding of the timescaleDB, we need to understand two main concepts: Hypertables and Chunks.

Hypertables:

Virtually all user interactions with TimescaleDB are Hypertable. However, Hypertables are abstractions or virtual views of many individual tables that actually store data called chunks.


-- Step 1: Define regular table
CREATE TABLE IF NOT EXISTS weather_metrics (
time TIMESTAMP WITHOUT TIME ZONE NOT NULL,
timezone_shift int NULL,
city_name text NULL,
temp_c double PRECISION NULL,
temp_min_c double PRECISION NULL,
temp_max_c double PRECISION NULL,
weather_type_id int NULL
);
-- Step 2: Turn into hypertable
SELECT create_hypertable('weather_metrics','time');

Chunks:

Chunks are for partitioning Hypertables. Chunks are created by partitioning a Hypertables’s data into one or multiple dimensions. All Hypertables are partitioned by the values belonging to a time column, which may be in timestamp, date or various integer forms.

TimescaleDB creates these chunks automatically as rows are inserted into the database.

Each chunk is implemented using a standard database table. In PostgreSQL internals, the chunk is actually a “child table” of “parent” Hypertable. All chunks in a Hypertable are disjoint in their partitioning.

Advantages of Hypertables and Chunks:

In-memory:

Chunks can be configured so that the recent chunks fit in memory. So inserting into recent time intervals as well as queries to recent data can be fast because it is stored in memory, not disk.

Local indexes:

Indexes are built on each chunk independently, rather than a global index across all data.

Easy data retention:

With TimescaleDB users can quickly delete chunks based on their time range. Users can create a data retention policy to make it automatic.

Instant multi-node elasticity:

TimescaleDB supports horizontally scales across multiple nodes. When a new server is added, existing chunks can remain at their current location, while chunks created for future time intervals are partitioned across the new set of servers. The TimescaleDB planner can handle queries across these reconfigurations, always knowing which nodes are storing which chunks.

And also it provides more features like data reordering, data replication, and data migration.

While TimescaleDB provides more features than PostgreSQL, there are certain limitations when we use Hypertables and Chunks, especially in distributed Hypertables.

Limitations:

  • Foreign key referencing is not supported in Hypertables.
  • The time column used in partitioning can not have null values.
  • UPDATE statements that move value between chunks are not supported.
  • Unique indexes must include all columns that are partitioning dimensions.

Distributed Hypertables also have various limitations.

Installation:

For installing TimescaleDB follow this official link.

Thank you for reading. Have a nice day!

--

--

Edison Devadoss
Edison Devadoss

Written by Edison Devadoss

Software Engineer / Full Stack Developer / JavaScript / React / React Native / Firebase / Node.js / Book Reader

No responses yet