In my previous post we looked at various partitioning techniques in PostgreSQL for efficient IoT data management using IoT Solution. We do understand that the basic objective behind time based partitions is to achieve better performance, especially in IoT environments, where active data is usually the most recent data. New data is usually append only and it can grow pretty quickly depending on the frequency of the data points.
Some might argue on why to have multiple write nodes (as would be inherently needed in a BDR cluster) when a single node can effectively handle incoming IoT data utilizing features such as time based partitioning. Gartner estimated 8.4 billion connected devices in 2017, and it expects that this number will grow to over 20 billion by 2020. The scale at which connected devices are operating, it becomes imperative to introduce additional write nodes into the cluster at some point just to handle the sheer number of devices connecting to the databases.
In a conventional Master-Standby setup, adding additional write nodes isn’t trivial. There are techniques one might use to introduce additional write nodes (as a separate cluster), but those techniques introduce many complexities which time critical IoT applications cannot afford. As an example, consider a city’s smart grid station that needs to run analytics on past data to make adjustments for power for a specific time of the year.
A couple of key concepts with respect to data in an IoT environment are:
- Data localization
- Local and global analytics
Edge devices write to their local data store. Data security regulations might dictate data localization in many cases and therefore the ability to store data locally is important. Other important reasons include reducing read/write latency. However, it is equally important to be able to run analytics on the reported data to make various decisions.
Enterprises are increasingly operating at global levels and the ability to handle geographically distributed data is becoming increasingly important. This is where PostgreSQL using Postgres-BDR has you covered. While you can achieve write scalability (by writing to multiple nodes at the same time within a given geographic location), you can also rely on PostgreSQL using Postgres-BDR extension for your local/global data integration needs in IoT environments to run analytics.
Diagram below gives a blueprint of how write scalability, data localization and integration can all be achieved using PostgreSQL with Postgres-BDR in IoT environments.
Some of the key points to note are:
- Edge devices write to any of local BDR nodes and BDR takes care of making sure all nodes in the cluster have the latest changes as long as there are no conflicts.
- BDR logical replica(s) integrate data from local BDR nodes.
- Integrated data is summarized for local analytics needs. BDR takes care of replicating summarized data to global analytics store. (central BDR logical replica node(s) for global analytics)
Discussion on how PostgreSQL using Postgres-BDR helps solve IOT data storage problems continues. Stay tuned!
If you want to share your feedback or if you would like to know more about exciting stuff Postgres-BDR can do for your IoT data storage needs, please do not hesitate to comment or write to us at info@2ndQuadrant.com