The Future of Database Management with EDB Postgres AI

Lætitia Avrot

August 01, 2024

Leverage artificial intelligence to unlock new possibilities in database management. Explore EDB Postgres AI, our data intelligence platform.

Let's talk about something I'm very excited about – tech. We’ve all seen tech inventions rise and fall fast in IT, and it's difficult to predict which will make it and which won’t. Some tech, however, can blow your mind when you first discover it and make you feel certain it’ll be part of the future. That's the case with artificial intelligence (AI), even though AI is not new. What's new is that we have a tremendous amount of data that makes this technology even more precise and impressive. So, let's talk about data management and AI!

The Importance of Data Management and Data Integrity

By analyzing data growth between 2010 and 2020, Statista estimates the volume of data created, captured, copied, and consumed worldwide will reach 181 zettabytes by 2025 (1zB = 1 million PB). Given this vast amount of data, I don't see why companies wouldn't invest in data management capabilities – because what's the point of spending money to obtain this data if we don't leverage it?

That's always been the dream behind business analytics: trying to guess the future from past and present data. Do you think the name "Oracle" was created on a whim? Similarly, Cassandra was named after the Trojan priestess who could utter accurate prophecies (even though she was cursed and no one believed her).

Without going that far, I worked on a very old project a few years ago. Right after World War II in France, child mortality was very high. The French government ordered a study comparing the situation with other European countries. The lack of milk was identified as one of the main reasons. France needed more cows and more efficiency to increase milk production.

Since then, farmers began collecting data, centralized and analyzed by public organizations. We have so much data, and the information is so detailed, that when a female veal is born, we can tell not only the quantity of milk she will produce her whole life but also the quality of her milk (percentage of fat, proteins, etc.) with minimal errors. The results are precise because the breadth of data nullifies the errors. Today, people would label this project AI even though it's SQL queries in a database and statistics. But when you think deeply about what's behind AI, isn't it simply querying data and calculating statistics?

Of course, you need to have a lot of data that is as clean as possible, or your predictions for the future will be completely wrong! This is where relational database management systems (RDBMS) can be invaluable, as they allow concurrent access and data management (data can easily become corrupted if not managed correctly) and data integrity with data domains, data types, constraints, etc.

Here’s one example of bad data leading to inaccurate predictions: during the early stage of the COVID era, doctors were overworked—especially the ones able to read chest X-rays to determine if a patient had COVID. Researchers chose to train AI to differentiate COVID-damaged lungs from healthy lungs. The problem was it was tough to find healthy adult chest X-rays. Researchers decided to train their AI on children’s X-rays as children’s lungs are typically much healthier than an adult’s. After several days of training, AI could distinguish children’s chest X-rays from an adult’s with a high success rate.

The Context behind Postgres AI

Postgres is an old project: its design note was published in 1986, and its first version was released in 1989. This project has more than 30 years of intense reflection, refactoring, re-designing, and extending capabilities! This explains why the Postgres Development Group includes very senior, experienced developers who know what they’re doing and can help less experienced developers write better code.

Postgres was designed for stability first. That's why its first versions were feature-poor. Postgres has reached the point where everyone knows it's stable. No wonder developers voted Postgres the most popular, admired, desired, and used database in Stack Overflow’s 2023 survey! To maintain stability, the project has several rules without exception:

● There will never be new features in a minor release.
● A feature will be added to a major version if it is stable enough (meaning thorough testing before it's considered safe with no known bugs).

For Postgres 15, we were looking forward to embedding JSON tables (a standard SQL feature) but removed them at the last minute as they weren’t ready for production. We added some of these changes to Postgres 16 and hope the rest will make it to Postgres 17, but there's no guarantee. It will be there if it's production ready.

Finally, Postgres is extensible. That's the second point of the design note from 1986. This paper emphasizes that Postgres "will provide extensibility for data types, operators, and access methods”. The next section of this note explains that the main goal of extensibility is so "the DBMS can be used in new application domains”. It would be only natural to make Postgres AI-compliant, right?

The Road to Today’s Database AI

As mentioned, it's sometimes difficult to predict whether or not tech trends will stay. However, AI is here to stay because it's not new at all!

In addition to working as a database consultant, I'm also teaching databases at the university. This year marks a change in how I’ll be grading my students because in June 2023, 75% of my students’ reports were totally or partly written by ChatGPT! As my job does not consist of grading ChatGPT on its knowledge of Postgres (which could be better), I decided this year's students will have to present a tech talk on an assigned topic.

In engineering school in the early 2000s, my team's project was to train a neural network to play checkers. Due to our lack of time and resources (CPUs weren't what they are now), we could only train our AI for 30 moves. Our AI was quite good for the first 30 but then began playing at random after. Any reasonable human could beat it if they avoided losing during the first 30 moves.

Another old example of AI is the computer Deep Blue, which was famous for beating Gary Kasparov in an epic six-game chess contest. You might not know that this was a re-match, as Kasparov had already beaten Deep Blue in a previous contest. Some moves are still considered controversial (in particular, a special pawn move where the computer didn't take the material advantage for a strategic one, which is something no computer had ever done before – see a complete analysis here.) This happened during the ‘90s!

There’s more than one kind of artificial intelligence. The different types include:

Reactive AI: This type of AI will only react to some events (like how Deep Blue reacts to a chess move or an autonomous car reacts to outside events like a speed limit, a pedestrian, etc.)

Generic AI: This type of AI is used for chatbots. It will identify keywords and give predefined answers based on them.

Limited AI: This is when an AI is limited to a specific domain, like banking fraud detection. Another good example of limited AI is Amazon Alexa. It can only perform a very specific and limited set of actions.

Super Smart AI: This is what every human imagines when reading or hearing the words artificial intelligence (like HAL in the iconic movie 2001: A Space Odyssey). This kind of AI is supposed to be better than the human brain. It does not exist yet and won't for a long time.

New Possibilities in Data Management with EDB Postgres AI

The human brain is limited when analyzing large amounts of data. It will try to summarize the data, whereas AI can find templates in a huge dataset for us to process. However, we will still have to be careful with the templates found by AI, as AI might have difficulty excluding some hypotheses and be confused between correlation and cause. For example, in France, 57% of deaths occur in a hospital bed. Does this mean hospital beds are dangerous for humanity? We all know that's not the case, and AI must be trained to learn that.

Still, AI can help analyze your data and find templates. As a generic database, Postgres is the ideal candidate to support. We already have tools built on top of Postgres to perform these queries. For example, the popular pgvector extension enables you to store your vectors and perform similarity searches by allowing exact and approximate nearest neighbor search, L2 distance, inner product, and cosine distance.

We also have tools like EvaDB that connect to your relational database and perform SQL queries on pre-trained models like Hugging Face, OpenAI, YOLO, and PyTorch.

But what if we look at the problem the other way around and use AI to make Postgres better? As a Postgres expert, I'm excited by this idea! For example, optimization is one of the most difficult tasks since the human brain has to focus on a small set of queries to optimize a system – but we know optimizing a small part of the system can lead to global performance degradations. With its large-scale view, AI could suggest better ideas.

What could we do with automatic constant tuning of Postgres, automatic indexation (dropping indexes and creating them) and a better optimizer? I suggest an architecture or data model. We could create a more natural language to query our data.

Combining database AI with PostgreSQL will result in endless opportunities, especially for enhancing the extensibility and flexibility of PostgreSQL. It will make PostgreSQL better and more relevant for domains and use cases. AI's evolution from Reactive AI to Limited AI opens the door to enhanced data analysis capabilities. EDB Postgres AI is the ideal tool for making this future a reality.

FAQs: Artificial Intelligence and Data Management

What is artificial intelligence?

Artificial intelligence (AI) is the simulation of cognitive functions associated with the human mind such as recognizing speech, identifying patterns, and making decisions.

What’s the difference between artificial intelligence, machine learning and deep learning?

Artificial intelligence is the broadest term and refers to the simulation of cognitive functions associated with the human mind. Machine learning is the application of artificial intelligence so systems can automatically learn and improve with experience. Deep learning is the application of machine learning using large volumes of data and complex algorithms to train a model.

How can a business benefit from artificial intelligence?

Artificial intelligence can optimize business operations in many ways. AI can automate routine tasks and avoid human error to make processes more efficient. AI can generate data to aid in decision-making, such as planning for labor, inventory, and financial resources. It can also boost marketing capabilities by analyzing data and identifying quality leads. Customers can directly benefit as well, enjoying more personalized services thanks to AI-driven insights.

Why use a database for artificial intelligence?

Artificial intelligence involves making inferences from data to simulate cognitive functions. Databases facilitate the storage, retrieval, modification, and deletion of large amounts of data, which are quite beneficial for AI processes.

Why use Postgres for artificial intelligence?

Postgres is a powerful open source data management system that can store complex data workloads–perfect for AI. Developers can also create applications and services with this system to protect data integrity. Postgres also stands out for its extensibility, featuring foreign data wrappers that can link databases and streams with a regular SQL interface. Its robust access-control system also makes it highly safe.

How can artificial intelligence transform data management?

Five main aspects of data management can benefit greatly from AI:

Classification: Obtaining, extracting, and structuring data from all kinds of media, such as photos, text documents, and even handwriting
Cataloging: Locating data
Quality: Reducing errors in data
Security: Ensuring data is safe from threats and used in line with current laws and policies
Data integration: Helping build data master lists

What is the role of data integrity in artificial intelligence?

Data integrity is needed to manage the increasingly complex applications and ecosystems related to AI. If AI models use training data that is compromised, inaccurate, or has errors, the model can generate erroneous results that could lead to poor performance for apps and services. This could have costly consequences for businesses. Stakeholders may lose trust in the company, which could reduce revenue.

What are cloud AI services?

These services feature the merging of artificial intelligence with cloud computing. They allow businesses to access AI capabilities like machine learning and predictive analytics purely online, without the hassle of on-premises infrastructure.

What is an AI workload?

AI workloads are the services and processes performed through fundamental AI techniques like machine learning. One example of such a process is feeding AI models large amounts of data and training them to identify patterns and make predictions. Another example is running a trained AI model and incorporating new data. Such workloads deal with analyzing unstructured data such as photos and text.

How can AI increase profit?

There are many ways AI can help grow revenue:

Identify market niches: It can determine key demographics and customer preferences, leading to new marketing opportunities
Support adoption of products and services: AI can identify product issues, pain points, and customer needs, informing managers on how to improve products and what new offerings can be developed
Forecast demand: AI can predict when best to boost stock to meet demand, like when natural events affect supply chains
Create new products: AI can bring in additional revenue by crafting brand-new products based on trend analyses
Optimize pricing: AI can analyze market data and determine the best price for every product and service in every marke

In this Article

The Importance of Data Management and Data Inte...
The Context behind Postgres AI
The Road to Today’s Database AI
New Possibilities in Data Management with EDB P...

Apply the Power of Postgres AI

Whether as a cloud-managed service, self-managed software, or a physical appliance

Resource Feature Callout 1

The Future of Database Management with EDB Postgres AI

Lætitia Avrot

The Importance of Data Management and Data Integrity

The Context behind Postgres AI

The Road to Today’s Database AI

New Possibilities in Data Management with EDB Postgres AI

FAQs: Artificial Intelligence and Data Management

Apply the Power of Postgres AI

Whether as a cloud-managed service, self-managed software, or a physical appliance

More Blogs

EDB Tutorial: Achieving High Availability Using Enterprise Failover Manager

How to implement repmgr for PostgreSQL automatic failover

PostgreSQL Replication and Automatic Failover Tutorial