Wharton Research Data Services Leverages Postgres and EDB Expertise

Key WRDS Benefits

  • Democratizes data access so that all disciplines can easily search for concepts across the data repository.
  • Expands on and improves capabilities with the help of core Postgres team members working at EDB, leveraging their support to enhance performance and navigate database issues.
  • Enables access to key documentation and hosts 350+TB of data on the most robust computing infrastructure to give users the power to analyze complex information at speeds of up to 400MB per second.
  • Supports 75,000+ users at 500+ institutions in 35+ countries

Products

Industry

  • Research/Analytics, Academia, Government, Commercial


About Wharton Research Data Services (WRDS)

A part of the Wharton School of the University of Pennsylvania, Wharton Research Data Services (WRDS) provides global corporations, universities and regulatory agencies the thought leadership, data access and insights needed to enable impactful research. WRDS democratizes data access within the data repository and provides curated data, guided Classroom tools, Research Applications and Learning Pathways for Researchers, Instructors and Information Professionals.

The Challenge

For 25+ years, Wharton Research Data Services (WRDS) has supported users with targeted solutions that underpin research, reinforce learning, and enable discovery. To provide these vital services, WRDS provides access to key documentation and hosts 350+TB of data on the most robust computing infrastructure to give users the power to analyze complex information at speeds of up to 400MB per second.

“At one point, we had SAS, MySQL, Microsoft SQL Server, Oracle and BDB all backing various services. We had five databases. That’s not a situation we wanted to be in,” says Tim Allen, Director of Advanced Initiatives at WRDS.

Understandably, WRDS began looking for a single source that could handle the wide variety of services they had to support—both high availability and scalability, essential requirements for WRDS data support. 

“We tried Oracle for a few years, but the authentication mechanism wasn’t efficient enough for us,” explains Russell Ney, IT Director of Platform Integration for WRDS.

Authentication was just the tip of the iceberg, however. WRDS’ Senior Database Administrator Alex Malek adds, “Vendor lock-in was a major concern. We have about 350,000 user records in LDAP and more than 75,000 are active at any given time. We want to make sure that whatever database we use can manage access and permission for our users.” 

Malek continues: “It’s a major engineering project to get everything into a database and would be a major engineering project to get everything out if we had issues with authentication and authorization. That’s really at the center of everything: making sure that users get access to only the data they’re supposed to, what their institution has paid for.”

Managing user access can be challenging for any organization, but for one whose database infrastructure is as large and complex as WRDS’, it becomes even more difficult. Working with a wide range of institutions—including Harvard, MIT, the University of Chicago, the Federal Reserve Bank and the SEC—means having a database that could both handle all of the information WRDS deals with and also provide total control over their data.

“With closed-source databases, they essentially want to own your data and can govern your users. That was not something that appealed to us. I’ve been a big proponent of open source for a long time,” Allen emphasizes. “I contribute to Django and I also work in the Python ecosystem. So, moving away from the closed-source databases was something I was very excited about.”

The Solution

Based on these concerns, open-source options moved immediately to the front of the pack for WRDS.

So, WRDS looked to Postgres.

“The thing that excited us most about Postgres was it seemed to be the only database that could handle the kind of workloads for the services which used the five separate databases in our ecosystem,” Malek relates. “It was also incredibly important for us to have a full-text search built in, which Postgres provided. There are thousands of other things we love about Postgres, but that’s the tip of the iceberg.”

Ney is similarly enthusiastic: “I think the root of why we love Postgres is that it's extensible and new features are coming all the time. There’s a whole community of contributors, it’s open and there’s no bureaucracy trying to figure out how to monetize it.”

While the enterprise opted to keep LDAP on BDB, they combined the remaining four databases into Postgres in 2015.

WRDS and EDB: a natural fit

With a multitude of open source experts within their ranks, WRDS initially planned to handle their new Postgres database entirely on their own.

“We’re very much a ‘learn things on our own’ group,” explains Ney. “So, we built the system and everything was great; but, then we reached the point where we had to release this database in production to users. Because it had become a foundational element for WRDS, we wanted to partner with a company that could help increase the technical depth of our bench and provide us with the 24/7 we need to support our users.”

That’s when EDB entered the picture.

“I was at one of the Postgres conferences and I came across the EDB booth,” Ney relates. “I saw that EDB offered a wide range of support and noted how many core Postgres developers worked with EDB. EDB seemed to be the leader. That was the natural fit for us.”

EDB helps WRDS supercharge Postgres

Because so many members of WRDS’ team had knowledge of and affinity for Postgres prior to the organization adopting it, they were looking to find every way they could leverage the database to their advantage, even beyond what a business might immediately think to attempt.

With so many Postgres luminaries on the EDB roster, WRDS knew they could experiment and innovate with the backing of the people who understood their database best.

“Access to EDB’s technical bench has been invaluable,” says Malek. “If we ever get a really obscure error, we know that someone at EDB will know exactly how to fix it and will respond promptly. That expertise has helped us with a number of innovative projects.”

Allen agrees: “An example that comes to mind is this huge permissions script we wrote, designed to read what data sets a user or customer should have permission to access and which data sets are contained in each WRDS product.”

As Allen explains, this massive script is foundational to how WRDS handles permissions. It was originally written in Perl, but—after consulting with EDB—the organization opted to rewrite it in Python, because they realized that the script in its current form was bloating their Postgres pg class table and slowing the entire database to a crawl.

 “This was a really important piece of code for our infrastructure and rewriting it was a huge project that Alex [Malek] and I undertook,” Allen explains. “Over the course of fixing and optimizing it, we reached out to EDB multiple times and got some great suggestions.”

The Results 

With the assistance of EDB, WRDS has been able to empower its teams and clientele alike, to make it easier to access the breadth of data.

“One of the key goals of WRDS is to provide institutions with the ability to access key research, even if they’re not themselves able to have a research department,” Ney says. For smaller or midsize schools that have business programs or are teaching business classes, this research is super valuable—both in teaching students and attracting faculty to do research.”

Postgres provides WRDS with the foundational DBMS to continue providing these services, while EDB ensures that Postgres is constantly meeting the needs of both WRDS and their client institutions.

The value of support

“We actually had a time-sensitive issue a while ago,” Malek explains. “EDB brought in Co-Founder of the PostgreSQL Global Development Group, Bruce Momjian, to help. During this chaotic and critical period, Bruce was in constant contact and able to find a solution to get us back to where we needed to be. Access to that expertise is crucial.”

Allen concurs: “Google can give you a million answers to your question, but if you go to the library, a librarian will give you the right answer to your question. That’s what working with EDB support is like.”

With this support, WRDS has been able to more effectively sync their users on a regular basis and avoid any issues that would make their database unusable, all thanks to deep expertise and lightning fast response time.

The power of expertise

“From where I sit, EDB has really helped us take Postgres to the next level,” Ney says. “If we have an idea for something, if we’re wondering ‘is this possible,’ we can bounce it off EDB and get that insight.”

Echoing this, Malek adds: “The stuff we’re doing with Postgres doesn’t fit any typical usage profile. There aren’t many places that give 75,000 active users direct access to a database. With a lot of what we do pushing the limits of Postgres, it is crucial that we can talk to the people who eat, live and breathe this stuff.”

The Future

With a database containing nearly 100 terabytes, WRDS has one of the largest installations of data on Postgres that EDB knows about. And that number is only climbing. To accommodate this, WRDS has begun to put major focus into setting up additional replicas and backups to continue to protect their growing datasets. 

But it’s not just the amount of data that is expanding, it’s how WRDS is looking to leverage it. “We’ve been pushing the limits and we’re looking to continue to do so,” Allen says. “Having a partner like EDB here to support us is what makes that possible.”

This spirit of evolution and exploration is what makes WRDS such a valuable partner for academic and financial institutions, and Postgres such a valuable partner for WRDS. “I’ve always seen the spirit of open source as very similar to the spirit of academia,” Ney says. “It’s that collaboration and open sharing of knowledge for everyone’s benefit.”

The reason so many businesses invested in growth, innovation and freedom choose Postgres is because of the flexibility and scalability it provides. As one of the best database management systems for the challenges faced by businesses with giant or complex database infrastructures, it provides control over your data and control over what you can achieve.

With the help of EDB’s deep Postgres expertise, WRDS unlocked more uses for Postgres, further expanding the reach to WRDS global community of researchers. 

Learn More About WRDS

Are you interested in sharing your Postgres success? We'd love to tell your story! Learn how by emailing our team at success-stories@enterprisedb.com to get started!

Share this

Related