Through this article, we are going to complete the MapReduce job started in the [previous article](https://www.2ndquadrant.com/en/2011/10/mapreduce-in-greenplum.html). ## Take up the problem from the previous article In the [previous article](https://www.2ndquadrant.com/en/2011/10/mapreduce-in-greenplum.html), we...
Scenario: We have a remote datasource, served by a gpfdist server. We need to import the data in a Greenplum database, while performing some ETL manipulation during the import. It...
Mapreduce is a very trendy software framework. It has been introduced by Google (TM) in 2004. It is a large topic, and it is not possible to cover all of...
In the first part of this article we have created a job, a database connection and defined the flow in Kettle. In the second part we’ll see how Kettle manages...
In this article, I am going to upgrade a Greenplum cluster from version 4.0 to 4.1 using `gpmigrator`. `gpmigrator` is an utility shipped with Greenplum Community Edition whose purpose is...
The Call for Papers for the Italian PGDay has been extended of a week. The new deadline for submitting a paper is October 23. English speakers can send their proposals...
Recently I have shown you how to perform a data import from a CSV file into a Greenplum database, using Talend Community Edition. In this article I’m going to perform...
I’m going to demonstrate how it is possible to use dblink in Greenplum 4.0.4.0 What’s dblink? —————— dblink is a PostgreSQL contrib module that allows to execute queries on another...
The Italian PGDay 2011 will take place in Prato, on Friday November 25th, at the Monash University Prato Centre. Exactly, where it all started. The event, organised by the Italian...
In the first part of this tutorial, we have set up all the connections required for creating the job, now we can proceed with data import. Let’s drag and drop...