Apache Nifi - Combine MySQL and PostgreSQL records over REST API

Prasad Jayasinghe
5 min readAug 11, 2020

Introduction

Here I’m using Apache NiFi ETL tool to assess 2 DBs and combined the output together. Also I’m running MySQL and PostgreSQL Docker containers with sample data to simulate this use case.

MySQL Table

PostgreSQL table

Use Case

Need to query 2 databases with unique ID and combined results, remove similar fields and output as json object.

Solution

Using Apache NifI as a ETL tool and combined 2 DBs records and output as json object.

Big Picture

Required Components — Nifi Processors

1. HandleHttpRequest

2. HandleHttpResponse

3. GenerateTableFetch

4. ExecuteSQLRecord

5. SplitRecord

6. ConvertRecord

7. MergeContent

8. ExtractText

9. ReplaceText

10. LogAttribute

LLD

Nifi Processor Configuration

  1. Processor Name : HandleHttpRequest

Configuration :

Note:

We can give any available port under Listening Port property

Under Hostname property we can give NiFi server IP or localhost. (If you give Server IP make sure to configure it on nifi.properties file)

Under HTTP Context Map select “StabdardHttpContextMap” keep below configurations

2) Processor Name : HandleHttpResponse

Configuration :

Note:

Same as HandleHttpRequest process we can use “StabdardHttpContextMap”

3) Processor Name : GenerateTableFetch

Configuration :

Note:

Under Table name we have to provide our RDBMS table name

Here I have configure NiFi expression language in Additional WHERE clause property, What it does is it can read the REST request and get some values to use as WHERE clause property

Eg : http://localhost :8008/field/?id=111111 -> this id=111111 will take as WHERE clause

Under Database Connection Pooling Service select the “DBCPConnectionPool” and clink on -> icon

Then you will move forward to

Here select the “DBCPConnectionPool” configuration and click on * icon, Then you can have the configuration screen as below

On this window we have provide

· Databases connection url

· DB Driver class name

· DB driver location (Local NiFi server)

· Database user and Password

Once all done we can enable the configuration as below

Same procedure you can do for the PostgreSQL as well. Only change you have to do is, Provide

· Databases connection url

· DB Driver class name

· DB driver location (Local NiFi server)

· Database user and Password

4) Processor Name : ExecuteSQLRecord

Configuration :

Note :

Likewise earlier we have to configure same configuration for MySQL and PostgreSQL.

Record Writer Property we can configure as below

5) Processor Name : SplitRecord

Configuration :

Note:

Record Reader configurations

Record Writer configurations

6) Processor Name : ConvertRecord

Configuration :

Note :

We can use same Record Reader, Record Writer configurations where we have user for “SplitRecord” process

7) Processor Name : MergeContent

Configuration :

Note :

This Process is merge 2 DB query records together

8) Processor Name : ExtractText

Configuration :

Note :

This process will extract the CSV record as an attributes by the help of NiFi expression language

Eg:

CSV Property values

(.+),(.+)

(.+),(.+)

(.+),(.+)

(.+),(.+)

9) Processor Name : ReplaceText

Configuration :

Note :

This process will replace the attribute according to our required format

Eg : {

${“csv.1”} : ${“csv.3”},

${“csv.2”} : ${“csv.4”},

${“csv.6”} : ${“csv.8”}

}

10) Processor Name : LogAttribute

Configuration :

Note :

We can forward failure logs in to LogAttribute (if needed) and it will help us to T-shoot the process.

Results over Postman

Once configuration is done, we can start all the processors and access the configured URL from the postman tool

--

--