Apache Nifi Cloudera Manager Integration
Apache Nifi is one of the application to manage multiple system ‘dataflow’, However Apache Nifi is not released any parcels for Cloudera distribution to include into Cloudera Manager. Apache Nifi Cloudera Manager Integration process step by step guide.
What is Apache NiFi?
Put simply NiFi was built to automate the flow of data between systems. While the term ‘dataflow’ is used in a variety of contexts, we use it here to mean the automated and managed flow of information between systems. This problem space has been around ever since enterprises had more than one system, where some of the systems created data and some of the systems consumed data.
Apache Nifi Cloudera Manager Prerequisites
- Maven Version should be more than 3.1.x (I used Apache Maven 3.5.4)
- Java 8.x must be installed
- Python Version must be 2.7.x or above
- Git package must be installed
Note: Apache Nifi parcel can’t be supported for RHEL7 / Centos7
# yum install maven-* # yum install git
Install Apache Nifi Prerequisites
cd /tmp git clone https://github.com/cloudera/cm_ext cd cm_ext/validator mvn install
This process will take some time to complete after completion you have to follow below steps in order to create apache nifi package
cd /tmp git clone http://github.com/apache/nifi cd /tmp/nifi
Now edit pom.xml file to avoid test failures like below
Example of Maven build failure
nifi-lookup-services ............................... SUCCESS [ 3.845 s] nifi-standard-utils ................................ SUCCESS [ 0.950 s] nifi-standard-processors ........................... FAILURE [04:30 min] nifi-standard-reporting-tasks ...................... SKIPPED nifi-standard-content-viewer ....................... SKIPPED
<artifactId>maven-surefire-plugin</artifactId>
<configuration>
<skipTests>true</skipTests>
and add below entry as well
<properties>
<maven.test.skip>true</maven.test.skip>
<!-- Vendor-specific version number included here as default,
should be overridden on the command-line <hadoop.version>2.7.1.2.4.0.0-169</hadoop.version> -->
</properties>
Create Apache Nifi Parcel for Cloudera
cd /tmp/nifi mvn -T 2.0C clean install
Process will take long time to complete, wait and see until apache nifi parcel completes
[INFO7] dockerhub 1.8.0-SNAPSHOT ........................... SUCCESS [ 1.505 s]
[INFO6] ------------------------------------------------------------------------
[INFO5] BUILD SUCCESS
[INFO4] ------------------------------------------------------------------------
[INFO3] Total time: 09:15 min (Wall Clock)
[INFO2] Finished at: 2018-10-08T14:42:07+05:30
[INFO1] ------------------------------------------------------------------------
Download parcel build script
$ cd /tmp $ git clone http://github.com/prateek/nifi-parcel $ cd nifi-parcel $ POINT_VERSION=5 VALIDATOR_DIR=/tmp/cm_ext ./build-parcel.sh /tmp/nifi/nifi-assembly/target/nifi-*-SNAPSHOT-bin.tar.gz $ VALIDATOR_DIR=/tmp/cm_ext ./build-csd.sh
If you receive any error like below while generating parcel package modify the build script
nifi-1.8.0-SNAPSHOT/docs/html/images/zero-master-node.png sed: can't read s/<VERSION-FULL>/0.0.5.nifi.p0.5/g: No such file or directory vi /tmp/nifi-parcel/build-parcel.sh From: sed -i "" "s/<VERSION-FULL>/$FULL_VERSION/g" $file sed -i "" "s/<VERSION-SHORT>/${SHORT_VERSION}/g" $file To: for file in `ls $PARCEL_NAME/meta/**` do sed -i -e "s/<VERSION-FULL>/$FULL_VERSION/g" $file sed -i -e "s/<VERSION-SHORT>/${SHORT_VERSION}/g" $file done
then run the parcel build package again
NIFI-0.0.5.nifi.p0.5/README Validating: NIFI-0.0.5.nifi.p0.5-el6.parcel Validating: NIFI-0.0.5.nifi.p0.5/meta/parcel.json Validation succeeded. Scanning directory: . Found parcel NIFI-0.0.5.nifi.p0.5-el6.parcel
Now run little apache server importer to distribute the Apache Nifi with Cloudera
# python -m SimpleHTTPServer 14641 Serving HTTP on 0.0.0.0 port 14641 ... 192.168.2.5 - - [08/Oct/2018 15:04:29] "GET /manifest.json HTTP/1.1" 200 -
Cloudera Manager Apache Nifi Distribution
Login to your Cloudera manager console Navigate to Hosts –> Parcels –> Configuration
add new entry
http://192.168.2.5:14641
Now click on Check for New Parcels
Download and Distribute to the Cloudera Cluster
Move Package to Cloudera Manager
$ cp NIFI-1.0.jar /opt/cloudera/csd $ mkdir /opt/cloudera/csd/NIFI-1.0 $ cp NIFI-1.0.jar /opt/cloudera/csd/NIFI-1.0 $ cd /opt/cloudera/csd/NIFI-1.0 $ jar xvf NIFI-1.0.jar $ rm -f NIFI-1.0.jar $ sudo service cloudera-scm-server restart # Wait a min, go to Cloudera Manager -> Add a Service -> NiFi
That’s it about making parcel of apache nifi cloudera manager
Conclusion:
This apache Nifi parcel will not support fir RHEL7 / Centos 7 check the compatibility before compiling package
Related Articles
RHEL7 Cloudera Manager Installation
Centos7 Streamsets Installation Guide
Creating MSSQL Streamsets Pipeline
Thanks for your wonderful Support and Encouragement