Streamsets: JDBC Connection To MSSQL New Pipeline

In this article I am going to show you how to create your first streamsets pipeline JDBC connection to MSSQL and query the data and process the data as you need. 

Prerequisites to Create New Pipeline

Login into your streamsets web application using URL

http://Streamsets-IP-Address:18630

Click on Create New Pipeline

Streamsets New Pipeline

New Pipeline

Save the Information

Now Select source as JDBC Consumer Query and Destination as Trash for some time

JDBC Consumer Query

Click on JDBC Consumer Query then see below Tabs fill required inputs

General Tab

JDBC Connection String for MSSQL – SQLSERVER

In JDBC Connection string for MSSQL is below

jdbc:sqlserver://IPAddressOfMSSQLServer:PORT;databaseName=DBName;

Write sql query to retrieve the data from database

Credentials

Credentials tab provide username and password to connect SQL server

Advanced Tab

Here in advanced tab sometimes you have to remove the tick mark Enforce Read-Only Connection 

error records

If you have error handling technique then add that one or else discard errors 

Streamsets: JDBC Connection To MSSQL From StreamSets

Now click on Validate Tick mark

JDBC_00 - Cannot connect to specified database: com.streamsets.pipeline.api.StageException: JDBC_06 - Failed to initialize connection pool: java.lang.RuntimeException: Unable to get driver instance for jdbcUrl=jdbc:sqlserver://192.168.2.5:1433;

You may see above error if your environment is not configured properly with required JDBC drivers

then follow below steps to resolve JDBC driver not found in streamsets

SSH to your streamsets server

Version support for Java Runtime Environments (JRE) and the associated JDBC API versions are determined by the included JAR files as follows:

  • Sqljdbc41.jar requires a JRE of 7 and supports the JDBC 4.1 API
  • Sqljdbc42.jar requires a JRE of 8 and supports the JDBC 4.2 API

Download and copy the sql jdbc driver to JDBC library path. In my case below is the path

/opt/streamsets-datacollector/streamsets-libs/streamsets-datacollector-jdbc-lib/lib

-rwxrwxrwx. 1 sdc sdc  660079 Feb 16 11:03 sqljdbc42.jar
-rwxrwxrwx. 1 sdc sdc 1166240 Jul 31 22:53 mssql-jdbc-7.0.0.jre8.jar

# chown sdc:sdc sqljdbc42.jar
# chown sdc:sdc mssql-jdbc-7.0.0.jre8.jar

Edit Stream Sets Deanmon environment shell script and add string to export the library path

vi /opt/streamsets-datacollector/libexec/sdcd-env.sh

## For SQL Database Connection
export STREAMSETS_LIBRARIES_EXTRA_DIR=/opt/streamsets-datacollector/streamsets-libs/streamsets-datacollector-jdbc-lib/lib/

and run the script

Restart SDC service

# systemctl restart sdc

Now validate back the JDBC Connection to sqlserver it will be good and ready to fetch data

That’s it about creating New MSSQL Pipeline in Streamsets

Related Articles

Data Collector Installation Step by Step Guide Ubuntu

How To Install GNOME 

Thanks for your wonderful Support and Encouragement