Streamsets: JDBC Connection To MSSQL New Pipeline
In this article I am going to show you how to create your first streamsets pipeline JDBC connection to MSSQL and query the data and process the data as you need.
Prerequisites to Create New Pipeline
- Download JDBC driver Microsoft JDBC Download URL
- Open firewall port 1433 from streamsets server to MSSQL Server
Login into your streamsets web application using URL
http://Streamsets-IP-Address:18630
Click on Create New Pipeline
Save the Information
Now Select source as JDBC Consumer Query and Destination as Trash for some time
Click on JDBC Consumer Query then see below Tabs fill required inputs
In JDBC Connection string for MSSQL is below
jdbc:sqlserver://IPAddressOfMSSQLServer:PORT;databaseName=DBName;
Write sql query to retrieve the data from database
Credentials tab provide username and password to connect SQL server
Here in advanced tab sometimes you have to remove the tick mark Enforce Read-Only Connection
If you have error handling technique then add that one or else discard errors
Streamsets: JDBC Connection To MSSQL From StreamSets
Now click on Validate Tick mark
JDBC_00 - Cannot connect to specified database: com.streamsets.pipeline.api.StageException: JDBC_06 - Failed to initialize connection pool: java.lang.RuntimeException: Unable to get driver instance for jdbcUrl=jdbc:sqlserver://192.168.2.5:1433;
You may see above error if your environment is not configured properly with required JDBC drivers
then follow below steps to resolve JDBC driver not found in streamsets
SSH to your streamsets server
Version support for Java Runtime Environments (JRE) and the associated JDBC API versions are determined by the included JAR files as follows:
- Sqljdbc41.jar requires a JRE of 7 and supports the JDBC 4.1 API
- Sqljdbc42.jar requires a JRE of 8 and supports the JDBC 4.2 API
Download and copy the sql jdbc driver to JDBC library path. In my case below is the path
/opt/streamsets-datacollector/streamsets-libs/streamsets-datacollector-jdbc-lib/lib -rwxrwxrwx. 1 sdc sdc 660079 Feb 16 11:03 sqljdbc42.jar -rwxrwxrwx. 1 sdc sdc 1166240 Jul 31 22:53 mssql-jdbc-7.0.0.jre8.jar # chown sdc:sdc sqljdbc42.jar # chown sdc:sdc mssql-jdbc-7.0.0.jre8.jar
Edit Stream Sets Deanmon environment shell script and add string to export the library path
vi /opt/streamsets-datacollector/libexec/sdcd-env.sh ## For SQL Database Connection export STREAMSETS_LIBRARIES_EXTRA_DIR=/opt/streamsets-datacollector/streamsets-libs/streamsets-datacollector-jdbc-lib/lib/
and run the script
Restart SDC service
# systemctl restart sdc
Now validate back the JDBC Connection to sqlserver it will be good and ready to fetch data
That’s it about creating New MSSQL Pipeline in Streamsets
Related Articles
Data Collector Installation Step by Step Guide Ubuntu
Thanks for your wonderful Support and Encouragement