
How to authenticate StreamSets with Snowflake: Using SSO for Snowflake key pair authentication
Key pair authentication can provide an added layer of security to basic authentication (ie. username and password) and multi-factor authentication when accessing your Snowflake warehouse. It is the approach for accessing your Snowflake environment that is configured for Single Sign On.
Why use Snowflake key pair authentication with StreamSets or other data integration providers?
Key pair authentication essentially assigns asymmetric public and private keys to a user. You can assign the public key you generate to your Snowflake user. Then, when establishing a connection to your Snowflake table within your StreamSets pipeline, you will be able to configure the private key associated with that public key and establish a connection to Snowflake. Your private key will only live in your local directory, making this a much more secure method of protecting your data.
Setting this up is easy and straightforward in 5 simple steps. I’ve included the relevant information from Snowflake documentation and added the final step necessary for StreamSets Data Collector as well as StreamSets Transformer for Snowflake, built on Snowpark.
StreamSets Data Collector enables you to easily connect to many enterprise data sources and easily ingest the data into Snowflake. StreamSets Transformer for Snowflake allows you to design simple to complex data transformation pipelines that are then run natively in Snowflake. The two services are decoupled allowing you to run enterprise grade ELT processes.
Step 1: Create private key
Open a Terminal window and enter the below command. Depending on what your security and governance requirements are, you can generate either an encrypted or unencrypted key. If you are unsure of what security guidelines are, in general, it is safer to use an encrypted key.
For an encrypted key, use the following command:
Step 2: Create public key
Enter the following command in your Terminal window:
Step 3: Store private and public keys securely
Make sure you know where your keys are stored, because you will need the file path later. The private key is stored and encrypted using the passphrase you specified in step 1. They will look like what is shown below.
Step 4: Set your public key to your Snowflake user
You must be in an ACCOUNTADMIN role to make edits to a user. You can view what role you have in the top right corner under your name. If you have the rights, it’s possible you can change your role by entering the following command in your worksheet or by clicking on the user setting bar in the top right corner of your worksheet, which looks like this:
Step 5a: Configure your Snowflake destination: StreamSets data collector ingestion pipeline
In your Snowflake destination, click on the tab labeled “Snowflake Connection Info” in the Properties panel.
You will need to configure two Keys- one called “private_key_file” with the value as the file path to where your private key is saved locally, and the other called “private_key_file_pwd” with the passphrase you set when you created the key in Step 1 as the value. See example below.
Step 5b: Configure your Snowflake destination: StreamSets transformer for Snowflake transformation pipeline
In the upper right corner of your StreamSets profile, click on the person icon and a dropdown menu will appear. Click on “My Account” which will take you to your account settings.