Data warehouses are used for storing structured, processed data.
This type of data is used by Peak for analysis and decision making.
Peak requires a data warehouse to be configured so that core features such as Data Sources and SQL Explorer can be used.

This article describes how to connect your Snowflake data warehouse to Peak.


Contents


Process overview

To connect Peak to your Snowflake data warehouse, you will need to complete the following steps in both your Snowflake cluster account and in your Peak organization.

In your Snowflake cluster account:

  1. Get details of your account.
    See below for a list of the information you’ll need.

  2. Set up storage integration between your Snowflake cluster account and the Amazon S3 data lake.
    This involves running a script in your account. See below for details. 

In Peak:

There are four steps to complete during this process.

  1. Details
    This step lets you name your data warehouse and specify the type of data warehouse that you want to use (in this case, Snowflake).

  2. Configuration
    This step lets you specify the Snowflake cluster details for your organization.
    Use the details that you have copied from your Snowflake account.

  3. Data Lake
    This step lets you link your data warehouse to a data lake.
    They must be in the same region.

  4. Review
    Once you have entered all of your configuration details, this step lets you review everything before saving.


Get details of your Snowflake cluster

Go to your Snowflake account and get the following details for your Snowflake cluster:

  • account ID

  • password

  • user

  • default_role

  • owner
    Customer managed is the default value for both Peak managed or customer managed

  • region
    This must be the same region that your data lake is located

  • database

  • default_schema
    This is the schema that Peak will have read/write access for

  • default_warehouse

  • default_storage_integration


Set up storage integration

Before you configure your Snowflake data lake to work with Peak, you must set up storage integration between your Snowflake cluster account and the Amazon S3 data lake.


For more information on this process, see Snowflake docs: Create storage integration


To set up storage integration, go to your Snowflake cluster account and run the following script:


USE ROLE accountadmin;

CREATE STORAGE INTEGRATION SNOWFLAKEUSER_STORAGE_INTEGRATION
type = external_stage
storage_provider = s3
enabled = true
storage_aws_role_arn = 'arn:aws:iam::<Prod Account number>:role/prod-<tenant Name>-System'
storage_allowed_locations = ('s3://<Bucket Name>/<Root Path>')
storage_aws_object_acl = 'bucket-owner-full-control';

USE ROLE accountadmin;
GRANT usage
ON integration SNOWFLAKEUSER_STORAGE_INTEGRATION
TO role SNOWFLAKEUSER_GP_ROLE;

Script details

  • Prod Account number
    The AWS account number for the S3 storage.

  • tenant Name
    This must be entered in lower case with no special characters or spaces.

  • Bucket Name
    This is the name of the S3 bucket that was entered when your organization’s data lake was configured.
    To find the name, go to Dock > Data Bridge > Data Lake and click the expand icon.

  • Root path
    This is the root path of the S3 Bucket that was entered when your organization’s data lake was configured.
    To find the name, go to Dock > Data Bridge > Data Lake and click the expand icon.

  • Account admin role
    Use the accountadmin role to create the storage integration.

  • Storage integration name
    The name SNOWFLAKEUSER_STORAGE_INTEGRATION can be changed.
    If you do this, make sure you also change the name:
    • in the script at GRANT usage
    • when setting up data warehouse connection under default_storage_integration

Getting to the screens

To connect to a new data warehouse:

  1. Go to Dock > Data Bridge.

  2. Click ADD DATA WAREHOUSE.



Entering the data warehouse details

  1. Name your data warehouse connection.
    The name must be unique to your Peak organization. 
    Only alphanumeric characters and underscores are allowed.
    The name cannot be changed after the connection has been set up.

  2. Choose Snowflake then click NEXT to move to the Configuration stage.



Snowflake configuration

For this step, you will need the Snowflake cluster details that you gathered at the start of the process.

Once you have entered the details, check the connection by clicking the Test button.

The storage integration isn't tested as part of this process.



Data lake

This step lets you link your data warehouse to a data lake.
Linking the two gives you faster speeds, reduced latency and more flexibility.

Select the required on from the drop-down and click NEXT.

Your data lake and data warehouse must be in the same region.


Reviewing your connection

Before you complete the configuration process, you can review the details you have given at each stage of the process.

  1. To make changes, click Edit next to the option you want to change.



  2. Once the details are correct, click FINISH.
    You will be taken back to the Data Bridge listing screen and your newly configured data warehouse will be shown as ‘Active’.