Configuring a Snowflake Storage Integration to Access Amazon S3
To Nha Notes | March 29, 2022, 11:38 a.m.
This section describes how to use storage integrations to allow Snowflake to read data from and write data to an Amazon S3 bucket referenced in an external (i.e. S3) stage. Integrations are named, first-class Snowflake objects that avoid the need for passing explicit cloud provider credentials such as secret keys or access tokens. Integration objects store an AWS identity and access management (IAM) user ID. An administrator in your organization grants the integration IAM user permissions in the AWS account.
https://docs.snowflake.com/en/user-guide/data-load-s3-config-storage-integration.html
https://aws.amazon.com/blogs/awsmarketplace/automate-snowflake-integration-with-amazon-s3-using-aws-service-catalog/
Unloading into Amazon S3
Store procedure code example:
CREATE OR REPLACE PROCEDURE unload_data_to_s3()
RETURNS VARIANT NOT NULL
LANGUAGE JAVASCRIPT
AS
$$
var returnValue = {"status": "OK"};
var projectsStmt = snowflake.createStatement({
sqlText: "SELECT ID, NAME FROM MP_COMMON.PUBLIC.PROJECTS;"
});
var res = projectsStmt.execute();
while (res.next()) {
var projectId = res.getColumnValue("ID");
var copySQL = "COPY INTO s3://bucket-name/test/" + res.getColumnValue("ID") + ".json"
copySQL += `
FROM (
SELECT OBJECT_CONSTRUCT('project_id', ID, 'organization_id', ORGANIZATION_ID, 'name', NAME)
FROM MP_COMMON.PUBLIC.PROJECTS WHERE ID = ?
)
STORAGE_INTEGRATION=MP_S3_INTEGRATION
FILE_FORMAT = (TYPE = JSON COMPRESSION = NONE) SINGLE=TRUE OVERWRITE=TRUE;
`;
var copyStmt = snowflake.createStatement( { sqlText: copySQL, binds: [projectId] } );
copyStmt.execute();
}
return returnValue;
$$;
Store procedure execute:
CALL unload_data_to_s3();
Unload data from snowflake to s3 via partitioning:
-- PARTITIONING AND UNLOAD TO S3
COPY INTO s3://bucket-name/test/
FROM
(
SELECT
OBJECT_CONSTRUCT(*) P
FROM
<TABLE>
)
PARTITION BY P:ID::STRING
STORAGE_INTEGRATION=MY_S3_INTEGRATION
FILE_FORMAT = (TYPE = JSON COMPRESSION = NONE);
SET UNLOAD_UUID = LAST_QUERY_ID();
-- TO RENAMING S3 FILES
CREATE OR REPLACE STAGE MY_COMMON.MY_DATALOADER_RAW.test_stage
STORAGE_INTEGRATION = MY_S3_INTEGRATION
URL = 's3://bucket-name/test/'
FILE_FORMAT = (TYPE = JSON, STRIP_OUTER_ARRAY = TRUE);
SET PATTERN=CONCAT('.*data_', $UNLOAD_UUID, '_.*');
LIST @MY_COMMON.MY_DATALOADER_RAW.test_stage PATTERN=$PATTERN;
COPY INTO s3://bucket-name/test/renaming_manifest.json
FROM (
WITH
TEMP AS
(
SELECT
"name"::STRING AS NAME
, SPLIT(NAME, '/') AS ARR
, REPLACE(GET(ARR, 4), '"', '') AS ID
, REPLACE(GET(ARR, 5), '"', '') AS FILE
FROM
TABLE(RESULT_SCAN(LAST_QUERY_ID()))
)
, ID_FILE_MAPPING AS
(
SELECT
ID
, FILE
, NAME AS FROM_S3_URI
, CONCAT('s3://my-dev-dsp/test/', ID, '/', ID, '.json') AS TO_S3_URI
FROM
TEMP
)
SELECT
OBJECT_CONSTRUCT('from_s3_uri', FROM_S3_URI, 'to_s3_uri', TO_S3_URI)
FROM
ID_FILE_MAPPING
)
STORAGE_INTEGRATION=MY_S3_INTEGRATION
FILE_FORMAT = (TYPE = JSON COMPRESSION = NONE)
OVERWRITE = TRUE
SINGLE = TRUE;
https://docs.snowflake.com/en/user-guide/data-unload-s3.html
https://docs.snowflake.com/en/sql-reference/sql/copy-into-location.html