To Nha Notes | Feb. 21, 2023, 11:57 a.m.
This series shows you the various ways you can use Python within Snowflake.
Snowpark for Python is the name for the new Python functionality integration that Snowflake has recently developed. At the Snowflake Summit in June 2022, Snowpark for Python was officially released into Public Preview, which means anybody is able to get started using Python in their own Snowflake environment.
I would understand if you wanted to skip this post and cut directly to the Snowflake’s monolithic documentation page; however, I would strongly recommend reading through this post first as it will give you an introduction to both the possibilities and the limitations of this new functionality. We will also walk through the steps you can take to create your own Python stored procedure from within Snowflake itself. There’s no need to set up anything on your local machine for this; all you need is a Snowflake account and the relevant privileges, most notably the CREATE PROCEDURE privilege at the schema level.
If you are interested in finding out about Python UDFs instead of stored procedures, be sure to check out my matching post, “A Definitive Guide to Creating Python UDFs Directly with the Snowflake User Interface.”
Most Snowflake users are far more comfortable with SQL than they are with Python or other advanced programming languages, simply because SQL is the core language required to leverage the Snowflake platform. With that in mind, it is easy to consider using SQL-based stored procedures instead of other languages; such as Python. However, there are several strong reasons to leverage a non-SQL language instead, especially when compared to Python:
To summarise, Python is simply more versatile than SQL and unlocks a wider range of functionality. Here are a few examples of powerful functionality that is possible with Python and not with SQL:
- Load SQLs which has multiple statements separated by ; from staged S3 files via a Snowflake UDF function as blow.
create or replace function my_udf()
returns string
language python
runtime_version=3.8
imports=('@my_stage/file.txt')
handler='compute'
as
$$
import sys
import os
with open(os.path.join(sys._xoptions["snowflake_import_directory"], 'file.txt'), "r") as f:
s = f.read()
def compute():
return s
$$;
- Create a common stored procedure in python language to split loaded SQL statements and iterative each SQL to execute it.
CREATE OR REPLACE PROCEDURE <PROCEDURE_NAME> (input_param STRING)
RETURNS STRING
LANGUAGE PYTHON
RUNTIME_VERSION = '3.8'
PACKAGES = ('snowflake-snowpark-python')
HANDLER = 'run'
EXECUTE AS CALLER
AS
$$
def execute_sql(session, sql):
if sql:
session.sql(sql.strip()).collect()
def run(session, output_bucket):
sqls_str = """
<STATEMENT_1>;
<STAGEMENT_2>;
"""
sqls = sqls_str.split(';')
for sql in sqls:
execute_sql(session, sql.strip())
return "SUCCESS"
$$;
https://pkg.go.dev/github.com/snowflakedb/gosnowflake
https://docs.snowflake.com/en/developer-guide/udf/python/udf-python-creating