ice_pick#

Contains core classes of Ice Pick

Submodules#

Package Contents#

Classes#

SchemaObject

Represents a Snowflake Schema object.

SchemaObjectFilter

A filter that can be used to return multiple SchemaObjects

AccountObjectFilter

A filter that can be used to return multiple AccountObjects

AccountObject

Account

(TODO - see how this is handled in snowflake)

Warehouse

Role

User

Database

Schema

Integration

NetworkPolicy

ResourceMonitor

Privilege

can pass any schema object or account object

Grant

Functions#

extend_session(→ extend_session.Session)

Returns the extended Session class

concat_standalone(→ snowflake.snowpark.DataFrame)

Returns a unioned dataframe from the input list of dataframes based on column names.

melt_standalone(→ snowflake.snowpark.DataFrame)

Returns a unioned dataframe from the input list of dataframes based on column names.

class ice_pick.SchemaObject#

Represents a Snowflake Schema object.

Schema Objects Include: ALERTS, EXTERNAL FUNCTIONS, EXTERNAL TABLES, FILE FORMATS, MATERIALIZED VIEWS, MASKING POLICIES, PASSWORD POLICIES, PIPES, PROCEDURES, ROW ACCESS POLICIES, SECRETS, SESSION POLICIES, SEQUENCES, STAGES, STREAMS, TABLES, TAGS, TASKS, USER FUNCTIONS, VIEWS, *EXTERNAL FUNCTIONS,

PROCEDURES, *USER FUNCTIONS * Also note some of these objects require the enterprise account

session#

Snowpark Session

Type:

Session

database#

database that the object is in

Type:

str

schema#

schema that the object is in

Type:

str

object_name#

the name of the object

Type:

str

object_type#

the type of schema object

Type:

str

session: snowflake.snowpark.Session#
database: str = 'SNOWFLAKE'#
schema: str = ''#
object_name: str = ''#
object_type: str = ''#
get_ddl(save: bool = False, fully_qualified: bool = True) str#

Return the ddl of the schema object as a string if save = True: save the ddl locally The default save path is: DDL/database/schema/object_type/database.schema.object_name.sql

Parameters:

save (bool = False) – save the ddl as a file locally

Returns:

A string with the ddl

Return type:

str

get_description() str#

Return the description of the schema object as a string

get_grants_on() list#

Return a list of grants on the schema object as a list

get_grant_objects() list#
grant(privilege: list, grantee: str) str#

grant access on object, return status

– For TABLE
{ SELECT | INSERT | UPDATE | DELETE | TRUNCATE | REFERENCES } [ , … ]
– For VIEW
{ SELECT | REFERENCES } [ , … ]
– For MATERIALIZED VIEW
{ SELECT | REFERENCES } [ , … ]
– For SEQUENCE, FUNCTION (UDF or external function), PROCEDURE, or FILE FORMAT
USAGE
– For internal STAGE
READ [ , WRITE ]
– For external STAGE
USAGE
– For PIPE
{ MONITOR | OPERATE } [ , … ]
– For STREAM
SELECT
– For TASK
{ MONITOR | OPERATE } [ , … ]
– For MASKING POLICY
APPLY
– For PASSWORD POLICY
APPLY
– For ROW ACCESS POLICY
APPLY
– For SESSION POLICY
APPLY
– For TAG
APPLY
– For ALERT
OPERATE
– For SECRET
USAGE
create(create_method: str = 'default', ddl: str = '', sql_ext: str = '', create_if_exists: bool = False)#

create in snowflake if not exists. For now this is very dependant on object type. Usualy the additional stuff comes after the object name, which can be provided by the sql_ext param. (todo: sql ext could just make more confusing - maybe need to create specific extension objects)

create_methods:
  • default:

  • ddl: user provided ddl string

  • ddl_state: use ddl from get_ddl() function

class ice_pick.SchemaObjectFilter#

A filter that can be used to return multiple SchemaObjects

Apply selection first, then filter out ingore objects.

Filters are applied by:

database -> ignore_dbs -> schema -> ignore_schemas -> object type -> object name

“.*” can be used to return all (regex supported)

session#

Snowpark Session

Type:

Session

databases#

databases that wil be searched

Type:

list

schemas#

schemas that wil be searched

Type:

list

object_names#

the name of the objects to be searched for

Type:

list

object_types#

the type of schema objects to be searched for

Type:

list

ignore_dbs#

databases to be ignored in search

Type:

list

ignore_schemas#

schemas to be ignored in search

Type:

list

session: snowflake.snowpark.Session#
databases: list#
schemas: list#
object_names: list#
object_types: list#
ignore_dbs: list#
ignore_schemas: list#
_filter_schema_objects_helper(objects_df: pandas.DataFrame, filtered_dbs: str, filtered_schemas: str, obj_type: str) pandas.DataFrame#

a helper function for filtering dataframe for objects

_filter_schema_objects(filtered_dbs: str, filtered_schemas: str) pandas.DataFrame#
helper function to get all schema level object info
  • get the object types that are selected

  • get the object info (database, schema, object type, object name)

  • The object info returned depends on the selected object types (see schema_level_exceptions)

return_schema_objects() List[ice_pick.schema_object.SchemaObject]#

Filter objects based on input objects If the property is a wildcard “.*”, then search all objects at that level (inputs are passed as regex)

If exclude is set to true, everything matched will be ignored, and all non-matches are returned

Parameters:

None

Returns:

a list of schema objects that matched the filter cases

Return type:

List[SchemaObjects]

Example

Get all procedures in all databases:
>> SchemaObjectFilter([“.*”], [“.*”], [“.*”], [“procedure”])
Get all tables and vies in a single database:
>> SchemaObjectFilter([“TEST_DB”], [“.*”], [“.*”], [“table”, “view”])
Get all tables except for the sample tables:
>> SchemaObjectFilter([“.*”], [“.*”],[“.*”], [“table”], ingore_dbs = [“SNOWFLAKE”, “SNOWFLAKE_SAMPLE_DATA”]
Get specific tables:
>> SchemaObjectFilter([“snowflake”], [“sample_data”], [“customer”, “transactions”], [“table”])
class ice_pick.AccountObjectFilter#

A filter that can be used to return multiple AccountObjects

Apply selection first, then filter out ingore objects.

Filters are applied by:

object_types -> object_names -> ignore_names

“.*” can be used to return all (regex supported)

session#

Snowpark Session

Type:

Session

object_names#

objects that will be searched

Type:

list

object_types#

object types that will be searched

Type:

list

ignore_names#

names to be ignored in search

Type:

list

session: snowflake.snowpark.Session#
object_names: list#
object_types: list#
ignore_names: list#
_query_account_object_helper() Dict[str, pandas.DataFrame]#

get all avialable account objects based on type Return a dictionary of the object type and the assocated pandas dataframe (todo - compare vs just getting everything then filtering)

_filter_name_account_objects(account_object_collection: Dict[str, pandas.DataFrame]) Dict[str, pandas.DataFrame]#

Filter objects to only selected names

_filter_ignore_account_objects(account_object_collection: Dict[str, pandas.DataFrame]) Dict[str, pandas.DataFrame]#

Filter objects to remove “ignore” names

__create_account_object_instances(object_type: str, object_names: list) List[ice_pick.account_object.AccountObject]#

create a single account object helper for creating all account objects

_create_account_objects(account_object_collection: Dict[str, pandas.DataFrame]) List[ice_pick.account_object.AccountObject]#

create the objects frm the account object collection

return_account_objects() List[ice_pick.account_object.AccountObject]#

Return all account objects matching the filter

class ice_pick.AccountObject(session, name: str, object_type: str)#
__repr__()#

Return repr(self).

get_description()#

Return the description of the object as a string

Supports: - DATABASE - SCHEMA - INTEGRATION - NETWORK POLICY - SHARE - USER - WAREHOUSE

get_ddl(fully_qualified: bool = True) str#

Return the ddl of the account object as a string

Supports: - DATABASE - SCHEMA

mock_ddl() str#

Return the ddl of objects that don’t work with the Snowflake GET_DDL() function

Supports: - WAREHOUSE - USER - ROLE

get_grants_on()#

Supports:

get_grant_objects() list#
drop()#

drops object

Supports: most objects

un_drop()#

undrops a dropped object Supports - database - schema

create(replace: bool = False)#

Supports:

class ice_pick.Account(session)#

(TODO - see how this is handled in snowflake) For now here just as a reminder and to help with global privileges Some things can only be accessed at the account level

class ice_pick.Warehouse(session, name: str)#

Bases: AccountObject

mock_ddl()#

Return the ddl of objects that don’t work with the Snowflake GET_DDL() function

Supports: - WAREHOUSE - USER - ROLE

suspend()#
resume()#
load_history(date_range_start: int, date_range_end: int, interval: str = 'hour') pandas.DataFrame#

This function returns warehouse activity within the last 14 days. This funciton requires elevated privileges to run, either: | - The ACCOUNTADMIN role can get results from this function as it has all of the global account permissions. | - A role with the MONITOR USAGE global privilege on the ACCOUNT can query this function for any warehouses in the account. | - A role with the MONITOR privilege on the WAREHOUSE can query this function for the warehouse it has permissions on. | - A role with the OWNERSHIP privilege on the WAREHOUSE has all permissions on the warehouse including MONITOR. | (https://docs.snowflake.com/en/sql-reference/functions/warehouse_load_history)

Parameters:
  • date_range_start (int) – start of hours/days ago interval

  • date_range_end (int) – end of hours/days ago interval

  • interval (str) – set the date range to either “days” or “hours”

Returns:

A snowpark dataframe with the unioned input dataframes

Return type:

snowpark.DataFrame

Example

Get the load history for the warehouse for the last 12 hours

>> warehouse.load_history(12, 0, interval = “hour”)

Returns Pandas DataFrame With:
- START_TIME
- END_TIME
- WAREHOUSE_NAME
- AVG_RUNNING
- AVG_QUEUED_LOAD
- AVG_QUEUED_PROVISIONING
- AVG_BLOCKED
metering_history(date_range_start: int, date_range_end: int, interval: str = 'hour')#

This table function can be used in queries to return the hourly credit usage for a single warehouse (or all the warehouses in your account) within a specified date range. This funciton requires elevated privileges to run, either: | - The ACCOUNTADMIN role can get results from this function as it has all of the global account permissions. | - A role with the MONITOR USAGE global privilege on the ACCOUNT can query this function for any warehouses in the account. (https://docs.snowflake.com/en/sql-reference/functions/warehouse_metering_history)

Parameters:
  • date_range_start (int) – start of hours/days ago interval

  • date_range_end (int) – end of hours/days ago interval

  • interval (str) – set the date range to either “days” or “hours”

Returns:

A snowpark dataframe with the unioned input dataframes

Return type:

snowpark.DataFrame

Example

Get the load history for the warehouse for the last 12 hours

>> warehouse.metering_history(12, 0, interval = “hour”)

Returns Pandas DataFrame With:
- START_TIME
- END_TIME
- WAREHOUSE_NAME
- CREDITS_USED
- CREDITS_USED_COMPUTE
- CREDITS_USED_CLOUD_SERVICES
query_history(date_range_start: int, date_range_end: int, interval: str = 'hour', result_limit: int = 1000)#
resize_recommendation(auto_apply: bool = False) str#

Keeping this simple as a starting point. In most cases rules should be customized for individual use cases. Rules: | - If: Local disk spillage over the last 3 days > 30% for the longest running queries | - Then: recommend size up warehouse | | - If: Local disk spillage over the last 3 days < 2% for the longest running queries | - Then: recommend downsizing warehouse | | - If: Max queued overload time > 10 minutes | - And: Local disk spillage over the last 3 days < 2% for the longest running queries | - Then: recommend scaling out | | - If: Max queued overload time > 10 minutes | - And: Local disk spillage over the last 3 days > 2% for the longest running queries | - Then: recommend scaling up

resize(wh_size: str)#

Resize the warehouse to specified size Warehouse sizes: - XSMALL , ‘X-SMALL’ - SMALL - MEDIUM - LARGE - XLARGE , ‘X-LARGE’ - XXLARGE , X2LARGE , ‘2X-LARGE’ - XXXLARGE , X3LARGE , ‘3X-LARGE’ - X4LARGE , ‘4X-LARGE’ - X5LARGE , ‘5X-LARGE’ - X6LARGE , ‘6X-LARGE’

Parameters:

wh_size (str) – specified warehouse size

Returns:

a string with execution status

Return type:

str

Example

Resize the warehouse to “SMALL” size

>> warehouse.resize(“SMALL”)
set_auto_suspend(seconds: int)#

Specifies the number of seconds of inactivity after which a warehouse is automatically suspended. | - Setting a value less than 60 is allowed, but may not result in the desired/expected behavior because the background process that suspends a warehouse runs approximately every 60 seconds and, therefore, is not intended for enabling exact control over warehouse suspension. | - Setting a 0 or NULL value means the warehouse never suspends

Parameters:

seconds (int) – the number of seconds of inactivity after which a warehouse is automatically suspended

Returns:

a string with execution status

Return type:

str

Example

Set the warehouse timeout to 60 seconds

>> warehouse.auto_suspend(60)
class ice_pick.Role(session, name: str)#

Bases: AccountObject

mock_ddl() str#

Return the ddl of objects that don’t work with the Snowflake GET_DDL() function

Supports: - WAREHOUSE - USER - ROLE

show_grants_of()#
show_grants_to()#
show_future_grants()#
show_grants_recursive()#

If a role is granted to another role, this function will also look at the privileges of the granted roles

class ice_pick.User(session, name: str)#

Bases: AccountObject

mock_ddl() str#

Return the ddl of objects that don’t work with the Snowflake GET_DDL() function

Supports: - WAREHOUSE - USER - ROLE

show_grants_to() pandas.DataFrame#

return dataframe with grant information

get_roles() list#

return granted roles as Role objects

get_all_privileges() pandas.DataFrame#

return all privileges for a user

check_privilege(schema_object: ice_pick.schema_object.SchemaObject) list#

maybe a quick way to show privilege on object?

class ice_pick.Database(session, name: str)#

Bases: AccountObject

class ice_pick.Schema(session, name: str)#

Bases: AccountObject

class ice_pick.Integration(session, name: str)#

Bases: AccountObject

class ice_pick.NetworkPolicy(session, name: str)#

Bases: AccountObject

class ice_pick.ResourceMonitor(session, name: str)#

Bases: AccountObject

class ice_pick.Privilege#

can pass any schema object or account object Mostly here just to help validation

object: ice_pick.schema_object.SchemaObject | ice_pick.account_object.AccountObject | ice_pick.account_object.Account#
definition: str#
__post_init__()#
class ice_pick.Grant(session: snowflake.snowpark.Session, privilege: Privilege, role: ice_pick.account_object.Role, grant_option: str = None, future_str: str = None)#
__repr__()#

Return repr(self).

check_exists() bool#
execute_grant()#

create the grant

revoke()#
validate(GrantTree)#
ice_pick.extend_session(Session: extend_session.Session) extend_session.Session#

Returns the extended Session class

Parameters:

session (Session) – Snowpark Session

Returns:

The exteneded Snowpark Session

Return type:

Session

Example

>> session = extend_session(Session).builder.configs(connection_parameters).create()

ice_pick.concat_standalone(session: snowflake.snowpark.Session, union_dfs: list) snowflake.snowpark.DataFrame#

Returns a unioned dataframe from the input list of dataframes based on column names. Primarly to handle cases where the number of columns do not match, which is not suppored by the base union function. If columns do not match, non-matching columns are added with null values to the base dataframes.

Parameters:
  • session (Session) – session object

  • union_dfs (list) – A list of the input snowpark dataframes to union

Returns:

A snowpark dataframe with the unioned input dataframes

Return type:

snowpark.DataFrame

Example

>> schema_1 = StructType([StructField(“a”, IntegerType()), StructField(“b”, StringType())])
>> schema_2 = StructType([StructField(“a”, FloatType()), StructField(“c”, StringType())])
>> schema_3 = StructType([StructField(“a”, IntegerType()), StructField(“c”, StringType())])
>> schema_4 = StructType([StructField(“c”, StringType()), StructField(“d”, StringType())])
>> df_1 = session.create_dataframe([[1, “snow”], [3, “flake”]], schema_1)
>> df_2 = session.create_dataframe([[2.0, “ice”], [4.0, “pick”]], schema_2)
>> df_3 = session.create_dataframe([[6, “test_1”], [7, “test_2”]], schema_3)
>> df_4 = session.create_dataframe([[“testing_d”, “testing_f”], [“testing_g”, “testing_h”]], schema_4)
>> union_dfs = [df_1, df_2, df_3, df_4]
>> unioned_df = auto_union(session, union_dfs)
>> unioned_df.show()
—————————————-
|”A” |”B” |”C” |”D” |
—————————————-
|1.0 |snow |NULL |NULL |
|3.0 |flake |NULL |NULL |
|2.0 |NULL |ice |NULL |
|4.0 |NULL |pick |NULL |
|6.0 |NULL |test_1 |NULL |
|7.0 |NULL |test_2 |NULL |
|NULL |NULL |testing_d |testing_f |
|NULL |NULL |testing_g |testing_h |
—————————————-
ice_pick.melt_standalone(session: snowflake.snowpark.Session, df: snowflake.snowpark.DataFrame, id_vars: list, value_vars: list, var_name: str = 'variable', value_name: str = 'value') snowflake.snowpark.DataFrame#

Returns a unioned dataframe from the input list of dataframes based on column names. Primarly to handle cases where the number of columns do not match, which is not suppored by the base union function. If columns do not match, non-matching columns are added with null values to the base dataframes.

Parameters:
  • session (Session) – session object

  • df (snowpark.DataFrame) – A snowpark dataframe to melt

  • id_vars (list) – Column names to use as identifiers

  • value_vars (list) – Column names to unpivot

  • var_name (str, default 'variable') – Name of the variable column

  • value_name (str, default 'value') – Name of the value column

Returns:

A snowpark dataframe with the unpivoted dataframes

Return type:

snowpark.DataFrame

Example

>> schema = StructType([StructField(“A”, StringType()), StructField(“B”, IntegerType()), StructField(“C”, IntegerType())])
>> df = session.create_dataframe([[‘a’, 1, 2], [‘b’, 3, 4], [‘c’, 5, 6]], schema)
>> melt_df = session.melt(df, [‘A’], [‘B’, ‘C’])
>> melt_df.show()
——————————
|”A” |”VALUE” |”VARIABLE” |
——————————
|a |1 |B |
|b |3 |B |
|c |5 |B |
|a |2 |C |
|b |4 |C |
|c |6 |C |
——————————