ice_pick#
Contains core classes of Ice Pick
Submodules#
Package Contents#
Classes#
Represents a Snowflake Schema object. |
|
A filter that can be used to return multiple SchemaObjects |
|
A filter that can be used to return multiple AccountObjects |
|
(TODO - see how this is handled in snowflake) |
|
can pass any schema object or account object |
|
Functions#
|
Returns the extended Session class |
|
Returns a unioned dataframe from the input list of dataframes based on column names. |
|
Returns a unioned dataframe from the input list of dataframes based on column names. |
- class ice_pick.SchemaObject#
Represents a Snowflake Schema object.
Schema Objects Include: ALERTS, EXTERNAL FUNCTIONS, EXTERNAL TABLES, FILE FORMATS, MATERIALIZED VIEWS, MASKING POLICIES, PASSWORD POLICIES, PIPES, PROCEDURES, ROW ACCESS POLICIES, SECRETS, SESSION POLICIES, SEQUENCES, STAGES, STREAMS, TABLES, TAGS, TASKS, USER FUNCTIONS, VIEWS, *EXTERNAL FUNCTIONS,
PROCEDURES, *USER FUNCTIONS * Also note some of these objects require the enterprise account
- session#
Snowpark Session
- Type:
Session
- database#
database that the object is in
- Type:
str
- schema#
schema that the object is in
- Type:
str
- object_name#
the name of the object
- Type:
str
- object_type#
the type of schema object
- Type:
str
- session: snowflake.snowpark.Session#
- database: str = 'SNOWFLAKE'#
- schema: str = ''#
- object_name: str = ''#
- object_type: str = ''#
- get_ddl(save: bool = False, fully_qualified: bool = True) str#
Return the ddl of the schema object as a string if save = True: save the ddl locally The default save path is: DDL/database/schema/object_type/database.schema.object_name.sql
- Parameters:
save (bool = False) – save the ddl as a file locally
- Returns:
A string with the ddl
- Return type:
str
- get_description() str#
Return the description of the schema object as a string
- get_grants_on() list#
Return a list of grants on the schema object as a list
- get_grant_objects() list#
- grant(privilege: list, grantee: str) str#
grant access on object, return status
– For TABLE{ SELECT | INSERT | UPDATE | DELETE | TRUNCATE | REFERENCES } [ , … ]– For VIEW{ SELECT | REFERENCES } [ , … ]– For MATERIALIZED VIEW{ SELECT | REFERENCES } [ , … ]– For SEQUENCE, FUNCTION (UDF or external function), PROCEDURE, or FILE FORMATUSAGE– For internal STAGEREAD [ , WRITE ]– For external STAGEUSAGE– For PIPE{ MONITOR | OPERATE } [ , … ]– For STREAMSELECT– For TASK{ MONITOR | OPERATE } [ , … ]– For MASKING POLICYAPPLY– For PASSWORD POLICYAPPLY– For ROW ACCESS POLICYAPPLY– For SESSION POLICYAPPLY– For TAGAPPLY– For ALERTOPERATE– For SECRETUSAGE
- create(create_method: str = 'default', ddl: str = '', sql_ext: str = '', create_if_exists: bool = False)#
create in snowflake if not exists. For now this is very dependant on object type. Usualy the additional stuff comes after the object name, which can be provided by the sql_ext param. (todo: sql ext could just make more confusing - maybe need to create specific extension objects)
- create_methods:
default:
ddl: user provided ddl string
ddl_state: use ddl from get_ddl() function
- class ice_pick.SchemaObjectFilter#
A filter that can be used to return multiple SchemaObjects
Apply selection first, then filter out ingore objects.
- Filters are applied by:
database -> ignore_dbs -> schema -> ignore_schemas -> object type -> object name
“.*” can be used to return all (regex supported)
- session#
Snowpark Session
- Type:
Session
- databases#
databases that wil be searched
- Type:
list
- schemas#
schemas that wil be searched
- Type:
list
- object_names#
the name of the objects to be searched for
- Type:
list
- object_types#
the type of schema objects to be searched for
- Type:
list
- ignore_dbs#
databases to be ignored in search
- Type:
list
- ignore_schemas#
schemas to be ignored in search
- Type:
list
- session: snowflake.snowpark.Session#
- databases: list#
- schemas: list#
- object_names: list#
- object_types: list#
- ignore_dbs: list#
- ignore_schemas: list#
- _filter_schema_objects_helper(objects_df: pandas.DataFrame, filtered_dbs: str, filtered_schemas: str, obj_type: str) pandas.DataFrame#
a helper function for filtering dataframe for objects
- _filter_schema_objects(filtered_dbs: str, filtered_schemas: str) pandas.DataFrame#
- helper function to get all schema level object info
get the object types that are selected
get the object info (database, schema, object type, object name)
The object info returned depends on the selected object types (see schema_level_exceptions)
- return_schema_objects() List[ice_pick.schema_object.SchemaObject]#
Filter objects based on input objects If the property is a wildcard “.*”, then search all objects at that level (inputs are passed as regex)
If exclude is set to true, everything matched will be ignored, and all non-matches are returned
- Parameters:
None –
- Returns:
a list of schema objects that matched the filter cases
- Return type:
List[SchemaObjects]
Example
Get all procedures in all databases:>> SchemaObjectFilter([“.*”], [“.*”], [“.*”], [“procedure”])Get all tables and vies in a single database:>> SchemaObjectFilter([“TEST_DB”], [“.*”], [“.*”], [“table”, “view”])Get all tables except for the sample tables:>> SchemaObjectFilter([“.*”], [“.*”],[“.*”], [“table”], ingore_dbs = [“SNOWFLAKE”, “SNOWFLAKE_SAMPLE_DATA”]Get specific tables:>> SchemaObjectFilter([“snowflake”], [“sample_data”], [“customer”, “transactions”], [“table”])
- class ice_pick.AccountObjectFilter#
A filter that can be used to return multiple AccountObjects
Apply selection first, then filter out ingore objects.
- Filters are applied by:
object_types -> object_names -> ignore_names
“.*” can be used to return all (regex supported)
- session#
Snowpark Session
- Type:
Session
- object_names#
objects that will be searched
- Type:
list
- object_types#
object types that will be searched
- Type:
list
- ignore_names#
names to be ignored in search
- Type:
list
- session: snowflake.snowpark.Session#
- object_names: list#
- object_types: list#
- ignore_names: list#
- _query_account_object_helper() Dict[str, pandas.DataFrame]#
get all avialable account objects based on type Return a dictionary of the object type and the assocated pandas dataframe (todo - compare vs just getting everything then filtering)
- _filter_name_account_objects(account_object_collection: Dict[str, pandas.DataFrame]) Dict[str, pandas.DataFrame]#
Filter objects to only selected names
- _filter_ignore_account_objects(account_object_collection: Dict[str, pandas.DataFrame]) Dict[str, pandas.DataFrame]#
Filter objects to remove “ignore” names
- __create_account_object_instances(object_type: str, object_names: list) List[ice_pick.account_object.AccountObject]#
create a single account object helper for creating all account objects
- _create_account_objects(account_object_collection: Dict[str, pandas.DataFrame]) List[ice_pick.account_object.AccountObject]#
create the objects frm the account object collection
- return_account_objects() List[ice_pick.account_object.AccountObject]#
Return all account objects matching the filter
- class ice_pick.AccountObject(session, name: str, object_type: str)#
- __repr__()#
Return repr(self).
- get_description()#
Return the description of the object as a string
Supports: - DATABASE - SCHEMA - INTEGRATION - NETWORK POLICY - SHARE - USER - WAREHOUSE
- get_ddl(fully_qualified: bool = True) str#
Return the ddl of the account object as a string
Supports: - DATABASE - SCHEMA
- mock_ddl() str#
Return the ddl of objects that don’t work with the Snowflake GET_DDL() function
Supports: - WAREHOUSE - USER - ROLE
- get_grants_on()#
Supports:
- get_grant_objects() list#
- drop()#
drops object
Supports: most objects
- un_drop()#
undrops a dropped object Supports - database - schema
- create(replace: bool = False)#
Supports:
- class ice_pick.Account(session)#
(TODO - see how this is handled in snowflake) For now here just as a reminder and to help with global privileges Some things can only be accessed at the account level
- class ice_pick.Warehouse(session, name: str)#
Bases:
AccountObject- mock_ddl()#
Return the ddl of objects that don’t work with the Snowflake GET_DDL() function
Supports: - WAREHOUSE - USER - ROLE
- suspend()#
- resume()#
- load_history(date_range_start: int, date_range_end: int, interval: str = 'hour') pandas.DataFrame#
This function returns warehouse activity within the last 14 days. This funciton requires elevated privileges to run, either: | - The ACCOUNTADMIN role can get results from this function as it has all of the global account permissions. | - A role with the MONITOR USAGE global privilege on the ACCOUNT can query this function for any warehouses in the account. | - A role with the MONITOR privilege on the WAREHOUSE can query this function for the warehouse it has permissions on. | - A role with the OWNERSHIP privilege on the WAREHOUSE has all permissions on the warehouse including MONITOR. | (https://docs.snowflake.com/en/sql-reference/functions/warehouse_load_history)
- Parameters:
date_range_start (int) – start of hours/days ago interval
date_range_end (int) – end of hours/days ago interval
interval (str) – set the date range to either “days” or “hours”
- Returns:
A snowpark dataframe with the unioned input dataframes
- Return type:
snowpark.DataFrame
Example
Get the load history for the warehouse for the last 12 hours
>> warehouse.load_history(12, 0, interval = “hour”)Returns Pandas DataFrame With:- START_TIME- END_TIME- WAREHOUSE_NAME- AVG_RUNNING- AVG_QUEUED_LOAD- AVG_QUEUED_PROVISIONING- AVG_BLOCKED
- metering_history(date_range_start: int, date_range_end: int, interval: str = 'hour')#
This table function can be used in queries to return the hourly credit usage for a single warehouse (or all the warehouses in your account) within a specified date range. This funciton requires elevated privileges to run, either: | - The ACCOUNTADMIN role can get results from this function as it has all of the global account permissions. | - A role with the MONITOR USAGE global privilege on the ACCOUNT can query this function for any warehouses in the account. (https://docs.snowflake.com/en/sql-reference/functions/warehouse_metering_history)
- Parameters:
date_range_start (int) – start of hours/days ago interval
date_range_end (int) – end of hours/days ago interval
interval (str) – set the date range to either “days” or “hours”
- Returns:
A snowpark dataframe with the unioned input dataframes
- Return type:
snowpark.DataFrame
Example
Get the load history for the warehouse for the last 12 hours
>> warehouse.metering_history(12, 0, interval = “hour”)Returns Pandas DataFrame With:- START_TIME- END_TIME- WAREHOUSE_NAME- CREDITS_USED- CREDITS_USED_COMPUTE- CREDITS_USED_CLOUD_SERVICES
- query_history(date_range_start: int, date_range_end: int, interval: str = 'hour', result_limit: int = 1000)#
- resize_recommendation(auto_apply: bool = False) str#
Keeping this simple as a starting point. In most cases rules should be customized for individual use cases. Rules: | - If: Local disk spillage over the last 3 days > 30% for the longest running queries | - Then: recommend size up warehouse | | - If: Local disk spillage over the last 3 days < 2% for the longest running queries | - Then: recommend downsizing warehouse | | - If: Max queued overload time > 10 minutes | - And: Local disk spillage over the last 3 days < 2% for the longest running queries | - Then: recommend scaling out | | - If: Max queued overload time > 10 minutes | - And: Local disk spillage over the last 3 days > 2% for the longest running queries | - Then: recommend scaling up
- resize(wh_size: str)#
Resize the warehouse to specified size Warehouse sizes: - XSMALL , ‘X-SMALL’ - SMALL - MEDIUM - LARGE - XLARGE , ‘X-LARGE’ - XXLARGE , X2LARGE , ‘2X-LARGE’ - XXXLARGE , X3LARGE , ‘3X-LARGE’ - X4LARGE , ‘4X-LARGE’ - X5LARGE , ‘5X-LARGE’ - X6LARGE , ‘6X-LARGE’
- Parameters:
wh_size (str) – specified warehouse size
- Returns:
a string with execution status
- Return type:
str
Example
Resize the warehouse to “SMALL” size
>> warehouse.resize(“SMALL”)
- set_auto_suspend(seconds: int)#
Specifies the number of seconds of inactivity after which a warehouse is automatically suspended. | - Setting a value less than 60 is allowed, but may not result in the desired/expected behavior because the background process that suspends a warehouse runs approximately every 60 seconds and, therefore, is not intended for enabling exact control over warehouse suspension. | - Setting a 0 or NULL value means the warehouse never suspends
- Parameters:
seconds (int) – the number of seconds of inactivity after which a warehouse is automatically suspended
- Returns:
a string with execution status
- Return type:
str
Example
Set the warehouse timeout to 60 seconds
>> warehouse.auto_suspend(60)
- class ice_pick.Role(session, name: str)#
Bases:
AccountObject- mock_ddl() str#
Return the ddl of objects that don’t work with the Snowflake GET_DDL() function
Supports: - WAREHOUSE - USER - ROLE
- show_grants_of()#
- show_grants_to()#
- show_future_grants()#
- show_grants_recursive()#
If a role is granted to another role, this function will also look at the privileges of the granted roles
- class ice_pick.User(session, name: str)#
Bases:
AccountObject- mock_ddl() str#
Return the ddl of objects that don’t work with the Snowflake GET_DDL() function
Supports: - WAREHOUSE - USER - ROLE
- show_grants_to() pandas.DataFrame#
return dataframe with grant information
- get_roles() list#
return granted roles as Role objects
- get_all_privileges() pandas.DataFrame#
return all privileges for a user
- check_privilege(schema_object: ice_pick.schema_object.SchemaObject) list#
maybe a quick way to show privilege on object?
- class ice_pick.Database(session, name: str)#
Bases:
AccountObject
- class ice_pick.Schema(session, name: str)#
Bases:
AccountObject
- class ice_pick.Integration(session, name: str)#
Bases:
AccountObject
- class ice_pick.NetworkPolicy(session, name: str)#
Bases:
AccountObject
- class ice_pick.ResourceMonitor(session, name: str)#
Bases:
AccountObject
- class ice_pick.Privilege#
can pass any schema object or account object Mostly here just to help validation
- object: ice_pick.schema_object.SchemaObject | ice_pick.account_object.AccountObject | ice_pick.account_object.Account#
- definition: str#
- __post_init__()#
- class ice_pick.Grant(session: snowflake.snowpark.Session, privilege: Privilege, role: ice_pick.account_object.Role, grant_option: str = None, future_str: str = None)#
- __repr__()#
Return repr(self).
- check_exists() bool#
- execute_grant()#
create the grant
- revoke()#
- validate(GrantTree)#
- ice_pick.extend_session(Session: extend_session.Session) extend_session.Session#
Returns the extended Session class
- Parameters:
session (Session) – Snowpark Session
- Returns:
The exteneded Snowpark Session
- Return type:
Session
Example
>> session = extend_session(Session).builder.configs(connection_parameters).create()
- ice_pick.concat_standalone(session: snowflake.snowpark.Session, union_dfs: list) snowflake.snowpark.DataFrame#
Returns a unioned dataframe from the input list of dataframes based on column names. Primarly to handle cases where the number of columns do not match, which is not suppored by the base union function. If columns do not match, non-matching columns are added with null values to the base dataframes.
- Parameters:
session (Session) – session object
union_dfs (list) – A list of the input snowpark dataframes to union
- Returns:
A snowpark dataframe with the unioned input dataframes
- Return type:
snowpark.DataFrame
Example
>> schema_1 = StructType([StructField(“a”, IntegerType()), StructField(“b”, StringType())])>> schema_2 = StructType([StructField(“a”, FloatType()), StructField(“c”, StringType())])>> schema_3 = StructType([StructField(“a”, IntegerType()), StructField(“c”, StringType())])>> schema_4 = StructType([StructField(“c”, StringType()), StructField(“d”, StringType())])>> df_1 = session.create_dataframe([[1, “snow”], [3, “flake”]], schema_1)>> df_2 = session.create_dataframe([[2.0, “ice”], [4.0, “pick”]], schema_2)>> df_3 = session.create_dataframe([[6, “test_1”], [7, “test_2”]], schema_3)>> df_4 = session.create_dataframe([[“testing_d”, “testing_f”], [“testing_g”, “testing_h”]], schema_4)>> union_dfs = [df_1, df_2, df_3, df_4]>> unioned_df = auto_union(session, union_dfs)>> unioned_df.show()—————————————-—————————————-—————————————-
- ice_pick.melt_standalone(session: snowflake.snowpark.Session, df: snowflake.snowpark.DataFrame, id_vars: list, value_vars: list, var_name: str = 'variable', value_name: str = 'value') snowflake.snowpark.DataFrame#
Returns a unioned dataframe from the input list of dataframes based on column names. Primarly to handle cases where the number of columns do not match, which is not suppored by the base union function. If columns do not match, non-matching columns are added with null values to the base dataframes.
- Parameters:
session (Session) – session object
df (snowpark.DataFrame) – A snowpark dataframe to melt
id_vars (list) – Column names to use as identifiers
value_vars (list) – Column names to unpivot
var_name (str, default 'variable') – Name of the variable column
value_name (str, default 'value') – Name of the value column
- Returns:
A snowpark dataframe with the unpivoted dataframes
- Return type:
snowpark.DataFrame
Example
>> schema = StructType([StructField(“A”, StringType()), StructField(“B”, IntegerType()), StructField(“C”, IntegerType())])>> df = session.create_dataframe([[‘a’, 1, 2], [‘b’, 3, 4], [‘c’, 5, 6]], schema)>> melt_df = session.melt(df, [‘A’], [‘B’, ‘C’])>> melt_df.show()