Starburst Galaxy

  •  Get started

  •  Working with data

  •  Data engineering

  •  Developer tools

  •  Cluster administration

  •  Security and compliance

  •  Troubleshooting

  • Galaxy status

  •  Reference

  • Schedule jobs #

    Starburst Galaxy provides several ways to perform maintenance tasks on a set schedule:

    SQL maintenance jobs #

    SQL maintenance job scheduling is best suited for data maintenance statements, such as CREATE, DROP, REFRESH, MERGE and TRUNCATE. Statements that return results are not supported, which excludes statements that begin with SELECT; however, SELECT as part of a statement is supported.

    In the navigation menu, click Jobs. The SQL tab is where you create, view, search for, and manage SQL maintenance jobs.

    All of the SQL maintenance jobs you create are displayed in a list, and each job in the list includes the following columns:

    • Name: The name of the SQL job. Click the column heading to sort in ascending or descending alphabetical order.
    • Description: The description provided for the SQL job.
    • Last run status: When the SQL job was last run.
    • Executing role: The role running the SQL job.
    • Last run ended: The date and time the last SQL job run ended.
    • Next run starts: The next date and time the SQL job is scheduled to start running. If the schedule is paused, the status of the schedule also appears here.

      All SQL jobs pane

    Create a SQL maintenance job #

    To run a statement on a recurring schedule, click Create SQL job, then provide the following information in the New SQL Job dialog:

    • In the Name and description section, enter a name for the job and a useful description.

    • In the Query section:

      1. Expand the first drop-down menu, and choose a cluster to run the statement on. We recommend using a fault tolerant cluster.

      2. From the drop-down menu, select a role to run the statement. The role must have the Manage Security privilege.

      3. Enter the statement you want to run.

    • In the Schedule section:

      1. Choose a time zone from the drop-down menu.

      2. Choose the Select frequency or Enter cron expression recurring interval format.

      For Select frequency: Choose an hourly, daily, weekly, monthly, or annual schedule from the drop-down menu. The corresponding values depend on the schedule selected:

      • Hourly: Enter a value between 1 minute and 59 minutes.
      • Daily: Enter a time in the format hh:mm, then specify AM or PM.
      • Weekly: Enter a time in the format hh:mm, specify AM or PM, then select a day of the week.
      • Monthly: Enter a time in the format hh:mm, specify AM or PM, then select a date.
      • Annually: Enter a month, day, hour, and minutes in the format MM/DD hh:mm. Specify AM or PM.

      For Enter cron expression: Enter the desired schedule in the form of a cron expression. For example, a SQL job run weekly at 9:30 AM on Monday, Wednesday, and Friday:

    30 9 * * 1,3,5
    
    • Click Create SQL job.

      New SQL job dialog

    View SQL job details #

    To view the details of a SQL job, click the name of the job. The header of the Job details pane displays the following information about your SQL job:

    • Description: The description provided for the SQL job.
    • Next run: The next date and time the SQL job is scheduled to run. If the schedule is paused, the status of the schedule also appears here.
    • Cluster: The chosen cluster.
    • Executing role: The role running the SQL job.

    Run now allows you to run the SQL job instantly.

    Completed statements and statements in progress appear in the Job history section, which displays the following information:

    • Query ID: A unique identifier for each statement. Click the Query ID to view statement details.
    • Status: check_circle for successfully completed statements and close for failed statements.
    • Started: Date and time the statement started running.
    • Elapsed time: Total duration for processing the statement.
    • Query progress: How much of the process is completed; shown as a percentage.

    To see statement details, click the Query ID.

    Scheduled query sql jobs job details

    Manage SQL maintenance jobs #

    You can manage SQL maintenance jobs in the SQL jobs pane and Job details pane. Click themore_vertoptions menu to edit, delete, pause, or resume the selected SQL maintenance job.

    MV refresh jobs #

    Materialized view refresh is only available for object storage catalogs.

    In the navigation menu, click Jobs. The Materialized view refresh tab is where you view, search for, and manage materialized view refresh jobs. Create a materialized view refresh job with SQL commands as described in this section.

    All of the materialized view refresh jobs you create are displayed in a list with the following information:

    • Job name: The name of the materialized view refresh job. Click the column heading to sort in ascending or descending alphabetical order.
    • Last run status: When the materialized view refresh job was last run.
    • Executing role: The role running the materialized view refresh job.
    • Last run ended: The date and time the last materialized view refresh job run ended.
    • Next run starts: The next date and time the materialized view refresh job is scheduled to start running. If the schedule is paused, the status of the schedule also appears here.

    Create a MV refresh job #

    To create a materialized view refresh job, go to the query editor, and write a statement using CREATE MATERIALIZED VIEW that includes the refresh_schedule property. The refresh_schedule property must be set with the frequency of the refresh as a cron expression:

    CREATE MATERIALIZED VIEW my_refresh_materialization
    WITH (
      refresh_schedule = '0 0 1 * *'
    ) AS
        SELECT *
        FROM lakehouse.burst_bank.customer
    

    Runs that are completed successfully appear automatically on the Materialized refresh view tab.

    To make changes to a materialized view refresh job or to schedule a refresh job for an existing materialized view, use the ALTER MATERIALIZED VIEW statement.

    To drop a materialized view refresh job, use the DROP MATERIALIZED VIEW statement. To remove a refresh job from a materialized view without dropping it, set the refresh_schedule property to DEFAULT.

    View MV refresh job details #

    To view the details of a materialized view refresh job, click the name of the job. The header of the Job details pane displays the following information about your materialized view refresh job:

    • Description: A read-only, auto generated description for the materialized view refresh job.
    • Next run: The next date and time the materialized view refresh job is scheduled to run. If the schedule is paused, the status of the schedule also appears here.
    • Cluster: The chosen cluster.
    • Executing role: The role running the materialized view refresh job.

    Run now allows you to run the materialized view refresh job instantly.

    Completed statements and statements in progress appear in the Job history section, which displays the following information:

    • Query ID: A unique identifier for each statement. Click the Query ID to view statement details.
    • Status: check_circle for successfully completed statements and close for failed statements.
    • Started: Date and time the statement started running.
    • Elapsed time: Total duration for processing the statement.
    • Query progress: How much of the process is completed; shown as a percentage.

    To see statement details, click the Query ID.

    Scheduled tasks MV refresh job details

    Manage MV refresh jobs #

    You can manage materialized view refresh jobs in the Materialized view refresh jobs pane and Job details pane. Click themore_vertoptions menu to edit, delete, pause, or resume the selected SQL job.

    Data optimization jobs #

    For object storage catalogs only, you can run table maintenance tasks that improve performance and reduce storage. The available maintenance tasks differ, depending on the table type.

    For further information, see Schedule optimization jobs.

    Data classification jobs #

    You can automatically classify data in catalogs, schemas, tables, or views in order to apply attribute tags to that data.

    For details, see Automatic data tagging.