Clusters
A cluster is a set of computation applications and configurations that allow you to run data transformation workflows such as our ETL pipelines batch processing and machine learning.
This section focus on the creation of clusters with the different cluster mode, with different configuration and management tools that differs during creation. Learn how to create and manage your clusters
Getting started
Create clusters
Manage clusters
Create clusters
To create a cluster, click on the "Cluster" tab at the top of the navigation bar, and click on "Create cluster" at the sidebar.
it redirects you to the create cluster page where you will have to name and configure the cluster you intend to create for your workflow. There are some configuration options which will be discussed in detail.
After filling in all the information needed for the cluster, then click on the "Create cluster" button at the bottom of the dashboard to submit. Under the cluster creation, there are different configuration parameters
Cluster name: Enter a name for the cluster you want to create in the cluster name field.
Cluster mode: Specify the cluster mode you tend to use for your configuration. The cluster mode you chose will determine the other configuration field that will appear. To select a cluster mode, click on the drop-down and select the cluster mode you want to use either "Light job computes, Jobs computes, and All-purpose computes" depending on what is best for your data and workflow.
Autopilot Options
When you are creating a cluster using our data platform, you can use the "Autopilot option" to reduce the operational cost of managing clusters and optimize clusters for production and yield higher workload availability.
The autopilot options for your clusters are the "Enable Autoscaling option and the Manual Scaling option".
Enable and Configure Autoscaling
To enable our data platform to help you to resize your cluster automatically, you need to enable autoscaling for the cluster and provide the minimum and maximum numbers of workers.
Enable Autoscaling
- Job Computes: When you choose the "Job computes cluster mode", you can either enable autoscaling or manually determine the termination time. On the create cluster page, select the "Enable autoscaling" checkbox in the autopilot option to activate autoscaling.
Terminate After: When you choose to terminate the job manually, click on the "Terminate after checkbox and manually choose or select the minutes of inactivity.
All-Purpose Computes When you select the "All-purpose computes cluster mode, you have the option to select any of the autopilot options. On the create cluster page, select the "Enable autoscaling" checkbox in the autopilot option to activate autoscaling.
Note
It is until you enable the autoscaling option when creating your cluster that you will be able to provide a minimum and maximum worker for your cluster. When you choose the auto termination option alone, you will only be able to choose the number of workers.
Tips
Autoscaling is not available for Light Job computes but it is available for Job computes and All-purpose computes
Workers Type:
To select the workers type, click on the "Workers type" and select the worker type you want to use for cluster from the drop-down list. you have the option of providing a minimum and maximum number of worker for your cluster. The Job computes and All-purpose computes cluster modes works with workers type.
To select a minmum and maximum worker for your cluster, click on the drop-down arrows to select the best number of workers for your cluster.
Supervisor Type
To select your supervisor type, click on the "Supervisor type" and select a type from the drop-down list for your cluster. The Job computes and All-purpose computes cluster modes works with supervisor type.
Tips
Light Job computes does not support worker type and supervisor type.
Node Type
When you are creating a cluster for your workflow and you choose the Light Job computes, the "Node Types" appears for your configuration. To select a node type, click on the "Node Type" drop-down list to select a node for your cluster.
Tips
The Node Type does not support the Job computes and All-purpose computes cluster mode
Manage Cluster
This section describes how to manage the created cluster including displaying, cloning, restarting, terminating and deleting and monitoring the performance of the clusters.
Display cluster
Click on "cluster" at the navigation bar to display your clusters.
With this display, you can view the details of every cluster created which allows you to manage the clusters. The following parameter of the display page includes:
Cluster name
State of cluster
Nodes
Workers
Supervisor
Autoscaling
Created on
Action
Cloning a cluster
You can clone a cluster from the display page of a cluster. Click on the " Clone icon" under the action tab.
you can clone a cluster while creating a cluster from the cluster detail page.
Restart a cluster
You can restart a previously terminated cluster from the cluster display page which allows you to resume a previously terminated cluster.
you can restart a cluster from the cluster display list or the cluster detail page.
From the cluster detail page
Terminate/Stop a cluster
You can manually terminate a cluster or configure the cluster to automatically terminate after a specified period of inactivity. A terminated cluster cannot run a job(s).
Manual termination
you can manually terminate a cluster from the display page list.
You can also manually terminate a cluster from the cluster detail page
Automatic termination
You have the option to auto terminate a cluster after a certain period of inactivity. When creating a cluster, under the autopilot option, you can set an inactivity period in minutes after which you want the cluster to terminate.
you can configure an automatic termination in the auto termination field under the autopilot option on the cluster creation page.
Click on the Automatic termination checkbox and set a time in minutes for inactivity.
You can opt-out of the auto termination by clicking on the auto termination checkbox or by specifying the inactivity time period to be 0.
Delete a cluster
when you delete a cluster, it permanently terminates and removes the configuration of the cluster.
Warning
Once you delete a cluster, you cannot undo the action
you can delete a cluster from the cluster display list or cluster detail page.
from the cluster detail page
Updated about 2 years ago