elasticluster.cluster

class elasticluster.cluster.Cluster(name, cloud_provider, setup_provider, user_key_name, user_key_public, user_key_private, repository=None, **extra)[source]

This is the heart of elasticluster and handles all cluster relevant behavior. You can basically start, setup and stop a cluster. Also it provides factory methods to add nodes to the cluster. A typical workflow is as follows:

  • create a new cluster
  • add nodes to fit your computing needs
  • start cluster; start all instances in the cloud
  • setup cluster; configure all nodes to fit your computing cluster
  • eventually stop cluster; destroys all instances in the cloud
Parameters:
Variables:

nodes – dict [node_type] = [Node] that represents all nodes in this cluster

add_node(kind, image_id, image_user, flavor, security_group, image_userdata='', name=None, **extra)[source]

Adds a new node to the cluster. This factory method provides an easy way to add a new node to the cluster by specifying all relevant parameters. The node does not get started nor setup automatically, this has to be done manually afterwards.

Parameters:
  • kind (str) – kind of node to start. this refers to the groups defined in the ansible setup provider elasticluster.providers.AnsibleSetupProvider Please note that this must match the [a-zA-Z0-9-] regexp, as it is used to build a valid hostname
  • image_id (str) – image id to use for the cloud instance (e.g. ami on amazon)
  • image_user (str) – user to login on given image
  • flavor (str) – machine type to use for cloud instance
  • security_group (str) – security group that defines firewall rules to the instance
  • image_userdata (str) – commands to execute after instance starts
  • name (str) – name of this node, automatically generated if None
Raises:

ValueError: kind argument is an invalid string.

Returns:

created Node

add_nodes(kind, num, image_id, image_user, flavor, security_group, image_userdata='', **extra)[source]

Helper method to add multiple nodes of the same kind to a cluster.

Parameters:
  • kind (str) – kind of node to start. this refers to the groups defined in the ansible setup provider elasticluster.providers.AnsibleSetupProvider
  • num (int) – number of nodes to add of this kind
  • image_id (str) – image id to use for the cloud instance (e.g. ami on amazon)
  • image_user (str) – user to login on given image
  • flavor (str) – machine type to use for cloud instance
  • security_group (str) – security group that defines firewall rules to the instance
  • image_userdata (str) – commands to execute after instance starts
get_all_nodes()[source]

Returns a list of all nodes in this cluster as a mixed list of different node kinds.

Returns:list of Node
get_frontend_node()[source]

Returns the first node of the class specified in the configuration file as ssh_to, or the first node of the first class in alphabetic order.

Returns:Node
Raise:elasticluster.exceptions.NodeNotFound if no valid frontend node is found
remove_node(node)[source]

Removes a node from the cluster, but does not stop it. Use this method with caution.

Parameters:node (Node) – node to remove
setup()[source]

Configure the cluster nodes with the specified This is delegated to the provided elasticluster.providers.AbstractSetupProvider

Returns:bool - True on success, False otherwise
start(min_nodes=None)[source]

Starts up all the instances in the cloud. To speed things up all instances are started in a seperate thread. To make sure elasticluster is not stopped during creation of an instance, it will overwrite the sigint handler. As soon as the last started instance is returned and saved to the repository, sigint is executed as usual. An instance is up and running as soon as a ssh connection can be established. If the startup timeout is reached before all instances are started, the cluster will stop and destroy all instances.

This method is blocking and might take some time depending on the amount of instances to start.

Parameters:min_nodes (dict [node_kind] = number) – minimum number of nodes to start in case the quota is reached before all instances are up
startup_timeout = 600

timeout in seconds to start all nodes

stop(force=False)[source]

Destroys all instances of this cluster and calls delete on the repository.

Parameters:force (bool) – force termination of instances in any case
update()[source]

Update all connection information of the nodes of this cluster. It occurs for example public ip’s are not available imediatly, therefore calling this method might help.

class elasticluster.cluster.Node(name, cluster_name, kind, cloud_provider, user_key_public, user_key_private, user_key_name, image_user, security_group, image, flavor, image_userdata=None, **extra)[source]

The node represents an instance in a cluster. It holds all information to connect to the nodes also manages the cloud instance. It provides the basic functionality to interact with the cloud instance, such as start, stop, check if the instance is up and ssh connect.

Parameters:
  • name (str) – identifier of the node
  • kind (str) – kind of node in regard to cluster. this usually refers to a specified group in the elasticluster.providers.AbstractSetupProvider
  • cloud_provider (elasticluster.providers.AbstractCloudProvider) – cloud provider to manage the instance
  • user_key_public (str) – path to the ssh public key
  • user_key_private (str) – path to the ssh private key
  • user_key_name (str) – name of the ssh key
  • image_user (str) – user to connect to the instance via ssh
  • security_group (str) – security group to setup firewall rules
  • image (str) – image id to launch instance with
  • flavor (str) – machine type to launch instance
  • image_userdata (str) – commands to execute after instance start
Variables:
  • instance_id – id of the node instance on the cloud
  • preferred_ip – IP address used to connect to the node.
  • ips – list of all the IPs defined for this node.
connect()[source]

Connect to the node via ssh using the paramiko library.

Returns:paramiko.SSHClient - ssh connection or None on failure
connection_ip()[source]

Returns the IP to be used to connect to this node.

If the instance has a public IP address, then this is returned, otherwise, its private IP is returned.

connection_timeout = 5

timeout in seconds to connect to host via ssh

is_alive()[source]

Checks if the current node is up and running in the cloud. It only checks the status provided by the cloud interface. Therefore a node might be running, but not yet ready to ssh into it.

pprint()[source]

Pretty print information about the node.

Returns:str - representaion of a node in pretty print
start()[source]

Starts the node on the cloud using the given instance properties. This method is non-blocking, as soon as the node id is returned from the cloud provider, it will return. Therefore the is_alive and update_ips methods can be used to further gather details about the state of the node.

stop()[source]

Destroys the instance launched on the cloud for this specific node.

update_ips()[source]

Retrieves the public and private ip of the instance by using the cloud provider. In some cases the public ip assignment takes some time, but this method is non blocking. To check for a public ip, consider calling this method multiple times during a certain timeout.