Clusters

Create, destroy, and otherwise interact with Rackspace CloudBigData clusters

You’ll spend most of your time with lavaclient spinning up, tearing down, and otherwise manipulating your clusters.

For example:

>>> [key.name for key in lava.credentials.list_ssh_keys()]
[u'scott@myhostname']

>>> stack = lava.stacks.get('HADOOP_HDP2_2')
>>> stack
StackDetail(id='HADOOP_HDP2_2', name='Hadoop HDP 2.2', created, description, distro, links, node_groups, services)

>>> stack.node_groups
[StackNodeGroup(id='gateway', flavor_id, components, count, resource_limits),
 StackNodeGroup(id='master', flavor_id, components, count, resource_limits),
 StackNodeGroup(id='secondary', flavor_id, components, count, resource_limits),
 StackNodeGroup(id='slave', flavor_id, components, count, resource_limits)]

>>> lava.flavors.list
[Flavor(id='hadoop1-15', name='Medium Hadoop Instance', disk, links, ram, vcpus),
 Flavor(id='hadoop1-30', name='Large Hadoop Instance', disk, links, ram, vcpus),
 Flavor(id='hadoop1-60', name='XLarge Hadoop Instance', disk, links, ram, vcpus),
 Flavor(id='hadoop1-7', name='Small Hadoop Instance', disk, links, ram, vcpus)]

>>> cluster = lava.clusters.create(
        'my_hadoop_cluster',
        'HADOOP_HDP2_2',
        username='scott',
        ssh_keys=['scott@myhostname'],
        node_groups=[{'id': 'slave', 'count': 1, 'flavor_id': 'hadoop1-7'}],
        wait=True)
>>> cluster
ClusterDetail(id='a12093dc-845b-4cfc-8b12-cec920695ccc', name='my_hadoop_cluster', stack_id, cbd_version, created, links, node_groups, progress, scripts, status, updated, username)

# Look at cluster nodes
[Node(id='58329654-09f5-45c2-86bc-a5469836c38d', name='master-1', flavor_id, addresses, components, created, node_group, private_ip, public_ip, status, updated),
 Node(id='7595cdb7-5cb9-4cde-b033-84709998a6e0', name='secondary-1', flavor_id, addresses, components, created, node_group, private_ip, public_ip, status, updated),
 Node(id='9831887a-88d6-4e35-9046-4c5ce0765b29', name='gateway-1', flavor_id, addresses, components, created, node_group, private_ip, public_ip, status, updated),
 Node(id='b32e94eb-ba88-43ef-a833-27824446b48e', name='slave-1', flavor_id, addresses, components, created, node_group, private_ip, public_ip, status, updated)]

>>> slave = cluster.node_groups[-1]
>>> slave.count
1

# Add two more slave nodes
>>> cluster = cluster.resize(node_groups=[{'id': 'slave', 'count': 3}], wait=True)
>>> [node for node in cluster.nodes if node.node_group == 'slave']
[Node(id='b32e94eb-ba88-43ef-a833-27824446b48e', name='slave-1', flavor_id, addresses, components, created, node_group, private_ip, public_ip, status, updated),
 Node(id='26126a8e-cf2f-4e0a-a5b8-0d5c56f25036', name='slave-2', flavor_id, addresses, components, created, node_group, private_ip, public_ip, status, updated),
 Node(id='16e72fa1-7113-4020-81f6-de20a8c49b98', name='slave-3', flavor_id, addresses, components, created, node_group, private_ip, public_ip, status, updated)]

# Execute command on gateway
>>> cluster.ssh_execute_on_node('gateway-1', 'whoami')
'scott\n'

# Delete the cluster
>>> cluster.delete()
ClusterDetail(id='a12093dc-845b-4cfc-8b12-cec920695ccc', name='my_hadoop_cluster', stack_id, cbd_version, created, links, node_groups, progress, scripts, status, updated, username)

On the command line, you have a few additional useful commands available:

$ lava clusters ssh a12093dc-845b-4cfc-8b12-cec920695ccc
Last login: Thu Jun 18 18:25:50 2015 from some.host
[scott@gateway-1 ~]$ exit
logout
Connection to xxx.xxx.xxx.xxx closed.

$ lava clusters ssh_proxy a12093dc-845b-4cfc-8b12-cec920695ccc
Starting SOCKS proxy via node gateway-1 (xxx.xxx.xxx.xxx)
Successfully created SOCKS proxy on localhost:12345
Use Ctrl-C to stop proxy
^CSOCKS proxy closed

Note

The ssh_proxy command will allow you to access web interfaces for the various services installed on your cluster, e.g. the YARN Web UI (you can find the URL’s for the services installed on your cluster using the nodes() method or CLI command). However, you will have to set up your browser to access the SOCKS proxy using the information provided. Note that this is a SOCKS 5 proxy, not SOCKS 4.

API Reference

class lavaclient.api.clusters.Resource

Clusters API methods

create(name, stack_id, username=None, ssh_keys=None, user_scripts=None, node_groups=None, connectors=None, wait=False, credentials=None, region=None)

Create a cluster

Parameters:
  • name – Cluster name
  • stack_id – Valid stack identifier
  • username – User to create on the cluster; defaults to local user
  • ssh_keys – List of SSH keys; if none is specified, it will use the key user@hostname, creating the key from $HOME/.ssh/id_rsa.pub if it doesn’t exist.
  • node_groupsdict of (node_group_id, attrs) pairs, in which attrs is a dict of node group attributes. Instead of a dict, you may give a list of dicts, each containing the id key. Currently supported attributes are flavor_id and count
  • user_scripts – List of user script ID’s; See lavaclient.api.scripts.Resource.create()
  • credentials – List of credentials to use. Each item must be a dictionary of (type, name) pairs
  • connectors – List of connector credentials to use. Each item must be a dictionary of (type, name) pairs. Deprecated in favor of credentials
  • wait – If True, wait for the cluster to become active before returning
  • region – The region to create the cluster in
Returns:

ClusterDetail

delete(cluster_id)

Delete a cluster

Parameters:cluster_id – Cluster ID
Returns:ClusterDetail
delete_ssh_credentials(cluster_id, keynames, wait)

Remove the specified SSH credentials from a cluster.

Parameters:
  • cluster_id – Cluster ID
  • keys – The ssh key names to remove from the cluster
  • wait – If True, wait for the cluster to become active before returning
Returns:

ClusterDetail

get(cluster_id)

Get the cluster corresponding to the cluster ID

Parameters:cluster_id – Cluster ID
Returns:ClusterDetail
list()

List clusters that belong to the tenant specified in the client

Returns:List of Cluster objects
nodes(cluster_id)

Get the cluster nodes

Parameters:cluster_id – Cluster ID
Returns:List of Node objects
resize(cluster_id, node_groups=None, wait=False)

Resize a cluster

Parameters:
  • cluster_id – Cluster ID
  • node_groupsdict of (node_group_id, attrs) pairs, in which attrs is a dict of node group attributes. Instead of a dict, you may give a list of dicts, each containing the id key. Currently supported attributes are flavor_id and count
Returns:

ClusterDetail

ssh_execute(cluster_id, node_name, command, ssh_command=None, wait=False)

Execute a command over SSH to the specified node in the cluster.

Parameters:
  • cluster_id – Cluster ID
  • node_name – Name of node on which to make the SSH connection. By default, use first available node.
  • command – Shell command to execute remotely
  • ssh_command – SSH shell command to execute locally (default: ‘ssh’)
  • wait – If True, wait for the cluster to become active before creating the proxy
Returns:

The output of running the command

ssh_proxy(cluster_id, port=None, node_name=None, ssh_command=None, wait=False)

Set up a SOCKS5 proxy over SSH to a node in the cluster. Returns the SSH process (via Popen), which can be stopped via the kill() method.

Parameters:
  • cluster_id – Cluster ID
  • port – Local port on which to create the proxy (default: 12345)
  • node_name – Name of node on which to make the SSH connection. By default, use first available node.
  • wait – If True, wait for the cluster to become active before creating the proxy
Returns:

Popen object representing the SSH connection.

ssh_tunnel(cluster_id, local_port, remote_port, node_name=None, component=None, ssh_command=None, wait=False)

Create a SSH tunnel from the local host to a particular port on a cluster node. Returns the SSH process (via Popen), which can be stopped via the kill() method.

Parameters:
  • cluster_id – Cluster ID
  • local_port – Port on which to bind on localhost
  • remote_port – Port on which to bind on the cluster node
  • node_name – Name of node on which to make the SSH connection. Mutually exclusive with component
  • component – Name of a component installed on a node in the cluster, e.g. HiveServer2. SSH tunnel will be set up on the first node containing the component. Mutually exclusive with node_name
  • wait – If True, wait for the cluster to become active before creating the proxy
Returns:

Popen object representing the SSH connection.

update_credentials(cluster_id, credentials=None, wait=False)

Update the credentials on a cluster

Parameters:
  • cluster_id – Cluster ID
  • credentials – The credential name to on update the cluster. Credentials must be a dict of {type: [name]} values
  • wait – If True, wait for the cluster to become active before returning
Returns:

ClusterDetail

wait(cluster_id, timeout=None, interval=None)

Wait (blocking) for a cluster to either become active or fail.

Parameters:
  • cluster_id – Cluster ID
  • timeout – Wait timeout in minutes (default: no timeout)
Returns:

ClusterDetail

class lavaclient.api.response.Cluster

Basic cluster information

Only returned by list(). Has the same methods/attributes as ClusterDetail except for node_group, progress, and scripts

class lavaclient.api.response.ClusterDetail

Detailed cluster information

delete()

Delete this cluster.

describe()

Return a pretty-formatted string that describes the format of the data

execute_on_node(node_name, command, ssh_command=None, wait=False)

Execute a command on a cluster node. See: ssh_execute().

refresh()

Refresh the cluster. If this object was returned from list(), it will return the same amount of detail as get().

Returns:ClusterDetail
ssh_proxy(port=None, node_name=None, ssh_command=None, wait=False)

Start a SOCKS5 proxy over SSH to this cluster. See ssh_proxy().

to_dict()

Convert the config to a plain python dictionary

wait(timeout=None, interval=None)

Wait for this cluster to become active. See: wait()

Returns:ClusterDetail
cbd_version

API version at which cluster was created

created

datetime corresponding to creation date

credentials
id
name
node_groups

See: NodeGroup

nodes

See: nodes()

progress
scripts

See: NodeScript

stack_id
status
updated

datetime corresponding to date last updated

username
class lavaclient.api.response.NodeGroup

Group of nodes that share the same flavor and installed services

describe()

Return a pretty-formatted string that describes the format of the data

to_dict()

Convert the config to a plain python dictionary

components
count
flavor_id
id