Clusters¶
Create, destroy, and otherwise interact with Rackspace CloudBigData clusters
You’ll spend most of your time with lavaclient
spinning up, tearing
down, and otherwise manipulating your clusters.
For example:
>>> [key.name for key in lava.credentials.list_ssh_keys()]
[u'scott@myhostname']
>>> stack = lava.stacks.get('HADOOP_HDP2_2')
>>> stack
StackDetail(id='HADOOP_HDP2_2', name='Hadoop HDP 2.2', created, description, distro, links, node_groups, services)
>>> stack.node_groups
[StackNodeGroup(id='gateway', flavor_id, components, count, resource_limits),
StackNodeGroup(id='master', flavor_id, components, count, resource_limits),
StackNodeGroup(id='secondary', flavor_id, components, count, resource_limits),
StackNodeGroup(id='slave', flavor_id, components, count, resource_limits)]
>>> lava.flavors.list
[Flavor(id='hadoop1-15', name='Medium Hadoop Instance', disk, links, ram, vcpus),
Flavor(id='hadoop1-30', name='Large Hadoop Instance', disk, links, ram, vcpus),
Flavor(id='hadoop1-60', name='XLarge Hadoop Instance', disk, links, ram, vcpus),
Flavor(id='hadoop1-7', name='Small Hadoop Instance', disk, links, ram, vcpus)]
>>> cluster = lava.clusters.create(
'my_hadoop_cluster',
'HADOOP_HDP2_2',
username='scott',
ssh_keys=['scott@myhostname'],
node_groups=[{'id': 'slave', 'count': 1, 'flavor_id': 'hadoop1-7'}],
wait=True)
>>> cluster
ClusterDetail(id='a12093dc-845b-4cfc-8b12-cec920695ccc', name='my_hadoop_cluster', stack_id, cbd_version, created, links, node_groups, progress, scripts, status, updated, username)
# Look at cluster nodes
[Node(id='58329654-09f5-45c2-86bc-a5469836c38d', name='master-1', flavor_id, addresses, components, created, node_group, private_ip, public_ip, status, updated),
Node(id='7595cdb7-5cb9-4cde-b033-84709998a6e0', name='secondary-1', flavor_id, addresses, components, created, node_group, private_ip, public_ip, status, updated),
Node(id='9831887a-88d6-4e35-9046-4c5ce0765b29', name='gateway-1', flavor_id, addresses, components, created, node_group, private_ip, public_ip, status, updated),
Node(id='b32e94eb-ba88-43ef-a833-27824446b48e', name='slave-1', flavor_id, addresses, components, created, node_group, private_ip, public_ip, status, updated)]
>>> slave = cluster.node_groups[-1]
>>> slave.count
1
# Add two more slave nodes
>>> cluster = cluster.resize(node_groups=[{'id': 'slave', 'count': 3}], wait=True)
>>> [node for node in cluster.nodes if node.node_group == 'slave']
[Node(id='b32e94eb-ba88-43ef-a833-27824446b48e', name='slave-1', flavor_id, addresses, components, created, node_group, private_ip, public_ip, status, updated),
Node(id='26126a8e-cf2f-4e0a-a5b8-0d5c56f25036', name='slave-2', flavor_id, addresses, components, created, node_group, private_ip, public_ip, status, updated),
Node(id='16e72fa1-7113-4020-81f6-de20a8c49b98', name='slave-3', flavor_id, addresses, components, created, node_group, private_ip, public_ip, status, updated)]
# Execute command on gateway
>>> cluster.ssh_execute_on_node('gateway-1', 'whoami')
'scott\n'
# Delete the cluster
>>> cluster.delete()
ClusterDetail(id='a12093dc-845b-4cfc-8b12-cec920695ccc', name='my_hadoop_cluster', stack_id, cbd_version, created, links, node_groups, progress, scripts, status, updated, username)
On the command line, you have a few additional useful commands available:
$ lava clusters ssh a12093dc-845b-4cfc-8b12-cec920695ccc
Last login: Thu Jun 18 18:25:50 2015 from some.host
[scott@gateway-1 ~]$ exit
logout
Connection to xxx.xxx.xxx.xxx closed.
$ lava clusters ssh_proxy a12093dc-845b-4cfc-8b12-cec920695ccc
Starting SOCKS proxy via node gateway-1 (xxx.xxx.xxx.xxx)
Successfully created SOCKS proxy on localhost:12345
Use Ctrl-C to stop proxy
^CSOCKS proxy closed
Note
The ssh_proxy command will allow you to access web interfaces for the
various services installed on your cluster, e.g. the YARN Web UI (you can
find the URL’s for the services installed on your cluster using the
nodes()
method or CLI command).
However, you will have to set up your browser to access the SOCKS proxy
using the information provided. Note that this is a SOCKS 5 proxy, not
SOCKS 4.
API Reference¶
-
class
lavaclient.api.clusters.
Resource
¶ Clusters API methods
-
create
(name, stack_id, username=None, ssh_keys=None, user_scripts=None, node_groups=None, connectors=None, wait=False, credentials=None, region=None)¶ Create a cluster
Parameters: - name – Cluster name
- stack_id – Valid stack identifier
- username – User to create on the cluster; defaults to local user
- ssh_keys – List of SSH keys; if none is specified, it will use the key user@hostname, creating the key from $HOME/.ssh/id_rsa.pub if it doesn’t exist.
- node_groups – dict of (node_group_id, attrs) pairs, in which attrs is a dict of node group attributes. Instead of a dict, you may give a list of dicts, each containing the id key. Currently supported attributes are flavor_id and count
- user_scripts – List of user script ID’s; See
lavaclient.api.scripts.Resource.create()
- credentials – List of credentials to use. Each item must be a dictionary of (type, name) pairs
- connectors – List of connector credentials to use. Each item must be a dictionary of (type, name) pairs. Deprecated in favor of credentials
- wait – If True, wait for the cluster to become active before returning
- region – The region to create the cluster in
Returns:
-
delete
(cluster_id)¶ Delete a cluster
Parameters: cluster_id – Cluster ID Returns: ClusterDetail
-
delete_ssh_credentials
(cluster_id, keynames, wait)¶ Remove the specified SSH credentials from a cluster.
Parameters: - cluster_id – Cluster ID
- keys – The ssh key names to remove from the cluster
- wait – If True, wait for the cluster to become active before returning
Returns:
-
get
(cluster_id)¶ Get the cluster corresponding to the cluster ID
Parameters: cluster_id – Cluster ID Returns: ClusterDetail
-
list
()¶ List clusters that belong to the tenant specified in the client
Returns: List of Cluster
objects
-
nodes
(cluster_id)¶ Get the cluster nodes
Parameters: cluster_id – Cluster ID Returns: List of Node
objects
-
resize
(cluster_id, node_groups=None, wait=False)¶ Resize a cluster
Parameters: - cluster_id – Cluster ID
- node_groups – dict of (node_group_id, attrs) pairs, in which attrs is a dict of node group attributes. Instead of a dict, you may give a list of dicts, each containing the id key. Currently supported attributes are flavor_id and count
Returns:
-
ssh_execute
(cluster_id, node_name, command, ssh_command=None, wait=False)¶ Execute a command over SSH to the specified node in the cluster.
Parameters: - cluster_id – Cluster ID
- node_name – Name of node on which to make the SSH connection. By default, use first available node.
- command – Shell command to execute remotely
- ssh_command – SSH shell command to execute locally (default: ‘ssh’)
- wait – If True, wait for the cluster to become active before creating the proxy
Returns: The output of running the command
-
ssh_proxy
(cluster_id, port=None, node_name=None, ssh_command=None, wait=False)¶ Set up a SOCKS5 proxy over SSH to a node in the cluster. Returns the SSH process (via
Popen
), which can be stopped via thekill()
method.Parameters: - cluster_id – Cluster ID
- port – Local port on which to create the proxy (default: 12345)
- node_name – Name of node on which to make the SSH connection. By default, use first available node.
- wait – If True, wait for the cluster to become active before creating the proxy
Returns: Popen
object representing the SSH connection.
-
ssh_tunnel
(cluster_id, local_port, remote_port, node_name=None, component=None, ssh_command=None, wait=False)¶ Create a SSH tunnel from the local host to a particular port on a cluster node. Returns the SSH process (via
Popen
), which can be stopped via thekill()
method.Parameters: - cluster_id – Cluster ID
- local_port – Port on which to bind on localhost
- remote_port – Port on which to bind on the cluster node
- node_name – Name of node on which to make the SSH connection. Mutually exclusive with component
- component – Name of a component installed on a node in the cluster, e.g. HiveServer2. SSH tunnel will be set up on the first node containing the component. Mutually exclusive with node_name
- wait – If True, wait for the cluster to become active before creating the proxy
Returns: Popen
object representing the SSH connection.
-
update_credentials
(cluster_id, credentials=None, wait=False)¶ Update the credentials on a cluster
Parameters: - cluster_id – Cluster ID
- credentials – The credential name to on update the cluster. Credentials must be a dict of {type: [name]} values
- wait – If True, wait for the cluster to become active before returning
Returns:
-
wait
(cluster_id, timeout=None, interval=None)¶ Wait (blocking) for a cluster to either become active or fail.
Parameters: - cluster_id – Cluster ID
- timeout – Wait timeout in minutes (default: no timeout)
Returns:
-
-
class
lavaclient.api.response.
Cluster
¶ Basic cluster information
Only returned by
list()
. Has the same methods/attributes asClusterDetail
except for node_group, progress, and scripts
-
class
lavaclient.api.response.
ClusterDetail
¶ Detailed cluster information
-
delete
()¶ Delete this cluster.
-
describe
()¶ Return a pretty-formatted string that describes the format of the data
-
execute_on_node
(node_name, command, ssh_command=None, wait=False)¶ Execute a command on a cluster node. See:
ssh_execute()
.
-
refresh
()¶ Refresh the cluster. If this object was returned from
list()
, it will return the same amount of detail asget()
.Returns: ClusterDetail
-
ssh_proxy
(port=None, node_name=None, ssh_command=None, wait=False)¶ Start a SOCKS5 proxy over SSH to this cluster. See
ssh_proxy()
.
-
to_dict
()¶ Convert the config to a plain python dictionary
-
wait
(timeout=None, interval=None)¶ Wait for this cluster to become active. See:
wait()
Returns: ClusterDetail
-
cbd_version
¶ API version at which cluster was created
-
credentials
¶
-
id
¶
-
links
¶
-
name
¶
-
progress
¶
-
scripts
¶ See:
NodeScript
-
stack_id
¶
-
status
¶
-
username
¶
-