OpenStack plugin is to let Cloudera Director be able to deploy and manage clusters on a given OpenStack platform. By OpenStack plugin, the user can customize his/her own environment configurations following his/her OpenStack setup, and create Cloudera Manager instance as well as a Cloudera cluster with this Cloudera Manager.
Currently, OpenStack plugin can support below features:
- Create Nova instances using the credentials and image/network/security groups assigned.
- In a nova-network setup, allocate floating IPs from a specified floating IP pool, and associate to the specified instances.
- Allocate cinder volumes, and attach them to specified instances.
- Search or release resources by given IDs.
-
Users need to ensure each of his instances should be able to connect to the other instances via hostname, for instances in a Cloudera cluster is using hostnames to connect to other instances. This can be divided into two cases:
-
A genuine OpenStack setup using nova-network still has flaw in the hostname resolving. Instance can only recognize the hostnames of the other instances using the same DHCP service as itself, that is, on the same host as itself. In this case, users need to construct a global DNS service by himself to ensure all hostnames can be resolved and accessed from all instances.
-
A genuine OpenStack setup using Neutron can resolve all hostnames. However, there is still a bug in Neutron that the hostnames kept in the DNS server do not match the real hostname that each instance got. E.g., when user creates an instance with a name "MyInst" and its private IP address is 10.0.0.2, then "MyInst" will be passed to this instance as its hostname. However, the hostname recorded in DNS server is generated from its allocated address as host-10-0-0-2.openstacklocal. To walk around this issue, the user needs to use a specified image, which can change its own hostname on boot to the value recorded in DNS server. This can be achieved by adding below lines to the end of /etc/rc.local file in the image.
MYIP=`ifconfig | awk '/inet addr/{print substr($2,6)}'` array=(${MYIP// /\n}) add1=${array[0]} hn=(${add1//\./-}) hostn=host-$hn.openstacklocal hostname $hostn
-
-
This OpenStack plugin does not support Neuton floating IP APIs yet. So if the user is using Neutron, this plugin cannot support floating IP operations like allocating and associating. The private network still works. If the user wants to use Neutron + floating IP, he/she can create Cloudera Manager node and cluster without floating IP first, and associate floating IPs on the Cloud Provider dashboard.
-
Currently Cloudera Director hard-codes the disk device characters to /dev/xvd* (This is inherited from AWS, where the instances are based on Xen, and have disks in /dev/xvd*. However, this became a bug for virtual machines based on KVM.), and the bootstrap script will encounter problem on OpenStack environments based on KVM. The user based on KVM has to do below things to avoid this issue:
- Use Cloudera Director 1.5.0 instead of latest 2.0, until the issue was resolved.
- Use a specific image for Cloudera Manager and cluster instances. This image can auto-resize the instance root partition to the whole disk. Please refer to the Fast Guide how to build such an image.
-
This OpenStack plugin does not completely support creating external database (Trove support) for Cloudera Manager. Trove (Database as a Service for OpenStack) does not have an API to set a specified password for root database user. So the current trove version is incompatible with Cloudera Director API. If the user wants to create an external database, he/she has to hack trove python code to make sure that command trove root-enable instance_id will return a constant value instead of a random value as password. This can be done via editing file trove/guestagent/datastore/mysql/service.py in trove library directory and modify:
def enable_root(cls, root_password=None): ... # change below line to the following line. # user.password = root_password or utils.generate_ramdom_password() user.password = "root" or utils.generate_ramdom_password() ...
Currently we disable the trove in this plugin. If the user want to take a try, he/she also need enable int this plugin. you can set system environment value TROVE_ENABLE=true or modify the java code in OpenStackProvider.java directly
To use OpenStack plugin, the user needs to have an OpenStack setup, which is capable of running the instances to be allocated for the Cloudera Manager and roles in Cloudera cluster, as well as the volumes, floating IPs, and other resources requried.
The instances allocated in the OpenStack setup should be able to access internet (access via a proxy is ok), for Cloudera Director will download or install files, parcels, and packages from internet. Without internet access, this process will fail.
The user must ensure the security groups were already properly set, so that the instances could be accessed via SSH, Cloudera Director dashboard, Cloudera Manager dashboard. The ports for Cloudera Manager server, Cloudera Manager agent, and other Big Data services should also be allowed in the security groups for the corresponding roles.
- An instance running Cloudera Director. The user needs a server (could be a baremetal machine, or a virtual instance) to run Cloudera Director service. This server should be able to connect the instances in the OpenStack cloud by private IP addresses.
- java-1.8.0-openjdk (CentOS, RedHat) or openjdk-8-jdk (Ubuntu) or their later version is required on the Cloudera Director server to support OpenStack plugin.
- The user should know below information about the OpenStack platform.
- KeyStone endpoint, a URL like "http://172.16.0.1:5000/v2.0/" format.
- User name
- Tenant name
- User password
- Region name
- Availability zone name (if exists)
- Floating IP pool name
- Private network ID
- Image IDs for the Cloudera Manager instance and Cluster node instances
- Keypair for the instances to be created
- Security group names for the instances to be created
- User names of the instances to be created. They are determined by the chosen images.
Below is a quick start to tell how to deploy Cloudera Director and enable OpenStack Plugin, and use it to deploy a Cloudera cluster:
-
Enable OpenStack Plugin
-
Download plugin source and compile the plugin by mvn.
git clone https://github.com/cloudera/director-openstack-plugin/ cd director-openstack-plugin && mvn clean package
The jar file will be generated at path director-openstack-plugin/target/openstack-1.0.0-SNAPSHOT.jar.
-
Upload the jar file to Cloudera Director server, and put it in directory /var/lib/cloudera-director-plugins/openstack-provider-1.0.0/openstack-provider-1.0.0.jar
-
Restart Cloudera Director service by run below command on Cloudera Director server.
sudo chown cloudera-director:cloudera-director /var/lib/cloudera-director-plugins/openstack-provider-1.0.0/openstack-provider-1.0.0.jar sudo service cloudera-director-server restart
-
-
Login Cloudera Director Dashboard
- Open a browser and input the Cloudera Director URL: http://<cloudera_director_ip>:7189/
- Login with default user and password: admin/admin.
-
Add an User Environment
- Click "Add Environment", and select "OpenStack" for "Cloud Provider".
- Input the information required and click "Continue":
- Environment Name
- KeyStone Endpoint
- OpenStack Tenant Name
- OpenStack User Name
- OpenStack User Password
- Region
- Instance SSH User Name
- Instance SSH Private Key
-
Create Cloudera Manager Instance
- In the environment page, click the pulldown menu, and select "Add Cloudera Manager".
- Input Cloudera Manager name.
- Click the pulldown menu to create a new instance template.
- Put in the information required by instance template:
- Instance template name
- Instance flavor name (the selected flavor should at least have 8GB memory size, recommended >12GB)
- Image ID
- Security group names
- Network ID
- Keypair name
- The user may also need to put in below additional information:
- Instance name prefix
- Availability zone
- Floating IP pool name
- volume number
- volume size
- SSH user name
- Bootstrap script
- Alternatively the user can click the pulldown menu to create DB Server Instance instead of embedded DB.
- Put in the information required by DB Server
- DB Name
- Master username
- Master user password (Referring to Known Limitations, currently the user have to input a constant value)
- DB engine
- Flavor ID
- Volume Size
- Save the template, and click "Continue" to create the Cloudera Manager instance. Wait until the process succeed and Cloudera Manager become ready.
-
Create Cloudera Cluster
- In the environment page, click the pulldown menu, and select "Add Cluster".
- Input cluster name.
- Choose the services to be enabled. Here we just choose default "Core Hadoop".
- Create instance templates for master, worker, and gateway, as we did for Cloudera Manager instance.
- Choose the instance count for master, worker, and gateway. We suggest 1 master, 1 gateway, and at least 3 workers.
- Click "Continue" to create the cluster. Wait until the process succeed and Cloudera cluster become healthy.
-
Setup an Environment Using a Proxy to Connect to Internet If you are using an OpenStack setup in which instance access to internet should be through a proxy, you need to do below steps:
-
In the images for the instances, add "proxy=http://<proxy_host>:" in file /etc/yum.conf, so that they can install packages from internet.
-
In the Cloudera Director server, add the proxy info to director service by below commands:
echo lp.proxy.http.host: <proxy_host> >> /etc/cloudera-director-server/application.properties echo lp.proxy.http.port: <port> >> /etc/cloudera-director-server/application.properties service cloudera-director-server restart
-
After step 4 and before step 5, login Cloudera Manager UI by URL http://:7180 and username/password as "admin/admin", and change the proxy setup by:
-
Click Administration->Settings
-
Search "proxy"
-
Set "Proxy Server" and "Proxy Port" value to your proxy, and click "Save Changes"
-
SSH into Cloudera Manager instance, and restart Cloudera Manager Server service by below commands:
sudo service cloudera-scm-server restart
-
-
-
How to build an auto-resize root partition image Here we choose CentOS 6.7 as an example. You can download the original image from (http://cloud.centos.org/centos/6/images/CentOS-6-x86_64-GenericCloud-1509.qcow2). Use the fresh OS image to launch a VM instance, SSH into the instance and su to root user to run below command lines:
# Set proxy if needed # export http_proxy=http://<proxy_host>:<port>/ # export https_proxy=http://<proxy_host>:<port>/ # echo proxy=http://<proxy_host>:<port>/ >> /etc/yum.conf yum update -y yum install -y epel-release # If access to internet via https is limited, replace the https to http in repo files. # sed -i "s/https/http/g" /etc/yum.repos.d/*.repo yum install -y nscd wget cloud-init cloud-utils cloud-utils-growpart dracut-modules-growroot rpm -qa kernel | perl -pe 's/^kernel-//' | xargs -I {} dracut -f /boot/initramfs-{}.img {} touch /root/firstrun
Now you can keep the instance snapshot as an auto-resize image for future usage.
-
How to setup a Cloudera Director server You can easily create a VM instance runing Cloudera Director service in your OpenStack platform from a fresh OS image. We still use CentOS 6.7 as an example. After step 1 above, continue to run below command lines:
wget -O /etc/yum.repos.d/cloudera-director.repo http://archive.cloudera.com/director/redhat/6/x86_64/director/cloudera-director.repo wget -O /etc/yum.repos.d/cloudera-cdh5.repo http://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/cloudera-cdh5.repo wget -O /etc/yum.repos.d/cloudera-manager.repo http://archive.cloudera.com/cm5/redhat/6/x86_64/cm/cloudera-manager.repo # sed -i "s/https/http/g" /etc/yum.repos.d/*.repo yum install -y java-1.8.0-openjdk cloudera-director-server cloudera-director-client # Add proxy info in Cloudera Director configuration file. # echo lp.proxy.http.host: <proxy_host> >> /etc/cloudera-director-server/application.properties # echo lp.proxy.http.port: <port> >> /etc/cloudera-director-server/application.properties service cloudera-director-server restart
Now you can use this instance as the Cloudera Director server. You can also keep the instance snapshot as Cloudera Director image for future usage.
Copyright © 2016 Intel Corp. Licensed under the Apache License.