Dockerized HDP Cluster
Built and tested with the latest version of Docker for Mac and CentOS. Older versions of Docker provided by docker-machine and/or Docker Toolbox will not work.
Trying this on Windows? Please let me know how it works out.
##Project Goals:
These containers are not pushed to DockerHub, thus you'll need to build them locally:
docker-compose -f examples/compose/single-container.yml build
A successful build looks like:
docker-hdp randy> docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
hdp/node latest cacb20b1b0d3 15 seconds ago 7.682 GB
hdp/ambari-server latest b0fad41dd49c 15 minutes ago 2.492 GB
hdp/postgres latest ad42250d5c8b 23 minutes ago 320.2 MB
centos 6 cf2c3ece5e41 3 weeks ago 194.6 MB
postgres latest 7ee9d2061970 6 weeks ago 275.3 MB
##Running HDP 2.5: To run 3 containers (postgres, ambari-server, and a "single container HDP cluster"):
docker-compose -f examples/compose/single-container.yml up
After a minute or so, you can access Ambari's Web UI at localhost:8080. Default User/PW is admin/admin.
##Using Ambari Blueprints: To snapshot your cluster's configuration into a blueprint:
# You can extract a blueprint as soon as you click Deploy. No need to wait for install to complete.
curl --user admin:admin -H 'X-Requested-By:admin' localhost:8080/api/v1/clusters/dev?format=blueprint > examples/blueprints/single-container.json
Note: I give Docker 7 cores and 14GB of RAM. If you're running with less, you should generate your own Ambari Blueprints with the recommendations Ambari provides (it should auto-detect your environment's available resources).
To submit your blueprint to Ambari and have it install your cluster:
# Can swap "single-container" for multi-container, or any type saved in examples/blueprints and examples/hostgroups
sh submit-blueprint.sh single-container examples/blueprints/single-container.json
There are additional blueprints for common test-beds in examples/blueprints, including Hive-LLAP and HBase-Phoenix.
##Notes:
##Helpful Hints: If you HDFS having issues starting up/not leaving SafeMode, it's probably because docker-compose is re-using containers from a previous run.
To start with fresh containers, before each run do:
docker-compose -f examples/compose/multi-container.yml rm
Going to remove compose_ambari-server.dev_1, compose_dn0.dev_1, compose_master0.dev_1, compose_postgres.dev_1
Are you sure? [yN] y
Removing compose_ambari-server.dev_1 ... done
Removing compose_dn0.dev_1 ... done
Removing compose_master0.dev_1 ... done
Removing compose_postgres.dev_1 ... done
Docker for Mac sometimes has storage space problems. I recommend adding the following to your ~/.bash_profile and restarting terminal:
function docker-cleanup(){
# remove untagged images
docker rmi $(docker images | grep none | awk '{ print $3}')
# remove unused volumes
docker volume rm $(docker volume ls -q )
# `shotgun` remove unused networks
docker network rm $(docker network ls | grep "_default")
# remove stopped + exited containers, I skip Exit 0 as I have old scripts using data containers.
docker rm -v $(docker ps -a | grep "Exit [0-255]" | awk '{ print $1 }')
}
Run "docker-cleanup" if you run into Docker errors or "No space left on device" issues inside containers.
Since Hadoop UIs often link to hostnames, add the following to your hosts file:
echo "127.0.0.1 ambari-server ambari-server.dev" >> /etc/hosts
echo "127.0.0.1 master0 master0.dev" >> /etc/hosts
echo "127.0.0.1 dn0 dn0.dev" >> /etc/hosts
TODO: