Last modified: 2014-07-07 20:45:51 UTC
In a fresh labs CDH 5 cluster installed by our puppet, Oozie need not be functional. Basic hive workflows for me failed with IndexOutOfBoundsException: Index: 0, Size: 0 Oozie not having had sharelibs properly installed turned out to be responsible for this. Puppet had installed only a the stax jar [1] in HDFS for oozie, hence the cdh::oozie::server's 'unless' guard [2] prevented proper install of oozie's sharelibs. ---------------------------------- * Steps to reproduce: Create a fresh Ubuntu 12.04 CDH5 cluster (from production puppet of 2014-07-04 ~15:00:00) master: role::analytics::hadoop::master role::analytics::hive::server role::analytics::oozie::server worker 1, and worker 2: role::analytics::hadoop::worker role::analytics::hive::client role::analytics::oozie::client role::analytics::pig All three instances having set hadoop_namenodes to the FQDN of the master. * Expected Result: Oozie had sharelibs installed in HDFS. * Actual Result: Only stax-api-*.jar are present in oozies sharelib directory [1]. * Steps to recover from broken Oozie On master node, as root run: 1. /usr/bin/oozie-setup sharelib create -fs hdfs:// -locallib /usr/lib/oozie/oozie-sharelib-yarn.tar.gz 2. /etc/init.d/oozie restart Then, running oozie admin -shareliblist should give you [Available ShareLib] oozie hive distcp hcatalog sqoop mapreduce-streaming hive2 pig ----------------------------------- [1] hdfs dfs -ls -R /user/oozie gives drwxr-xr-x - oozie hadoop 0 2014-07-04 16:05 /user/oozie/share drwxr-xr-x - oozie hadoop 0 2014-07-04 16:05 /user/oozie/share/lib drwxr-xr-x - oozie hadoop 0 2014-07-04 16:05 /user/oozie/share/lib/lib_20140704160518 drwxr-xr-x - oozie hadoop 0 2014-07-04 16:05 /user/oozie/share/lib/lib_20140704160518/hive -rw-r--r-- 3 oozie hadoop 0 2014-07-04 17:05 /user/oozie/share/lib/lib_20140704160518/hive/stax-api-1.0.1.jar [2] http://git.wikimedia.org/blob/operations%2Fpuppet%2Fcdh.git/69b6d3e853d248c5977a6909eaceab72a9620284/manifests%2Foozie%2Fserver.pp#L125
It seems the issue only appears when adding the role::analytics::oozie::server role, before the worker had deployed their role::analytics::hadoop::worker role. This requirement seems sane. Puppet/Oozie could be more open about this, or give a proper error message. But meh. Hence, marking the bug as invalid.