Posts

Showing posts with the label Bash scripting

Apache Spark :: Error Resolution :: 'value join is not a member of org.apache.spark.rdd.RDD'

Apache Spark :: Error Resolution :: 'value join is not a member of org.apache.spark.rdd.RDD' ERROR DESCRIPTION Consider 2 Spark RDDs to be joined together.. Say, rdd1.first is in the form of (Int, Int, Float) = (1,957,299.98) while rdd2.first is something like (Int, Int) = (25876,1) where the join is supposed to take place on the 1st field from both the RDDs. scala> rdd1.join(rdd2)  --- results in an error <console>:**: error: value join is not a member of org.apache.spark.rdd.RDD[(Int, Int, Float)] REASON Both the RDDs should be in the form of a Key-Value pair. Here, rdd2 -- being in the form of (1,957,299.98) -- does not obey this rule.. While rdd1 -- which is in the form of (25876,1) -- does. RESOLUTION Convert the output of the 1st RDD from (1,957,299.98) to a Key-Value pair in the form of (1,(957,299.98)) before joining it with rdd2, as shown below: scala> val rdd1KV = rdd1.map(x=>(x.split(",")(1).toInt,(x.split(",...

Automated bash script to export all Hive DDLs from an existing environment at one go!

Exporting all Hive DDLs from an existing environment The below script scans through all the databases on your Hive system, and routes all the Create Table statements for all the tables to a file. This will be helpful when you need to set up a new environment based on the existing one. It has been tested on databases with over 500 tables. Steps to be performed are as follows: 1. Create a script, say  hive_ddls_export.sh  in the Hadoop box with HIVE CLI installed on it with the following content: #!/bin/bash databases =`hive -e "show databases;"` all_db_names =${databases} datetoday =`date +%Y-%m-%d_%H:%M:%S` touch dev_hive_ext_tables_$datetoday.sql chmod 744 dev_hive_ext_tables_$datetoday.sql for listofdatabases in $ all_db_names do   tables=`hive -e "use $listofdatabases;show tables;"`   all_tab_names=`echo "${tables}"`   echo " /****  Start DDLs for Tables in ${listofdatabases} ****/ " >> dev_hive...