In this section we will configure Spark 1.3.1 on Hortonworks Sandbox with HDP 2.2.
Login as root to your Sandbox and add the repo that has Spark 1.3.1 using the following command
wget -nv http://public-repo-1.hortonworks.com/HDP/centos6/2.x/updates/18.104.22.168/hdp.repo -O /etc/yum.repos.d/HDP-TP.repo
With the following command install the Spark package
yum install spark_2_2_4_4_16-master
It will take a few minutes to complete the installation of Spark and all the dependencies:
Lets’s also install pyspark with the command
yum install spark-python
hdp-select command to configure history server and client to point to the version we just installed:
hdp-select set spark-historyserver 22.214.171.124-16 hdp-select set spark-client 126.96.36.199-16
Let’s check if all is well by running a sample.
First let’s as the
spark user that was created during the RPM install and then cd to the $SPARK_HOME directory.
su spark cd /usr/hdp/current/spark-client
Now let’s run the sample to calculate properties
./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn-client --num-executors 3 --driver-memory 512m --executor-memory 512m --executor-cores 1 lib/spark-examples*.jar 10