What are the pre-requisites for using Spark in Yarn Mode with Diyotta?
Diyotta supports running Spark executions on Yarn Mode. From Spark data point choose “Yarn Mode” to launch spark applications created by Diyotta in Yarn mode. With Yarn mode, Diyotta always uses Spark version that is packaged in Diyotta’s Spark extension installer. Apart from the availability of Yarn cluster, there are two pre-requisites.
Firstly, Spark library files should be available on HDFS. This is done by copying the contents extracted from the Diyotta Spark extension package to a location in HDFS. This location is then provided in the Spark data point property “Spark Yarn Jar Location”. Secondly, the configuration xml files from the Hadoop environment should be copied over to the Diyotta agent nodes.
The path to these files should be provided in Spark data point property “Hadoop Conf Path”.