This explains the coding, building and running a simple scala program on windows. This should be used only for testing purpose.
Pre-requiste : Apache Spark 1.5.x, Scala 2.11 and sbt installed on your windows machine.
Create a simple scala file, SimpleApp.scala
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf
object SimpleApp {
def main(args: Array[String]) {
System.setProperty("hadoop.home.dir", "C:\\winutil\\");
val logFile = "C:/spark-1.5.1-bin-hadoop2.6/README.md"
val conf = new SparkConf().setAppName("Simple Application")
val sc = new SparkContext(conf)
val logData = sc.textFile(logFile, 2).cache()
val numAs = logData.filter(line => line.contains("a")).count()
val numBs = logData.filter(line => line.contains("b")).count()
println("Lines with a: %s, Lines with b: %s".format(numAs, numBs))
}
}
You need to set the System property specifying the path of hadoop home directory. This is required only on windows. You can download the winutils.exe file from the following link
http://public-repo-1.hortonworks.com/hdp-win-alpha/winutils.exe
Create a sbt file, SimpleApp.sbt
name := "Simple App project"
version := "1.0"
scalaVersion := "2.11.5"
libraryDependencies += "org.apache.spark" %% "spark-core" % "1.5.1"
Your directory structure should be as follows
SimpleApp.sbt
src\main\scala\SimpleApp.scala
Building a deploying a package
sbt package
spark-submit --class "SimpleApp" --master local[4] target\scala-2.11\simple-app-project_2.11-1.0.jar
Pre-requiste : Apache Spark 1.5.x, Scala 2.11 and sbt installed on your windows machine.
Create a simple scala file, SimpleApp.scala
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf
object SimpleApp {
def main(args: Array[String]) {
System.setProperty("hadoop.home.dir", "C:\\winutil\\");
val logFile = "C:/spark-1.5.1-bin-hadoop2.6/README.md"
val conf = new SparkConf().setAppName("Simple Application")
val sc = new SparkContext(conf)
val logData = sc.textFile(logFile, 2).cache()
val numAs = logData.filter(line => line.contains("a")).count()
val numBs = logData.filter(line => line.contains("b")).count()
println("Lines with a: %s, Lines with b: %s".format(numAs, numBs))
}
}
You need to set the System property specifying the path of hadoop home directory. This is required only on windows. You can download the winutils.exe file from the following link
http://public-repo-1.hortonworks.com/hdp-win-alpha/winutils.exe
Create a sbt file, SimpleApp.sbt
name := "Simple App project"
version := "1.0"
scalaVersion := "2.11.5"
libraryDependencies += "org.apache.spark" %% "spark-core" % "1.5.1"
Your directory structure should be as follows
SimpleApp.sbt
src\main\scala\SimpleApp.scala
Building a deploying a package
sbt package
spark-submit --class "SimpleApp" --master local[4] target\scala-2.11\simple-app-project_2.11-1.0.jar