10.14
Fill in the blanks below to complete the following Apache Spark program which computes the number of occurrences of each word in a file. For simplicity we assume that words only occur in lowercase, and there are no punctuations marks.
<String> textFile = sc.textFile("hdfs://..."); JavaRDD<String, Integer> counts = textFile.____(s->Arrays.asList(s.split(" "))._____()) JavaPairRDD.mapToPair(word -> new _______).reduceByKey((a,b) -> a + b);
<String> textFile = sc.textFile("hdfs://...");
JavaRDD<String, Integer> counts = textFile.flatMap(s->Arrays.asList(s.split(" ")).iterator())
JavaPairRDD.mapToPair(word -> new Tuple2<>(word, 1)).reduceByKey((a,b) -> a + b);