Scala 練習題學生分數案例 - ⎝⎛CodingNote.cc ⎞⎠

Scala 練習題學生分數案例

2022 年 7 月 20 日
筆記
scala

一、相關資訊
題目：
1、統計班級人數
2、統計學生的總分
3、統計總分年級排名前十學生各科的分數
4、統計總分大於年級平均分的學生
5、統計每科都及格的學生
6、統計偏科最嚴重的前100名學生
數據樣例（部分數據）：
1.學生資訊數據：students.txt

1500100001,施笑槐,22,女,文科六班
1500100002,呂金鵬,24,男,文科七班
1500100003,單樂蕊,22,女,理科六班
1500100004,葛德曜,24,男,理科三班
1500100005,宣谷芹,22,女,理科五班
1500100006,邊昂雄,21,男,理科二班
1500100007,尚孤風,23,女,文科六班
1500100008,符半雙,22,女,理科六班
1500100009,沈德昌,21,男,理科一班
1500100010,羿彥昌,23,男,理科六班
1500100011,宰運華,21,男,理科三班
1500100012,梁易槐,21,女,理科一班
1500100013,逯君昊,24,男,文科二班
1500100014,羿旭炎,23,男,理科五班
1500100015,宦懷綠,21,女,理科一班
1500100016,潘訪煙,23,女,文科一班

2.學生分數資訊（部分）：

500100001,1000001,98
1500100001,1000002,5
1500100001,1000003,0
1500100001,1000004,29
1500100001,1000005,85
1500100001,1000006,52
1500100002,1000001,139
1500100002,1000002,102
1500100002,1000003,44
1500100002,1000004,18
1500100002,1000005,46
1500100002,1000006,91
1500100003,1000001,48

3.學生科目資訊（部分）：

1000001,語文,150
1000002,數學,150
1000003,英語,150
1000004,政治,100
1000005,歷史,100
1000006,物理,100
1000007,化學,100
1000008,地理,100
1000009,生物,100

二、題目程式碼編寫

1、統計班級人數

package shujia

import scala.io.Source
//1、統計班級人數
/**
 * 以下所有的方法都是返回新的集合，不會修改原始的集合
 * 同時以下這些方法在set集合中也有，除了sort
 * foreach:遍曆數據
 * map：一條一條處理數據
 * filter：過濾數據
 * flatMap:將一行轉換成多行
 * sortBy：排序
 * groupBy：分組
 */
object Test2 {
  def main(args: Array[String]): Unit = {
    //讀取文件
    val students: List[String] = Source.fromFile("data/score.txt").getLines().toList
    //按照逗號分割

    val stringses: List[Array[String]] = students.map(line => line.split(","))

    //3.過濾臟數據
    val listFilter: List[Array[String]] = stringses.filter(line => line.length == 3)

    //4.取數據
    val scores: List[(String, Int)] = listFilter.map {
      case Array(id: String, _: String, sco: String) =>
        (id, sco.toInt)
    }
    //分組group
    val group: Map[String, List[(String, Int)]] = scores.groupBy(word => word._1)
    //統計數量

    val sumScoList: Map[String, Int] = group.map {
      case (id: String, list: List[(String, Int)]) =>
        val sco: List[Int] = list.map { case (_, sco: Int) => sco }
        val sumSco: Int = sco.sum
        (id, sumSco)
    }
    sumScoList.foreach(println)
  }
}

2、統計學生的總分

package com.shujia.scala

import scala.io.Source

object Demo22SumScore {
  def main(args: Array[String]): Unit = {
    /**
     * 2、統計學生的總分
     */

    //1、讀取分數表
    val scoresList: List[String] = Source.fromFile("data/score.txt").getLines().toList


    //2、過濾臟數據
    val filterList: List[String] = scoresList.filter((line: String) => {
      val length: Int = line.split(",").length
      length == 3
    })

    //3、取出學號和分數
    val idAndScore: List[(String, Int)] = filterList.map((line => {
      val split: Array[String] = line.split(",")
      //學號
      val id: String = split.head
      //分數
      val score: Int = split.last.toInt
      (id, score)
    }))

    //4、按照學號分組
    val groupByList: Map[String, List[(String, Int)]] = idAndScore.groupBy(kv => kv._1)

    //5、統計學生的總分
    val sumScoMap: Map[String, Int] = groupByList.map((kv: (String, List[(String, Int)])) => {
      val id: String = kv._1
      val scores: List[(String, Int)] = kv._2
      //取出每個學生所有的分數
      val scos: List[Int] = scores.map(sco => sco._2)
      //計算總分
      val sumSco: Int = scos.sum

      (id, sumSco)
    })

    sumScoMap.foreach(println)
  }

}
/*
* 第二種方法，case
*/

object Demo22SumScore {
  def main(args: Array[String]): Unit = {
    /**
     * 2、統計學生的總分
     */

    //1、讀取分數表
    val scoresList: List[String] = Source.fromFile("data/score.txt").getLines().toList


    //2、過濾臟數據
    val filterList: List[String] = scoresList.filter((line: String) => {
      val length: Int = line.split(",").length
      length == 3
    })

    //3、取出學號和分數
    val idAndScore: List[(String, Int)] = filterList.map((line => {
      val split: Array[String] = line.split(",")
      //學號
      val id: String = split.head
      //分數
      val score: Int = split.last.toInt
      (id, score)
    }))

    //4、按照學號分組
    val groupByList: Map[String, List[(String, Int)]] = idAndScore.groupBy(kv => kv._1)

    //5、統計學生的總分
    val sumScoMap: Map[String, Int] = groupByList.map((kv: (String, List[(String, Int)])) => {
      val id: String = kv._1
      val scores: List[(String, Int)] = kv._2
      //取出每個學生所有的分數
      val scos: List[Int] = scores.map(sco => sco._2)
      //計算總分
      val sumSco: Int = scos.sum

      (id, sumSco)
    })

    sumScoMap.foreach(println)
  }

}

3、統計總分年級排名前十學生各科的分數

package shujia

import scala.io.Source

//1、統計總分年級排名前十學生各科的分數
object Test3Top10 {
  def main(args: Array[String]): Unit = {

    //1、讀取分數
    val lines: List[String] = Source.fromFile("data/score.txt").getLines().toList

    //2、切分數據
    val scoreArr: List[Array[String]] = lines.map(line => line.split(","))

    //3、過濾臟數據
    val scoreFilter: List[Array[String]] = scoreArr.filter(arr => arr.length == 3)
//scoreFilter.foreach(println)
    //4、取出學號和分數
    val scoFilter: List[(String, String, Int)] = scoreFilter.map {
      case Array(id: String, subject: String, sco: String) =>
        (id, subject, sco.toInt)
    }
    //5.學號分組
    val scoGroupBy: Map[String, List[(String, String, Int)]] = scoFilter.groupBy(kv => kv._1)

    //6.計算學生總分
    val sSos: List[(String, Int, List[(String, String, Int)])] = scoGroupBy.map {
      case (id: String, list: List[(String, String, Int)]) =>
        val scores: List[Int] = list.map { case (_, _, sco: Int) => sco }

        val scoSum: Int = scores.sum
        (id, scoSum, list)
    }.toList
    val lists: List[(String, Int, List[(String, String, Int)])] = sSos.sortBy(kv => -kv._2)
    val top10: List[(String, Int, List[(String, String, Int)])] = lists.take(10)
    top10.foreach(println)
  }
}


4、統計總分大於年級平均分的學生

import com.shujia.spark.util.HdfsUtil
import org.apache.spark.rdd.RDD
import org.apache.spark.{SparkConf, SparkContext}

//統計總分大於年級平均分的學生
//平均總分=學生總分/學生人數
object Test2ScoreAvg {
  def main(args: Array[String]): Unit = {
    val conf = new SparkConf()
    conf.setAppName("AVG")
    //    conf.setMaster("local")
    val sc = new SparkContext(conf)

    //讀取文件切分過濾臟數據
    val scoFilter: RDD[Array[String]] = sc.textFile("/data/student/score.txt").map(_.split(",")).filter(_.length == 3)
    //提取分數出來
    val scoRDD: RDD[(String, Int)] = scoFilter.map {
      case Array(id: String, _, sco: String) =>
        (id, sco.toInt)
    }
    //按照學生學號進行分組
    val scoGroRDD: RDD[(String, Iterable[(String, Int)])] = scoRDD.groupBy(_._1)
    //暫存快取中，提高速率
    scoRDD.cache()
    //計算學生的總分
    val sumStusRDD: RDD[(String, Int)] = scoRDD.reduceByKey(_ + _)
    //計算年級的總分
    val sumNJ: Double = sumStusRDD.map(_._2).sum
    //計算年級的平均總分
    val avgSum: Double = sumNJ / scoGroRDD.count
    //過濾總分大於平均分的數據
    val avgtoSum: RDD[(String, Int)] = sumStusRDD.filter { case (_, sco: Int) => sco >= avgSum }
    val l1: Long = avgtoSum.count()
    println(s"大於平均分有$l1+人,平均分是：$avgSum")
    //    avgtoSum.foreach(println)
    HdfsUtil.delete("/data/sum_avgToSum")
    avgtoSum.saveAsTextFile("/data/sum_avgToSum")
  }

}



5、統計每科都及格的學生

package shujia

import scala.collection.immutable
import scala.io.Source

//3、統計每科都及格的學生
object Test5_60fen {
  def main(args: Array[String]): Unit = {
    //讀取文件
    val list: List[String] = Source.fromFile("data/score.txt").getLines().toList
    //    list.foreach(println)
    //按照逗號分割
    val listSplit: List[Array[String]] = list.map(line => line.split(","))
    //    listSplit.foreach(println)
    //過濾數據
    val listFilter: List[Array[String]] = listSplit.filter(line => line.length == 3)
    //提取數據,過濾分數大於60的人
//    listFilter.foreach(println)
    val listFilter2: List[Array[String]] = listFilter.filter(sco => sco.last.toInt - (60) >= 0)
    //listFilter2.foreach(println)
    val lists: List[(String, String, Int)] = listFilter2.map {
      case Array(id: String, sub: String, sco: String) =>
        (id, sub, sco.toInt)
    }
    //按照學號分組
    val listGroup: Map[String, List[(String, String, Int)]] = lists.groupBy(line => line._1)

    val list1: List[(String, List[(String, String, Int)])] = listGroup.map((kv: (String, List[(String, String, Int)])) => {
      val id: String = kv._1
      val count: List[(String, String, Int)] = kv._2

      (id, count)
    }).toList

       list1.foreach(println)
  }
}


6、統計偏科最嚴重的前100名學生

package com.shujia.scala

import scala.collection.immutable
import scala.io.Source

object Demo31Student {
  def main(args: Array[String]): Unit = {
    /**
     * 4、統計偏科最嚴重的前100名學生
     *
     * 偏科評估的標準： 方差
     */

    //1、讀取分數
    val lines: List[String] = Source.fromFile("data/score.txt").getLines().toList

    //2、切分數據
    val scoreArr: List[Array[String]] = lines.map(line => line.split(","))

    //3、過濾臟數據
    val scoreFilter: List[Array[String]] = scoreArr.filter(arr => arr.length == 3)

    //4、取出學號和分數
    val idAndScore: List[(String, Int)] = scoreFilter.map {
      case Array(id: String, _, sco: String) =>
        (id, sco.toInt)
    }

    //5、按照學號分組
    val groupBy: Map[String, List[(String, Int)]] = idAndScore.groupBy(kv => kv._1)

    //計算方差
    val std: List[(String, Double, List[(String, Int)])] = groupBy.map {
      case (id: String, list: List[(String, Int)]) =>
        //一個學生所有的分數
        val scores: List[Int] = list.map { case (_, sco: Int) => sco }

        /**
         * 計算方差
         * 1、計算總數
         * 2、計算平均值
         * 3、計算方差
         *
         */
        //科目數
        val N: Double = scores.length.toDouble
        //平均數
        val avg: Double = scores.sum / N

        //計算方差
        val std: Double = scores.map((sco: Int) => (sco - avg) * (sco - avg)).sum / N

        (id, std, list)
    }.toList

    //按照方差排序，取前100
    val sortByStd: List[(String, Double, List[(String, Int)])] = std.sortBy(kv => -kv._2)


    //取前100
    val top10: List[(String, Double, List[(String, Int)])] = sortByStd.take(100)


    top10.foreach(println)
  }

}

Tags: scala

Previous post

性能本們不用C口充電的原因：總算懂了

Next post

預計投資近5萬億！基建狂魔新基地啟動：規模超450GW