Stream流的使用

2021 年 11 月 28 日
筆記
JAVA

創建流

創建流的方式很多，從jdk8起，很多類中添加了一些方法來創建相應的流，比如：BufferedReader類的lines()方法；Pattern類的splitAsStream方法。但是開發中使用到Stream基本上都是對集合的操作，了解如下幾種創建方式即可：

// 集合與數組
List<String> list = new ArrayList<>();
String[] arr = new String[]{};

Stream<String> listStream = list.stream();
Stream<String> arrayStream = Arrays.stream(arr);

// 靜態方法創建指定元素的流
Stream<String> programStream = Stream.of("java", "c++", "c");

// 將多個集合合併為一個流
List<String> sub1 = new ArrayList<>();
List<String> sub2 = new ArrayList<>();
Stream<String> concatStream = Stream.concat(sub1.stream(), sub2.stream());

// 提供一個供給型函數式介面, 源源不斷創造數據, 創建無限流
// 如下創建一個無限的整數流, 源源不斷列印10以內的隨機數
Stream<Integer> generateStream = Stream.generate(() -> RandomUtil.randomInt(10));
generateStream.forEach(System.out::println);
// limit可控制返迴流中的前n個元素, 此處僅取前2個隨機數列印
Stream<Integer> limitStream = generateStream.limit(2);
limitStream.forEach(System.out::println);

中間操作

篩選

filter：入參為斷言型介面（Predicate<T>），即用於篩選出斷言函數返回true的元素

limit：截斷流，獲取前n個元素的流

skip：跳過n個元素

distinct：通過流中元素的equals方法比較，去除重複的元素

這四個篩選類方法較為簡單，示例如下：

List<Map<String, Object>> source = ImmutableList.of(
  ImmutableMap.of("id", 1),
  ImmutableMap.of("id", 2),
  ImmutableMap.of("id", 2),
  ImmutableMap.of("id", 3),
  ImmutableMap.of("id", 4)
);
Stream<Map<String, Object>> target = source.stream()
  // 先篩選出id > 1的數據
  .filter(item -> Convert.toInt(item.get("id")) > 1)
  .distinct() 	// 去重
  .skip(1)    	// 跳過一個元素
  .limit(1);    // 只返回第一個元素
target.forEach(System.out::println); // 輸出: {id=3}

映射

map

map方法入參為函數式介面（Function<T, R>），即將流中的每個元素映射為另一個元素；

如下，取出source中的每個map元素，取id和age屬性拼成一句話當做新的元素

List<Map<String, Object>> source = ImmutableList.of(
  ImmutableMap.of("id", 1, "age", 10),
  ImmutableMap.of("id", 2, "age", 12),
  ImmutableMap.of("id", 3, "age", 16)
);
Stream<String> target = source.stream()
  .map(item -> item.get("id").toString() + "號嘉賓, 年齡: " + item.get("age"));
target.forEach(System.out::println);

輸出：

1號嘉賓, 年齡: 10
2號嘉賓, 年齡: 12
3號嘉賓, 年齡: 16

flatMap

map操作是將單個對象轉換為另外一個對象，而flatMap操作則是將單個對象轉換為多個對象，這多個對象以流的形式返回，最後將所有流合併為一個流返回；

List<List<String>> source = ImmutableList.of(
  ImmutableList.of("A_B"),
  ImmutableList.of("C_D")
);

Stream<String> target = source.stream().flatMap(item -> {
  String data = item.get(0);
  // 將數據映射為一個數組, 即:一對多
  String[] spilt = data.split("_");
  // 由泛型可知, 需要返迴流的形式
  return Arrays.stream(spilt);
});
target.forEach(System.out::println); // 依次列印A、B、C、D

排序

sorted

對流中的元素進行排序，默認會根據元素的自然順序排序，也可傳入Comparator介面實現指定排序邏輯

List<Integer> source = ImmutableList.of(
  10, 1, 5, 3
);
source.stream()
  .sorted()
  .forEach(System.out::println); // 列印: 1、3、5、10

source.stream()
  .sorted(((o1, o2) -> o2 - o1))
  .forEach(System.out::println); // 列印: 10、5、3、1

消費

peek

該方法主要目的是用來調試，接收一個消費者函數，方法注釋中用例如下：

Stream.of("one", "two", "three", "four")
  .filter(e -> e.length() > 3)
  .peek(e -> System.out.println("Filtered value: " + e))
  .map(String::toUpperCase)
  .peek(e -> System.out.println("Mapped value: " + e))
  .collect(Collectors.toList());

實際使用很少，並不需要這麼來調試問題

終結操作

查找與匹配

allMatch：檢查是否匹配流中所有元素

anyMatch：檢查是否至少匹配流中的一個元素

noneMatch：檢查是否沒有匹配的元素

findFirst：返回第一個元素

findAny：返回任意一個元素

max/min/count：返迴流中最大/最小/總數

List<Integer> source = ImmutableList.of(
  10, 1, 5, 3
);
// 元素是否都大於0
System.out.println(source.stream().allMatch(s -> s > 0));
// 是否存在元素大於9
System.out.println(source.stream().anyMatch(s -> s > 9));
// 是否不存在元素大於10
System.out.println(source.stream().noneMatch(s -> s > 10));
// 返回第一個元素, 若不存在則返回0
System.out.println(source.stream().findFirst().orElse(0));
// 任意返回一個元素, 若不存在則返回0
System.out.println(source.stream().findAny().orElse(0));

reduce

該方法可用於聚合求值，有三個重載方法；

reduce(BinaryOperator<T> accumulator)
reduce(T identity, BinaryOperator<T> accumulator)
reduce(U identity, BiFunction<U, ? super T, U> accumulator, BinaryOperator<U> combiner)

先看第一種重載形式：
當入參僅為函數式介面BinaryOperator<T>時，定義了如何將流中的元素聚合在一起，即將流中所有元素組合成一個元素；注意到BinaryOperator<T>介面繼承自BiFunction<T,T,T>，從泛型可知介面入參和出參都是同類型的

// 查詢出在不同年齡擁有的財產
List<Map<String, Object>> source = ImmutableList.of(
  ImmutableMap.of("age", 10, "money", "21.2"),
  ImmutableMap.of("age", 20, "money", "422.14"),
  ImmutableMap.of("age", 30, "money", "3312.16")
);
// 計算年齡總和和財產總和
Map<String, Object> res = source.stream().reduce((firstMap, secondMap) -> {
  Map<String, Object> tempRes = new HashMap<>();
  tempRes.put("age", Integer.parseInt(firstMap.get("age").toString())
              + Integer.parseInt(secondMap.get("age").toString()));
  tempRes.put("money", Double.parseDouble(firstMap.get("money").toString())
              + Double.parseDouble(secondMap.get("money").toString()));
  // 流中的元素是map, 最終也只能聚合為一個map
  return tempRes;
}).orElseGet(HashMap::new);
System.out.println(JSONUtil.toJsonPrettyStr(res));


// 輸出
{
    "money": 3755.5,
    "age": 60
}

BinaryOperator<T>函數的兩個入參如何理解呢？如下：

容易看出其含義就是將流中元素整合在一起，但是這種方法的初始值就是流中的第一個元素，能否自定義聚合的初始值呢？

這就是第二種重載形式了，顯然，第一個參數就是指定聚合的初始值；

緊接著上個例子，假設人一出生就擁有100塊錢（100塊都不給我？），如下：

List<Map<String, Object>> source = ImmutableList.of(
				ImmutableMap.of("age", 10, "money", "21.2"),
				ImmutableMap.of("age", 20, "money", "422.14"),
				ImmutableMap.of("age", 30, "money", "3312.16")
		);
// 計算年齡總和和財產總和
Map<String, Object> res = source.stream().reduce(ImmutableMap.of("age", 0, "money", "100"),
				(firstMap, secondMap) -> {
					Map<String, Object> tempRes = new HashMap<>();
					tempRes.put("age", Integer.parseInt(firstMap.get("age").toString())
							+ Integer.parseInt(secondMap.get("age").toString()));
					tempRes.put("money", Double.parseDouble(firstMap.get("money").toString())
							+ Double.parseDouble(secondMap.get("money").toString()));
					// 流中的元素是map, 最終也只能聚合為一個map
					return tempRes;
				});
System.out.println(JSONUtil.toJsonPrettyStr(res));

// 輸出
{
    "money": 3855.5,
    "age": 60
}

注意到第一種形式沒有指定初始值，所以會返回一個Optional值，而第二種重載形式既定了初始值，也就是不會為空，所以返回值不需要Optional類包裝了。

如上我們既可以定義初始值，又可以定義聚合的方式了，還缺什麼呢？

有一點小缺就是上面兩種返回的結果集的類型，跟原數據流中的類型是一樣的，無法自定義返回的類型，這點從BinaryOperator參數的泛型可以看出。

所以第三種重載形式出場了，既可自定義返回的數據類型，又支援自定義並行流場景下的多個執行緒結果集的組合形式；

如下返回Pair類型：

List<Map<String, Object>> source = ImmutableList.of(
				ImmutableMap.of("age", 10, "money", "21.2"),
				ImmutableMap.of("age", 20, "money", "422.14"),
				ImmutableMap.of("age", 30, "money", "3312.16")
		);
// 計算年齡總和和財產總和
Pair<Integer, Double> reduce = source.stream().reduce(Pair.of(0, 100d),
				(firstPair, secondMap) -> {
					int left = firstPair.getLeft()
							+ Integer.parseInt(secondMap.get("age").toString());
					double right = firstPair.getRight() +
							+Double.parseDouble(secondMap.get("money").toString());
					return Pair.of(left, right);
				}, (o, n) -> o);
System.out.println(JSONUtil.toJsonPrettyStr(reduce));

其中(o, n) -> o是隨便寫的，因為在順序流中這段函數不會執行，也就是無效的，只關注前兩個參數：如何定義返回的數據和類型、如何定義聚合的邏輯。

坑點：

如果創建並行流，且使用前兩種重載方法，最終得到的結果可能會和上面舉例的有些差別，因為從上面的原理來看，reduce中的每一次運算，下一步的結果總是依賴於上一步的執行結果的，像這樣肯定無法並行執行，所以並行流場景下，會有一些不同的細節問題
當創建並行流，且使用了第三種重載方法，得到的結果可能和預期的也不一樣，這需要了解其內部到底是如何聚合多個執行緒的結果集的

目前開發中沒有詳細使用並行流的經驗，有待研究

collect

collect是個非常有用的操作，可以將流中的元素收集成為另一種形式便於我們使用；該方法需要傳入Collector類型，但是手動實現此類型還比較麻煩，通常用Collectors工具類來構建我們想要返回的形式：

構建方法	說明
Collectors.toList()	將流收集為List形式
Collectors.toSet()	將流收集為Set形式
Collectors.toCollection()	收集為指定的集合形式，如LinkedList…等
Collectors.toMap()	收集為Map
Collectors.collectingAndThen()	允許對生成的集合再做一次操作
Collectors.joining()	用來連接流中的元素，比如用逗號分隔元素連接起來
Collectors.counting()	統計元素個數
Collectors.summarizingDouble/Long/Int()	為流中元素生成統計資訊，返回的是一個統計類
Collectors.averagingDouble/Long/Int()	對流中元素做平均
Collectors.maxBy()/minBy()	根據指定的Comparator，返迴流中最大/最小值
Collectors.groupingBy()	根據某些屬性分組
Collectors.partitioningBy()	根據指定條件

如下舉例一些使用：

toList/toSet/toCollection

將元素收集為集合形式

List<Map<String, Object>> source = ImmutableList.of(
  ImmutableMap.of("name", "小明", "grade", "a1", "sex", "1"),
  ImmutableMap.of("name", "小紅", "grade", "a2", "sex", "2"),
  ImmutableMap.of("name", "小白", "grade", "a1", "sex", "1"),
  ImmutableMap.of("name", "小黑", "grade", "a3", "sex", "1"),
  ImmutableMap.of("name", "小黃", "grade", "a4", "sex", "2")
);
// toSet類似, toCollection指定要返回的集合即可
List<String> toList = source.stream()
  .map(map -> map.get("name").toString())
  .collect(Collectors.toList());
System.out.println(toList); // [小明, 小紅, 小白, 小黑, 小黃]

collectingAndThen

收集為集合之後再額外做一次操作

List<String> andThen = source.stream()
  .map(map -> map.get("name").toString())
  .collect(Collectors.collectingAndThen(Collectors.toList(), x -> {
    // 將集合翻轉
    Collections.reverse(x);
    return x;
  }));
// 由於上述方法在toList之後, 增加了一個函數操作使集合翻轉, 所以結果跟上個示例是反的
System.out.println(andThen); // [小黃, 小黑, 小白, 小紅, 小明]

toMap

收集為map，使用此方法務必要傳入第三個參數，用於表明當key衝突時如何處理

Map<String, String> toMap = source.stream()
  .collect(Collectors.toMap(
    // 取班級欄位為key
    item -> item.get("grade").toString(),
    // 取名字為value
    item -> item.get("name").toString(),
    // 此函數用於決定當key重複時, 新value和舊value如何取捨
    (oldV, newV) -> oldV + "_" + newV));
System.out.println(toMap); // {a1=小明_小白, a2=小紅, a3=小黑, a4=小黃}

summarizingDouble/Long/Int

可用於對基本數據類型的數據作一些統計用，使用很少

// 對集合中sex欄位統計一些資訊
IntSummaryStatistics statistics = source.stream()
  .collect(Collectors.summarizingInt(map -> Integer.parseInt(map.get("sex").toString())));
System.out.println("平均值:" + statistics.getAverage()); // 1.4 
System.out.println("最大值:" + statistics.getMax()); // 2
System.out.println("最小值:" + statistics.getMin()); // 1
System.out.println("元素數量:" + statistics.getCount()); // 5
System.out.println("總和:" + statistics.getSum()); // 7

groupingBy

分組是用得比較多的一種方法，其有三種重載形式：

groupingBy(Function<? super T, ? extends K> classifier)
groupingBy(Function<? super T, ? extends K> classifier, Collector<? super T, A, D> downstream)
groupingBy(Function<? super T, ? extends K> classifier, Supplier<M> mapFactory, Collector<? super T, A, D> downstream)

點進源碼中很容易發現，其實前兩種都是第三種的特殊形式；

從第一種重載形式看起：

Function classifier：分類器，是個Function類型介面，此參數用於指定根據什麼值來分組，Function介面返回的值就是最終返回的Map中的key

List<Map<String, Object>> source = ImmutableList.of(
  ImmutableMap.of("name", "小明", "grade", "a1", "sex", "1"),
  ImmutableMap.of("name", "小紅", "grade", "a2", "sex", "2"),
  ImmutableMap.of("name", "小白", "grade", "a1", "sex", "1"),
  ImmutableMap.of("name", "小黑", "grade", "a3", "sex", "1"),
  ImmutableMap.of("name", "小黃", "grade", "a4", "sex", "2")
);

Map<String, List<Map<String, Object>>> grade = source.stream()
  .collect(Collectors.groupingBy(item -> item.get("grade") + "-" + item.get("sex")));
System.out.println(JSONUtil.toJsonPrettyStr(grade));

根據班級+性別組合成的欄位分組，結果如下（省略了一部分）：

{
    "a4-2": [
        {
            "sex": "2",
            "grade": "a4",
            "name": "小黃"
        }
    ],
    "a1-1": [
        {
            "sex": "1",
            "grade": "a1",
            "name": "小明"
        },
        {
            "sex": "1",
            "grade": "a1",
            "name": "小白"
        }
    ],
   ...
}

可以看到key是Function介面中選擇的grade+sex欄位，value是原數據集中元素的集合；
既然key的形式可以控制，value的形式如何控制呢？這就需要第二種重載形式了

Collector downstream：該參數就是用於控制分組之後想返回什麼形式的值，Collector類型，所以可以用Collectors工具類的方法來控制結果集形式；根據此特性，可以實現多級分組；

點進第一種重載方法源碼中可以看到，不傳第二個參數時，默認取的是Collectors.toList()，所以最後返回的Map中的value是集合形式，可以指定返回其他的形式。

比如上個例子，分組之後返回每組有多少個元素就行，不需要具體元素的集合

List<Map<String, Object>> source = ImmutableList.of(
  ImmutableMap.of("name", "小明", "grade", "a1", "sex", "1"),
  ImmutableMap.of("name", "小紅", "grade", "a2", "sex", "2"),
  ImmutableMap.of("name", "小白", "grade", "a1", "sex", "1"),
  ImmutableMap.of("name", "小黑", "grade", "a3", "sex", "1"),
  ImmutableMap.of("name", "小黃", "grade", "a4", "sex", "2")
);

Map<String, Long> collect = source.stream()
  .collect(Collectors.groupingBy(item -> item.get("grade") + "-" + item.get("sex"),
                                 Collectors.counting()));
System.out.println(JSONUtil.toJsonPrettyStr(collect));

結果就是返回的Map中的value是元素個數了：

{
    "a4-2": 1,
    "a1-1": 2,
    "a3-1": 1,
    "a2-2": 1
}

再看最後返回的結果集Map，其中的key和value都可以自由控制形式了，那Map類型具體到底是哪個實現類呢？這就是第三種重載形式的作用了

Supplier mapFactory：從參數名顯而易見就是製造map的工廠，Supplier供給型介面，即指定返回的Map類型就行

如果沒有使用此參數指定Map類型，那麼默認返回的就是HashMap實現，這點從方法源碼中很容易看到

緊接著上個例子，返回LinkedHashMap實現

List<Map<String, Object>> source = ImmutableList.of(
  ImmutableMap.of("name", "小明", "grade", "a1", "sex", "1"),
  ImmutableMap.of("name", "小紅", "grade", "a2", "sex", "2"),
  ImmutableMap.of("name", "小白", "grade", "a1", "sex", "1"),
  ImmutableMap.of("name", "小黑", "grade", "a3", "sex", "1"),
  ImmutableMap.of("name", "小黃", "grade", "a4", "sex", "2")
);

Map<String, Long> collect = source.stream()
  .collect(Collectors.groupingBy(item -> item.get("grade") + "-" + item.get("sex"),
                                 LinkedHashMap::new,
                                 Collectors.counting()));
System.out.println(JSONUtil.toJsonPrettyStr(collect));
System.out.println(collect instanceof LinkedHashMap);

輸出：

{
    "a4-2": 1,
    "a1-1": 2,
    "a2-2": 1,
    "a3-1": 1
}
true

partitioningBy

分區也是分組的一種特殊表現形式，分區操作返回的結果集Map中的key固定為true/false兩種，含義就是根據入參的預測型介面將數據分為兩類，類比分組即可；

Tags: JAVA

Stream流的使用

創建流

中間操作

篩選

映射

map

flatMap

排序

sorted

消費

peek

終結操作

查找與匹配

reduce

collect

toList/toSet/toCollection

collectingAndThen

toMap

summarizingDouble/Long/Int

groupingBy

partitioningBy

VirMach 便宜 VPS

QNews

Stream流的使用

創建流

中間操作

篩選

映射

map

flatMap

排序

sorted

消費

peek

終結操作

查找與匹配

reduce

collect

toList/toSet/toCollection

collectingAndThen

toMap

summarizingDouble/Long/Int

groupingBy

partitioningBy

分享此文：

Related Posts

如何使用sdkmanager命令行接受SDK package的license

Elasticsearch（ES）的滾動搜索與批量操作

第十六章：介面

『與善仁』Appium基礎 — 15、使用Appium的第一個Demo

VirMach 便宜 VPS

QNews

熱門文章

熱門搜尋