www.97超碰.com,av一区在线,久久国产亚洲

collection, collections, collect, collector, collectos

collection是java集合的祖先接口。
collections是java.util包下的一個(gè)工具類，內(nèi)涵各種處理集合的靜態(tài)方法。
java.util.stream.stream#collect(java.util.stream.collector<? super t,a,r>)是stream的一個(gè)函數(shù)，負(fù)責(zé)收集流。
java.util.stream.collector 是一個(gè)收集函數(shù)的接口, 聲明了一個(gè)收集器的功能。
java.util.comparators則是一個(gè)收集器的工具類，內(nèi)置了一系列收集器實(shí)現(xiàn)。

收集器的作用

你可以把java8的流看做花哨又懶惰的數(shù)據(jù)集迭代器。他們支持兩種類型的操作：中間操作(e.g. filter, map)和終端操作(如count, findfirst, foreach, reduce). 中間操作可以連接起來(lái)，將一個(gè)流轉(zhuǎn)換為另一個(gè)流。這些操作不會(huì)消耗流，其目的是建立一個(gè)流水線。與此相反，終端操作會(huì)消耗類，產(chǎn)生一個(gè)最終結(jié)果。collect就是一個(gè)歸約操作，就像reduce一樣可以接受各種做法作為參數(shù)，將流中的元素累積成一個(gè)匯總結(jié)果。具體的做法是通過(guò)定義新的collector接口來(lái)定義的。

預(yù)定義的收集器

下面簡(jiǎn)單演示基本的內(nèi)置收集器。模擬數(shù)據(jù)源如下：

									final arraylist<dish> dishes = lists.newarraylist(

									    new dish("pork", false, 800, type.meat),

									    new dish("beef", false, 700, type.meat),

									    new dish("chicken", false, 400, type.meat),

									    new dish("french fries", true, 530, type.other),

									    new dish("rice", true, 350, type.other),

									    new dish("season fruit", true, 120, type.other),

									    new dish("pizza", true, 550, type.other),

									    new dish("prawns", false, 300, type.fish),

									    new dish("salmon", false, 450, type.fish)

									);

最大值，最小值，平均值

									// 為啥返回optional？ 如果stream為null怎么辦, 這時(shí)候optinal就很有意義了

									optional<dish> mostcaloriedish = dishes.stream().max(comparator.comparingint(dish::getcalories));

									optional<dish> mincaloriedish = dishes.stream().min(comparator.comparingint(dish::getcalories));

									double avgcalories = dishes.stream().collect(collectors.averagingint(dish::getcalories));

									intsummarystatistics summarystatistics = dishes.stream().collect(collectors.summarizingint(dish::getcalories));

									double average = summarystatistics.getaverage();

									long count = summarystatistics.getcount();

									int max = summarystatistics.getmax();

									int min = summarystatistics.getmin();

									long sum = summarystatistics.getsum();

這幾個(gè)簡(jiǎn)單的統(tǒng)計(jì)指標(biāo)都有collectors內(nèi)置的收集器函數(shù)，尤其是針對(duì)數(shù)字類型拆箱函數(shù)，將會(huì)比直接操作包裝類型開銷小很多。

連接收集器

想要把stream的元素拼起來(lái)？

									//直接連接

									string join1 = dishes.stream().map(dish::getname).collect(collectors.joining());

									//逗號(hào)

									string join2 = dishes.stream().map(dish::getname).collect(collectors.joining(", "));

tolist

1	`list<string> names = dishes.stream().map(dish::getname).collect(tolist());`

將原來(lái)的stream映射為一個(gè)單元素流，然后收集為list。

toset

1	`set<type> types = dishes.stream().map(dish::gettype).collect(collectors.toset());`

將type收集為一個(gè)set，可以去重復(fù)。

tomap

1	`map<type, dish> bytype = dishes.stream().collect(tomap(dish::gettype, d -> d));`

有時(shí)候可能需要將一個(gè)數(shù)組轉(zhuǎn)為map，做緩存，方便多次計(jì)算獲取。tomap提供的方法k和v的生成函數(shù)。(注意，上述demo是一個(gè)坑，不可以這樣用！??！, 請(qǐng)使用tomap(function, function, binaryoperator))

上面幾個(gè)幾乎是最常用的收集器了，也基本夠用了。但作為初學(xué)者來(lái)說(shuō)，理解需要時(shí)間。想要真正明白為什么這樣可以做到收集，就必須查看內(nèi)部實(shí)現(xiàn)，可以看到，這幾個(gè)收集器都是基于java.util.stream.collectors.collectorimpl，也就是開頭提到過(guò)了collector的一個(gè)實(shí)現(xiàn)類。后面自定義收集器會(huì)學(xué)習(xí)具體用法。

自定義歸約reducing

前面幾個(gè)都是reducing工廠方法定義的歸約過(guò)程的特殊情況，其實(shí)可以用collectors.reducing創(chuàng)建收集器。比如，求和

									integer totalcalories = dishes.stream().collect(reducing(0, dish::getcalories, (i, j) -> i + j));

									//使用內(nèi)置函數(shù)代替箭頭函數(shù)

									integer totalcalories2 = dishes.stream().collect(reducing(0, dish::getcalories, integer::sum));

當(dāng)然也可以直接使用reduce

1	`optional<integer> totalcalories3 = dishes.stream().map(dish::getcalories).reduce(integer::sum);`

雖然都可以，但考量效率的話，還是要選擇下面這種

1	`int` `sum = dishes.stream().maptoint(dish::getcalories).sum();`

根據(jù)情況選擇最佳方案

上面的demo說(shuō)明，函數(shù)式編程通常提供了多種方法來(lái)執(zhí)行同一個(gè)操作，使用收集器collect比直接使用stream的api用起來(lái)更加復(fù)雜，好處是collect能提供更高水平的抽象和概括，也更容易重用和自定義。

我們的建議是，盡可能為手頭的問(wèn)題探索不同的解決方案，始終選擇最專業(yè)的一個(gè)，無(wú)論從可讀性還是性能來(lái)看，這一般都是最好的決定。

reducing除了接收一個(gè)初始值，還可以把第一項(xiàng)當(dāng)作初始值

1 2	`optional<dish> mostcaloriedish = dishes.stream()` `.collect(reducing((d1, d2) -> d1.getcalories() > d2.getcalories() ? d1 : d2));`

reducing

關(guān)于reducing的用法比較復(fù)雜，目標(biāo)在于把兩個(gè)值合并成一個(gè)值。

									public static <t, u>

									  collector<t, ?, u> reducing(u identity,

									                function<? super t, ? extends u> mapper,

									                binaryoperator<u> op)

首先看到3個(gè)泛型，

u是返回值的類型，比如上述demo中計(jì)算熱量的，u就是integer。

關(guān)于t，t是stream里的元素類型。由function的函數(shù)可以知道，mapper的作用就是接收一個(gè)參數(shù)t，然后返回一個(gè)結(jié)果u。對(duì)應(yīng)demo中dish。

?在返回值collector的泛型列表的中間，這個(gè)表示容器類型，一個(gè)收集器當(dāng)然需要一個(gè)容器來(lái)存放數(shù)據(jù)。這里的？則表示容器類型不確定。事實(shí)上，在這里的容器就是u[]。

關(guān)于參數(shù)：

identity是返回值類型的初始值，可以理解為累加器的起點(diǎn)。

mapper則是map的作用，意義在于將stream流轉(zhuǎn)換成你想要的類型流。

op則是核心函數(shù)，作用是如何處理兩個(gè)變量。其中，第一個(gè)變量是累積值，可以理解為sum，第二個(gè)變量則是下一個(gè)要計(jì)算的元素。從而實(shí)現(xiàn)了累加。

reducing還有一個(gè)重載的方法，可以省略第一個(gè)參數(shù)，意義在于把stream里的第一個(gè)參數(shù)當(dāng)做初始值。

1 2	`public` `static` `<t> collector<t, ?, optional<t>>` `reducing(binaryoperator<t> op)`

先看返回值的區(qū)別，t表示輸入值和返回值類型，即輸入值類型和輸出值類型相同。還有不同的就是optional了。這是因?yàn)闆](méi)有初始值，而第一個(gè)參數(shù)有可能是null，當(dāng)stream的元素是null的時(shí)候，返回optional就很意義了。

再看參數(shù)列表，只剩下binaryoperator。binaryoperator是一個(gè)三元組函數(shù)接口，目標(biāo)是將兩個(gè)同類型參數(shù)做計(jì)算后返回同類型的值。可以按照1>2? 1:2來(lái)理解，即求兩個(gè)數(shù)的最大值。求最大值是比較好理解的一種說(shuō)法，你可以自定義lambda表達(dá)式來(lái)選擇返回值。那么，在這里，就是接收兩個(gè)stream的元素類型t，返回t類型的返回值。用sum累加來(lái)理解也可以。

上述的demo中發(fā)現(xiàn)reduce和collect的作用幾乎一樣，都是返回一個(gè)最終的結(jié)果，比如，我們可以使用reduce實(shí)現(xiàn)tolist效果：

									//手動(dòng)實(shí)現(xiàn)tolistcollector --- 濫用reduce， 不可變的規(guī)約---不可以并行

									list<integer> calories = dishes.stream().map(dish::getcalories)

									    .reduce(new arraylist<integer>(),

									        (list<integer> l, integer e) -> {

									          l.add(e);

									          return l;

									        },

									        (list<integer> l1, list<integer> l2) -> {

									          l1.addall(l2);

									          return l1;

									        }

									    );

關(guān)于上述做法解釋一下。

									<u> u reduce(u identity,

									         bifunction<u, ? super t, u> accumulator,

									         binaryoperator<u> combiner);

u是返回值類型，這里就是list

bifunction<u, ? super t, u> accumulator是是累加器，目標(biāo)在于累加值和單個(gè)元素的計(jì)算規(guī)則。這里就是list和元素做運(yùn)算，最終返回list。即，添加一個(gè)元素到list。

binaryoperator<u> combiner是組合器，目標(biāo)在于把兩個(gè)返回值類型的變量合并成一個(gè)。這里就是兩個(gè)list合并。
這個(gè)解決方案有兩個(gè)問(wèn)題：一個(gè)是語(yǔ)義問(wèn)題，一個(gè)是實(shí)際問(wèn)題。語(yǔ)義問(wèn)題在于，reduce方法旨在把兩個(gè)值結(jié)合起來(lái)生成一個(gè)新值，它是一個(gè)不可變歸約。相反，collect方法的設(shè)計(jì)就是要改變?nèi)萜鳎瑥亩鄯e要輸出的結(jié)果。這意味著，上面的代碼片段是在濫用reduce方法，因?yàn)樗谠馗淖兞俗鳛槔奂悠鞯膌ist。錯(cuò)誤的語(yǔ)義來(lái)使用reduce方法還會(huì)造成一個(gè)實(shí)際問(wèn)題：這個(gè)歸約不能并行工作，因?yàn)橛啥鄠€(gè)線程并發(fā)修改同一個(gè)數(shù)據(jù)結(jié)構(gòu)可能會(huì)破壞list本身。在這種情況下，如果你想要線程安全，就需要每次分配一個(gè)新的list，而對(duì)象分配又會(huì)影響性能。這就是collect適合表達(dá)可變?nèi)萜魃系臍w約的原因，更關(guān)鍵的是它適合并行操作。

總結(jié)：reduce適合不可變?nèi)萜鳉w約，collect適合可變?nèi)萜鳉w約。collect適合并行。

分組

數(shù)據(jù)庫(kù)中經(jīng)常遇到分組求和的需求，提供了group by原語(yǔ)。在java里，如果按照指令式風(fēng)格(手動(dòng)寫循環(huán))的方式，將會(huì)非常繁瑣，容易出錯(cuò)。而java8則提供了函數(shù)式解法。

比如，將dish按照type分組。和前面的tomap類似，但分組的value卻不是一個(gè)dish，而是一個(gè)list。

1	`map<type, list<dish>> dishesbytype = dishes.stream().collect(groupingby(dish::gettype));`

這里

1 2	`public` `static` `<t, k> collector<t, ?, map<k, list<t>>>` `groupingby(function<?` `super` `t, ?` `extends` `k> classifier)`

參數(shù)分類器為function，旨在接收一個(gè)參數(shù)，轉(zhuǎn)換為另一個(gè)類型。上面的demo就是把stream的元素dish轉(zhuǎn)成類型type，然后根據(jù)type將stream分組。其內(nèi)部是通過(guò)hashmap來(lái)實(shí)現(xiàn)分組的。groupingby(classifier, hashmap::new, downstream);

除了按照stream元素自身的屬性函數(shù)去分組，還可以自定義分組依據(jù)，比如根據(jù)熱量范圍分組。

既然已經(jīng)知道groupingby的參數(shù)為function, 并且function的參數(shù)類型為dish，那么可以自定義分類器為：

									private caloriclevel getcaloriclevel(dish d) {

									  if (d.getcalories() <= 400) {

									   return caloriclevel.diet;

									  } else if (d.getcalories() <= 700) {

									   return caloriclevel.normal;

									  } else {

									   return caloriclevel.fat;

									  }

									}

再傳入?yún)?shù)即可

1 2	`map<caloriclevel, list<dish>> dishesbylevel = dishes.stream()` `.collect(groupingby(this::getcaloriclevel));`

多級(jí)分組

groupingby還重載了其他幾個(gè)方法，比如

									public static <t, k, a, d>

									  collector<t, ?, map<k, d>> groupingby(function<? super t, ? extends k> classifier,

									                     collector<? super t, a, d> downstream)

泛型多的恐怖。簡(jiǎn)單的認(rèn)識(shí)一下。classifier還是分類器，就是接收stream的元素類型，返回一個(gè)你想要分組的依據(jù)，也就是提供分組依據(jù)的基數(shù)的。所以t表示stream當(dāng)前的元素類型，k表示分組依據(jù)的元素類型。第二個(gè)參數(shù)downstream，下游是一個(gè)收集器collector. 這個(gè)收集器元素類型是t的子類，容器類型container為a，reduction返回值類型為d。也就是說(shuō)分組的k通過(guò)分類器提供，分組的value則通過(guò)第二個(gè)參數(shù)的收集器reduce出來(lái)。正好，上個(gè)demo的源碼為：

									public static <t, k> collector<t, ?, map<k, list<t>>>

									  groupingby(function<? super t, ? extends k> classifier) {

									    return groupingby(classifier, tolist());

									  }

將tolist當(dāng)作reduce收集器，最終收集的結(jié)果是一個(gè)list<dish>, 所以分組結(jié)束的value類型是list<dish>。那么，可以類推value類型取決于reduce收集器，而reduce收集器則有千千萬(wàn)。比如，我想對(duì)value再次分組，分組也是一種reduce。

									//多級(jí)分組

									map<type, map<caloriclevel, list<dish>>> bytypeandcalory = dishes.stream().collect(

									  groupingby(dish::gettype, groupingby(this::getcaloriclevel)));

									bytypeandcalory.foreach((type, bycalory) -> {

									 system.out.println("----------------------------------");

									 system.out.println(type);

									 bycalory.foreach((level, dishlist) -> {

									  system.out.println("\t" + level);

									  system.out.println("\t\t" + dishlist);

									 });

									});

驗(yàn)證結(jié)果為：

									----------------------------------

									fish

									  diet

									    [dish(name=prawns, vegetarian=false, calories=300, type=fish)]

									  normal

									    [dish(name=salmon, vegetarian=false, calories=450, type=fish)]

									----------------------------------

									meat

									  fat

									    [dish(name=pork, vegetarian=false, calories=800, type=meat)]

									  diet

									    [dish(name=chicken, vegetarian=false, calories=400, type=meat)]

									  normal

									    [dish(name=beef, vegetarian=false, calories=700, type=meat)]

									----------------------------------

									other

									  diet

									    [dish(name=rice, vegetarian=true, calories=350, type=other), dish(name=season fruit, vegetarian=true, calories=120, type=other)]

									  normal

									    [dish(name=french fries, vegetarian=true, calories=530, type=other), dish(name=pizza, vegetarian=true, calories=550, type=other)]

總結(jié)：groupingby的核心參數(shù)為k生成器，v生成器。v生成器可以是任意類型的收集器collector。

比如，v生成器可以是計(jì)算數(shù)目的, 從而實(shí)現(xiàn)了sql語(yǔ)句中的select count(*) from table a group by type

									map<type, long> typescount = dishes.stream().collect(groupingby(dish::gettype, counting()));

									system.out.println(typescount);

									-----------

									{fish=2, meat=3, other=4}

sql查找分組最高分select max(id) from table a group by type

1 2	`map<type, optional<dish>> mostcaloricbytype = dishes.stream()` `.collect(groupingby(dish::gettype, maxby(comparator.comparingint(dish::getcalories))));`

這里的optional沒(méi)有意義，因?yàn)榭隙ú皇莕ull。那么只好取出來(lái)了。使用collectingandthen

									map<type, dish> mostcaloricbytype = dishes.stream()

									  .collect(groupingby(dish::gettype,

									    collectingandthen(maxby(comparator.comparingint(dish::getcalories)), optional::get)));

到這里似乎結(jié)果出來(lái)了，但idea不同意，編譯黃色報(bào)警，按提示修改后變?yōu)椋?/p>

									map<type, dish> mostcaloricbytype = dishes.stream()

									  .collect(tomap(dish::gettype, function.identity(),

									    binaryoperator.maxby(comparingint(dish::getcalories))));

是的，groupingby就變成tomap了，key還是type，value還是dish，但多了一個(gè)參數(shù)??！這里回應(yīng)開頭的坑，開頭的tomap演示是為了容易理解，真那么用則會(huì)被搞死。我們知道把一個(gè)list重組為map必然會(huì)面臨k相同的問(wèn)題。當(dāng)k相同時(shí)，v是覆蓋還是不管呢？前面的demo的做法是當(dāng)k存在時(shí)，再次插入k則直接拋出異常：

1 2	`java.lang.illegalstateexception: duplicate key dish(name=pork, vegetarian=false, calories=800, type=meat)` `at java.util.stream.collectors.lambda$throwingmerger$0(collectors.java:133)`

正確的做法是提供處理沖突的函數(shù)，在本demo中，處理沖突的原則就是找出最大的，正好符合我們分組求最大的要求。(真的不想搞java8函數(shù)式學(xué)習(xí)了，感覺(jué)到處都是性能問(wèn)題的坑)

繼續(xù)數(shù)據(jù)庫(kù)sql映射，分組求和select sum(score) from table a group by type

1 2	`map<type, integer> totalcaloriesbytype = dishes.stream()` `.collect(groupingby(dish::gettype, summingint(dish::getcalories)));`

然而常常和groupingby聯(lián)合使用的另一個(gè)收集器是mapping方法生成的。這個(gè)方法接收兩個(gè)參數(shù)：一個(gè)函數(shù)對(duì)流中的元素做變換，另一個(gè)則將變換的結(jié)果對(duì)象收集起來(lái)。其目的是在累加之前對(duì)每個(gè)輸入元素應(yīng)用一個(gè)映射函數(shù)，這樣就可以讓接收特定類型元素的收集器適應(yīng)不同類型的對(duì)象。我么來(lái)看一個(gè)使用這個(gè)收集器的實(shí)際例子。比如你想得到，對(duì)于每種類型的dish，菜單中都有哪些caloriclevel。我們可以把groupingby和mapping收集器結(jié)合起來(lái)，如下所示:

1 2	`map<type, set<caloriclevel>> caloriclevelsbytype = dishes.stream()` `.collect(groupingby(dish::gettype, mapping(this::getcaloriclevel, toset())));`

這里的toset默認(rèn)采用的hashset，也可以手動(dòng)指定具體實(shí)現(xiàn)tocollection(hashset::new)

分區(qū)

分區(qū)是分組的特殊情況：由一個(gè)謂詞(返回一個(gè)布爾值的函數(shù))作為分類函數(shù)，它稱為分區(qū)函數(shù)。分區(qū)函數(shù)返回一個(gè)布爾值，這意味著得到的分組map的鍵類型是boolean，于是它最多可以分為兩組：true or false. 例如，如果你是素食者，你可能想要把菜單按照素食和非素食分開：

1	`map<boolean, list<dish>> partitionedmenu = dishes.stream().collect(partitioningby(dish::isvegetarian));`

當(dāng)然，使用filter可以達(dá)到同樣的效果：

1	`list<dish> vegetariandishes = dishes.stream().filter(dish::isvegetarian).collect(collectors.tolist());`

分區(qū)相對(duì)來(lái)說(shuō)，優(yōu)勢(shì)就是保存了兩個(gè)副本，當(dāng)你想要對(duì)一個(gè)list分類時(shí)挺有用的。同時(shí)，和groupingby一樣，partitioningby一樣有重載方法，可以指定分組value的類型。

									map<boolean, map<type, list<dish>>> vegetariandishesbytype = dishes.stream()

									  .collect(partitioningby(dish::isvegetarian, groupingby(dish::gettype)));

									map<boolean, integer> vegetariandishestotalcalories = dishes.stream()

									  .collect(partitioningby(dish::isvegetarian, summingint(dish::getcalories)));

									map<boolean, dish> mostcaloricpartitionedbyvegetarian = dishes.stream()

									  .collect(partitioningby(dish::isvegetarian,

									    collectingandthen(maxby(comparingint(dish::getcalories)), optional::get)));

作為使用partitioningby收集器的最后一個(gè)例子，我們把菜單數(shù)據(jù)模型放在一邊，來(lái)看一個(gè)更加復(fù)雜也更為有趣的例子：將數(shù)組分為質(zhì)數(shù)和非質(zhì)數(shù)。

首先，定義個(gè)質(zhì)數(shù)分區(qū)函數(shù)：

									private boolean isprime(int candidate) {

									  int candidateroot = (int) math.sqrt((double) candidate);

									  return intstream.rangeclosed(2, candidateroot).nonematch(i -> candidate % i == 0);

									}