Login  Register

Re: Presto+CarbonData optimization work discussion

Posted by Liang Chen on Jul 20, 2017; 2:34am
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/Presto-CarbonData-optimization-work-discussion-tp18509p18522.html

Hi

For -- 4) Lazy decoding of the dictionary,  just i tested 180 millions rows data with the script:
"select province,sum(age),count(*) from presto_carbondata group by province order by province"

Spark integration module has "dictionary lazy decode", presto doesn't have "dictionary lazy decode", the performance is 4.5 times difference, so "dictionary lazy decode" might much help to improve aggregation performance.

The detail test result as below :

1. Presto+CarbonData is 9 second:
presto:default> select province,sum(age),count(*) from presto_carbondata group by province order by province;
 province |  _col1   |  _col2
----------+----------+---------
 AB       | 57442740 | 1385010
 BC       | 57488826 | 1385580
 MB       | 57564702 | 1386510
 NB       | 57599520 | 1386960
 NL       | 57446592 | 1383774
 NS       | 57448734 | 1384272
 NT       | 57534228 | 1386936
 NU       | 57506844 | 1385346
 ON       | 57484956 | 1384470
 PE       | 57325164 | 1379802
 QC       | 57467886 | 1385076
 SK       | 57385152 | 1382364
 YT       | 57377556 | 1383900
(13 rows)

Query 20170720_022833_00004_c9ky2, FINISHED, 1 node
Splits: 55 total, 55 done (100.00%)
0:09 [18M rows, 34.3MB] [1.92M rows/s, 3.65MB/s]

2.Spark+CarbonData is :2 seconds
scala> benchmark { carbon.sql("select province,sum(age),count(*) from presto_carbondata group by province order by province").show }
+--------+--------+--------+
|province|sum(age)|count(1)|
+--------+--------+--------+
|      AB|57442740| 1385010|
|      BC|57488826| 1385580|
|      MB|57564702| 1386510|
|      NB|57599520| 1386960|
|      NL|57446592| 1383774|
|      NS|57448734| 1384272|
|      NT|57534228| 1386936|
|      NU|57506844| 1385346|
|      ON|57484956| 1384470|
|      PE|57325164| 1379802|
|      QC|57467886| 1385076|
|      SK|57385152| 1382364|
|      YT|57377556| 1383900|
+--------+--------+--------+

2109.346231ms