Posted by
aaron on
Sep 05, 2018; 5:22am
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/master-timeSeries-DATAMAP-does-not-work-well-as-1-4-1-tp61399.html
Hi All,
It seems that master timeSeries DATAMAP does not work well as 1.4.1, could
you please have a look?
Demo data:
|market_code|device_code|
date|country_code|category_id|product_id|revenue|
+-----------+-----------+-------------------+------------+-----------+----------+-------+
|apple-store| ios-phone|2018-02-01 00:00:00| CA|
1|1000000000| 73481|
|apple-store| ios-phone|2018-02-01 00:00:00| CA|
1|1000000001| 713316|
|apple-store| ios-phone|2018-02-01 00:00:00| CA|
1|1000000002| 657503|
|apple-store| ios-phone|2018-02-01 00:00:00| CA|
1|1000000003| 764930|
|apple-store| ios-phone|2018-02-01 00:00:00| CA|
1|1000000004| 835665|
|apple-store| ios-phone|2018-02-01 00:00:00| CA|
1|1000000005| 599234|
|apple-store| ios-phone|2018-02-01 00:00:00| CA|
1|1000000006| 22451|
|apple-store| ios-phone|2018-02-01 00:00:00| CA|
1|1000000007| 17284|
|apple-store| ios-phone|2018-02-01 00:00:00| CA|
1|1000000008| 118846|
|apple-store| ios-phone|2018-02-01 00:00:00| CA|
1|1000000009| 735783|
|apple-store| ios-phone|2018-02-01 00:00:00| CA|
1|1000000010| 698596|
|apple-store| ios-phone|2018-02-01 00:00:00| CA|
1|1000000011| 788919|
|apple-store| ios-phone|2018-02-01 00:00:00| CA|
1|1000000012| 817443|
|apple-store| ios-phone|2018-02-01 00:00:00| CA|
1|1000000013| 839801|
|apple-store| ios-phone|2018-02-01 00:00:00| CA|
1|1000000014| 880020|
|apple-store| ios-phone|2018-02-01 00:00:00| CA|
1|1000000015| 808019|
|apple-store| ios-phone|2018-02-01 00:00:00| CA|
1|1000000016| 740226|
|apple-store| ios-phone|2018-02-01 00:00:00| CA|
1|1000000017| 473469|
|apple-store| ios-phone|2018-02-01 00:00:00| CA|
1|1000000018| 322765|
SQL:
carbon.sql("DROP TABLE IF EXISTS test_store_int")
val createMainTableSql = s"""
| CREATE TABLE test_store_int(
| market_code VARCHAR(50),
| device_code VARCHAR(50),
| date TIMESTAMP,
| country_code CHAR(2),
| category_id INTEGER,
| product_id LONG,
| revenue INTEGER
| )
| STORED BY 'carbondata'
| TBLPROPERTIES(
| 'SORT_COLUMNS'='market_code, device_code, country_code, category_id,
date',
| 'DICTIONARY_INCLUDE'='market_code, device_code, country_code,
category_id, date, product_id',
| 'NO_INVERTED_INDEX'='revenue',
| 'SORT_SCOPE'='GLOBAL_SORT'
| )
""".stripMargin
print(createMainTableSql)
carbon.sql(createMainTableSql)
carbon.sql("DROP DATAMAP test_store_int_agg_by_month ON TABLE
test_store_int")
val createTimeSeriesTableSql = s"""
| CREATE DATAMAP test_store_int_agg_by_month ON TABLE test_store_int
| USING 'timeSeries'
| DMPROPERTIES (
| 'EVENT_TIME'='date',
| 'MONTH_GRANULARITY'='1')
| AS SELECT date, market_code, device_code, country_code, category_id,
product_id, sum(revenue), count(revenue), min(revenue), max(revenue) FROM
test_store_int
| GROUP BY date, market_code, device_code, country_code, category_id,
product_id
""".stripMargin
print(createTimeSeriesTableSql)
carbon.sql(createTimeSeriesTableSql)
Query plan:
1. By month, work
carbon.sql(s"""explain select market_code, device_code, country_code,
category_id, product_id, sum(revenue), timeseries(date, 'month') from
test_store_int group by timeseries(date, 'month'), market_code, device_code,
country_code, category_id, product_id""".stripMargin).show(200,
truncate=false)
|== CarbonData Profiler ==
Query rewrite based on DataMap:
- test_store_int_agg_by_month (timeseries)
Table Scan on test_store_int_test_store_int_agg_by_month
- total blocklets: 4
- filter: none
- pruned by Main DataMap
- skipped blocklets: 0
2. By year, not work
carbon.sql(s"""explain select market_code, device_code, country_code,
category_id, product_id, sum(revenue), timeseries(date, 'year') from
test_store_int group by timeseries(date, 'year'), market_code, device_code,
country_code, category_id, product_id""".stripMargin).show(200,
truncate=false)
|== CarbonData Profiler ==
Table Scan on test_store_int
- total blocklets: 16
- filter: none
- pruned by Main DataMap
- skipped blocklets: 0
Thanks
Aaron
--
Sent from:
http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/