Login  Register

master timeSeries DATAMAP does not work well as 1.4.1

classic Classic list List threaded Threaded
2 messages Options Options
Embed post
Permalink
Reply | Threaded
Open this post in threaded view
| More
Print post
Permalink

master timeSeries DATAMAP does not work well as 1.4.1

aaron
Hi All,

It seems that master timeSeries DATAMAP does not work well as 1.4.1, could
you please have a look?


Demo data:

|market_code|device_code|              
date|country_code|category_id|product_id|revenue|
+-----------+-----------+-------------------+------------+-----------+----------+-------+
|apple-store|  ios-phone|2018-02-01 00:00:00|          CA|        
1|1000000000|  73481|
|apple-store|  ios-phone|2018-02-01 00:00:00|          CA|        
1|1000000001| 713316|
|apple-store|  ios-phone|2018-02-01 00:00:00|          CA|        
1|1000000002| 657503|
|apple-store|  ios-phone|2018-02-01 00:00:00|          CA|        
1|1000000003| 764930|
|apple-store|  ios-phone|2018-02-01 00:00:00|          CA|        
1|1000000004| 835665|
|apple-store|  ios-phone|2018-02-01 00:00:00|          CA|        
1|1000000005| 599234|
|apple-store|  ios-phone|2018-02-01 00:00:00|          CA|        
1|1000000006|  22451|
|apple-store|  ios-phone|2018-02-01 00:00:00|          CA|        
1|1000000007|  17284|
|apple-store|  ios-phone|2018-02-01 00:00:00|          CA|        
1|1000000008| 118846|
|apple-store|  ios-phone|2018-02-01 00:00:00|          CA|        
1|1000000009| 735783|
|apple-store|  ios-phone|2018-02-01 00:00:00|          CA|        
1|1000000010| 698596|
|apple-store|  ios-phone|2018-02-01 00:00:00|          CA|        
1|1000000011| 788919|
|apple-store|  ios-phone|2018-02-01 00:00:00|          CA|        
1|1000000012| 817443|
|apple-store|  ios-phone|2018-02-01 00:00:00|          CA|        
1|1000000013| 839801|
|apple-store|  ios-phone|2018-02-01 00:00:00|          CA|        
1|1000000014| 880020|
|apple-store|  ios-phone|2018-02-01 00:00:00|          CA|        
1|1000000015| 808019|
|apple-store|  ios-phone|2018-02-01 00:00:00|          CA|        
1|1000000016| 740226|
|apple-store|  ios-phone|2018-02-01 00:00:00|          CA|        
1|1000000017| 473469|
|apple-store|  ios-phone|2018-02-01 00:00:00|          CA|        
1|1000000018| 322765|

SQL:

carbon.sql("DROP TABLE IF EXISTS test_store_int")
    val createMainTableSql = s"""
      | CREATE TABLE test_store_int(
      | market_code VARCHAR(50),
      | device_code VARCHAR(50),
      | date TIMESTAMP,
      | country_code CHAR(2),
      | category_id INTEGER,
      | product_id LONG,
      | revenue INTEGER
      | )
      | STORED BY 'carbondata'
      | TBLPROPERTIES(
      | 'SORT_COLUMNS'='market_code, device_code, country_code, category_id,
date',
      | 'DICTIONARY_INCLUDE'='market_code, device_code, country_code,
category_id, date, product_id',
      | 'NO_INVERTED_INDEX'='revenue',
      | 'SORT_SCOPE'='GLOBAL_SORT'
      | )
    """.stripMargin
    print(createMainTableSql)
    carbon.sql(createMainTableSql)

    carbon.sql("DROP DATAMAP test_store_int_agg_by_month ON TABLE
test_store_int")
    val createTimeSeriesTableSql = s"""
      | CREATE DATAMAP test_store_int_agg_by_month ON TABLE test_store_int
      | USING 'timeSeries'
      | DMPROPERTIES (
      | 'EVENT_TIME'='date',
      | 'MONTH_GRANULARITY'='1')
      | AS SELECT date, market_code, device_code, country_code, category_id,
product_id, sum(revenue), count(revenue), min(revenue), max(revenue) FROM
test_store_int
      | GROUP BY date, market_code, device_code, country_code, category_id,
product_id
    """.stripMargin
    print(createTimeSeriesTableSql)
    carbon.sql(createTimeSeriesTableSql)

Query plan:

1. By month, work
carbon.sql(s"""explain select market_code, device_code, country_code,
category_id, product_id, sum(revenue), timeseries(date, 'month') from
test_store_int group by timeseries(date, 'month'), market_code, device_code,
country_code, category_id, product_id""".stripMargin).show(200,
truncate=false)

|== CarbonData Profiler ==
Query rewrite based on DataMap:
 - test_store_int_agg_by_month (timeseries)
Table Scan on test_store_int_test_store_int_agg_by_month
 - total blocklets: 4
 - filter: none
 - pruned by Main DataMap
    - skipped blocklets: 0

2. By year,  not work
carbon.sql(s"""explain select market_code, device_code, country_code,
category_id, product_id, sum(revenue), timeseries(date, 'year') from
test_store_int group by timeseries(date, 'year'), market_code, device_code,
country_code, category_id, product_id""".stripMargin).show(200,
truncate=false)

|== CarbonData Profiler ==
Table Scan on test_store_int
 - total blocklets: 16
 - filter: none
 - pruned by Main DataMap
    - skipped blocklets: 0


Thanks
Aaron                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              



--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
Reply | Threaded
Open this post in threaded view
| More
Print post
Permalink

Re: master timeSeries DATAMAP does not work well as 1.4.1

aaron
My mistake. MONTH and YEAR should roll up from DAY



--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/