master timeSeries DATAMAP does not work well as 1.4.1

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

master timeSeries DATAMAP does not work well as 1.4.1

aaron
Hi All,

It seems that master timeSeries DATAMAP does not work well as 1.4.1, could
you please have a look?


Demo data:

|market_code|device_code|              
date|country_code|category_id|product_id|revenue|
+-----------+-----------+-------------------+------------+-----------+----------+-------+
|apple-store|  ios-phone|2018-02-01 00:00:00|          CA|        
1|1000000000|  73481|
|apple-store|  ios-phone|2018-02-01 00:00:00|          CA|        
1|1000000001| 713316|
|apple-store|  ios-phone|2018-02-01 00:00:00|          CA|        
1|1000000002| 657503|
|apple-store|  ios-phone|2018-02-01 00:00:00|          CA|        
1|1000000003| 764930|
|apple-store|  ios-phone|2018-02-01 00:00:00|          CA|        
1|1000000004| 835665|
|apple-store|  ios-phone|2018-02-01 00:00:00|          CA|        
1|1000000005| 599234|
|apple-store|  ios-phone|2018-02-01 00:00:00|          CA|        
1|1000000006|  22451|
|apple-store|  ios-phone|2018-02-01 00:00:00|          CA|        
1|1000000007|  17284|
|apple-store|  ios-phone|2018-02-01 00:00:00|          CA|        
1|1000000008| 118846|
|apple-store|  ios-phone|2018-02-01 00:00:00|          CA|        
1|1000000009| 735783|
|apple-store|  ios-phone|2018-02-01 00:00:00|          CA|        
1|1000000010| 698596|
|apple-store|  ios-phone|2018-02-01 00:00:00|          CA|        
1|1000000011| 788919|
|apple-store|  ios-phone|2018-02-01 00:00:00|          CA|        
1|1000000012| 817443|
|apple-store|  ios-phone|2018-02-01 00:00:00|          CA|        
1|1000000013| 839801|
|apple-store|  ios-phone|2018-02-01 00:00:00|          CA|        
1|1000000014| 880020|
|apple-store|  ios-phone|2018-02-01 00:00:00|          CA|        
1|1000000015| 808019|
|apple-store|  ios-phone|2018-02-01 00:00:00|          CA|        
1|1000000016| 740226|
|apple-store|  ios-phone|2018-02-01 00:00:00|          CA|        
1|1000000017| 473469|
|apple-store|  ios-phone|2018-02-01 00:00:00|          CA|        
1|1000000018| 322765|

SQL:

carbon.sql("DROP TABLE IF EXISTS test_store_int")
    val createMainTableSql = s"""
      | CREATE TABLE test_store_int(
      | market_code VARCHAR(50),
      | device_code VARCHAR(50),
      | date TIMESTAMP,
      | country_code CHAR(2),
      | category_id INTEGER,
      | product_id LONG,
      | revenue INTEGER
      | )
      | STORED BY 'carbondata'
      | TBLPROPERTIES(
      | 'SORT_COLUMNS'='market_code, device_code, country_code, category_id,
date',
      | 'DICTIONARY_INCLUDE'='market_code, device_code, country_code,
category_id, date, product_id',
      | 'NO_INVERTED_INDEX'='revenue',
      | 'SORT_SCOPE'='GLOBAL_SORT'
      | )
    """.stripMargin
    print(createMainTableSql)
    carbon.sql(createMainTableSql)

    carbon.sql("DROP DATAMAP test_store_int_agg_by_month ON TABLE
test_store_int")
    val createTimeSeriesTableSql = s"""
      | CREATE DATAMAP test_store_int_agg_by_month ON TABLE test_store_int
      | USING 'timeSeries'
      | DMPROPERTIES (
      | 'EVENT_TIME'='date',
      | 'MONTH_GRANULARITY'='1')
      | AS SELECT date, market_code, device_code, country_code, category_id,
product_id, sum(revenue), count(revenue), min(revenue), max(revenue) FROM
test_store_int
      | GROUP BY date, market_code, device_code, country_code, category_id,
product_id
    """.stripMargin
    print(createTimeSeriesTableSql)
    carbon.sql(createTimeSeriesTableSql)

Query plan:

1. By month, work
carbon.sql(s"""explain select market_code, device_code, country_code,
category_id, product_id, sum(revenue), timeseries(date, 'month') from
test_store_int group by timeseries(date, 'month'), market_code, device_code,
country_code, category_id, product_id""".stripMargin).show(200,
truncate=false)

|== CarbonData Profiler ==
Query rewrite based on DataMap:
 - test_store_int_agg_by_month (timeseries)
Table Scan on test_store_int_test_store_int_agg_by_month
 - total blocklets: 4
 - filter: none
 - pruned by Main DataMap
    - skipped blocklets: 0

2. By year,  not work
carbon.sql(s"""explain select market_code, device_code, country_code,
category_id, product_id, sum(revenue), timeseries(date, 'year') from
test_store_int group by timeseries(date, 'year'), market_code, device_code,
country_code, category_id, product_id""".stripMargin).show(200,
truncate=false)

|== CarbonData Profiler ==
Table Scan on test_store_int
 - total blocklets: 16
 - filter: none
 - pruned by Main DataMap
    - skipped blocklets: 0


Thanks
Aaron                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              



--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: master timeSeries DATAMAP does not work well as 1.4.1

aaron
My mistake. MONTH and YEAR should roll up from DAY



--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/