http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/DISCUSSION-Support-Time-Series-for-MV-datamap-and-autodatamap-loading-of-timeseries-datamaps-tp84721p84927.html
I have some suggestions and questions.
1. In DMPROPERTIES, instead of 'timestamp_column' suggest to use
'timeseries_column'.
supported and update the document with all the datatype supported.
2. Querying on this datamap table is also supported right ? supporting
3. If user has not created day granularity datamap, but just created hour
granularity datamap. When query has day granularity, data will be fetched
> Hi xuchuanyin,
>
> Thanks for the comments/Suggestions
>
> 1. Preaggregate is productized, but not the timeseries with preaggregate,
> i think you got confused with that, if im right.
> 2. Limitations like, auto sampling or rollup, which we will be supporting
> now. Retention policies. etc
> 3. segmentTimestampMin, this i will consider in design.
> 4. RP is added as a separate task, i thought instead of maintaining two
> variables better to maintabin one and parse it. But i will consider your
> point based on feasibility during implementation.
> 5. We use an accumulator which takes list, so before writing index files
> we take the min max of the timestamp column and fill in accumulator and
> then we can access accumulator.value in driver after load is finished.
>
> Regards,
> Akash R Nilugal
>
> On 2019/09/28 10:46:31, xuchuanyin <
[hidden email]> wrote:
> > Hi akash, glad to see the feature proposed and I have some questions
> about
> > this. Please notice that some of the following descriptions are comments
> > followed by '===' described in the design document attached in the
> > corresponding jira.
> >
> > 1.
> > "Currently carbondata supports timeseries on preaggregate datamap, but
> its
> > an alpha feature"
> > ===
> > It has been some time since the preaggregate datamap was introduced and
> it
> > is still **alpha**, why it is still not product-ready? Will the new
> feature
> > also come into the similar situation?
> >
> > 2.
> > "there are so many limitations when we compare and analyze the existing
> > timeseries database or projects which supports time series like apache
> druid
> > or influxdb"
> > ===
> > What are the actual limitations? Besides, please give an example of this.
> >
> > 3.
> > "Segment_Timestamp_Min"
> > ===
> > Suggest using camel-case style like 'segmentTimestampMin'
> >
> > 4.
> > "RP is way of telling the system, for how long the data should be kept"
> > ===
> > Since the function is simple, I'd suggest using 'retentionTime'=15 and
> > 'timeUnit'='day' instead of 'RP'='15_days'
> >
> > 5.
> > "When the data load is called for main table, use an spark accumulator to
> > get the maximum value of timestamp in that load and return to the load."
> > ===
> > How can you get the spark accumulator? The load is launched using
> > loading-by-dataframe not using global-sort-by-spark.
> >
> > 6.
> > For the rest of the content, still reading.
> >
> >
> >
> >
> > --
> > Sent from:
>
http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/> >
>