Apache CarbonData Dev Mailing List archive

Re: Help, carbondata issues on spark

Posted by Jacky Li on
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/Help-carbondata-issues-on-spark-tp38089p38546.html

> 在 2018年2月2日，上午11:30，ilegend <[hidden email]> 写道：
>
> Hi guys
> We're testing carbondata for our project. The performance of the carbondata is better than parquet under the special rules, but there are some problems. Do you have any solutions for our issues.
> Hdfs 2.6, spark 2.1, carbondata 1.3
> 1.no multiple levels partitions , we need three levels partitions, like year,day,hour

If you are looking for OLAP on timeseries day, you can try timeseries feature in 1.3, you can refer to the timeseries section in https://github.com/apache/carbondata/blob/master/docs/data-management-on-carbondata.md#pre-aggregate-tables <https://github.com/apache/carbondata/blob/master/docs/data-management-on-carbondata.md#pre-aggregate-tables>

> 2.spark needs import carbondata jar, we wouldn't modify the existing sql algorithm

I think if you are using CarbonSession, you have all builtin sql optimization support from carbon. You do not need to modify your spark jar.

> 3.low stability, insert failure frequently

Is it memory issue?

>
> Look forward to your reply.
>
> 发自我的 iPhone
>
>
>
>
>
>