Why not support global sort in partition table?

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Why not support global sort in partition table?

carbondata-newuser
Such table can be created but if you insert data to the table.
It will throw an error like:
org.apache.carbondata.spark.exception.MalformedCarbonCommandException: Don't
support use global sort on partitioned table.




--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Why not support global sort in partition table?

Jacky Li
I think there is no technical reason that it can’t be supported, it is just because it is not implemented yet. I think it is not implement because:
1. In case of partition plus sorting, it will be like global sort when the query leverage partition pruning if you give partition column in predicate.
2. Partition with global sort will lead to 2 times shuffle in data loading, so loading will be slower.

If user can tolerate these impact and want to use partition for ease of changing and deleting data, I think this feature is welcomed. So feel free to raise JIRA for this.

Regards,
Jacky


> 在 2018年7月19日,下午3:00,carbondata-newuser <[hidden email]> 写道:
>
> Such table can be created but if you insert data to the table.
> It will throw an error like:
> org.apache.carbondata.spark.exception.MalformedCarbonCommandException: Don't
> support use global sort on partitioned table.
>
>
>
>
> --
> Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
>