Login  Register

[Discussion] SI support Complex Array Type

Posted by Indhumathi on Jul 30, 2020; 7:19am
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/Discussion-SI-support-Complex-Array-Type-tp98066.html

Hi community,

Currently, i am working on supporting SI with complex array type.
In order to support it, we must decide, how we can store Array type
in SI, to get better performance.

Solution 1:
Store Array as complex(ARRAY) type in secondary index table.

Cons:
Pruning arrays of huge data on SI and maintable will be an overhead
and might not give much performance results.

Solution 2:
Make Array data as flattened and store it as its child DataType in secondary
index table, which can provide benefit in some scenarios, compared to
solution 1.(i have raised a PR with this solution). On first level, only one level of
Array will be supported.

And also, with this solution, added support to prune SI on rowId(keeping
position id till rowId,instead of blockletId), with complex types for better
performance.

Cons:
With this solution, query having more than one array_contains filter
with expressions like AND, cannot be supported on SI, since the array data
will flattened in SI.

Inputs and suggestions for any new solution/ changes in above solution are
most welcomed.

Regards,
Indhumathi
 



--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/