Posted by
Ajantha Bhat on
Jul 30, 2020; 9:34am
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/Discussion-SI-support-Complex-Array-Type-tp98066p98104.html
Hi David & Indhumathi,
Storing Array of String as just String column in SI by flattening [with row
level position reference] can result in slow performance in case of
* Multiple array_contains() or multiple array[0] = 'x'
* The join solution mentioned can result in multiple scan (once for every
complex filter condition) which can slow down the SI performance.
* Row level SI can slow down SI performance when the filter results huge
value.
* To support multiple SI on a single table, complex SI will become row
level position reference and primitive will become blocklet level position
reference. Need extra logic /time for join.
* Solution 2 cannot support struct column SI in the future. So, it cannot
be a generic solution.
Considering the above points, *solution2 is a very good solution if only
one filter exist* for complex column. *But not a good solution for all the
scenarios.*
*So, I have to go with solution1 or need to wait for other people opinions
or new solutions.*
Thanks,
Ajantha
On Thu, Jul 30, 2020 at 1:19 PM David CaiQiang <
[hidden email]> wrote: