Apache CarbonData Dev Mailing List archive

Re: [Discussion]Presto Queries leveraging Secondary Index

Posted by akashrn5 on Jan 18, 2021; 4:12am
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/Discussion-Presto-Queries-leveraging-Secondary-Index-tp105291p105429.html

Hi venu,

Thanks for suggesting.

1. option 1 is not a good idea. i think performance will be bad
2. for option2, like we have other indexes of lucene and bloom where the
distributed pruning happens. Lucene also a index stored along with table,
but not another table like SI, so we scan lucene in a distributed job and
then return the index for the filter expression. So similarly we can call
for SI to scan and prune, but since we need spark job to do it, we need
indexserver which is the only option.
So we can use that for scanning, but im afraid if it impacts the other
concurrent queries, so i would suggest better to go for POC with the index
server where we will get to know some other bottlenecks with this approach,
so then we can decide and start design.

If you have already done POC and have some results and design is ready, we
can review that.

Thanks

Regards
Akash

--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/