Apache CarbonData Dev Mailing List archive

Re: [DISCUSSION] Distributed Index Cache Server

Posted by kunalkapoor on Feb 17, 2019; 11:37am
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/DISCUSSION-Distributed-Index-Cache-Server-tp75008p75185.html

Hi Dhatchayani,
1. The next query will take care of removing the cache for the deleted
segments. The request is designed to contain the invalid segments as well,
so that the corresponding datamaps can be removed from the cache.
2. No impact on clean files command.
3. ColumnCache will behave in the same way. If alter command is fired then
the cache would be changed accordingly.
4. This is a valid point, the query retry configuration should be disabled
so that the datamaps are caches in the assigned executor only. Even if the
query fails then the carbon driver will take care of the pruning.

Thanks
Kunal Kapoor

On Thu, Feb 14, 2019 at 11:56 PM dhatchayani <[hidden email]>
wrote:

> Hi Kunal,
>
> This feature looks great from the design.
>
> Still, I need some more clarifications on the below points.
>
> (1) How segment deletion will be handled? Whether the next query takes care
> of clearing this segment and update the driver map or the delete operation
> will update?
> (2) Is there any impact on the CLEAN FILES command?
> (3) Is there any impact on the COLUMN_META_CACHE property? This is a
> session
> property and can be changed through ALTER command. If this property is
> altered, accordingly the current cache implementation will invalidate the
> datamap cache, in required cases.
> (4) Executor shut down/failures will be handled by the spark cluster
> manager? In between query execution, if some executor fails, then the tasks
> will be re-launched in any other available executors?
>
>
>
> --
> Sent from:
> http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
>