Login  Register

Re: [DISCUSSION] Distributed Index Cache Server

Posted by kunalkapoor on Feb 13, 2019; 5:22am
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/DISCUSSION-Distributed-Index-Cache-Server-tp75008p75102.html

Hi Manish,
Thank you for the suggestions.

1. I will add the impacted areas to the design document.
2. Yes the mapping has to be updated when a executor is down and when it
get back up the scheduling of the splits has to be done accordingly. Same
will be updated in the design.
3. I think the distribution should be based on the index files and not the
segments, so that when the user has set only 1 segment for the query even
then distribution will happen.
4. Already updated in the design

On Tue, Feb 12, 2019 at 10:18 PM manishgupta88 <[hidden email]>
wrote:

> +1
>
> 1. Add the impacted areas in design document.
> 2. If any executor goes down then update the index cache to executor
> mapping
> in driver accordingly.
> 3. Even though the cache would be divided based on index files, the minimum
> unit of cache need to be fixed. Example: 1 segment cache should belong to 1
> executor only.
> 4. One possible suggestion: Instead of reading the splits in Carbon driver,
> let each executor in index server write to a file and pass the
> List[FilePaths] to the driver and let each Carbon Executors to read from
> that path. This is the case when the number of splits exceed the threshold.
>
> Regards
> Manish Gupta
>
>
>
> --
> Sent from:
> http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
>