[jira] [Created] (CARBONDATA-1213) Removed rowCountPercentage check and fixed IUD data load issue

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (CARBONDATA-1213) Removed rowCountPercentage check and fixed IUD data load issue

Akash R Nilugal (Jira)
Manish Gupta created CARBONDATA-1213:
----------------------------------------

             Summary: Removed rowCountPercentage check and fixed IUD data load issue
                 Key: CARBONDATA-1213
                 URL: https://issues.apache.org/jira/browse/CARBONDATA-1213
             Project: CarbonData
          Issue Type: Bug
            Reporter: Manish Gupta
            Assignee: Manish Gupta
             Fix For: 1.2.0


Problems:
1. Row count percentage not required with high cardinality threshold check
2. IUD returning incorrect results in case of update on high cardinality column

Analysis:
1. In case a column is identified as high cardinality column still it is not getting converted to no dictionary column because of another parameter check called rowCountPercentage. Default value of rowCountPercentage is 80%. Due to this even though high cardinality column is identified, if it is less than 80% of the total number of rows it will be treated as dictionary column. This can still lead to executor lost failure due to memory constraints.
2. RLE on a column is not being set correctly and due to incorrect code design RLE applicable on a column is decided by a different part of code from the one which is actually applying the RLE on a column. Because of this Footer is getting filled with incorrect RLE information and query is failing.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)