[GitHub] carbondata pull request #3059: [HOTFIX][DataLoad]fix task assignment issue u...

classic Classic list List threaded Threaded
22 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #3059: [HOTFIX][DataLoad]fix task assignment issue u...

qiuchenjian-2
Github user kevinjmh commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/3059#discussion_r246376652
 
    --- Diff: processing/src/main/java/org/apache/carbondata/processing/util/CarbonLoaderUtil.java ---
    @@ -609,6 +609,14 @@ public static Dictionary getDictionary(AbsoluteTableIdentifier absoluteTableIden
               blockAssignmentStrategy = BlockAssignmentStrategy.BLOCK_SIZE_FIRST;
             } else {
               blockAssignmentStrategy = BlockAssignmentStrategy.BLOCK_NUM_FIRST;
    +          // fall back to BLOCK_NUM_FIRST strategy need to reset
    +          // the average expected size for each node
    +          if (blockInfos.size() > 0) {
    --- End diff --
   
    could be set to some value if use NODE_MIN_SIZE_FIRST but fall back to BLOCK_NUM_FIRST


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #3059: [HOTFIX][DataLoad]fix task assignment issue u...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user KanakaKumar commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/3059#discussion_r246826911
 
    --- Diff: processing/src/main/java/org/apache/carbondata/processing/util/CarbonLoaderUtil.java ---
    @@ -609,6 +609,14 @@ public static Dictionary getDictionary(AbsoluteTableIdentifier absoluteTableIden
               blockAssignmentStrategy = BlockAssignmentStrategy.BLOCK_SIZE_FIRST;
             } else {
               blockAssignmentStrategy = BlockAssignmentStrategy.BLOCK_NUM_FIRST;
    +          // fall back to BLOCK_NUM_FIRST strategy need to reset
    +          // the average expected size for each node
    +          if (blockInfos.size() > 0) {
    +            sizePerNode = blockInfos.size() / numOfNodes;
    --- End diff --
   
    This logic can be simplified like below.
    sizePerNode = blockInfos.size() / numOfNodes;
    sizePerNode = sizePerNode == 0 ? 1 : sizePerNode
   
    Also, please avoid DivideByZeroError if numOfNodes is zero in a faulty cluster case.


---
12