Apache CarbonData Dev Mailing List archive › Apache CarbonData JIRA issues

[GitHub] [carbondata] VenuReddy2103 opened a new pull request #3723: [HOTFIX]BlockletIndexInputFormat object instantiation failed due to mismatch in constructor params

Classic

List

39 messages Options

Options

12

GitBox

[GitHub] [carbondata] VenuReddy2103 opened a new pull request #3723: [HOTFIX]BlockletIndexInputFormat object instantiation failed due to mismatch in constructor params

VenuReddy2103 opened a new pull request #3723:
URL: https://github.com/apache/carbondata/pull/3723

### Why is this PR needed?
BlockletIndexInputFormat object instantiation failed due to mismatch in params passed to reflection constructor instantiation and actual parameters of BlockletIndexInputFormat constructor.

### What changes were proposed in this PR?
1. Have modified to pass the correct parameters while instanting the BlockletIndexInputFormat through reflections
2. Segment min-max based pruning to happen when CARBON_LOAD_INDEXES_PARALLEL is enabled.

### Does this PR introduce any user interface change?
- No

### Is any new testcase added?
- No

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3723: [HOTFIX]BlockletIndexInputFormat object instantiation failed due to mismatch in constructor params

CarbonDataQA1 commented on issue #3723:
URL: https://github.com/apache/carbondata/pull/3723#issuecomment-617658724

Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1100/

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3723: [HOTFIX]BlockletIndexInputFormat object instantiation failed due to mismatch in constructor params

In reply to this post by GitBox

CarbonDataQA1 commented on issue #3723:
URL: https://github.com/apache/carbondata/pull/3723#issuecomment-617669025

Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/2813/

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3723: [HOTFIX]BlockletIndexInputFormat object instantiation failed due to mismatch in constructor params

In reply to this post by GitBox

Indhumathi27 commented on a change in pull request #3723:
URL: https://github.com/apache/carbondata/pull/3723#discussion_r412931653

##########
File path: core/src/main/java/org/apache/carbondata/core/index/IndexUtil.java
##########
@@ -198,25 +198,17 @@ public static void loadIndexes(CarbonTable carbonTable, IndexExprWrapper indexEx
String clsName = "org.apache.spark.sql.secondaryindex.Jobs.SparkBlockletIndexLoaderJob";
IndexJob indexJob = (IndexJob) createIndexJob(clsName);
String className = "org.apache.spark.sql.secondaryindex.Jobs.BlockletIndexInputFormat";
- SegmentStatusManager.ValidAndInvalidSegmentsInfo validAndInvalidSegmentsInfo =
- getValidAndInvalidSegments(carbonTable, FileFactory.getConfiguration());
- List<Segment> invalidSegments = validAndInvalidSegmentsInfo.getInvalidSegments();
FileInputFormat indexFormat =
- createIndexJob(carbonTable, indexExprWrapper, validSegments, invalidSegments,
- partitionsToPrune, className, false);
+ createIndexJob(carbonTable, indexExprWrapper, validSegments, className);

Review comment:
Please add an UT with Carbon load indexes parallel enabled, to avoid these issues in future

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] Indhumathi27 commented on issue #3723: [HOTFIX]BlockletIndexInputFormat object instantiation failed due to mismatch in constructor params

In reply to this post by GitBox

Indhumathi27 commented on issue #3723:
URL: https://github.com/apache/carbondata/pull/3723#issuecomment-617746551

@VenuReddy2103 Please check and fix method `CarbonProperties.isIndexParallelLoadingEnabled` in getting load index property also in this PR

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] Indhumathi27 edited a comment on issue #3723: [HOTFIX]BlockletIndexInputFormat object instantiation failed due to mismatch in constructor params

In reply to this post by GitBox

Indhumathi27 edited a comment on issue #3723:
URL: https://github.com/apache/carbondata/pull/3723#issuecomment-617746551

@VenuReddy2103 Please check and fix getting load index property from method `CarbonProperties.isIndexParallelLoadingEnabled` also in this PR

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] akashrn5 commented on a change in pull request #3723: [CARBONDATA-3779]BlockletIndexInputFormat object instantiation failed due to mismatch in constructor params

In reply to this post by GitBox

akashrn5 commented on a change in pull request #3723:
URL: https://github.com/apache/carbondata/pull/3723#discussion_r413625305

##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/secondaryindex/Jobs/BlockletIndexInputFormat.java
##########
@@ -123,8 +127,10 @@ public BlockletIndexInputFormat(CarbonTable table, IndexExprWrapper indexExprWra
Cache<TableBlockIndexUniqueIdentifierWrapper, BlockletIndexWrapper> cache =
CacheProvider.getInstance().createCache(CacheType.DRIVER_BLOCKLET_INDEX);
private Iterator<TableBlockIndexUniqueIdentifier> iterator;
+ private Map<String, Map<String, BlockMetaInfo>> segInfoCache = new HashMap<>();

Review comment:
please add a comment, stating why this map is required.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3723: [CARBONDATA-3779]BlockletIndexInputFormat object instantiation failed due to mismatch in constructor params

In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3723:
URL: https://github.com/apache/carbondata/pull/3723#issuecomment-618890989

Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1117/

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3723: [CARBONDATA-3779]BlockletIndexInputFormat object instantiation failed due to mismatch in constructor params

In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3723:
URL: https://github.com/apache/carbondata/pull/3723#issuecomment-618892748

Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/2830/

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] VenuReddy2103 commented on a change in pull request #3723: [CARBONDATA-3779]BlockletIndexInputFormat object instantiation failed due to mismatch in constructor params

In reply to this post by GitBox

VenuReddy2103 commented on a change in pull request #3723:
URL: https://github.com/apache/carbondata/pull/3723#discussion_r414650596

##########
File path: core/src/main/java/org/apache/carbondata/core/index/IndexUtil.java
##########
@@ -198,25 +198,17 @@ public static void loadIndexes(CarbonTable carbonTable, IndexExprWrapper indexEx
String clsName = "org.apache.spark.sql.secondaryindex.Jobs.SparkBlockletIndexLoaderJob";
IndexJob indexJob = (IndexJob) createIndexJob(clsName);
String className = "org.apache.spark.sql.secondaryindex.Jobs.BlockletIndexInputFormat";
- SegmentStatusManager.ValidAndInvalidSegmentsInfo validAndInvalidSegmentsInfo =
- getValidAndInvalidSegments(carbonTable, FileFactory.getConfiguration());
- List<Segment> invalidSegments = validAndInvalidSegmentsInfo.getInvalidSegments();
FileInputFormat indexFormat =
- createIndexJob(carbonTable, indexExprWrapper, validSegments, invalidSegments,
- partitionsToPrune, className, false);
+ createIndexJob(carbonTable, indexExprWrapper, validSegments, className);

Review comment:
Added a testcase

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] VenuReddy2103 commented on a change in pull request #3723: [CARBONDATA-3779]BlockletIndexInputFormat object instantiation failed due to mismatch in constructor params

In reply to this post by GitBox

VenuReddy2103 commented on a change in pull request #3723:
URL: https://github.com/apache/carbondata/pull/3723#discussion_r414650854

##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/secondaryindex/Jobs/BlockletIndexInputFormat.java
##########
@@ -123,8 +127,10 @@ public BlockletIndexInputFormat(CarbonTable table, IndexExprWrapper indexExprWra
Cache<TableBlockIndexUniqueIdentifierWrapper, BlockletIndexWrapper> cache =
CacheProvider.getInstance().createCache(CacheType.DRIVER_BLOCKLET_INDEX);
private Iterator<TableBlockIndexUniqueIdentifier> iterator;
+ private Map<String, Map<String, BlockMetaInfo>> segInfoCache = new HashMap<>();

Review comment:
Added a comment

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3723: [CARBONDATA-3779]BlockletIndexInputFormat object instantiation failed due to mismatch in constructor params

In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3723:
URL: https://github.com/apache/carbondata/pull/3723#issuecomment-619188674

Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/2832/

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3723: [CARBONDATA-3779]BlockletIndexInputFormat object instantiation failed due to mismatch in constructor params

In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3723:
URL: https://github.com/apache/carbondata/pull/3723#issuecomment-619189998

Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1120/

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3723: [CARBONDATA-3779]BlockletIndexInputFormat object instantiation failed due to mismatch in constructor params

In reply to this post by GitBox

Indhumathi27 commented on a change in pull request #3723:
URL: https://github.com/apache/carbondata/pull/3723#discussion_r415518496

##########
File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/allqueries/TestQueryWithColumnMetCacheAndCacheLevelProperty.scala
##########
@@ -340,5 +340,15 @@ class TestQueryWithColumnMetCacheAndCacheLevelProperty extends QueryTest with Be
sql("DROP table IF EXISTS carbonCahe")
}

-
+ test("Test query with parallel index load") {
+ CarbonProperties.getInstance()
+ .addProperty(CarbonCommonConstants.CARBON_LOAD_INDEXES_PARALLEL, "true")
+ sql("drop table if exists parallel_index_load")
+ sql("CREATE table parallel_index_load (a STRING, b STRING, c INT) STORED AS carbondata")
+ sql("insert into parallel_index_load select 'aa', 'bb', 1")
+ sql("insert into parallel_index_load select 'cc', 'dd', 2")
+ sql("insert into parallel_index_load select 'ee', 'ff', 3")
+ sql("select a, b from parallel_index_load").show()

Review comment:
can change it to `.collect()`

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3723: [CARBONDATA-3779]BlockletIndexInputFormat object instantiation failed due to mismatch in constructor params

In reply to this post by GitBox

Indhumathi27 commented on a change in pull request #3723:
URL: https://github.com/apache/carbondata/pull/3723#discussion_r415519312

##########
File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/allqueries/TestQueryWithColumnMetCacheAndCacheLevelProperty.scala
##########
@@ -340,5 +340,15 @@ class TestQueryWithColumnMetCacheAndCacheLevelProperty extends QueryTest with Be
sql("DROP table IF EXISTS carbonCahe")
}

-
+ test("Test query with parallel index load") {
+ CarbonProperties.getInstance()
+ .addProperty(CarbonCommonConstants.CARBON_LOAD_INDEXES_PARALLEL, "true")
+ sql("drop table if exists parallel_index_load")

Review comment:
Please add drop table to afterALL also and revert 'CARBON_LOAD_INDEXES_PARALLEL' to default at end of the testcase

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] VenuReddy2103 commented on a change in pull request #3723: [CARBONDATA-3779]BlockletIndexInputFormat object instantiation failed due to mismatch in constructor params

In reply to this post by GitBox

VenuReddy2103 commented on a change in pull request #3723:
URL: https://github.com/apache/carbondata/pull/3723#discussion_r415575181

##########
File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/allqueries/TestQueryWithColumnMetCacheAndCacheLevelProperty.scala
##########
@@ -340,5 +340,15 @@ class TestQueryWithColumnMetCacheAndCacheLevelProperty extends QueryTest with Be
sql("DROP table IF EXISTS carbonCahe")
}

-
+ test("Test query with parallel index load") {
+ CarbonProperties.getInstance()
+ .addProperty(CarbonCommonConstants.CARBON_LOAD_INDEXES_PARALLEL, "true")
+ sql("drop table if exists parallel_index_load")
+ sql("CREATE table parallel_index_load (a STRING, b STRING, c INT) STORED AS carbondata")
+ sql("insert into parallel_index_load select 'aa', 'bb', 1")
+ sql("insert into parallel_index_load select 'cc', 'dd', 2")
+ sql("insert into parallel_index_load select 'ee', 'ff', 3")
+ sql("select a, b from parallel_index_load").show()

Review comment:
done

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] VenuReddy2103 commented on a change in pull request #3723: [CARBONDATA-3779]BlockletIndexInputFormat object instantiation failed due to mismatch in constructor params

In reply to this post by GitBox

VenuReddy2103 commented on a change in pull request #3723:
URL: https://github.com/apache/carbondata/pull/3723#discussion_r415579010

##########
File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/allqueries/TestQueryWithColumnMetCacheAndCacheLevelProperty.scala
##########
@@ -340,5 +340,15 @@ class TestQueryWithColumnMetCacheAndCacheLevelProperty extends QueryTest with Be
sql("DROP table IF EXISTS carbonCahe")
}

-
+ test("Test query with parallel index load") {
+ CarbonProperties.getInstance()
+ .addProperty(CarbonCommonConstants.CARBON_LOAD_INDEXES_PARALLEL, "true")
+ sql("drop table if exists parallel_index_load")

Review comment:
Done. Have added `afterEach` to ensure all the tables are dropped at the end of testcase. And reverted `CARBON_LOAD_INDEXES_PARALLEL` to default as well

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3723: [CARBONDATA-3779]BlockletIndexInputFormat object instantiation failed due to mismatch in constructor params

In reply to this post by GitBox

Indhumathi27 commented on a change in pull request #3723:
URL: https://github.com/apache/carbondata/pull/3723#discussion_r415633357

##########
File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/allqueries/TestQueryWithColumnMetCacheAndCacheLevelProperty.scala
##########
@@ -189,7 +195,6 @@ class TestQueryWithColumnMetCacheAndCacheLevelProperty extends QueryTest with Be
}

test("test queries with column_meta_cache and cache_level='BLOCKLET'") {
- dropSchema

Review comment:
removing this might cause testcase failure. Please revert

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3723: [CARBONDATA-3779]BlockletIndexInputFormat object instantiation failed due to mismatch in constructor params

In reply to this post by GitBox

Indhumathi27 commented on a change in pull request #3723:
URL: https://github.com/apache/carbondata/pull/3723#discussion_r415633999

##########
File path: core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
##########
@@ -1561,6 +1561,8 @@ private CarbonCommonConstants() {
@CarbonProperty(dynamicConfigurable = true)
public static final String CARBON_LOAD_INDEXES_PARALLEL = "carbon.load.indexes.parallel.";

+ public static final String CARBON_LOAD_INDEXES_PARALLEL_DEFAULT = "false";

Review comment:
please add comment

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

GitBox

[GitHub] [carbondata] VenuReddy2103 commented on a change in pull request #3723: [CARBONDATA-3779]BlockletIndexInputFormat object instantiation failed due to mismatch in constructor params

In reply to this post by GitBox

VenuReddy2103 commented on a change in pull request #3723:
URL: https://github.com/apache/carbondata/pull/3723#discussion_r415650431

##########
File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/allqueries/TestQueryWithColumnMetCacheAndCacheLevelProperty.scala
##########
@@ -189,7 +195,6 @@ class TestQueryWithColumnMetCacheAndCacheLevelProperty extends QueryTest with Be
}

test("test queries with column_meta_cache and cache_level='BLOCKLET'") {
- dropSchema

Review comment:
In practice, one testcase shouldn't depend on other testcase. `afterEach` ensures that previous test tables are removed. Have ran all the tests in file to ensure there are no failures.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]

12