[GitHub] [carbondata] jackylk opened a new pull request #3522: [WIP] Remove preaggregate and timeseries datamap

classic Classic list List threaded Threaded
39 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat commented on issue #3522: [CARBONDATA-3609][CARBONDATA-3610] Remove preaggregate and timeseries datamap

GitBox
ajantha-bhat commented on issue #3522: [CARBONDATA-3609][CARBONDATA-3610] Remove preaggregate and timeseries datamap
URL: https://github.com/apache/carbondata/pull/3522#issuecomment-568336392
 
 
   By removing preaggregate we lose preaggregate + streaming feature. But we can implement it on mv (currently doesn't support)
   
   @akashrn5, @kumarvishal09  : please check the changes once. I will also review.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat commented on issue #3522: [CARBONDATA-3609][CARBONDATA-3610] Remove preaggregate and timeseries datamap

GitBox
In reply to this post by GitBox
ajantha-bhat commented on issue #3522: [CARBONDATA-3609][CARBONDATA-3610] Remove preaggregate and timeseries datamap
URL: https://github.com/apache/carbondata/pull/3522#issuecomment-568336817
 
 
   @akashrn5 , @Indhumathi27 : Is all pre-aggreagte and timeseries testcase scenario we have converted to MV test cases?  If not instead of removing pre-aggregate test case, may be we can convert them to mv test case.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] Indhumathi27 commented on issue #3522: [CARBONDATA-3609][CARBONDATA-3610] Remove preaggregate and timeseries datamap

GitBox
In reply to this post by GitBox
Indhumathi27 commented on issue #3522: [CARBONDATA-3609][CARBONDATA-3610] Remove preaggregate and timeseries datamap
URL: https://github.com/apache/carbondata/pull/3522#issuecomment-568364284
 
 
   @ajantha-bhat I have checked MV test cases. It contains all preaggregate scenarios. So, I think Preaggregate test cases can be removed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3522: [CARBONDATA-3609][CARBONDATA-3610] Remove preaggregate and timeseries datamap

GitBox
In reply to this post by GitBox
ajantha-bhat commented on a change in pull request #3522: [CARBONDATA-3609][CARBONDATA-3610] Remove preaggregate and timeseries datamap
URL: https://github.com/apache/carbondata/pull/3522#discussion_r360773039
 
 

 ##########
 File path: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/datamap/CarbonDropDataMapCommand.scala
 ##########
 @@ -33,7 +32,6 @@ import org.apache.carbondata.core.datamap.{DataMapProvider, DataMapStoreManager}
 import org.apache.carbondata.core.datamap.status.DataMapStatusManager
 import org.apache.carbondata.core.locks.{CarbonLockUtil, ICarbonLock, LockUsage}
 import org.apache.carbondata.core.metadata.AbsoluteTableIdentifier
-import org.apache.carbondata.core.metadata.converter.ThriftWrapperSchemaConverterImpl
 
 Review comment:
   In `Schema.Thrift`
   `DataMapSchema` has `childTableIdentifier`, `childTableSchema`.
   
   I think we can check and remove those if it is not used for MV

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3522: [CARBONDATA-3609][CARBONDATA-3610] Remove preaggregate and timeseries datamap

GitBox
In reply to this post by GitBox
ajantha-bhat commented on a change in pull request #3522: [CARBONDATA-3609][CARBONDATA-3610] Remove preaggregate and timeseries datamap
URL: https://github.com/apache/carbondata/pull/3522#discussion_r360773592
 
 

 ##########
 File path: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/table/CarbonDropTableCommand.scala
 ##########
 @@ -69,7 +69,7 @@ case class CarbonDropTableCommand(
                 CarbonLockUtil.getLockObject(identifier, lock)
       }
       // check for directly drop datamap table
-      if (carbonTable.isChildTable && !dropChildTable) {
+      if (carbonTable.isChildTableForMV && !dropChildTable) {
 
 Review comment:
   This file has a big section for pre-aggregate, we can remove that code
   
   ` if (carbonTable.getTableInfo.getParentRelationIdentifiers.size() == 1) {`

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3522: [CARBONDATA-3609][CARBONDATA-3610] Remove preaggregate and timeseries datamap

GitBox
In reply to this post by GitBox
ajantha-bhat commented on a change in pull request #3522: [CARBONDATA-3609][CARBONDATA-3610] Remove preaggregate and timeseries datamap
URL: https://github.com/apache/carbondata/pull/3522#discussion_r360773888
 
 

 ##########
 File path: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java
 ##########
 @@ -116,54 +116,35 @@
     SegmentStatusManager segmentStatusManager = new SegmentStatusManager(identifier,
         readCommittedScope.getConfiguration());
     SegmentStatusManager.ValidAndInvalidSegmentsInfo segments = segmentStatusManager
-        .getValidAndInvalidSegments(carbonTable.isChildTable(), loadMetadataDetails,
+        .getValidAndInvalidSegments(carbonTable.isChildTableForMV(), loadMetadataDetails,
             this.readCommittedScope);
 
-    // to check whether only streaming segments access is enabled or not,
-    // if access streaming segment is true then data will be read from streaming segments
-    boolean accessStreamingSegments = getAccessStreamingSegments(job.getConfiguration());
-    if (getValidateSegmentsToAccess(job.getConfiguration())) {
-      if (!accessStreamingSegments) {
-        List<Segment> validSegments = segments.getValidSegments();
-        streamSegments = segments.getStreamSegments();
-        streamSegments = getFilteredSegment(job, streamSegments, readCommittedScope);
-        if (validSegments.size() == 0) {
-          return getSplitsOfStreaming(job, streamSegments, carbonTable);
-        }
-        List<Segment> filteredSegmentToAccess =
-            getFilteredSegment(job, segments.getValidSegments(), readCommittedScope);
-        if (filteredSegmentToAccess.size() == 0) {
-          return getSplitsOfStreaming(job, streamSegments, carbonTable);
-        } else {
-          setSegmentsToAccess(job.getConfiguration(), filteredSegmentToAccess);
-        }
-      } else {
-        List<Segment> filteredNormalSegments =
-            getFilteredNormalSegments(segments.getValidSegments(),
 
 Review comment:
   getFilteredNormalSegments definition also we can remove, it is unused now.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3522: [CARBONDATA-3609][CARBONDATA-3610] Remove preaggregate and timeseries datamap

GitBox
In reply to this post by GitBox
ajantha-bhat commented on a change in pull request #3522: [CARBONDATA-3609][CARBONDATA-3610] Remove preaggregate and timeseries datamap
URL: https://github.com/apache/carbondata/pull/3522#discussion_r360774327
 
 

 ##########
 File path: README.md
 ##########
 @@ -58,8 +58,6 @@ CarbonData is built using Apache Maven, to [build CarbonData](https://github.com
 * [CarbonData DataMap Management](https://github.com/apache/carbondata/blob/master/docs/datamap/datamap-management.md)
  * [CarbonData BloomFilter DataMap](https://github.com/apache/carbondata/blob/master/docs/datamap/bloomfilter-datamap-guide.md)
  * [CarbonData Lucene DataMap](https://github.com/apache/carbondata/blob/master/docs/datamap/lucene-datamap-guide.md)
- * [CarbonData Pre-aggregate DataMap](https://github.com/apache/carbondata/blob/master/docs/datamap/preaggregate-datamap-guide.md)
- * [CarbonData Timeseries DataMap](https://github.com/apache/carbondata/blob/master/docs/datamap/timeseries-datamap-guide.md)
 
 Review comment:
   In `introduction.md` also need to remove.
   
   **please search for "preAgg", "pre Agg", "pre-Agg".
   I still see few more places need to removed and modified. Can handle comment section also**
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] jackylk commented on a change in pull request #3522: [CARBONDATA-3609][CARBONDATA-3610] Remove preaggregate and timeseries datamap

GitBox
In reply to this post by GitBox
jackylk commented on a change in pull request #3522: [CARBONDATA-3609][CARBONDATA-3610] Remove preaggregate and timeseries datamap
URL: https://github.com/apache/carbondata/pull/3522#discussion_r361050057
 
 

 ##########
 File path: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/datamap/CarbonDropDataMapCommand.scala
 ##########
 @@ -33,7 +32,6 @@ import org.apache.carbondata.core.datamap.{DataMapProvider, DataMapStoreManager}
 import org.apache.carbondata.core.datamap.status.DataMapStatusManager
 import org.apache.carbondata.core.locks.{CarbonLockUtil, ICarbonLock, LockUsage}
 import org.apache.carbondata.core.metadata.AbsoluteTableIdentifier
-import org.apache.carbondata.core.metadata.converter.ThriftWrapperSchemaConverterImpl
 
 Review comment:
   This will break compatibility, better do not delete them

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] jackylk commented on a change in pull request #3522: [CARBONDATA-3609][CARBONDATA-3610] Remove preaggregate and timeseries datamap

GitBox
In reply to this post by GitBox
jackylk commented on a change in pull request #3522: [CARBONDATA-3609][CARBONDATA-3610] Remove preaggregate and timeseries datamap
URL: https://github.com/apache/carbondata/pull/3522#discussion_r361050141
 
 

 ##########
 File path: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/table/CarbonDropTableCommand.scala
 ##########
 @@ -69,7 +69,7 @@ case class CarbonDropTableCommand(
                 CarbonLockUtil.getLockObject(identifier, lock)
       }
       // check for directly drop datamap table
-      if (carbonTable.isChildTable && !dropChildTable) {
+      if (carbonTable.isChildTableForMV && !dropChildTable) {
 
 Review comment:
   fixed

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] jackylk commented on a change in pull request #3522: [CARBONDATA-3609][CARBONDATA-3610] Remove preaggregate and timeseries datamap

GitBox
In reply to this post by GitBox
jackylk commented on a change in pull request #3522: [CARBONDATA-3609][CARBONDATA-3610] Remove preaggregate and timeseries datamap
URL: https://github.com/apache/carbondata/pull/3522#discussion_r361050163
 
 

 ##########
 File path: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java
 ##########
 @@ -116,54 +116,35 @@
     SegmentStatusManager segmentStatusManager = new SegmentStatusManager(identifier,
         readCommittedScope.getConfiguration());
     SegmentStatusManager.ValidAndInvalidSegmentsInfo segments = segmentStatusManager
-        .getValidAndInvalidSegments(carbonTable.isChildTable(), loadMetadataDetails,
+        .getValidAndInvalidSegments(carbonTable.isChildTableForMV(), loadMetadataDetails,
             this.readCommittedScope);
 
-    // to check whether only streaming segments access is enabled or not,
-    // if access streaming segment is true then data will be read from streaming segments
-    boolean accessStreamingSegments = getAccessStreamingSegments(job.getConfiguration());
-    if (getValidateSegmentsToAccess(job.getConfiguration())) {
-      if (!accessStreamingSegments) {
-        List<Segment> validSegments = segments.getValidSegments();
-        streamSegments = segments.getStreamSegments();
-        streamSegments = getFilteredSegment(job, streamSegments, readCommittedScope);
-        if (validSegments.size() == 0) {
-          return getSplitsOfStreaming(job, streamSegments, carbonTable);
-        }
-        List<Segment> filteredSegmentToAccess =
-            getFilteredSegment(job, segments.getValidSegments(), readCommittedScope);
-        if (filteredSegmentToAccess.size() == 0) {
-          return getSplitsOfStreaming(job, streamSegments, carbonTable);
-        } else {
-          setSegmentsToAccess(job.getConfiguration(), filteredSegmentToAccess);
-        }
-      } else {
-        List<Segment> filteredNormalSegments =
-            getFilteredNormalSegments(segments.getValidSegments(),
 
 Review comment:
   fixed

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] jackylk commented on a change in pull request #3522: [CARBONDATA-3609][CARBONDATA-3610] Remove preaggregate and timeseries datamap

GitBox
In reply to this post by GitBox
jackylk commented on a change in pull request #3522: [CARBONDATA-3609][CARBONDATA-3610] Remove preaggregate and timeseries datamap
URL: https://github.com/apache/carbondata/pull/3522#discussion_r361052278
 
 

 ##########
 File path: README.md
 ##########
 @@ -58,8 +58,6 @@ CarbonData is built using Apache Maven, to [build CarbonData](https://github.com
 * [CarbonData DataMap Management](https://github.com/apache/carbondata/blob/master/docs/datamap/datamap-management.md)
  * [CarbonData BloomFilter DataMap](https://github.com/apache/carbondata/blob/master/docs/datamap/bloomfilter-datamap-guide.md)
  * [CarbonData Lucene DataMap](https://github.com/apache/carbondata/blob/master/docs/datamap/lucene-datamap-guide.md)
- * [CarbonData Pre-aggregate DataMap](https://github.com/apache/carbondata/blob/master/docs/datamap/preaggregate-datamap-guide.md)
- * [CarbonData Timeseries DataMap](https://github.com/apache/carbondata/blob/master/docs/datamap/timeseries-datamap-guide.md)
 
 Review comment:
   searched all place and fixed

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3522: [CARBONDATA-3609][CARBONDATA-3610] Remove preaggregate and timeseries datamap

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3522: [CARBONDATA-3609][CARBONDATA-3610] Remove preaggregate and timeseries datamap
URL: https://github.com/apache/carbondata/pull/3522#issuecomment-568642476
 
 
   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1278/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3522: [CARBONDATA-3609][CARBONDATA-3610] Remove preaggregate and timeseries datamap

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3522: [CARBONDATA-3609][CARBONDATA-3610] Remove preaggregate and timeseries datamap
URL: https://github.com/apache/carbondata/pull/3522#issuecomment-568642477
 
 
   Build Failed with Spark 2.2.1, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.2/1267/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3522: [CARBONDATA-3609][CARBONDATA-3610] Remove preaggregate and timeseries datamap

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3522: [CARBONDATA-3609][CARBONDATA-3610] Remove preaggregate and timeseries datamap
URL: https://github.com/apache/carbondata/pull/3522#issuecomment-568642475
 
 
   Build Failed  with Spark 2.1.0, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.1/1257/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3522: [CARBONDATA-3609][CARBONDATA-3610] Remove preaggregate and timeseries datamap

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3522: [CARBONDATA-3609][CARBONDATA-3610] Remove preaggregate and timeseries datamap
URL: https://github.com/apache/carbondata/pull/3522#issuecomment-568663426
 
 
   Build Success with Spark 2.1.0, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.1/1259/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3522: [CARBONDATA-3609][CARBONDATA-3610] Remove preaggregate and timeseries datamap

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3522: [CARBONDATA-3609][CARBONDATA-3610] Remove preaggregate and timeseries datamap
URL: https://github.com/apache/carbondata/pull/3522#issuecomment-568674344
 
 
   Build Success with Spark 2.2.1, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.2/1269/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3522: [CARBONDATA-3609][CARBONDATA-3610] Remove preaggregate and timeseries datamap

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3522: [CARBONDATA-3609][CARBONDATA-3610] Remove preaggregate and timeseries datamap
URL: https://github.com/apache/carbondata/pull/3522#issuecomment-568675312
 
 
   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1280/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat commented on issue #3522: [CARBONDATA-3609][CARBONDATA-3610] Remove preaggregate and timeseries datamap

GitBox
In reply to this post by GitBox
ajantha-bhat commented on issue #3522: [CARBONDATA-3609][CARBONDATA-3610] Remove preaggregate and timeseries datamap
URL: https://github.com/apache/carbondata/pull/3522#issuecomment-568681736
 
 
   LGTM

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] asfgit closed pull request #3522: [CARBONDATA-3609][CARBONDATA-3610] Remove preaggregate and timeseries datamap

GitBox
In reply to this post by GitBox
asfgit closed pull request #3522: [CARBONDATA-3609][CARBONDATA-3610] Remove preaggregate and timeseries datamap
URL: https://github.com/apache/carbondata/pull/3522
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
12