CarbonDataQA1 commented on issue #3614: [CARBONDATA-3693] Separate Index command from DataMap command
URL: https://github.com/apache/carbondata/pull/3614#issuecomment-591454698 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/2200/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
jackylk commented on issue #3614: [CARBONDATA-3693] Separate Index command from DataMap command
URL: https://github.com/apache/carbondata/pull/3614#issuecomment-591734910 retest this please ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3614: [CARBONDATA-3693] Separate Index command from DataMap command
URL: https://github.com/apache/carbondata/pull/3614#issuecomment-591739324 Build Success with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/504/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3614: [CARBONDATA-3693] Separate Index command from DataMap command
URL: https://github.com/apache/carbondata/pull/3614#issuecomment-591748167 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/2203/ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
akashrn5 commented on a change in pull request #3614: [CARBONDATA-3693] Separate Index command from DataMap command
URL: https://github.com/apache/carbondata/pull/3614#discussion_r384033343 ########## File path: core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java ########## @@ -460,10 +460,9 @@ private CarbonCommonConstants() { public static final String LOCAL_DICTIONARY_EXCLUDE = "local_dictionary_exclude"; /** - * DMPROPERTY for Index DataMap, like lucene, bloomfilter DataMap, - * to indicate a list of column name to be indexed + * Internal property to store for index column names */ - public static final String INDEX_COLUMNS = "INDEX_COLUMNS"; + public static final String INDEX_COLUMNS = "_INDEX_COLUMNS"; Review comment: why this change is required? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
akashrn5 commented on a change in pull request #3614: [CARBONDATA-3693] Separate Index command from DataMap command
URL: https://github.com/apache/carbondata/pull/3614#discussion_r384281103 ########## File path: core/src/main/java/org/apache/carbondata/core/datamap/status/DiskBasedDataMapStatusProvider.java ########## @@ -31,11 +31,11 @@ import org.apache.carbondata.common.logging.LogServiceFactory; import org.apache.carbondata.core.constants.CarbonCommonConstants; -import org.apache.carbondata.core.datamap.dev.DataMapSyncStatus; import org.apache.carbondata.core.datastore.impl.FileFactory; import org.apache.carbondata.core.fileoperations.AtomicFileOperationFactory; import org.apache.carbondata.core.fileoperations.AtomicFileOperations; import org.apache.carbondata.core.fileoperations.FileWriteOperation; +import org.apache.carbondata.core.index.dev.DataMapSyncStatus; Review comment: can revert this change ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
akashrn5 commented on a change in pull request #3614: [CARBONDATA-3693] Separate Index command from DataMap command
URL: https://github.com/apache/carbondata/pull/3614#discussion_r384035745 ########## File path: core/src/main/java/org/apache/carbondata/core/datamap/DataMapStoreManager.java ########## @@ -168,6 +186,38 @@ public DataMapSchema getDataMapSchema(String dataMapName) return provider.retrieveSchema(dataMapName); } + public List<DataMapSchema> getAllMVSchemas() throws IOException { + return provider.retrieveAllSchemas().stream() + .filter(schema -> !schema.isIndexDataMap()) + .collect(Collectors.toList()); + } + + public DataMapSchema getIndexSchema(String indexName) + throws NoSuchIndexException, IOException { + try { + DataMapSchema schema = provider.retrieveSchema(indexName); + if (!schema.isIndexDataMap()) { + throw new NoSuchIndexException(indexName); + } + return schema; + } catch (NoSuchDataMapException e) { Review comment: please add `LOGGER.error()` here and if no schema file is present, it will throw IO exception, if while reading schema failed IOexception is thrown, always saying NoSuchIndexException is misleading i think ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
akashrn5 commented on a change in pull request #3614: [CARBONDATA-3693] Separate Index command from DataMap command
URL: https://github.com/apache/carbondata/pull/3614#discussion_r384281012 ########## File path: core/src/main/java/org/apache/carbondata/core/datamap/status/DataMapStatusManager.java ########## @@ -142,13 +142,13 @@ public static DataMapSchema getDataMapSchema(String dataMapName) } /** - * This method will remove all segments of dataMap table in case of Insert-Overwrite/Update/Delete + * This method will remove all segments of MV table in case of Insert-Overwrite/Update/Delete * operations on main table * * @param allDataMapSchemas of main carbon table * @throws IOException */ - public static void truncateDataMap(List<DataMapSchema> allDataMapSchemas) + public static void truncateMVTable(List<DataMapSchema> allDataMapSchemas) Review comment: can we change change to IndexSchema ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
akashrn5 commented on a change in pull request #3614: [CARBONDATA-3693] Separate Index command from DataMap command
URL: https://github.com/apache/carbondata/pull/3614#discussion_r384282806 ########## File path: core/src/main/java/org/apache/carbondata/core/index/IndexLevel.java ########## @@ -15,17 +15,17 @@ * limitations under the License. */ -package org.apache.carbondata.core.datamap; +package org.apache.carbondata.core.index; import org.apache.carbondata.common.annotations.InterfaceAudience; import org.apache.carbondata.common.annotations.InterfaceStability; /** - * Index level of the datamap + * Index level Review comment: no need of this comment i think ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
akashrn5 commented on a change in pull request #3614: [CARBONDATA-3693] Separate Index command from DataMap command
URL: https://github.com/apache/carbondata/pull/3614#discussion_r384036122 ########## File path: core/src/main/java/org/apache/carbondata/core/datamap/DataMapStoreManager.java ########## @@ -168,6 +186,38 @@ public DataMapSchema getDataMapSchema(String dataMapName) return provider.retrieveSchema(dataMapName); } + public List<DataMapSchema> getAllMVSchemas() throws IOException { + return provider.retrieveAllSchemas().stream() + .filter(schema -> !schema.isIndexDataMap()) + .collect(Collectors.toList()); + } + + public DataMapSchema getIndexSchema(String indexName) + throws NoSuchIndexException, IOException { + try { + DataMapSchema schema = provider.retrieveSchema(indexName); + if (!schema.isIndexDataMap()) { + throw new NoSuchIndexException(indexName); + } + return schema; + } catch (NoSuchDataMapException e) { + throw new NoSuchIndexException(indexName); + } + } + + public DataMapSchema getMVSchema(String mvName) + throws NoSuchMaterializedViewException, IOException { + try { + DataMapSchema schema = provider.retrieveSchema(mvName); + if (schema == null || schema.isIndexDataMap()) { + throw new NoSuchMaterializedViewException(mvName); + } + return schema; + } catch (NoSuchDataMapException e) { + throw new NoSuchMaterializedViewException(mvName); Review comment: please add LOGGER.error() here and if no schema file is present, it will throw IO exception, if while reading schema failed IOexception is thrown, always saying NoSuchMaterializedViewException is misleading i think ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
akashrn5 commented on a change in pull request #3614: [CARBONDATA-3693] Separate Index command from DataMap command
URL: https://github.com/apache/carbondata/pull/3614#discussion_r384036324 ########## File path: core/src/main/java/org/apache/carbondata/core/datamap/DataMapStoreManager.java ########## @@ -168,6 +186,38 @@ public DataMapSchema getDataMapSchema(String dataMapName) return provider.retrieveSchema(dataMapName); } + public List<DataMapSchema> getAllMVSchemas() throws IOException { + return provider.retrieveAllSchemas().stream() + .filter(schema -> !schema.isIndexDataMap()) + .collect(Collectors.toList()); + } + + public DataMapSchema getIndexSchema(String indexName) + throws NoSuchIndexException, IOException { + try { + DataMapSchema schema = provider.retrieveSchema(indexName); + if (!schema.isIndexDataMap()) { + throw new NoSuchIndexException(indexName); + } + return schema; + } catch (NoSuchDataMapException e) { + throw new NoSuchIndexException(indexName); + } + } + + public DataMapSchema getMVSchema(String mvName) + throws NoSuchMaterializedViewException, IOException { + try { + DataMapSchema schema = provider.retrieveSchema(mvName); + if (schema == null || schema.isIndexDataMap()) { + throw new NoSuchMaterializedViewException(mvName); Review comment: please do Logger.error before throwing exception ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
akashrn5 commented on a change in pull request #3614: [CARBONDATA-3693] Separate Index command from DataMap command
URL: https://github.com/apache/carbondata/pull/3614#discussion_r384033877 ########## File path: core/src/main/java/org/apache/carbondata/core/datamap/DataMapProvider.java ########## @@ -48,25 +49,26 @@ import org.apache.log4j.Logger; /** - * DataMap is a accelerator for certain type of query. Developer can add new DataMap + * Index is a accelerator for certain type of query. Developer can add new Index * implementation to improve query performance. * - * Currently two types of DataMap are supported + * Currently two types of Index are supported * <ol> - * <li> MVDataMap: materialized view type of DataMap to accelerate olap style query, + * <li> MVDataMap: materialized view type of Index to accelerate olap style query, * like SPJG query (select, predicate, join, groupby) </li> - * <li> DataMap: index type of DataMap to accelerate filter query </li> + * <li> Index: index type of Index to accelerate filter query </li> * </ol> * * <p> * In following command <br> - * {@code CREATE DATAMAP dm ON TABLE main USING 'provider'}, <br> - * the <b>provider</b> string can be a short name or class name of the DataMap implementation. + * {@code CREATE INDEX index ON main AS 'provider'}, <br> Review comment: `on table ` should not be removed here, its required ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
akashrn5 commented on a change in pull request #3614: [CARBONDATA-3693] Separate Index command from DataMap command
URL: https://github.com/apache/carbondata/pull/3614#discussion_r384281648 ########## File path: core/src/main/java/org/apache/carbondata/core/index/DistributableIndexFormat.java ########## @@ -29,16 +29,17 @@ import org.apache.carbondata.common.logging.LogServiceFactory; import org.apache.carbondata.core.constants.CarbonCommonConstants; -import org.apache.carbondata.core.datamap.dev.expr.DataMapDistributableWrapper; +import org.apache.carbondata.core.datamap.DataMapStoreManager; +import org.apache.carbondata.core.datamap.Segment; import org.apache.carbondata.core.datastore.impl.FileFactory; +import org.apache.carbondata.core.index.dev.expr.IndexDistributableWrapper; import org.apache.carbondata.core.indexstore.ExtendedBlocklet; import org.apache.carbondata.core.indexstore.PartitionSpec; import org.apache.carbondata.core.metadata.schema.table.CarbonTable; import org.apache.carbondata.core.readcommitter.LatestFilesReadCommittedScope; import org.apache.carbondata.core.readcommitter.ReadCommittedScope; import org.apache.carbondata.core.readcommitter.TableStatusReadCommittedScope; import org.apache.carbondata.core.scan.filter.resolver.FilterResolverIntf; - Review comment: revert ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
akashrn5 commented on a change in pull request #3614: [CARBONDATA-3693] Separate Index command from DataMap command
URL: https://github.com/apache/carbondata/pull/3614#discussion_r384286761 ########## File path: core/src/main/java/org/apache/carbondata/core/index/dev/BlockletSerializer.java ########## @@ -15,19 +15,19 @@ * limitations under the License. */ -package org.apache.carbondata.core.datamap.dev; +package org.apache.carbondata.core.index.dev; import java.io.DataInputStream; import java.io.DataOutputStream; import java.io.IOException; import org.apache.carbondata.common.annotations.InterfaceAudience; -import org.apache.carbondata.core.datamap.dev.fgdatamap.FineGrainBlocklet; import org.apache.carbondata.core.datastore.impl.FileFactory; +import org.apache.carbondata.core.index.dev.fgindex.FineGrainBlocklet; /** * A serializer/deserializer for {@link FineGrainBlocklet}, it is used after prune the data - * by {@link org.apache.carbondata.core.datamap.dev.fgdatamap.FineGrainDataMap} + * by {@link FineGrainIndex} Review comment: @link is not working for it, please correct it ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
akashrn5 commented on a change in pull request #3614: [CARBONDATA-3693] Separate Index command from DataMap command
URL: https://github.com/apache/carbondata/pull/3614#discussion_r384034368 ########## File path: core/src/main/java/org/apache/carbondata/core/datamap/DataMapProvider.java ########## @@ -178,7 +180,7 @@ public boolean rebuild() throws IOException, NoSuchDataMapException { == SegmentStatus.INSERT_OVERWRITE_IN_PROGRESS) && SegmentStatusManager .isLoadInProgress(dataMapTableAbsoluteTableIdentifier, loadMetaDetail.getLoadName())) { - throw new RuntimeException("Rebuild to datamap " + dataMapSchema.getDataMapName() + throw new RuntimeException("Rebuild to " + dataMapSchema.getDataMapName() Review comment: `Rebuild to` can be renamed to `Refresh` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
akashrn5 commented on a change in pull request #3614: [CARBONDATA-3693] Separate Index command from DataMap command
URL: https://github.com/apache/carbondata/pull/3614#discussion_r384280832 ########## File path: core/src/main/java/org/apache/carbondata/core/datamap/status/DataMapStatus.java ########## @@ -18,7 +18,7 @@ package org.apache.carbondata.core.datamap.status; /** - * DataMap status + * Index status */ public enum DataMapStatus { Review comment: can make this IndexStatus ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
akashrn5 commented on a change in pull request #3614: [CARBONDATA-3693] Separate Index command from DataMap command
URL: https://github.com/apache/carbondata/pull/3614#discussion_r384283718 ########## File path: core/src/main/java/org/apache/carbondata/core/index/TableIndex.java ########## @@ -338,7 +339,7 @@ public Void call() throws IOException { try { executorService.awaitTermination(2, TimeUnit.HOURS); } catch (InterruptedException e) { - LOG.error("Error in pruning datamap in multi-thread: " + e.getMessage()); + LOG.error("Error in pruning in multi-thread: " + e.getMessage()); Review comment: change message , include pruning index ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
jackylk commented on a change in pull request #3614: [CARBONDATA-3693] Separate Index command from DataMap command
URL: https://github.com/apache/carbondata/pull/3614#discussion_r385609150 ########## File path: core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java ########## @@ -460,10 +460,9 @@ private CarbonCommonConstants() { public static final String LOCAL_DICTIONARY_EXCLUDE = "local_dictionary_exclude"; /** - * DMPROPERTY for Index DataMap, like lucene, bloomfilter DataMap, - * to indicate a list of column name to be indexed + * Internal property to store for index column names */ - public static final String INDEX_COLUMNS = "INDEX_COLUMNS"; + public static final String INDEX_COLUMNS = "_INDEX_COLUMNS"; Review comment: This is a temporary change, index metadata should also unified using main table's table property. This will be done in another PR ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
jackylk commented on a change in pull request #3614: [CARBONDATA-3693] Separate Index command from DataMap command
URL: https://github.com/apache/carbondata/pull/3614#discussion_r385609753 ########## File path: core/src/main/java/org/apache/carbondata/core/datamap/DataMapProvider.java ########## @@ -48,25 +49,26 @@ import org.apache.log4j.Logger; /** - * DataMap is a accelerator for certain type of query. Developer can add new DataMap + * Index is a accelerator for certain type of query. Developer can add new Index * implementation to improve query performance. * - * Currently two types of DataMap are supported + * Currently two types of Index are supported * <ol> - * <li> MVDataMap: materialized view type of DataMap to accelerate olap style query, + * <li> MVDataMap: materialized view type of Index to accelerate olap style query, * like SPJG query (select, predicate, join, groupby) </li> - * <li> DataMap: index type of DataMap to accelerate filter query </li> + * <li> Index: index type of Index to accelerate filter query </li> * </ol> * * <p> * In following command <br> - * {@code CREATE DATAMAP dm ON TABLE main USING 'provider'}, <br> - * the <b>provider</b> string can be a short name or class name of the DataMap implementation. + * {@code CREATE INDEX index ON main AS 'provider'}, <br> Review comment: I have make ON TABLE optional so user can give `CREATE INDEX index ON TABLE main` or `CREATE INDEX index ON main` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
In reply to this post by GitBox
jackylk commented on a change in pull request #3614: [CARBONDATA-3693] Separate Index command from DataMap command
URL: https://github.com/apache/carbondata/pull/3614#discussion_r385609753 ########## File path: core/src/main/java/org/apache/carbondata/core/datamap/DataMapProvider.java ########## @@ -48,25 +49,26 @@ import org.apache.log4j.Logger; /** - * DataMap is a accelerator for certain type of query. Developer can add new DataMap + * Index is a accelerator for certain type of query. Developer can add new Index * implementation to improve query performance. * - * Currently two types of DataMap are supported + * Currently two types of Index are supported * <ol> - * <li> MVDataMap: materialized view type of DataMap to accelerate olap style query, + * <li> MVDataMap: materialized view type of Index to accelerate olap style query, * like SPJG query (select, predicate, join, groupby) </li> - * <li> DataMap: index type of DataMap to accelerate filter query </li> + * <li> Index: index type of Index to accelerate filter query </li> * </ol> * * <p> * In following command <br> - * {@code CREATE DATAMAP dm ON TABLE main USING 'provider'}, <br> - * the <b>provider</b> string can be a short name or class name of the DataMap implementation. + * {@code CREATE INDEX index ON main AS 'provider'}, <br> Review comment: I have make the syntax to `CREATE INDEX index_name ON [TABLE] main_table` so user can give `CREATE INDEX index ON TABLE main` or `CREATE INDEX index ON main` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [hidden email] With regards, Apache Git Services |
Free forum by Nabble | Edit this page |