[GitHub] carbondata pull request #2113: [WIP][LUCENE_DATAMAP]load issue in lucene dat...

classic Classic list List threaded Threaded
124 messages Options
1234567
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2113: [CARBONDATA-2347][LUCENE_DATAMAP]load issue i...

qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2113#discussion_r181661716
 
    --- Diff: datamap/lucene/pom.xml ---
    @@ -141,6 +141,34 @@
               </execution>
             </executions>
           </plugin>
    +      <plugin>
    --- End diff --
   
    Can you move the testcase to spark2 module and add `<scope>test</scope>` for spark2 in the pom


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2113: [CARBONDATA-2347][LUCENE_DATAMAP]load issue i...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2113#discussion_r181662181
 
    --- Diff: datamap/lucene/src/main/java/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMapFactory.java ---
    @@ -62,7 +62,17 @@
       public List<FineGrainDataMap> getDataMaps(DataMapDistributable distributable,
           ReadCommittedScope readCommittedScope)
           throws IOException {
    -    return getDataMaps(distributable.getSegment(), readCommittedScope);
    +    List<FineGrainDataMap> lstDataMap = new ArrayList<>();
    +    FineGrainDataMap dataMap = new LuceneFineGrainDataMap(analyzer);
    +    String indexPath = ((LuceneDataMapDistributable) distributable).getIndexPath();
    +    try {
    +      dataMap.init(new DataMapModel(indexPath));
    +    } catch (MemoryException e) {
    +      LOGGER.error("failed to get lucene datamap , detail is {}" + e.getMessage());
    --- End diff --
   
    you can use String.format() to format the log message


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2113: [CARBONDATA-2347][LUCENE_DATAMAP]load issue i...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2113#discussion_r181662903
 
    --- Diff: datamap/lucene/src/test/scala/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMapSuite.scala ---
    @@ -57,7 +65,7 @@ class LuceneFineGrainDataMapSuite extends QueryTest with BeforeAndAfterAll {
         var exception = intercept[MalformedDataMapCommandException](sql(
           s"""
              | CREATE DATAMAP dm1 ON TABLE datamap_test
    -         | USING 'org.apache.carbondata.datamap.lucene.LuceneFineGrainDataMapFactory'
    +         | USING 'lucenefg'
    --- End diff --
   
    please use `lucene` only


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2113: [CARBONDATA-2347][LUCENE_DATAMAP]load issue i...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2113#discussion_r181663104
 
    --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/datamap/CGDataMapTestCase.scala ---
    @@ -402,7 +402,7 @@ class CGDataMapTestCase extends QueryTest with BeforeAndAfterAll {
     
         sql(s"create datamap test_cg_datamap1 on table datamap_store_test1 using '${classOf[CGDataMapFactory].getName}' as select  id, name from datamap_store_test")
     
    -    val loc = CarbonProperties.getInstance().getSystemFolderLocation + "/test_cg_datamap1.dmschema"
    +    val loc = CarbonProperties.getInstance().getSystemFolderLocation + "/datamap_store_test1_test_cg_datamap1.dmschema"
    --- End diff --
   
    Please make a utility function to construct the dmschema path


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2113: [CARBONDATA-2347][LUCENE_DATAMAP]load issue i...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2113#discussion_r181663757
 
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/datamap/CarbonDataMapShowCommand.scala ---
    @@ -40,35 +42,48 @@ case class CarbonDataMapShowCommand(tableIdentifier: Option[TableIdentifier])
       override def output: Seq[Attribute] = {
         Seq(AttributeReference("DataMapName", StringType, nullable = false)(),
           AttributeReference("ClassName", StringType, nullable = false)(),
    -      AttributeReference("Associated Table", StringType, nullable = false)())
    +      AttributeReference("Associated Table", StringType, nullable = false)(),
    +      AttributeReference("DMProperties", StringType, nullable = false)())
       }
     
       override def processData(sparkSession: SparkSession): Seq[Row] = {
    +    val finalSchemaList: util.List[DataMapSchema] = new util.ArrayList[DataMapSchema]()
    --- End diff --
   
    rename to datamapSchemaList


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2113: [CARBONDATA-2347][LUCENE_DATAMAP]load issue i...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2113#discussion_r181665755
 
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/datamap/CarbonDataMapShowCommand.scala ---
    @@ -40,35 +42,48 @@ case class CarbonDataMapShowCommand(tableIdentifier: Option[TableIdentifier])
       override def output: Seq[Attribute] = {
         Seq(AttributeReference("DataMapName", StringType, nullable = false)(),
           AttributeReference("ClassName", StringType, nullable = false)(),
    -      AttributeReference("Associated Table", StringType, nullable = false)())
    +      AttributeReference("Associated Table", StringType, nullable = false)(),
    +      AttributeReference("DMProperties", StringType, nullable = false)())
       }
     
       override def processData(sparkSession: SparkSession): Seq[Row] = {
    +    val finalSchemaList: util.List[DataMapSchema] = new util.ArrayList[DataMapSchema]()
         tableIdentifier match {
           case Some(table) =>
             Checker.validateTableExists(table.database, table.table, sparkSession)
             val carbonTable = CarbonEnv.getCarbonTable(table)(sparkSession)
             if (carbonTable.hasDataMapSchema) {
    -          val schemaList = carbonTable.getTableInfo.getDataMapSchemaList
    -          convertToRow(schemaList)
    -        } else {
    -          convertToRow(DataMapStoreManager.getInstance().getAllDataMapSchemas(carbonTable))
    +          finalSchemaList.addAll(carbonTable.getTableInfo.getDataMapSchemaList)
    +        }
    +        val indexSchemas = DataMapStoreManager.getInstance().getAllDataMapSchemas(carbonTable)
    +        if (!indexSchemas.isEmpty) {
    +          finalSchemaList.addAll(indexSchemas)
             }
    +        convertToRow(finalSchemaList)
           case _ =>
             convertToRow(DataMapStoreManager.getInstance().getAllDataMapSchemas)
         }
    -
       }
     
       private def convertToRow(schemaList: util.List[DataMapSchema]) = {
         if (schemaList != null && schemaList.size() > 0) {
           schemaList.asScala.map { s =>
             var table = "(NA)"
             val relationIdentifier = s.getRelationIdentifier
    -        if (relationIdentifier != null) {
    +        var dmProperties = "(NA)"
    +        val isFGorCGdm =
    --- End diff --
   
    For show datamap command, I think just show the shortname and the main table
    And we should have desc datamap command, which I think we should add in other PR


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2113: [CARBONDATA-2347][LUCENE_DATAMAP]load issue i...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2113#discussion_r181665850
 
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/mutation/CarbonProjectForDeleteCommand.scala ---
    @@ -17,15 +17,18 @@
     
     package org.apache.spark.sql.execution.command.mutation
     
    +import scala.collection.JavaConverters._
    --- End diff --
   
    seems no need to modify this file


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2113: [CARBONDATA-2347][LUCENE_DATAMAP]load issue i...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2113#discussion_r181665909
 
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/mutation/CarbonProjectForUpdateCommand.scala ---
    @@ -17,6 +17,8 @@
     
     package org.apache.spark.sql.execution.command.mutation
     
    +import scala.collection.JavaConverters._
    --- End diff --
   
    seems no need to modify this file


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2113: [CARBONDATA-2347][LUCENE_DATAMAP]load issue i...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2113#discussion_r181666147
 
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/strategy/DDLStrategy.scala ---
    @@ -244,6 +244,13 @@ class DDLStrategy(sparkSession: SparkSession) extends SparkStrategy {
               throw new MalformedCarbonCommandException("Unsupported operation on unmanaged table")
             }
     
    +        // TODO remove this limitation after streaming table support 'Lucene' DataMap
    +        // if the table has 'Lucene' DataMap, it doesn't support streaming now
    +        if (carbonTable.hasLuceneDataMap) {
    --- End diff --
   
    no need to do this now, do it later


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2113: [CARBONDATA-2347][LUCENE_DATAMAP]load issue in lucen...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2113
 
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/3827/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2113: [CARBONDATA-2347][LUCENE_DATAMAP]load issue in lucen...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2113
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5044/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2113: [CARBONDATA-2347][LUCENE_DATAMAP]load issue in lucen...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2113
 
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/3836/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2113: [CARBONDATA-2347][LUCENE_DATAMAP]load issue in lucen...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2113
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5057/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2113: [CARBONDATA-2347][LUCENE_DATAMAP]load issue in lucen...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2113
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5056/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2113: [CARBONDATA-2347][LUCENE_DATAMAP]load issue in lucen...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user akashrn5 commented on the issue:

    https://github.com/apache/carbondata/pull/2113
 
    retest this please


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2113: [CARBONDATA-2347][LUCENE_DATAMAP]load issue i...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2113#discussion_r181713300
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/metadata/schema/table/DiskBasedDMSchemaStorageProvider.java ---
    @@ -129,9 +130,11 @@ public DiskBasedDMSchemaStorageProvider(String storePath) {
         return dataMapSchemas;
       }
     
    -  @Override public void dropSchema(String dataMapName) throws IOException {
    -    String schemaPath =
    -        storePath + CarbonCommonConstants.FILE_SEPARATOR + dataMapName + ".dmschema";
    +  @Override public void dropSchema(String dataMapName, String tableName, String dataMapProviderName)
    +      throws IOException {
    +    String schemaPath = storePath + CarbonCommonConstants.FILE_SEPARATOR + tableName
    --- End diff --
   
    make an utility function and use it in all place including testcase


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2113: [CARBONDATA-2347][LUCENE_DATAMAP]load issue i...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2113#discussion_r181713494
 
    --- Diff: datamap/lucene/src/main/java/org/apache/carbondata/datamap/lucene/LuceneDataMapFactoryBase.java ---
    @@ -205,9 +234,12 @@ public void clear(Segment segment) {
       /**
        * Clear all datamaps from memory
        */
    -  @Override
    -  public void clear() {
    -
    +  @Override public void clear() {
    --- End diff --
   
    move @override to previous line, please follow this in future


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2113: [CARBONDATA-2347][LUCENE_DATAMAP]load issue in lucen...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2113
 
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/3840/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2113: [CARBONDATA-2347][LUCENE_DATAMAP]load issue i...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2113#discussion_r181725781
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/datamap/IndexDataMapProvider.java ---
    @@ -99,25 +103,24 @@ private DataMapFactory createIndexDataMapFactory(DataMapSchema dataMapSchema)
         return dataMapFactory;
       }
     
    -  private DataMapFactory getDataMapFactoryByShortName(String providerName)
    +  public static DataMapFactory getDataMapFactoryByShortName(String providerName)
           throws MalformedDataMapCommandException {
    +    DataMapRegistry.registerDataMap(DataMapClassProvider.LUCENE.getClassName(),
    --- End diff --
   
    What is the need of specifing the LUCENE here, these classes are supposed to be independent on any specific datamap.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2113: [CARBONDATA-2347][LUCENE_DATAMAP]load issue i...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2113#discussion_r181727647
 
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/datamap/CarbonCreateDataMapCommand.scala ---
    @@ -69,8 +69,29 @@ case class CarbonCreateDataMapCommand(
         }
     
         dataMapSchema = new DataMapSchema(dataMapName, dmClassName)
    -    if (mainTable != null &&
    -        mainTable.isStreamingTable &&
    +    if (dataMapSchema.getProviderName.equalsIgnoreCase(DataMapClassProvider.LUCENE.toString)) {
    --- End diff --
   
    This all validation code should be inside the luceneDatamapFactory init method. Please don't add any lucene specific code outside of lucene classes


---
1234567