Apache CarbonData Dev Mailing List archive › Apache CarbonData JIRA issues

[GitHub] carbondata pull request #2113: [WIP][LUCENE_DATAMAP]load issue in lucene dat...

Classic

List

124 messages Options

Options

1234567

[GitHub] carbondata pull request #2113: [CARBONDATA-2347][LUCENE_DATAMAP]load issue i...

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2113#discussion_r181661716

--- Diff: datamap/lucene/pom.xml ---
@@ -141,6 +141,34 @@
</execution>
</executions>
</plugin>
+ <plugin>
--- End diff --

Can you move the testcase to spark2 module and add `<scope>test</scope>` for spark2 in the pom

---

[GitHub] carbondata pull request #2113: [CARBONDATA-2347][LUCENE_DATAMAP]load issue i...

In reply to this post by qiuchenjian-2

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2113#discussion_r181662181

--- Diff: datamap/lucene/src/main/java/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMapFactory.java ---
@@ -62,7 +62,17 @@
public List<FineGrainDataMap> getDataMaps(DataMapDistributable distributable,
ReadCommittedScope readCommittedScope)
throws IOException {
- return getDataMaps(distributable.getSegment(), readCommittedScope);
+ List<FineGrainDataMap> lstDataMap = new ArrayList<>();
+ FineGrainDataMap dataMap = new LuceneFineGrainDataMap(analyzer);
+ String indexPath = ((LuceneDataMapDistributable) distributable).getIndexPath();
+ try {
+ dataMap.init(new DataMapModel(indexPath));
+ } catch (MemoryException e) {
+ LOGGER.error("failed to get lucene datamap , detail is {}" + e.getMessage());
--- End diff --

you can use String.format() to format the log message

---

[GitHub] carbondata pull request #2113: [CARBONDATA-2347][LUCENE_DATAMAP]load issue i...

In reply to this post by qiuchenjian-2

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2113#discussion_r181662903

--- Diff: datamap/lucene/src/test/scala/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMapSuite.scala ---
@@ -57,7 +65,7 @@ class LuceneFineGrainDataMapSuite extends QueryTest with BeforeAndAfterAll {
var exception = intercept[MalformedDataMapCommandException](sql(
s"""
| CREATE DATAMAP dm1 ON TABLE datamap_test
- | USING 'org.apache.carbondata.datamap.lucene.LuceneFineGrainDataMapFactory'
+ | USING 'lucenefg'
--- End diff --

please use `lucene` only

---

[GitHub] carbondata pull request #2113: [CARBONDATA-2347][LUCENE_DATAMAP]load issue i...

In reply to this post by qiuchenjian-2

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2113#discussion_r181663104

--- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/datamap/CGDataMapTestCase.scala ---
@@ -402,7 +402,7 @@ class CGDataMapTestCase extends QueryTest with BeforeAndAfterAll {

sql(s"create datamap test_cg_datamap1 on table datamap_store_test1 using '${classOf[CGDataMapFactory].getName}' as select id, name from datamap_store_test")

- val loc = CarbonProperties.getInstance().getSystemFolderLocation + "/test_cg_datamap1.dmschema"
+ val loc = CarbonProperties.getInstance().getSystemFolderLocation + "/datamap_store_test1_test_cg_datamap1.dmschema"
--- End diff --

Please make a utility function to construct the dmschema path

---

[GitHub] carbondata pull request #2113: [CARBONDATA-2347][LUCENE_DATAMAP]load issue i...

In reply to this post by qiuchenjian-2

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2113#discussion_r181663757

--- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/datamap/CarbonDataMapShowCommand.scala ---
@@ -40,35 +42,48 @@ case class CarbonDataMapShowCommand(tableIdentifier: Option[TableIdentifier])
override def output: Seq[Attribute] = {
Seq(AttributeReference("DataMapName", StringType, nullable = false)(),
AttributeReference("ClassName", StringType, nullable = false)(),
- AttributeReference("Associated Table", StringType, nullable = false)())
+ AttributeReference("Associated Table", StringType, nullable = false)(),
+ AttributeReference("DMProperties", StringType, nullable = false)())
}

override def processData(sparkSession: SparkSession): Seq[Row] = {
+ val finalSchemaList: util.List[DataMapSchema] = new util.ArrayList[DataMapSchema]()
--- End diff --

rename to datamapSchemaList

---

[GitHub] carbondata pull request #2113: [CARBONDATA-2347][LUCENE_DATAMAP]load issue i...

In reply to this post by qiuchenjian-2

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2113#discussion_r181665755

--- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/datamap/CarbonDataMapShowCommand.scala ---
@@ -40,35 +42,48 @@ case class CarbonDataMapShowCommand(tableIdentifier: Option[TableIdentifier])
override def output: Seq[Attribute] = {
Seq(AttributeReference("DataMapName", StringType, nullable = false)(),
AttributeReference("ClassName", StringType, nullable = false)(),
- AttributeReference("Associated Table", StringType, nullable = false)())
+ AttributeReference("Associated Table", StringType, nullable = false)(),
+ AttributeReference("DMProperties", StringType, nullable = false)())
}

override def processData(sparkSession: SparkSession): Seq[Row] = {
+ val finalSchemaList: util.List[DataMapSchema] = new util.ArrayList[DataMapSchema]()
tableIdentifier match {
case Some(table) =>
Checker.validateTableExists(table.database, table.table, sparkSession)
val carbonTable = CarbonEnv.getCarbonTable(table)(sparkSession)
if (carbonTable.hasDataMapSchema) {
- val schemaList = carbonTable.getTableInfo.getDataMapSchemaList
- convertToRow(schemaList)
- } else {
- convertToRow(DataMapStoreManager.getInstance().getAllDataMapSchemas(carbonTable))
+ finalSchemaList.addAll(carbonTable.getTableInfo.getDataMapSchemaList)
+ }
+ val indexSchemas = DataMapStoreManager.getInstance().getAllDataMapSchemas(carbonTable)
+ if (!indexSchemas.isEmpty) {
+ finalSchemaList.addAll(indexSchemas)
}
+ convertToRow(finalSchemaList)
case _ =>
convertToRow(DataMapStoreManager.getInstance().getAllDataMapSchemas)
}
-
}

private def convertToRow(schemaList: util.List[DataMapSchema]) = {
if (schemaList != null && schemaList.size() > 0) {
schemaList.asScala.map { s =>
var table = "(NA)"
val relationIdentifier = s.getRelationIdentifier
- if (relationIdentifier != null) {
+ var dmProperties = "(NA)"
+ val isFGorCGdm =
--- End diff --

For show datamap command, I think just show the shortname and the main table
And we should have desc datamap command, which I think we should add in other PR

---

[GitHub] carbondata pull request #2113: [CARBONDATA-2347][LUCENE_DATAMAP]load issue i...

In reply to this post by qiuchenjian-2

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2113#discussion_r181665850

--- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/mutation/CarbonProjectForDeleteCommand.scala ---
@@ -17,15 +17,18 @@

package org.apache.spark.sql.execution.command.mutation

+import scala.collection.JavaConverters._
--- End diff --

seems no need to modify this file

---

[GitHub] carbondata pull request #2113: [CARBONDATA-2347][LUCENE_DATAMAP]load issue i...

In reply to this post by qiuchenjian-2

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2113#discussion_r181665909

--- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/mutation/CarbonProjectForUpdateCommand.scala ---
@@ -17,6 +17,8 @@

package org.apache.spark.sql.execution.command.mutation

+import scala.collection.JavaConverters._
--- End diff --

seems no need to modify this file

---

[GitHub] carbondata pull request #2113: [CARBONDATA-2347][LUCENE_DATAMAP]load issue i...

In reply to this post by qiuchenjian-2

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2113#discussion_r181666147

--- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/strategy/DDLStrategy.scala ---
@@ -244,6 +244,13 @@ class DDLStrategy(sparkSession: SparkSession) extends SparkStrategy {
throw new MalformedCarbonCommandException("Unsupported operation on unmanaged table")
}

+ // TODO remove this limitation after streaming table support 'Lucene' DataMap
+ // if the table has 'Lucene' DataMap, it doesn't support streaming now
+ if (carbonTable.hasLuceneDataMap) {
--- End diff --

no need to do this now, do it later

---

[GitHub] carbondata issue #2113: [CARBONDATA-2347][LUCENE_DATAMAP]load issue in lucen...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2113

Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/3827/

---

[GitHub] carbondata issue #2113: [CARBONDATA-2347][LUCENE_DATAMAP]load issue in lucen...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2113

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5044/

---

[GitHub] carbondata issue #2113: [CARBONDATA-2347][LUCENE_DATAMAP]load issue in lucen...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2113

Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/3836/

---

[GitHub] carbondata issue #2113: [CARBONDATA-2347][LUCENE_DATAMAP]load issue in lucen...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2113

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5057/

---

[GitHub] carbondata issue #2113: [CARBONDATA-2347][LUCENE_DATAMAP]load issue in lucen...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2113

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5056/

---

[GitHub] carbondata issue #2113: [CARBONDATA-2347][LUCENE_DATAMAP]load issue in lucen...

In reply to this post by qiuchenjian-2

Github user akashrn5 commented on the issue:

https://github.com/apache/carbondata/pull/2113

retest this please

---

[GitHub] carbondata pull request #2113: [CARBONDATA-2347][LUCENE_DATAMAP]load issue i...

In reply to this post by qiuchenjian-2

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2113#discussion_r181713300

--- Diff: core/src/main/java/org/apache/carbondata/core/metadata/schema/table/DiskBasedDMSchemaStorageProvider.java ---
@@ -129,9 +130,11 @@ public DiskBasedDMSchemaStorageProvider(String storePath) {
return dataMapSchemas;
}

- @Override public void dropSchema(String dataMapName) throws IOException {
- String schemaPath =
- storePath + CarbonCommonConstants.FILE_SEPARATOR + dataMapName + ".dmschema";
+ @Override public void dropSchema(String dataMapName, String tableName, String dataMapProviderName)
+ throws IOException {
+ String schemaPath = storePath + CarbonCommonConstants.FILE_SEPARATOR + tableName
--- End diff --

make an utility function and use it in all place including testcase

---

[GitHub] carbondata pull request #2113: [CARBONDATA-2347][LUCENE_DATAMAP]load issue i...

In reply to this post by qiuchenjian-2

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2113#discussion_r181713494

--- Diff: datamap/lucene/src/main/java/org/apache/carbondata/datamap/lucene/LuceneDataMapFactoryBase.java ---
@@ -205,9 +234,12 @@ public void clear(Segment segment) {
/**
* Clear all datamaps from memory
*/
- @Override
- public void clear() {
-
+ @Override public void clear() {
--- End diff --

move @override to previous line, please follow this in future

---

[GitHub] carbondata issue #2113: [CARBONDATA-2347][LUCENE_DATAMAP]load issue in lucen...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2113

Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/3840/

---

[GitHub] carbondata pull request #2113: [CARBONDATA-2347][LUCENE_DATAMAP]load issue i...

In reply to this post by qiuchenjian-2

Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2113#discussion_r181725781

--- Diff: core/src/main/java/org/apache/carbondata/core/datamap/IndexDataMapProvider.java ---
@@ -99,25 +103,24 @@ private DataMapFactory createIndexDataMapFactory(DataMapSchema dataMapSchema)
return dataMapFactory;
}

- private DataMapFactory getDataMapFactoryByShortName(String providerName)
+ public static DataMapFactory getDataMapFactoryByShortName(String providerName)
throws MalformedDataMapCommandException {
+ DataMapRegistry.registerDataMap(DataMapClassProvider.LUCENE.getClassName(),
--- End diff --

What is the need of specifing the LUCENE here, these classes are supposed to be independent on any specific datamap.

---

[GitHub] carbondata pull request #2113: [CARBONDATA-2347][LUCENE_DATAMAP]load issue i...

In reply to this post by qiuchenjian-2

Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2113#discussion_r181727647

--- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/datamap/CarbonCreateDataMapCommand.scala ---
@@ -69,8 +69,29 @@ case class CarbonCreateDataMapCommand(
}

dataMapSchema = new DataMapSchema(dataMapName, dmClassName)
- if (mainTable != null &&
- mainTable.isStreamingTable &&
+ if (dataMapSchema.getProviderName.equalsIgnoreCase(DataMapClassProvider.LUCENE.toString)) {
--- End diff --

This all validation code should be inside the luceneDatamapFactory init method. Please don't add any lucene specific code outside of lucene classes

---

1234567