Apache CarbonData Dev Mailing List archive › Apache CarbonData JIRA issues

[GitHub] carbondata pull request #1774: [CARBONDATA-2001] Unable to Save DataFrame As...

Classic

List

Threaded

77 messages Options

1234

qiuchenjian-2

[GitHub] carbondata issue #1774: [CARBONDATA-2001] Unable to Save DataFrame As Carbon...

Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1774

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2824/

---

qiuchenjian-2

[GitHub] carbondata pull request #1774: [CARBONDATA-2001] Unable to Save DataFrame As...

In reply to this post by qiuchenjian-2

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1774#discussion_r161241835

--- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/CarbonDataFrameWriter.scala ---
@@ -167,13 +167,20 @@ class CarbonDataFrameWriter(sqlContext: SQLContext, val dataFrame: DataFrame) {
val carbonSchema = schema.map { field =>
s"${ field.name } ${ convertToCarbonType(field.dataType) }"
}
+ val isStreaming = if (options.isStreaming) Some("true") else None
+
val property = Map(
"SORT_COLUMNS" -> options.sortColumns,
"DICTIONARY_INCLUDE" -> options.dictionaryInclude,
"DICTIONARY_EXCLUDE" -> options.dictionaryExclude,
- "TABLE_BLOCKSIZE" -> options.tableBlockSize
- ).filter(_._2.isDefined).map(p => s"'${p._1}' = '${p._2.get}'").mkString(",")
+ "TABLE_BLOCKSIZE" -> options.tableBlockSize,
+ "STREAMING" -> isStreaming
+ )
+ .filter(_._2.isDefined).
--- End diff --

not move `.filter` but move last `.` to next line

---

qiuchenjian-2

[GitHub] carbondata pull request #1774: [CARBONDATA-2001] Unable to Save DataFrame As...

In reply to this post by qiuchenjian-2

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1774#discussion_r161242096

--- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/CarbonDataFrameWriter.scala ---
@@ -167,13 +167,19 @@ class CarbonDataFrameWriter(sqlContext: SQLContext, val dataFrame: DataFrame) {
val carbonSchema = schema.map { field =>
s"${ field.name } ${ convertToCarbonType(field.dataType) }"
}
+ val isStreaming = if (options.isStreaming) Some("true") else None
+
val property = Map(
"SORT_COLUMNS" -> options.sortColumns,
"DICTIONARY_INCLUDE" -> options.dictionaryInclude,
"DICTIONARY_EXCLUDE" -> options.dictionaryExclude,
- "TABLE_BLOCKSIZE" -> options.tableBlockSize
- ).filter(_._2.isDefined).map(p => s"'${p._1}' = '${p._2.get}'").mkString(",")
+ "TABLE_BLOCKSIZE" -> options.tableBlockSize,
+ "STREAMING" -> isStreaming
--- End diff --

It is ok to add `options.isStreaming` directly, by default it is false if user does not specify it.

---

qiuchenjian-2

[GitHub] carbondata pull request #1774: [CARBONDATA-2001] Unable to Save DataFrame As...

In reply to this post by qiuchenjian-2

Github user anubhav100 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1774#discussion_r161367337

--- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/CarbonDataFrameWriter.scala ---
@@ -167,13 +167,20 @@ class CarbonDataFrameWriter(sqlContext: SQLContext, val dataFrame: DataFrame) {
val carbonSchema = schema.map { field =>
s"${ field.name } ${ convertToCarbonType(field.dataType) }"
}
+ val isStreaming = if (options.isStreaming) Some("true") else None
+
val property = Map(
"SORT_COLUMNS" -> options.sortColumns,
"DICTIONARY_INCLUDE" -> options.dictionaryInclude,
"DICTIONARY_EXCLUDE" -> options.dictionaryExclude,
- "TABLE_BLOCKSIZE" -> options.tableBlockSize
- ).filter(_._2.isDefined).map(p => s"'${p._1}' = '${p._2.get}'").mkString(",")
+ "TABLE_BLOCKSIZE" -> options.tableBlockSize,
+ "STREAMING" -> isStreaming
+ )
+ .filter(_._2.isDefined).
--- End diff --

using it directly will cause this test case to fail

test("test datasource table with specified table path") {
val path = "./source"
df2.write
.format("carbondata")
.option("tableName", "carbon10")
.option("tablePath", path)
.mode(SaveMode.Overwrite)
.save()
assert(new File(path).exists())
checkAnswer(
sql("select count(*) from carbon10 where c3 > 500"), Row(500)
)
sql("drop table carbon10")
assert(!new File(path).exists())
assert(intercept[AnalysisException](
sql("select count(*) from carbon10 where c3 > 500"))
.message
.contains("not found"))
}

this is because of some problem at parsing level giving tableproperties followed by table path is not parsing correctlly a seperate jira is already created for it here is the link

https://issues.apache.org/jira/browse/CARBONDATA-2005

are we going to support it?

thats why what i do is that if is streaming is false then do not include it in property map make it none

---

qiuchenjian-2

[GitHub] carbondata pull request #1774: [CARBONDATA-2001] Unable to Save DataFrame As...

In reply to this post by qiuchenjian-2

Github user anubhav100 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1774#discussion_r161367429

--- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/CarbonDataFrameWriter.scala ---
@@ -167,13 +167,19 @@ class CarbonDataFrameWriter(sqlContext: SQLContext, val dataFrame: DataFrame) {
val carbonSchema = schema.map { field =>
s"${ field.name } ${ convertToCarbonType(field.dataType) }"
}
+ val isStreaming = if (options.isStreaming) Some("true") else None
+
val property = Map(
"SORT_COLUMNS" -> options.sortColumns,
"DICTIONARY_INCLUDE" -> options.dictionaryInclude,
"DICTIONARY_EXCLUDE" -> options.dictionaryExclude,
- "TABLE_BLOCKSIZE" -> options.tableBlockSize
- ).filter(_._2.isDefined).map(p => s"'${p._1}' = '${p._2.get}'").mkString(",")
+ "TABLE_BLOCKSIZE" -> options.tableBlockSize,
+ "STREAMING" -> isStreaming
--- End diff --

using it directly will cause this test case to fail

test("test datasource table with specified table path") {
val path = "./source"
df2.write
.format("carbondata")
.option("tableName", "carbon10")
.option("tablePath", path)
.mode(SaveMode.Overwrite)
.save()
assert(new File(path).exists())
checkAnswer(
sql("select count() from carbon10 where c3 > 500"), Row(500)
)
sql("drop table carbon10")
assert(!new File(path).exists())
assert(intercept[AnalysisException](
sql("select count() from carbon10 where c3 > 500"))
.message
.contains("not found"))
}

this is because of some problem at parsing level giving tableproperties followed by table path is not parsing correctlly a seperate jira is already created for it here is the link

https://issues.apache.org/jira/browse/CARBONDATA-2005

are we going to support it?

thats why what i do is that if is streaming is false then do not include it in property map make it none

---

qiuchenjian-2

[GitHub] carbondata pull request #1774: [CARBONDATA-2001] Unable to Save DataFrame As...

In reply to this post by qiuchenjian-2

Github user anubhav100 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1774#discussion_r161367434

--- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/CarbonDataFrameWriter.scala ---
@@ -167,13 +167,20 @@ class CarbonDataFrameWriter(sqlContext: SQLContext, val dataFrame: DataFrame) {
val carbonSchema = schema.map { field =>
s"${ field.name } ${ convertToCarbonType(field.dataType) }"
}
+ val isStreaming = if (options.isStreaming) Some("true") else None
+
val property = Map(
"SORT_COLUMNS" -> options.sortColumns,
"DICTIONARY_INCLUDE" -> options.dictionaryInclude,
"DICTIONARY_EXCLUDE" -> options.dictionaryExclude,
- "TABLE_BLOCKSIZE" -> options.tableBlockSize
- ).filter(_._2.isDefined).map(p => s"'${p._1}' = '${p._2.get}'").mkString(",")
+ "TABLE_BLOCKSIZE" -> options.tableBlockSize,
+ "STREAMING" -> isStreaming
+ )
+ .filter(_._2.isDefined).
--- End diff --

done and pushed

---

qiuchenjian-2

[GitHub] carbondata issue #1774: [CARBONDATA-2001] Unable to Save DataFrame As Carbon...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1774

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2739/

---

qiuchenjian-2

[GitHub] carbondata issue #1774: [CARBONDATA-2001] Unable to Save DataFrame As Carbon...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1774

Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1507/

---

qiuchenjian-2

[GitHub] carbondata issue #1774: [CARBONDATA-2001] Unable to Save DataFrame As Carbon...

In reply to this post by qiuchenjian-2

Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1774

SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2864/

---

qiuchenjian-2

[GitHub] carbondata issue #1774: [CARBONDATA-2001] Unable to Save DataFrame As Carbon...

In reply to this post by qiuchenjian-2

Github user anubhav100 commented on the issue:

https://github.com/apache/carbondata/pull/1774

retest this please

---

qiuchenjian-2

[GitHub] carbondata issue #1774: [CARBONDATA-2001] Unable to Save DataFrame As Carbon...

In reply to this post by qiuchenjian-2

Github user anubhav100 commented on the issue:

https://github.com/apache/carbondata/pull/1774

retest sdv please

---

qiuchenjian-2

[GitHub] carbondata issue #1774: [CARBONDATA-2001] Unable to Save DataFrame As Carbon...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1774

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2758/

---

qiuchenjian-2

[GitHub] carbondata issue #1774: [CARBONDATA-2001] Unable to Save DataFrame As Carbon...

In reply to this post by qiuchenjian-2

Github user anubhav100 commented on the issue:

https://github.com/apache/carbondata/pull/1774

retest this please

---

qiuchenjian-2

[GitHub] carbondata issue #1774: [CARBONDATA-2001] Unable to Save DataFrame As Carbon...

In reply to this post by qiuchenjian-2

Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1774

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2866/

---

qiuchenjian-2

[GitHub] carbondata issue #1774: [CARBONDATA-2001] Unable to Save DataFrame As Carbon...

In reply to this post by qiuchenjian-2

qiuchenjian-2

[GitHub] carbondata issue #1774: [CARBONDATA-2001] Unable to Save DataFrame As Carbon...

In reply to this post by qiuchenjian-2

qiuchenjian-2

[GitHub] carbondata issue #1774: [CARBONDATA-2001] Unable to Save DataFrame As Carbon...

In reply to this post by qiuchenjian-2

qiuchenjian-2

[GitHub] carbondata pull request #1774: [CARBONDATA-2001] Unable to Save DataFrame As...

In reply to this post by qiuchenjian-2

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1774#discussion_r161438842

--- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/CarbonDataFrameWriter.scala ---
@@ -167,13 +167,19 @@ class CarbonDataFrameWriter(sqlContext: SQLContext, val dataFrame: DataFrame) {
val carbonSchema = schema.map { field =>
s"${ field.name } ${ convertToCarbonType(field.dataType) }"
}
+ val isStreaming = if (options.isStreaming) Some("true") else None
+
val property = Map(
"SORT_COLUMNS" -> options.sortColumns,
"DICTIONARY_INCLUDE" -> options.dictionaryInclude,
"DICTIONARY_EXCLUDE" -> options.dictionaryExclude,
- "TABLE_BLOCKSIZE" -> options.tableBlockSize
- ).filter(_._2.isDefined).map(p => s"'${p._1}' = '${p._2.get}'").mkString(",")
+ "TABLE_BLOCKSIZE" -> options.tableBlockSize,
+ "STREAMING" -> isStreaming
--- End diff --

Yes, "CREATE TABLE LOCATION" is external table feature, it is supported in #1749, but currently that is not planned to merge into carbonstore branch but not master.

---

qiuchenjian-2

[GitHub] carbondata pull request #1774: [CARBONDATA-2001] Unable to Save DataFrame As...

In reply to this post by qiuchenjian-2

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1774#discussion_r161439054

--- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/CarbonDataFrameWriter.scala ---
@@ -167,13 +167,19 @@ class CarbonDataFrameWriter(sqlContext: SQLContext, val dataFrame: DataFrame) {
val carbonSchema = schema.map { field =>
s"${ field.name } ${ convertToCarbonType(field.dataType) }"
}
+ val isStreaming = if (options.isStreaming) Some("true") else None
+
val property = Map(
"SORT_COLUMNS" -> options.sortColumns,
"DICTIONARY_INCLUDE" -> options.dictionaryInclude,
"DICTIONARY_EXCLUDE" -> options.dictionaryExclude,
- "TABLE_BLOCKSIZE" -> options.tableBlockSize
- ).filter(_._2.isDefined).map(p => s"'${p._1}' = '${p._2.get}'").mkString(",")
+ "TABLE_BLOCKSIZE" -> options.tableBlockSize,
+ "STREAMING" -> isStreaming
--- End diff --

I still suspect it should work, did you try
```
"STREAMING" -> options.isStreaming.toString
```

---

qiuchenjian-2

[GitHub] carbondata pull request #1774: [CARBONDATA-2001] Unable to Save DataFrame As...

In reply to this post by qiuchenjian-2

Github user anubhav100 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1774#discussion_r161439320

--- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/CarbonDataFrameWriter.scala ---
@@ -167,13 +167,19 @@ class CarbonDataFrameWriter(sqlContext: SQLContext, val dataFrame: DataFrame) {
val carbonSchema = schema.map { field =>
s"${ field.name } ${ convertToCarbonType(field.dataType) }"
}
+ val isStreaming = if (options.isStreaming) Some("true") else None
+
val property = Map(
"SORT_COLUMNS" -> options.sortColumns,
"DICTIONARY_INCLUDE" -> options.dictionaryInclude,
"DICTIONARY_EXCLUDE" -> options.dictionaryExclude,
- "TABLE_BLOCKSIZE" -> options.tableBlockSize
- ).filter(_._2.isDefined).map(p => s"'${p._1}' = '${p._2.get}'").mkString(",")
+ "TABLE_BLOCKSIZE" -> options.tableBlockSize,
+ "STREAMING" -> isStreaming
--- End diff --

@jackylk yes i do test case fail with similiar exception as in jira https://issues.apache.org/jira/browse/CARBONDATA-2005

this test case is failing

test("test datasource table with specified table path") {
val path = "./source"
df2.write
.format("carbondata")
.option("tableName", "carbon10")
.option("tablePath", path)
.mode(SaveMode.Overwrite)
.save()
assert(new File(path).exists())
checkAnswer(
sql("select count() from carbon10 where c3 > 500"), Row(500)
)
sql("drop table carbon10")
assert(!new File(path).exists())
assert(intercept[AnalysisException](
sql("select count() from carbon10 where c3 > 500"))
.message
.contains("not found"))
}

---

1234