[GitHub] carbondata pull request #1774: [CARBONDATA-2001] Unable to Save DataFrame As...

classic Classic list List threaded Threaded
77 messages Options
1234
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1774: [CARBONDATA-2001] Unable to Save DataFrame As Carbon...

qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1774
 
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2824/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1774: [CARBONDATA-2001] Unable to Save DataFrame As...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1774#discussion_r161241835
 
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/CarbonDataFrameWriter.scala ---
    @@ -167,13 +167,20 @@ class CarbonDataFrameWriter(sqlContext: SQLContext, val dataFrame: DataFrame) {
         val carbonSchema = schema.map { field =>
           s"${ field.name } ${ convertToCarbonType(field.dataType) }"
         }
    +  val isStreaming = if (options.isStreaming) Some("true") else None
    +
         val property = Map(
           "SORT_COLUMNS" -> options.sortColumns,
           "DICTIONARY_INCLUDE" -> options.dictionaryInclude,
           "DICTIONARY_EXCLUDE" -> options.dictionaryExclude,
    -      "TABLE_BLOCKSIZE" -> options.tableBlockSize
    -    ).filter(_._2.isDefined).map(p => s"'${p._1}' = '${p._2.get}'").mkString(",")
    +      "TABLE_BLOCKSIZE" -> options.tableBlockSize,
    +      "STREAMING" -> isStreaming
    +    )
    +      .filter(_._2.isDefined).
    --- End diff --
   
    not move `.filter` but move last `.` to next line


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1774: [CARBONDATA-2001] Unable to Save DataFrame As...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1774#discussion_r161242096
 
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/CarbonDataFrameWriter.scala ---
    @@ -167,13 +167,19 @@ class CarbonDataFrameWriter(sqlContext: SQLContext, val dataFrame: DataFrame) {
         val carbonSchema = schema.map { field =>
           s"${ field.name } ${ convertToCarbonType(field.dataType) }"
         }
    +  val isStreaming = if (options.isStreaming) Some("true") else None
    +
         val property = Map(
           "SORT_COLUMNS" -> options.sortColumns,
           "DICTIONARY_INCLUDE" -> options.dictionaryInclude,
           "DICTIONARY_EXCLUDE" -> options.dictionaryExclude,
    -      "TABLE_BLOCKSIZE" -> options.tableBlockSize
    -    ).filter(_._2.isDefined).map(p => s"'${p._1}' = '${p._2.get}'").mkString(",")
    +      "TABLE_BLOCKSIZE" -> options.tableBlockSize,
    +      "STREAMING" -> isStreaming
    --- End diff --
   
    It is ok to add `options.isStreaming` directly, by default it is false if user does not specify it.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1774: [CARBONDATA-2001] Unable to Save DataFrame As...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user anubhav100 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1774#discussion_r161367337
 
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/CarbonDataFrameWriter.scala ---
    @@ -167,13 +167,20 @@ class CarbonDataFrameWriter(sqlContext: SQLContext, val dataFrame: DataFrame) {
         val carbonSchema = schema.map { field =>
           s"${ field.name } ${ convertToCarbonType(field.dataType) }"
         }
    +  val isStreaming = if (options.isStreaming) Some("true") else None
    +
         val property = Map(
           "SORT_COLUMNS" -> options.sortColumns,
           "DICTIONARY_INCLUDE" -> options.dictionaryInclude,
           "DICTIONARY_EXCLUDE" -> options.dictionaryExclude,
    -      "TABLE_BLOCKSIZE" -> options.tableBlockSize
    -    ).filter(_._2.isDefined).map(p => s"'${p._1}' = '${p._2.get}'").mkString(",")
    +      "TABLE_BLOCKSIZE" -> options.tableBlockSize,
    +      "STREAMING" -> isStreaming
    +    )
    +      .filter(_._2.isDefined).
    --- End diff --
   
    using it directly will cause this test case to fail
   
    test("test datasource table with specified table path") {
        val path = "./source"
        df2.write
          .format("carbondata")
          .option("tableName", "carbon10")
          .option("tablePath", path)
          .mode(SaveMode.Overwrite)
          .save()
        assert(new File(path).exists())
        checkAnswer(
          sql("select count(*) from carbon10 where c3 > 500"), Row(500)
        )
        sql("drop table carbon10")
        assert(!new File(path).exists())
        assert(intercept[AnalysisException](
          sql("select count(*) from carbon10 where c3 > 500"))
          .message
          .contains("not found"))
      }
   
    this is because of some problem at parsing level giving tableproperties followed by table path is not parsing correctlly a seperate jira is already created for it here is the link
   
    https://issues.apache.org/jira/browse/CARBONDATA-2005
   
    are we going to support it?
   
    thats why what i do is that if is streaming is false then do not include it in property map make it none


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1774: [CARBONDATA-2001] Unable to Save DataFrame As...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user anubhav100 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1774#discussion_r161367429
 
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/CarbonDataFrameWriter.scala ---
    @@ -167,13 +167,19 @@ class CarbonDataFrameWriter(sqlContext: SQLContext, val dataFrame: DataFrame) {
         val carbonSchema = schema.map { field =>
           s"${ field.name } ${ convertToCarbonType(field.dataType) }"
         }
    +  val isStreaming = if (options.isStreaming) Some("true") else None
    +
         val property = Map(
           "SORT_COLUMNS" -> options.sortColumns,
           "DICTIONARY_INCLUDE" -> options.dictionaryInclude,
           "DICTIONARY_EXCLUDE" -> options.dictionaryExclude,
    -      "TABLE_BLOCKSIZE" -> options.tableBlockSize
    -    ).filter(_._2.isDefined).map(p => s"'${p._1}' = '${p._2.get}'").mkString(",")
    +      "TABLE_BLOCKSIZE" -> options.tableBlockSize,
    +      "STREAMING" -> isStreaming
    --- End diff --
   
    using it directly will cause this test case to fail
   
    test("test datasource table with specified table path") {
    val path = "./source"
    df2.write
    .format("carbondata")
    .option("tableName", "carbon10")
    .option("tablePath", path)
    .mode(SaveMode.Overwrite)
    .save()
    assert(new File(path).exists())
    checkAnswer(
    sql("select count() from carbon10 where c3 > 500"), Row(500)
    )
    sql("drop table carbon10")
    assert(!new File(path).exists())
    assert(intercept[AnalysisException](
    sql("select count() from carbon10 where c3 > 500"))
    .message
    .contains("not found"))
    }
   
    this is because of some problem at parsing level giving tableproperties followed by table path is not parsing correctlly a seperate jira is already created for it here is the link
   
    https://issues.apache.org/jira/browse/CARBONDATA-2005
   
    are we going to support it?
   
    thats why what i do is that if is streaming is false then do not include it in property map make it none


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1774: [CARBONDATA-2001] Unable to Save DataFrame As...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user anubhav100 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1774#discussion_r161367434
 
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/CarbonDataFrameWriter.scala ---
    @@ -167,13 +167,20 @@ class CarbonDataFrameWriter(sqlContext: SQLContext, val dataFrame: DataFrame) {
         val carbonSchema = schema.map { field =>
           s"${ field.name } ${ convertToCarbonType(field.dataType) }"
         }
    +  val isStreaming = if (options.isStreaming) Some("true") else None
    +
         val property = Map(
           "SORT_COLUMNS" -> options.sortColumns,
           "DICTIONARY_INCLUDE" -> options.dictionaryInclude,
           "DICTIONARY_EXCLUDE" -> options.dictionaryExclude,
    -      "TABLE_BLOCKSIZE" -> options.tableBlockSize
    -    ).filter(_._2.isDefined).map(p => s"'${p._1}' = '${p._2.get}'").mkString(",")
    +      "TABLE_BLOCKSIZE" -> options.tableBlockSize,
    +      "STREAMING" -> isStreaming
    +    )
    +      .filter(_._2.isDefined).
    --- End diff --
   
    done and pushed


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1774: [CARBONDATA-2001] Unable to Save DataFrame As Carbon...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1774
 
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2739/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1774: [CARBONDATA-2001] Unable to Save DataFrame As Carbon...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1774
 
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1507/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1774: [CARBONDATA-2001] Unable to Save DataFrame As Carbon...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1774
 
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2864/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1774: [CARBONDATA-2001] Unable to Save DataFrame As Carbon...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user anubhav100 commented on the issue:

    https://github.com/apache/carbondata/pull/1774
 
    retest this please


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1774: [CARBONDATA-2001] Unable to Save DataFrame As Carbon...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user anubhav100 commented on the issue:

    https://github.com/apache/carbondata/pull/1774
 
    retest sdv please


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1774: [CARBONDATA-2001] Unable to Save DataFrame As Carbon...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1774
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2758/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1774: [CARBONDATA-2001] Unable to Save DataFrame As Carbon...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user anubhav100 commented on the issue:

    https://github.com/apache/carbondata/pull/1774
 
    retest this please


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1774: [CARBONDATA-2001] Unable to Save DataFrame As Carbon...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1774
 
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2866/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1774: [CARBONDATA-2001] Unable to Save DataFrame As Carbon...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1774
 
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1527/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1774: [CARBONDATA-2001] Unable to Save DataFrame As Carbon...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1774
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2762/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #1774: [CARBONDATA-2001] Unable to Save DataFrame As Carbon...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1774
 
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1531/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1774: [CARBONDATA-2001] Unable to Save DataFrame As...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1774#discussion_r161438842
 
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/CarbonDataFrameWriter.scala ---
    @@ -167,13 +167,19 @@ class CarbonDataFrameWriter(sqlContext: SQLContext, val dataFrame: DataFrame) {
         val carbonSchema = schema.map { field =>
           s"${ field.name } ${ convertToCarbonType(field.dataType) }"
         }
    +  val isStreaming = if (options.isStreaming) Some("true") else None
    +
         val property = Map(
           "SORT_COLUMNS" -> options.sortColumns,
           "DICTIONARY_INCLUDE" -> options.dictionaryInclude,
           "DICTIONARY_EXCLUDE" -> options.dictionaryExclude,
    -      "TABLE_BLOCKSIZE" -> options.tableBlockSize
    -    ).filter(_._2.isDefined).map(p => s"'${p._1}' = '${p._2.get}'").mkString(",")
    +      "TABLE_BLOCKSIZE" -> options.tableBlockSize,
    +      "STREAMING" -> isStreaming
    --- End diff --
   
    Yes, "CREATE TABLE LOCATION" is external table feature, it is supported in #1749, but currently that is not planned to merge into carbonstore branch but not master.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1774: [CARBONDATA-2001] Unable to Save DataFrame As...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1774#discussion_r161439054
 
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/CarbonDataFrameWriter.scala ---
    @@ -167,13 +167,19 @@ class CarbonDataFrameWriter(sqlContext: SQLContext, val dataFrame: DataFrame) {
         val carbonSchema = schema.map { field =>
           s"${ field.name } ${ convertToCarbonType(field.dataType) }"
         }
    +  val isStreaming = if (options.isStreaming) Some("true") else None
    +
         val property = Map(
           "SORT_COLUMNS" -> options.sortColumns,
           "DICTIONARY_INCLUDE" -> options.dictionaryInclude,
           "DICTIONARY_EXCLUDE" -> options.dictionaryExclude,
    -      "TABLE_BLOCKSIZE" -> options.tableBlockSize
    -    ).filter(_._2.isDefined).map(p => s"'${p._1}' = '${p._2.get}'").mkString(",")
    +      "TABLE_BLOCKSIZE" -> options.tableBlockSize,
    +      "STREAMING" -> isStreaming
    --- End diff --
   
    I still suspect it should work, did you try
    ```
    "STREAMING" -> options.isStreaming.toString
    ```


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #1774: [CARBONDATA-2001] Unable to Save DataFrame As...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user anubhav100 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1774#discussion_r161439320
 
    --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/CarbonDataFrameWriter.scala ---
    @@ -167,13 +167,19 @@ class CarbonDataFrameWriter(sqlContext: SQLContext, val dataFrame: DataFrame) {
         val carbonSchema = schema.map { field =>
           s"${ field.name } ${ convertToCarbonType(field.dataType) }"
         }
    +  val isStreaming = if (options.isStreaming) Some("true") else None
    +
         val property = Map(
           "SORT_COLUMNS" -> options.sortColumns,
           "DICTIONARY_INCLUDE" -> options.dictionaryInclude,
           "DICTIONARY_EXCLUDE" -> options.dictionaryExclude,
    -      "TABLE_BLOCKSIZE" -> options.tableBlockSize
    -    ).filter(_._2.isDefined).map(p => s"'${p._1}' = '${p._2.get}'").mkString(",")
    +      "TABLE_BLOCKSIZE" -> options.tableBlockSize,
    +      "STREAMING" -> isStreaming
    --- End diff --
   
    @jackylk  yes i do test case fail with similiar exception as in jira https://issues.apache.org/jira/browse/CARBONDATA-2005
   
    this test case is failing
   
    test("test datasource table with specified table path") {
    val path = "./source"
    df2.write
    .format("carbondata")
    .option("tableName", "carbon10")
    .option("tablePath", path)
    .mode(SaveMode.Overwrite)
    .save()
    assert(new File(path).exists())
    checkAnswer(
    sql("select count() from carbon10 where c3 > 500"), Row(500)
    )
    sql("drop table carbon10")
    assert(!new File(path).exists())
    assert(intercept[AnalysisException](
    sql("select count() from carbon10 where c3 > 500"))
    .message
    .contains("not found"))
    }


---
1234