[GitHub] [carbondata] marchpure opened a new pull request #4021: [WIP] Support Complex DataType in DataFrame Save

classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] marchpure opened a new pull request #4021: [WIP] Support Complex DataType in DataFrame Save

GitBox

marchpure opened a new pull request #4021:
URL: https://github.com/apache/carbondata/pull/4021


    ### Why is this PR needed?
   
   
    ### What changes were proposed in this PR?
   
       
    ### Does this PR introduce any user interface change?
    - No
    - Yes. (please explain the change and update document)
   
    ### Is any new testcase added?
    - No
    - Yes
   
       
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] marchpure commented on pull request #4021: [WIP] Support Complex DataType in DataFrame Save

GitBox

marchpure commented on pull request #4021:
URL: https://github.com/apache/carbondata/pull/4021#issuecomment-732811550


   retest this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4021: [WIP] Support Complex DataType in DataFrame Save

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4021:
URL: https://github.com/apache/carbondata/pull/4021#issuecomment-732917707


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4884/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4021: [WIP] Support Complex DataType in DataFrame Save

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4021:
URL: https://github.com/apache/carbondata/pull/4021#issuecomment-732920382


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/3130/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4021: [CARBONDATA-4057] Support Complex DataType when Save DataFrame with MODE.OVERWRITE

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4021:
URL: https://github.com/apache/carbondata/pull/4021#issuecomment-733149233


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4891/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4021: [CARBONDATA-4057] Support Complex DataType when Save DataFrame with MODE.OVERWRITE

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4021:
URL: https://github.com/apache/carbondata/pull/4021#issuecomment-733149879


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/3137/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #4021: [CARBONDATA-4057] Support Complex DataType when Save DataFrame with MODE.OVERWRITE

GitBox
In reply to this post by GitBox

ajantha-bhat commented on a change in pull request #4021:
URL: https://github.com/apache/carbondata/pull/4021#discussion_r530077716



##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/CarbonDataFrameWriter.scala
##########
@@ -74,6 +74,9 @@ class CarbonDataFrameWriter(sqlContext: SQLContext, val dataFrame: DataFrame) {
       case decimal: DecimalType => s"decimal(${decimal.precision}, ${decimal.scale})"
       case BooleanType => CarbonType.BOOLEAN.getName
       case BinaryType => CarbonType.BINARY.getName
+      case ArrayType(elementType, _) => sparkType.simpleString
+      case StructType(fields) => sparkType.simpleString
+      case MapType(keyType, valueType, _) => sparkType.simpleString

Review comment:
       Long string columns (CarbonType.VARCHAR) also seems to be missing, can you test and add it also if required?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4021: [CARBONDATA-4057] Support Complex DataType when Save DataFrame with MODE.OVERWRITE

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4021:
URL: https://github.com/apache/carbondata/pull/4021#issuecomment-733446467


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4893/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4021: [CARBONDATA-4057] Support Complex DataType when Save DataFrame with MODE.OVERWRITE

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4021:
URL: https://github.com/apache/carbondata/pull/4021#issuecomment-733447024


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/3139/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] marchpure commented on pull request #4021: [CARBONDATA-4057] Support Complex DataType when Save DataFrame with MODE.OVERWRITE

GitBox
In reply to this post by GitBox

marchpure commented on pull request #4021:
URL: https://github.com/apache/carbondata/pull/4021#issuecomment-733543032


   retest this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4021: [CARBONDATA-4057] Support Complex DataType when Save DataFrame with MODE.OVERWRITE

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4021:
URL: https://github.com/apache/carbondata/pull/4021#issuecomment-733601660


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4900/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4021: [CARBONDATA-4057] Support Complex DataType when Save DataFrame with MODE.OVERWRITE

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4021:
URL: https://github.com/apache/carbondata/pull/4021#issuecomment-733603194


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/3146/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] marchpure commented on a change in pull request #4021: [CARBONDATA-4057] Support Complex DataType when Save DataFrame with MODE.OVERWRITE

GitBox
In reply to this post by GitBox

marchpure commented on a change in pull request #4021:
URL: https://github.com/apache/carbondata/pull/4021#discussion_r530373867



##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/CarbonDataFrameWriter.scala
##########
@@ -74,6 +74,9 @@ class CarbonDataFrameWriter(sqlContext: SQLContext, val dataFrame: DataFrame) {
       case decimal: DecimalType => s"decimal(${decimal.precision}, ${decimal.scale})"
       case BooleanType => CarbonType.BOOLEAN.getName
       case BinaryType => CarbonType.BINARY.getName
+      case ArrayType(elementType, _) => sparkType.simpleString
+      case StructType(fields) => sparkType.simpleString
+      case MapType(keyType, valueType, _) => sparkType.simpleString

Review comment:
       There is no varchartype in spark.
   https://spark.apache.org/docs/latest/sql-ref-datatypes.html
   
   I have rework the testcase, All datatype are tested. please have a check




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #4021: [CARBONDATA-4057] Support Complex DataType when Save DataFrame with MODE.OVERWRITE

GitBox
In reply to this post by GitBox

ajantha-bhat commented on a change in pull request #4021:
URL: https://github.com/apache/carbondata/pull/4021#discussion_r530444909



##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/CarbonDataFrameWriter.scala
##########
@@ -74,6 +74,9 @@ class CarbonDataFrameWriter(sqlContext: SQLContext, val dataFrame: DataFrame) {
       case decimal: DecimalType => s"decimal(${decimal.precision}, ${decimal.scale})"
       case BooleanType => CarbonType.BOOLEAN.getName
       case BinaryType => CarbonType.BINARY.getName
+      case ArrayType(elementType, _) => sparkType.simpleString
+      case StructType(fields) => sparkType.simpleString
+      case MapType(keyType, valueType, _) => sparkType.simpleString

Review comment:
       yes, for spark all string is varchar itself.
   But for carbon if we mention long_string_columns in table properties we change data type from carbonType.string --> carbonType.varchar
   for dataframe I think options("long_string_columns"="c1,c2") is not yet supported. So, current changes are ok




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat commented on pull request #4021: [CARBONDATA-4057] Support Complex DataType when Save DataFrame with MODE.OVERWRITE

GitBox
In reply to this post by GitBox

ajantha-bhat commented on pull request #4021:
URL: https://github.com/apache/carbondata/pull/4021#issuecomment-733765223


   LGTM


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] asfgit closed pull request #4021: [CARBONDATA-4057] Support Complex DataType when Save DataFrame with MODE.OVERWRITE

GitBox
In reply to this post by GitBox

asfgit closed pull request #4021:
URL: https://github.com/apache/carbondata/pull/4021


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]