[GitHub] [carbondata] ajantha-bhat opened a new pull request #3882: [WIP] Support binary data type reading from presto

classic Classic list List threaded Threaded
18 messages Options
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat opened a new pull request #3882: [WIP] Support binary data type reading from presto

GitBox

ajantha-bhat opened a new pull request #3882:
URL: https://github.com/apache/carbondata/pull/3882


    ### Why is this PR needed?
    when binary store is queried from presto, presto currently give 0 rows.
   
    ### What changes were proposed in this PR?
   Presto can support binary (varBinary) data type reading by using the SliceStreamReader
   and it can put binary byte[] using putByteArray() method of SliceStreamReader
       
    ### Does this PR introduce any user interface change?
    - No
   
    ### Is any new testcase added?
    - Yes
   
       
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] QiangCai commented on a change in pull request #3882: [CARBONDATA-3941] Support binary data type reading from presto

GitBox

QiangCai commented on a change in pull request #3882:
URL: https://github.com/apache/carbondata/pull/3882#discussion_r466249308



##########
File path: integration/presto/src/test/scala/org/apache/carbondata/presto/integrationtest/PrestoTestNonTransactionalTableFiles.scala
##########
@@ -176,24 +188,35 @@ class PrestoTestNonTransactionalTableFiles extends FunSuiteLike with BeforeAndAf
   def buildTestDataOtherDataType(rows: Int, sortColumns: Array[String]): Any = {
     val fields: Array[Field] = new Array[Field](5)
     // same column name, but name as boolean type
-    fields(0) = new Field("name", DataTypes.BOOLEAN)
+    fields(0) = new Field("name", DataTypes.VARCHAR)
     fields(1) = new Field("age", DataTypes.INT)
-    fields(2) = new Field("id", DataTypes.BYTE)
+    fields(2) = new Field("id", DataTypes.BINARY)
     fields(3) = new Field("height", DataTypes.DOUBLE)
     fields(4) = new Field("salary", DataTypes.FLOAT)
 
+    val imagePath = "../../sdk/sdk/src/test/resources/image/carbondatalogo.jpg"
     try {
       val builder = CarbonWriter.builder()
       val writer =
         builder.outputPath(writerPath)
           .uniqueIdentifier(System.currentTimeMillis()).withBlockSize(2).sortBy(sortColumns)
           .withCsvInput(new Schema(fields)).writtenBy("TestNonTransactionalCarbonTable").build()
       var i = 0
+      val bis = new BufferedInputStream(new FileInputStream(imagePath))
+      var hexValue: Array[Char] = null
+      val originBinary = new Array[Byte](bis.available)
+      while (bis.read(originBinary) != -1) {
+        hexValue = Hex.encodeHex(originBinary)
+      }
+      bis.close()
+      val binaryValue = String.valueOf(hexValue)
+

Review comment:
       keep only one blank line.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] QiangCai commented on a change in pull request #3882: [CARBONDATA-3941] Support binary data type reading from presto

GitBox
In reply to this post by GitBox

QiangCai commented on a change in pull request #3882:
URL: https://github.com/apache/carbondata/pull/3882#discussion_r466249621



##########
File path: integration/presto/src/test/scala/org/apache/carbondata/presto/integrationtest/PrestoTestNonTransactionalTableFiles.scala
##########
@@ -176,24 +188,35 @@ class PrestoTestNonTransactionalTableFiles extends FunSuiteLike with BeforeAndAf
   def buildTestDataOtherDataType(rows: Int, sortColumns: Array[String]): Any = {
     val fields: Array[Field] = new Array[Field](5)
     // same column name, but name as boolean type
-    fields(0) = new Field("name", DataTypes.BOOLEAN)
+    fields(0) = new Field("name", DataTypes.VARCHAR)
     fields(1) = new Field("age", DataTypes.INT)
-    fields(2) = new Field("id", DataTypes.BYTE)
+    fields(2) = new Field("id", DataTypes.BINARY)
     fields(3) = new Field("height", DataTypes.DOUBLE)
     fields(4) = new Field("salary", DataTypes.FLOAT)
 
+    val imagePath = "../../sdk/sdk/src/test/resources/image/carbondatalogo.jpg"

Review comment:
       better to base on rootPath
   val imagePath = "$rootPath/sdk/sdk/src/test/resources/image/carbondatalogo.jpg"
   
   another question: why presto module need scala?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3882: [CARBONDATA-3941] Support binary data type reading from presto

GitBox
In reply to this post by GitBox

ajantha-bhat commented on a change in pull request #3882:
URL: https://github.com/apache/carbondata/pull/3882#discussion_r466252790



##########
File path: integration/presto/src/test/scala/org/apache/carbondata/presto/integrationtest/PrestoTestNonTransactionalTableFiles.scala
##########
@@ -176,24 +188,35 @@ class PrestoTestNonTransactionalTableFiles extends FunSuiteLike with BeforeAndAf
   def buildTestDataOtherDataType(rows: Int, sortColumns: Array[String]): Any = {
     val fields: Array[Field] = new Array[Field](5)
     // same column name, but name as boolean type
-    fields(0) = new Field("name", DataTypes.BOOLEAN)
+    fields(0) = new Field("name", DataTypes.VARCHAR)
     fields(1) = new Field("age", DataTypes.INT)
-    fields(2) = new Field("id", DataTypes.BYTE)
+    fields(2) = new Field("id", DataTypes.BINARY)
     fields(3) = new Field("height", DataTypes.DOUBLE)
     fields(4) = new Field("salary", DataTypes.FLOAT)
 
+    val imagePath = "../../sdk/sdk/src/test/resources/image/carbondatalogo.jpg"

Review comment:
       > another question: why presto module need scala?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3882: [CARBONDATA-3941] Support binary data type reading from presto

GitBox
In reply to this post by GitBox

ajantha-bhat commented on a change in pull request #3882:
URL: https://github.com/apache/carbondata/pull/3882#discussion_r466252790



##########
File path: integration/presto/src/test/scala/org/apache/carbondata/presto/integrationtest/PrestoTestNonTransactionalTableFiles.scala
##########
@@ -176,24 +188,35 @@ class PrestoTestNonTransactionalTableFiles extends FunSuiteLike with BeforeAndAf
   def buildTestDataOtherDataType(rows: Int, sortColumns: Array[String]): Any = {
     val fields: Array[Field] = new Array[Field](5)
     // same column name, but name as boolean type
-    fields(0) = new Field("name", DataTypes.BOOLEAN)
+    fields(0) = new Field("name", DataTypes.VARCHAR)
     fields(1) = new Field("age", DataTypes.INT)
-    fields(2) = new Field("id", DataTypes.BYTE)
+    fields(2) = new Field("id", DataTypes.BINARY)
     fields(3) = new Field("height", DataTypes.DOUBLE)
     fields(4) = new Field("salary", DataTypes.FLOAT)
 
+    val imagePath = "../../sdk/sdk/src/test/resources/image/carbondatalogo.jpg"

Review comment:
       > another question: why presto module need scala?
   we can add spark as test dependency and create store from spark and query from presto

##########
File path: integration/presto/src/test/scala/org/apache/carbondata/presto/integrationtest/PrestoTestNonTransactionalTableFiles.scala
##########
@@ -176,24 +188,35 @@ class PrestoTestNonTransactionalTableFiles extends FunSuiteLike with BeforeAndAf
   def buildTestDataOtherDataType(rows: Int, sortColumns: Array[String]): Any = {
     val fields: Array[Field] = new Array[Field](5)
     // same column name, but name as boolean type
-    fields(0) = new Field("name", DataTypes.BOOLEAN)
+    fields(0) = new Field("name", DataTypes.VARCHAR)
     fields(1) = new Field("age", DataTypes.INT)
-    fields(2) = new Field("id", DataTypes.BYTE)
+    fields(2) = new Field("id", DataTypes.BINARY)
     fields(3) = new Field("height", DataTypes.DOUBLE)
     fields(4) = new Field("salary", DataTypes.FLOAT)
 
+    val imagePath = "../../sdk/sdk/src/test/resources/image/carbondatalogo.jpg"

Review comment:
       > another question: why presto module need scala?
   
   we can add spark as test dependency and create store from spark and query from presto




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3882: [CARBONDATA-3941] Support binary data type reading from presto

GitBox
In reply to this post by GitBox

ajantha-bhat commented on a change in pull request #3882:
URL: https://github.com/apache/carbondata/pull/3882#discussion_r466254229



##########
File path: integration/presto/src/test/scala/org/apache/carbondata/presto/integrationtest/PrestoTestNonTransactionalTableFiles.scala
##########
@@ -176,24 +188,35 @@ class PrestoTestNonTransactionalTableFiles extends FunSuiteLike with BeforeAndAf
   def buildTestDataOtherDataType(rows: Int, sortColumns: Array[String]): Any = {
     val fields: Array[Field] = new Array[Field](5)
     // same column name, but name as boolean type
-    fields(0) = new Field("name", DataTypes.BOOLEAN)
+    fields(0) = new Field("name", DataTypes.VARCHAR)
     fields(1) = new Field("age", DataTypes.INT)
-    fields(2) = new Field("id", DataTypes.BYTE)
+    fields(2) = new Field("id", DataTypes.BINARY)
     fields(3) = new Field("height", DataTypes.DOUBLE)
     fields(4) = new Field("salary", DataTypes.FLOAT)
 
+    val imagePath = "../../sdk/sdk/src/test/resources/image/carbondatalogo.jpg"
     try {
       val builder = CarbonWriter.builder()
       val writer =
         builder.outputPath(writerPath)
           .uniqueIdentifier(System.currentTimeMillis()).withBlockSize(2).sortBy(sortColumns)
           .withCsvInput(new Schema(fields)).writtenBy("TestNonTransactionalCarbonTable").build()
       var i = 0
+      val bis = new BufferedInputStream(new FileInputStream(imagePath))
+      var hexValue: Array[Char] = null
+      val originBinary = new Array[Byte](bis.available)
+      while (bis.read(originBinary) != -1) {
+        hexValue = Hex.encodeHex(originBinary)
+      }
+      bis.close()
+      val binaryValue = String.valueOf(hexValue)
+

Review comment:
       ok

##########
File path: integration/presto/src/test/scala/org/apache/carbondata/presto/integrationtest/PrestoTestNonTransactionalTableFiles.scala
##########
@@ -176,24 +188,35 @@ class PrestoTestNonTransactionalTableFiles extends FunSuiteLike with BeforeAndAf
   def buildTestDataOtherDataType(rows: Int, sortColumns: Array[String]): Any = {
     val fields: Array[Field] = new Array[Field](5)
     // same column name, but name as boolean type
-    fields(0) = new Field("name", DataTypes.BOOLEAN)
+    fields(0) = new Field("name", DataTypes.VARCHAR)
     fields(1) = new Field("age", DataTypes.INT)
-    fields(2) = new Field("id", DataTypes.BYTE)
+    fields(2) = new Field("id", DataTypes.BINARY)
     fields(3) = new Field("height", DataTypes.DOUBLE)
     fields(4) = new Field("salary", DataTypes.FLOAT)
 
+    val imagePath = "../../sdk/sdk/src/test/resources/image/carbondatalogo.jpg"

Review comment:
       ok, used root path




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3882: [CARBONDATA-3941] Support binary data type reading from presto

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3882:
URL: https://github.com/apache/carbondata/pull/3882#issuecomment-669828122


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3627/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3882: [CARBONDATA-3941] Support binary data type reading from presto

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3882:
URL: https://github.com/apache/carbondata/pull/3882#issuecomment-669832416


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1888/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3882: [CARBONDATA-3941] Support binary data type reading from presto

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3882:
URL: https://github.com/apache/carbondata/pull/3882#issuecomment-669915582


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1896/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3882: [CARBONDATA-3941] Support binary data type reading from presto

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3882:
URL: https://github.com/apache/carbondata/pull/3882#issuecomment-669945651


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3635/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat commented on pull request #3882: [CARBONDATA-3941] Support binary data type reading from presto

GitBox
In reply to this post by GitBox

ajantha-bhat commented on pull request #3882:
URL: https://github.com/apache/carbondata/pull/3882#issuecomment-669948010


   retest this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3882: [CARBONDATA-3941] Support binary data type reading from presto

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3882:
URL: https://github.com/apache/carbondata/pull/3882#issuecomment-670051650


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3640/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3882: [CARBONDATA-3941] Support binary data type reading from presto

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3882:
URL: https://github.com/apache/carbondata/pull/3882#issuecomment-670055404


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1901/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] ajantha-bhat commented on pull request #3882: [CARBONDATA-3941] Support binary data type reading from presto

GitBox
In reply to this post by GitBox

ajantha-bhat commented on pull request #3882:
URL: https://github.com/apache/carbondata/pull/3882#issuecomment-670056335


   retest this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3882: [CARBONDATA-3941] Support binary data type reading from presto

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3882:
URL: https://github.com/apache/carbondata/pull/3882#issuecomment-670145319


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1908/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3882: [CARBONDATA-3941] Support binary data type reading from presto

GitBox
In reply to this post by GitBox

CarbonDataQA1 commented on pull request #3882:
URL: https://github.com/apache/carbondata/pull/3882#issuecomment-670150976


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3647/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] QiangCai commented on pull request #3882: [CARBONDATA-3941] Support binary data type reading from presto

GitBox
In reply to this post by GitBox

QiangCai commented on pull request #3882:
URL: https://github.com/apache/carbondata/pull/3882#issuecomment-670493189


   LGTM


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] asfgit closed pull request #3882: [CARBONDATA-3941] Support binary data type reading from presto

GitBox
In reply to this post by GitBox

asfgit closed pull request #3882:
URL: https://github.com/apache/carbondata/pull/3882


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]