[GitHub] incubator-carbondata pull request #523: [CARBONDATA-440] fixing no kettle is...

classic Classic list List threaded Threaded
32 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata pull request #523: [CARBONDATA-440] fixing no kettle is...

qiuchenjian-2
GitHub user ravikiran23 opened a pull request:

    https://github.com/apache/incubator-carbondata/pull/523

    [CARBONDATA-440] fixing no kettle issue for IUD.

    For iud data load flow will be used. so in the case of NO-KETTLE, need to handle data load.
   
    load count/ segment count should be string because in compaction case it will be 2.1
   


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ravikiran23/incubator-carbondata IUD-NO-KETTLE

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-carbondata/pull/523.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #523
   
----
commit 5dd98b38e332b08f11daeaa683950b90172e02a9
Author: ravikiran <[hidden email]>
Date:   2017-01-09T13:28:13Z

    fixing no kettle issue for IUD.
    load count/ segment count should be string because in compaction case it will be 2.1

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata issue #523: [CARBONDATA-440] fixing no kettle issue for...

qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/523
 
    Build Success with Spark 1.5.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/559/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata pull request #523: [CARBONDATA-440] fixing no kettle is...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/523#discussion_r95704439
 
    --- Diff: integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala ---
    @@ -719,16 +720,51 @@ object CarbonDataRDDFactory {
                   loadMetadataDetails.setLoadStatus(CarbonCommonConstants.STORE_LOADSTATUS_SUCCESS)
                   val rddIteratorKey = CarbonCommonConstants.RDDUTIL_UPDATE_KEY +
                                        UUID.randomUUID().toString
    +              if (useKettle) {
    +                try {
    +                  RddInpututilsForUpdate.put(rddIteratorKey,
    +                    new RddIteratorForUpdate(iter, carbonLoadModel))
    +                  carbonLoadModel.setRddIteratorKey(rddIteratorKey)
    +                  CarbonDataLoadForUpdate
    +                    .run(carbonLoadModel, index, storePath, kettleHomePath,
    +                      segId, loadMetadataDetails, executionErrors)
    +                } finally {
    +                  RddInpututilsForUpdate.remove(rddIteratorKey)
    +                }
    +              }
    +              else {
    --- End diff --
   
    move to previous line


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata pull request #523: [CARBONDATA-440] fixing no kettle is...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/523#discussion_r95709745
 
    --- Diff: integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala ---
    @@ -719,16 +720,51 @@ object CarbonDataRDDFactory {
                   loadMetadataDetails.setLoadStatus(CarbonCommonConstants.STORE_LOADSTATUS_SUCCESS)
                   val rddIteratorKey = CarbonCommonConstants.RDDUTIL_UPDATE_KEY +
                                        UUID.randomUUID().toString
    +              if (useKettle) {
    --- End diff --
   
    how about in carbon-spark2 module, can you check the same in that module also?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata issue #523: [CARBONDATA-440] fixing no kettle issue for...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/523
 
    I verified with `mvn clean verify -Pno-kettle -Pspark-1.6` but it failed in test case `insert from hive-sum expression`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata pull request #523: [CARBONDATA-440] fixing no kettle is...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravikiran23 commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/523#discussion_r95751767
 
    --- Diff: integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala ---
    @@ -719,16 +720,51 @@ object CarbonDataRDDFactory {
                   loadMetadataDetails.setLoadStatus(CarbonCommonConstants.STORE_LOADSTATUS_SUCCESS)
                   val rddIteratorKey = CarbonCommonConstants.RDDUTIL_UPDATE_KEY +
                                        UUID.randomUUID().toString
    +              if (useKettle) {
    --- End diff --
   
    as of now IUD is supported in 1.6.2. support is not there for 2.1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata pull request #523: [CARBONDATA-440] fixing no kettle is...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravikiran23 commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/523#discussion_r95752235
 
    --- Diff: integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala ---
    @@ -719,16 +720,51 @@ object CarbonDataRDDFactory {
                   loadMetadataDetails.setLoadStatus(CarbonCommonConstants.STORE_LOADSTATUS_SUCCESS)
                   val rddIteratorKey = CarbonCommonConstants.RDDUTIL_UPDATE_KEY +
                                        UUID.randomUUID().toString
    +              if (useKettle) {
    +                try {
    +                  RddInpututilsForUpdate.put(rddIteratorKey,
    +                    new RddIteratorForUpdate(iter, carbonLoadModel))
    +                  carbonLoadModel.setRddIteratorKey(rddIteratorKey)
    +                  CarbonDataLoadForUpdate
    +                    .run(carbonLoadModel, index, storePath, kettleHomePath,
    +                      segId, loadMetadataDetails, executionErrors)
    +                } finally {
    +                  RddInpututilsForUpdate.remove(rddIteratorKey)
    +                }
    +              }
    +              else {
    --- End diff --
   
    fixed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata issue #523: [CARBONDATA-440] fixing no kettle issue for...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/523
 
    Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/565/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata issue #523: [CARBONDATA-440] fixing no kettle issue for...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravikiran23 commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/523
 
    @jackylk  i verified the same test case with new code with out my fix , it is still failing.  this may be due to some other PR. my code doesnt impact insert into flow.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata issue #523: [CARBONDATA-440] fixing no kettle issue for...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/523
 
    @jackylk Please review and merge this PR, I will fix the testcases for no kettle flow and raise in another PR


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata issue #523: [CARBONDATA-440] fixing no kettle issue for...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/523
 
    LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata pull request #523: [CARBONDATA-440] fixing no kettle is...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/523#discussion_r95920624
 
    --- Diff: integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala ---
    @@ -719,16 +720,50 @@ object CarbonDataRDDFactory {
                   loadMetadataDetails.setLoadStatus(CarbonCommonConstants.STORE_LOADSTATUS_SUCCESS)
                   val rddIteratorKey = CarbonCommonConstants.RDDUTIL_UPDATE_KEY +
                                        UUID.randomUUID().toString
    +              if (useKettle) {
    +                try {
    --- End diff --
   
    move `try` to `CarbonDataLoadForUpdate.run` only, we should limit the try scope, do the same for next `try` also.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata pull request #523: [CARBONDATA-440] fixing no kettle is...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/523#discussion_r95920765
 
    --- Diff: integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala ---
    @@ -719,16 +720,50 @@ object CarbonDataRDDFactory {
                   loadMetadataDetails.setLoadStatus(CarbonCommonConstants.STORE_LOADSTATUS_SUCCESS)
                   val rddIteratorKey = CarbonCommonConstants.RDDUTIL_UPDATE_KEY +
                                        UUID.randomUUID().toString
    +              if (useKettle) {
    +                try {
    +                  RddInpututilsForUpdate.put(rddIteratorKey,
    +                    new RddIteratorForUpdate(iter, carbonLoadModel))
    +                  carbonLoadModel.setRddIteratorKey(rddIteratorKey)
    +                  CarbonDataLoadForUpdate
    +                    .run(carbonLoadModel, index, storePath, kettleHomePath,
    +                      segId, loadMetadataDetails, executionErrors)
    +                } finally {
    +                  RddInpututilsForUpdate.remove(rddIteratorKey)
    +                }
    +              } else {
    +                try {
    +                  val recordReaders = mutable.Buffer[CarbonIterator[Array[AnyRef]]]()
    +                  val serializer = SparkEnv.get.closureSerializer.newInstance()
    +                  var serializeBuffer: ByteBuffer = null
    +                    recordReaders += new CarbonIteratorImpl(
    +                      new NewRddIterator(iter,
    +                        carbonLoadModel,
    +                        TaskContext.get()))
    +
    +                  val loader = new SparkPartitionLoader(carbonLoadModel,
    +                    index,
    +                    null,
    +                    null,
    +                    segId,
    +                    loadMetadataDetails)
    +                  // Intialize to set carbon properties
    +                  loader.initialize()
    +
    +                  loadMetadataDetails.setLoadStatus(CarbonCommonConstants.STORE_LOADSTATUS_SUCCESS)
    +                  new DataLoadExecutor()
    +                    .execute(carbonLoadModel, loader.storeLocation, recordReaders.toArray)
    --- End diff --
   
    move to previous line, break the line at parameter list


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata pull request #523: [CARBONDATA-440] fixing no kettle is...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/523#discussion_r95920779
 
    --- Diff: integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala ---
    @@ -719,16 +720,50 @@ object CarbonDataRDDFactory {
                   loadMetadataDetails.setLoadStatus(CarbonCommonConstants.STORE_LOADSTATUS_SUCCESS)
                   val rddIteratorKey = CarbonCommonConstants.RDDUTIL_UPDATE_KEY +
                                        UUID.randomUUID().toString
    +              if (useKettle) {
    +                try {
    +                  RddInpututilsForUpdate.put(rddIteratorKey,
    +                    new RddIteratorForUpdate(iter, carbonLoadModel))
    +                  carbonLoadModel.setRddIteratorKey(rddIteratorKey)
    +                  CarbonDataLoadForUpdate
    +                    .run(carbonLoadModel, index, storePath, kettleHomePath,
    +                      segId, loadMetadataDetails, executionErrors)
    +                } finally {
    +                  RddInpututilsForUpdate.remove(rddIteratorKey)
    +                }
    +              } else {
    +                try {
    +                  val recordReaders = mutable.Buffer[CarbonIterator[Array[AnyRef]]]()
    +                  val serializer = SparkEnv.get.closureSerializer.newInstance()
    +                  var serializeBuffer: ByteBuffer = null
    +                    recordReaders += new CarbonIteratorImpl(
    +                      new NewRddIterator(iter,
    +                        carbonLoadModel,
    +                        TaskContext.get()))
    +
    +                  val loader = new SparkPartitionLoader(carbonLoadModel,
    +                    index,
    +                    null,
    +                    null,
    +                    segId,
    +                    loadMetadataDetails)
    +                  // Intialize to set carbon properties
    +                  loader.initialize()
    +
    +                  loadMetadataDetails.setLoadStatus(CarbonCommonConstants.STORE_LOADSTATUS_SUCCESS)
    +                  new DataLoadExecutor()
    +                    .execute(carbonLoadModel, loader.storeLocation, recordReaders.toArray)
    +
    +                } catch {
    +                  case e: BadRecordFoundException =>
    +                    loadMetadataDetails
    +                      .setLoadStatus(CarbonCommonConstants.STORE_LOADSTATUS_PARTIAL_SUCCESS)
    --- End diff --
   
    move to previous line, break the line at parameter list


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata pull request #523: [CARBONDATA-440] fixing no kettle is...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/523#discussion_r95920835
 
    --- Diff: integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala ---
    @@ -719,16 +720,50 @@ object CarbonDataRDDFactory {
                   loadMetadataDetails.setLoadStatus(CarbonCommonConstants.STORE_LOADSTATUS_SUCCESS)
                   val rddIteratorKey = CarbonCommonConstants.RDDUTIL_UPDATE_KEY +
                                        UUID.randomUUID().toString
    +              if (useKettle) {
    +                try {
    +                  RddInpututilsForUpdate.put(rddIteratorKey,
    +                    new RddIteratorForUpdate(iter, carbonLoadModel))
    +                  carbonLoadModel.setRddIteratorKey(rddIteratorKey)
    +                  CarbonDataLoadForUpdate
    +                    .run(carbonLoadModel, index, storePath, kettleHomePath,
    --- End diff --
   
    move to previous line, break the line at parameter list


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata pull request #523: [CARBONDATA-440] fixing no kettle is...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/523#discussion_r95920952
 
    --- Diff: integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala ---
    @@ -719,16 +720,50 @@ object CarbonDataRDDFactory {
                   loadMetadataDetails.setLoadStatus(CarbonCommonConstants.STORE_LOADSTATUS_SUCCESS)
                   val rddIteratorKey = CarbonCommonConstants.RDDUTIL_UPDATE_KEY +
                                        UUID.randomUUID().toString
    +              if (useKettle) {
    +                try {
    +                  RddInpututilsForUpdate.put(rddIteratorKey,
    +                    new RddIteratorForUpdate(iter, carbonLoadModel))
    +                  carbonLoadModel.setRddIteratorKey(rddIteratorKey)
    +                  CarbonDataLoadForUpdate
    +                    .run(carbonLoadModel, index, storePath, kettleHomePath,
    +                      segId, loadMetadataDetails, executionErrors)
    +                } finally {
    +                  RddInpututilsForUpdate.remove(rddIteratorKey)
    +                }
    +              } else {
    +                try {
    +                  val recordReaders = mutable.Buffer[CarbonIterator[Array[AnyRef]]]()
    +                  val serializer = SparkEnv.get.closureSerializer.newInstance()
    +                  var serializeBuffer: ByteBuffer = null
    +                    recordReaders += new CarbonIteratorImpl(
    +                      new NewRddIterator(iter,
    +                        carbonLoadModel,
    +                        TaskContext.get()))
    +
    +                  val loader = new SparkPartitionLoader(carbonLoadModel,
    +                    index,
    +                    null,
    --- End diff --
   
    You are following different code style, can you make the style like other code?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata issue #523: [CARBONDATA-440] fixing no kettle issue for...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/523
 
    @ravikiran23 Please work on comments given by @jackylk , we should merge this soon otherwise it will be   blocking for testing.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata pull request #523: [CARBONDATA-440] fixing no kettle is...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravikiran23 commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/523#discussion_r96169559
 
    --- Diff: integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala ---
    @@ -719,16 +720,50 @@ object CarbonDataRDDFactory {
                   loadMetadataDetails.setLoadStatus(CarbonCommonConstants.STORE_LOADSTATUS_SUCCESS)
                   val rddIteratorKey = CarbonCommonConstants.RDDUTIL_UPDATE_KEY +
                                        UUID.randomUUID().toString
    +              if (useKettle) {
    +                try {
    --- End diff --
   
    fixed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata pull request #523: [CARBONDATA-440] fixing no kettle is...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravikiran23 commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/523#discussion_r96169571
 
    --- Diff: integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala ---
    @@ -719,16 +720,50 @@ object CarbonDataRDDFactory {
                   loadMetadataDetails.setLoadStatus(CarbonCommonConstants.STORE_LOADSTATUS_SUCCESS)
                   val rddIteratorKey = CarbonCommonConstants.RDDUTIL_UPDATE_KEY +
                                        UUID.randomUUID().toString
    +              if (useKettle) {
    +                try {
    +                  RddInpututilsForUpdate.put(rddIteratorKey,
    +                    new RddIteratorForUpdate(iter, carbonLoadModel))
    +                  carbonLoadModel.setRddIteratorKey(rddIteratorKey)
    +                  CarbonDataLoadForUpdate
    +                    .run(carbonLoadModel, index, storePath, kettleHomePath,
    +                      segId, loadMetadataDetails, executionErrors)
    +                } finally {
    +                  RddInpututilsForUpdate.remove(rddIteratorKey)
    +                }
    +              } else {
    +                try {
    +                  val recordReaders = mutable.Buffer[CarbonIterator[Array[AnyRef]]]()
    +                  val serializer = SparkEnv.get.closureSerializer.newInstance()
    +                  var serializeBuffer: ByteBuffer = null
    +                    recordReaders += new CarbonIteratorImpl(
    +                      new NewRddIterator(iter,
    +                        carbonLoadModel,
    +                        TaskContext.get()))
    +
    +                  val loader = new SparkPartitionLoader(carbonLoadModel,
    +                    index,
    +                    null,
    +                    null,
    +                    segId,
    +                    loadMetadataDetails)
    +                  // Intialize to set carbon properties
    +                  loader.initialize()
    +
    +                  loadMetadataDetails.setLoadStatus(CarbonCommonConstants.STORE_LOADSTATUS_SUCCESS)
    +                  new DataLoadExecutor()
    +                    .execute(carbonLoadModel, loader.storeLocation, recordReaders.toArray)
    --- End diff --
   
    fixed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata pull request #523: [CARBONDATA-440] fixing no kettle is...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravikiran23 commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/523#discussion_r96169576
 
    --- Diff: integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala ---
    @@ -719,16 +720,50 @@ object CarbonDataRDDFactory {
                   loadMetadataDetails.setLoadStatus(CarbonCommonConstants.STORE_LOADSTATUS_SUCCESS)
                   val rddIteratorKey = CarbonCommonConstants.RDDUTIL_UPDATE_KEY +
                                        UUID.randomUUID().toString
    +              if (useKettle) {
    +                try {
    +                  RddInpututilsForUpdate.put(rddIteratorKey,
    +                    new RddIteratorForUpdate(iter, carbonLoadModel))
    +                  carbonLoadModel.setRddIteratorKey(rddIteratorKey)
    +                  CarbonDataLoadForUpdate
    +                    .run(carbonLoadModel, index, storePath, kettleHomePath,
    +                      segId, loadMetadataDetails, executionErrors)
    +                } finally {
    +                  RddInpututilsForUpdate.remove(rddIteratorKey)
    +                }
    +              } else {
    +                try {
    +                  val recordReaders = mutable.Buffer[CarbonIterator[Array[AnyRef]]]()
    +                  val serializer = SparkEnv.get.closureSerializer.newInstance()
    +                  var serializeBuffer: ByteBuffer = null
    +                    recordReaders += new CarbonIteratorImpl(
    +                      new NewRddIterator(iter,
    +                        carbonLoadModel,
    +                        TaskContext.get()))
    +
    +                  val loader = new SparkPartitionLoader(carbonLoadModel,
    +                    index,
    +                    null,
    +                    null,
    +                    segId,
    +                    loadMetadataDetails)
    +                  // Intialize to set carbon properties
    +                  loader.initialize()
    +
    +                  loadMetadataDetails.setLoadStatus(CarbonCommonConstants.STORE_LOADSTATUS_SUCCESS)
    +                  new DataLoadExecutor()
    +                    .execute(carbonLoadModel, loader.storeLocation, recordReaders.toArray)
    +
    +                } catch {
    +                  case e: BadRecordFoundException =>
    +                    loadMetadataDetails
    +                      .setLoadStatus(CarbonCommonConstants.STORE_LOADSTATUS_PARTIAL_SUCCESS)
    --- End diff --
   
    fixed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
12