[GitHub] carbondata pull request #2220: [CARBONDATA-2369] FAQ update related to carbo...

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2220: [CARBONDATA-2369] FAQ update related to carbo...

qiuchenjian-2
GitHub user ajantha-bhat opened a pull request:

    https://github.com/apache/carbondata/pull/2220

    [CARBONDATA-2369] FAQ update related to carbon SDK scenario

    [CARBONDATA-2369] FAQ update related to carbon SDK scenario
   
     - [ ] Any interfaces changed? no
     
     - [ ] Any backward compatibility impacted? no
     
     - [ ] Document update required? yes, updated
   
     - [ ] Testing done. NA
           
   
   


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ajantha-bhat/carbondata faq

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/2220.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2220
   
----
commit 6ee180f0c4f11207e73b19a46ab48ba01ec7128a
Author: ajantha-bhat <ajanthabhat@...>
Date:   2018-04-24T08:20:35Z

    [CARBONDATA-2369] FAQ update related to SDK scenario

----


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2220: [CARBONDATA-2369] FAQ update related to carbon SDK s...

qiuchenjian-2
Github user sgururajshetty commented on the issue:

    https://github.com/apache/carbondata/pull/2220
 
    LGTM


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2220: [CARBONDATA-2369] FAQ update related to carbon SDK s...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2220
 
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4197/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2220: [CARBONDATA-2369] FAQ update related to carbon SDK s...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2220
 
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4508/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2220: [CARBONDATA-2369] FAQ update related to carbon SDK s...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2220
 
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5366/



---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2220: [CARBONDATA-2369] FAQ update related to carbo...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user gvramana commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2220#discussion_r183798281
 
    --- Diff: docs/faq.md ---
    @@ -182,3 +183,15 @@ select cntry,sum(gdp) from gdp21,pop1 where cntry=ctry group by cntry;
     ## Why all executors are showing success in Spark UI even after Dataload command failed at Driver side?
     Spark executor shows task as failed after the maximum number of retry attempts, but loading the data having bad records and BAD_RECORDS_ACTION (carbon.bad.records.action) is set as “FAIL” will attempt only once but will send the signal to driver as failed instead of throwing the exception to retry, as there is no point to retry if bad record found and BAD_RECORDS_ACTION is set to fail. Hence the Spark executor displays this one attempt as successful but the command has actually failed to execute. Task attempts or executor logs can be checked to observe the failure reason.
     
    +## Why different time zone result for select query output when query SDK writer output?
    +SDK writer is an independent entity, hence SDK writer can generate carbondata files from a non-cluster machine that has different time zones. But at cluster when those files are read, it always takes cluster time-zone. Hence, the value of timestamp and date datatype fields are not original value.
    +If you do not want to see according to time-zone, then set cluster's time-zone in SDK writer by calling below API.
    --- End diff --
   
    If wanted to control timezone of data while writing, then set cluster's time-zone in SDK writer by calling below API.


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2220: [CARBONDATA-2369] FAQ update related to carbo...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ajantha-bhat commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2220#discussion_r183928302
 
    --- Diff: docs/faq.md ---
    @@ -182,3 +183,15 @@ select cntry,sum(gdp) from gdp21,pop1 where cntry=ctry group by cntry;
     ## Why all executors are showing success in Spark UI even after Dataload command failed at Driver side?
     Spark executor shows task as failed after the maximum number of retry attempts, but loading the data having bad records and BAD_RECORDS_ACTION (carbon.bad.records.action) is set as “FAIL” will attempt only once but will send the signal to driver as failed instead of throwing the exception to retry, as there is no point to retry if bad record found and BAD_RECORDS_ACTION is set to fail. Hence the Spark executor displays this one attempt as successful but the command has actually failed to execute. Task attempts or executor logs can be checked to observe the failure reason.
     
    +## Why different time zone result for select query output when query SDK writer output?
    +SDK writer is an independent entity, hence SDK writer can generate carbondata files from a non-cluster machine that has different time zones. But at cluster when those files are read, it always takes cluster time-zone. Hence, the value of timestamp and date datatype fields are not original value.
    +If you do not want to see according to time-zone, then set cluster's time-zone in SDK writer by calling below API.
    --- End diff --
   
    done. will take this changes in #2198


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata issue #2220: [CARBONDATA-2369] FAQ update related to carbon SDK s...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ajantha-bhat commented on the issue:

    https://github.com/apache/carbondata/pull/2220
 
    This will be handled in #2198.
   
    No need of separate PR


---
Reply | Threaded
Open this post in threaded view
|

[GitHub] carbondata pull request #2220: [CARBONDATA-2369] FAQ update related to carbo...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ajantha-bhat closed the pull request at:

    https://github.com/apache/carbondata/pull/2220


---