[GitHub] incubator-carbondata pull request #696: [CARBONDATA-818] Make the file_name ...

classic Classic list List threaded Threaded
28 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata pull request #696: [CARBONDATA-818] Make the file_name ...

qiuchenjian-2
GitHub user watermen opened a pull request:

    https://github.com/apache/incubator-carbondata/pull/696

    [CARBONDATA-818] Make the file_name in carbonindex exactly

    The file_name stored in carbonindex is a local path which used on executor as temp dir
    ```
    /tmp/6937581525189542/0/default/carbon_v3/Fact/Part0/Segment_1/0/part-0-0_batchno0-0-1490344094093.carbondata
    ```
    But I think we want to store the actual carbondata path like
    ```
    /user/hive/warehouse/default/carbon_v3/Fact/Part0/Segment_0/part-0-0-0-1489566284025.carbondata
    ```
   
    I have already check this with @QiangCai.
   
    cc @jackylk

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/watermen/incubator-carbondata CARBONDATA-818

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-carbondata/pull/696.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #696
   
----
commit 4da2a705ed8f050a8e89da0780d3c56751208a2e
Author: Yadong Qi <[hidden email]>
Date:   2017-03-24T08:26:20Z

    Make the file_name in carbonindex exactly.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata issue #696: [CARBONDATA-818] Make the file_name in carb...

qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/696
 
    Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1323/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata issue #696: [CARBONDATA-818] Make the file_name in carb...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user QiangCai commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/696
 
    @watermen
    It is unnecessary to store carbondata file path in carbonindex file.
    During btree building, just use carbondata file name to sort tableblockinfos.
    please check CarbonUtil.readCarbonIndexFile and TableBlockInfo.compareTo.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata issue #696: [CARBONDATA-818] Make the file_name in carb...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/696
 
    add to whitelist


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata issue #696: [CARBONDATA-818] Make the file_name in carb...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/696
 
    Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1331/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata issue #696: [CARBONDATA-818] Make the file_name in carb...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user watermen commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/696
 
    @QiangCai Store fileName insteads of filePath in carbonindex now. Please review it again.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata issue #696: [CARBONDATA-818] Make the file_name in carb...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/696
 
    Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1332/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata issue #696: [CARBONDATA-818] Make the file_name in carb...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user QiangCai commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/696
 
    LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata issue #696: [CARBONDATA-818] Make the file_name in carb...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user Sephiroth-Lin commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/696
 
    LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata issue #696: [CARBONDATA-818] Make the file_name in carb...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user chenliang613 commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/696
 
    retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata issue #696: [CARBONDATA-818] Make the file_name in carb...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/696
 
    Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1336/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata pull request #696: [CARBONDATA-818] Make the file_name ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user chenliang613 commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/696#discussion_r108052363
 
    --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/dataload/TestDataLoadWithFileName.scala ---
    @@ -0,0 +1,111 @@
    +package org.apache.carbondata.spark.testsuite.dataload
    --- End diff --
   
    Please add license header


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata pull request #696: [CARBONDATA-818] Make the file_name ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user chenliang613 commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/696#discussion_r108052521
 
    --- Diff: processing/src/main/java/org/apache/carbondata/processing/store/writer/v1/CarbonFactDataWriterImplV1.java ---
    @@ -373,7 +373,7 @@ protected void writeBlockletInfoToFile(FileChannel channel, String filePath)
           FileFooter convertFileMeta = CarbonMetadataUtil
               .convertFileFooter(blockletInfoList, localCardinality.length, localCardinality,
                   thriftColumnSchemaList, dataWriterVo.getSegmentProperties());
    -      fillBlockIndexInfoDetails(convertFileMeta.getNum_rows(), filePath, currentPosition);
    +      fillBlockIndexInfoDetails(convertFileMeta.getNum_rows(), carbonDataFileName, currentPosition);
    --- End diff --
   
    Please align the parameter name(filePath) for fillBlockIndexInfoDetails of AbstractFactDataWriter.java
    For example :
     protected void fillBlockIndexInfoDetails(long numberOfRows,
        String carbonDataFileName, long currentPosition)
   
    Please modify accordingly for all part.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata pull request #696: [CARBONDATA-818] Make the file_name ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user chenliang613 commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/696#discussion_r108052532
 
    --- Diff: processing/src/main/java/org/apache/carbondata/processing/store/writer/v3/CarbonFactDataWriterImplV3.java ---
    @@ -528,8 +528,7 @@ protected void fillBlockIndexInfoDetails(long numberOfRows, String filePath,
         org.apache.carbondata.core.metadata.blocklet.index.BlockletIndex blockletIndex =
             new org.apache.carbondata.core.metadata.blocklet.index.BlockletIndex(btree, minmax);
         BlockIndexInfo blockIndexInfo =
    -        new BlockIndexInfo(numberOfRows, filePath.substring(0, filePath.lastIndexOf('.')),
    -            currentPosition, blockletIndex);
    +        new BlockIndexInfo(numberOfRows, filePath, currentPosition, blockletIndex);
    --- End diff --
   
    can you explain ,why do this change ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata pull request #696: [CARBONDATA-818] Make the file_name ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user watermen commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/696#discussion_r108081556
 
    --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/dataload/TestDataLoadWithFileName.scala ---
    @@ -0,0 +1,111 @@
    +package org.apache.carbondata.spark.testsuite.dataload
    --- End diff --
   
    Done


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata pull request #696: [CARBONDATA-818] Make the file_name ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user watermen commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/696#discussion_r108081571
 
    --- Diff: processing/src/main/java/org/apache/carbondata/processing/store/writer/v1/CarbonFactDataWriterImplV1.java ---
    @@ -373,7 +373,7 @@ protected void writeBlockletInfoToFile(FileChannel channel, String filePath)
           FileFooter convertFileMeta = CarbonMetadataUtil
               .convertFileFooter(blockletInfoList, localCardinality.length, localCardinality,
                   thriftColumnSchemaList, dataWriterVo.getSegmentProperties());
    -      fillBlockIndexInfoDetails(convertFileMeta.getNum_rows(), filePath, currentPosition);
    +      fillBlockIndexInfoDetails(convertFileMeta.getNum_rows(), carbonDataFileName, currentPosition);
    --- End diff --
   
    Done


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata pull request #696: [CARBONDATA-818] Make the file_name ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user watermen commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/696#discussion_r108081845
 
    --- Diff: processing/src/main/java/org/apache/carbondata/processing/store/writer/v3/CarbonFactDataWriterImplV3.java ---
    @@ -528,8 +528,7 @@ protected void fillBlockIndexInfoDetails(long numberOfRows, String filePath,
         org.apache.carbondata.core.metadata.blocklet.index.BlockletIndex blockletIndex =
             new org.apache.carbondata.core.metadata.blocklet.index.BlockletIndex(btree, minmax);
         BlockIndexInfo blockIndexInfo =
    -        new BlockIndexInfo(numberOfRows, filePath.substring(0, filePath.lastIndexOf('.')),
    -            currentPosition, blockletIndex);
    +        new BlockIndexInfo(numberOfRows, filePath, currentPosition, blockletIndex);
    --- End diff --
   
    # Before
    We pass the fileName and in the end of fileName is `.inprogress`, so we need to do substring.
    ```java
    this.fileName = dataWriterVo.getStoreLocation() + File.separator + carbonDataFileName + CarbonCommonConstants.FILE_INPROGRESS_STATUS;
    ```
    # After
    We pass the carbonDataFileName and we don't need to do substring.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata issue #696: [CARBONDATA-818] Make the file_name in carb...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user watermen commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/696
 
    @chenliang613 Thanks for your review, plz review it again.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata issue #696: [CARBONDATA-818] Make the file_name in carb...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/696
 
    Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1345/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata issue #696: [CARBONDATA-818] Make the file_name in carb...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user chenliang613 commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/696
 
    retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
12