[GitHub] incubator-carbondata pull request #852: [WIP] Refactory writer step and add ...

classic Classic list List threaded Threaded
17 messages Options
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata pull request #852: [WIP] Refactory writer step and add ...

qiuchenjian-2
GitHub user jackylk opened a pull request:

    https://github.com/apache/incubator-carbondata/pull/852

    [WIP] Refactory writer step and add ColumnPage

    Add ColumnPage and simplify writer step in data loading

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/jackylk/incubator-carbondata page-intf

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-carbondata/pull/852.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #852
   
----
commit db45d7e9ebed2a1e9e2c5c523a4e9a96665194b8
Author: jackylk <[hidden email]>
Date:   2017-04-19T16:08:46Z

    refactory FactDataHandlerColumnar

commit 575e6c96ac9fb3a63a570517a93adbdf3d34d445
Author: jackylk <[hidden email]>
Date:   2017-04-26T01:29:40Z

    change write to use ColumnPage

commit ad40f763e470ea7ac39bd3e59915fa625feb1c10
Author: jackylk <[hidden email]>
Date:   2017-04-26T01:36:12Z

    fix testcase

commit 6a4e9748f075b04f774bf6ff460054ed0bde74e6
Author: jackylk <[hidden email]>
Date:   2017-04-26T03:00:52Z

    fix testcase

commit 2e82c7ec2100ee274112f6417a843e3a3a39e232
Author: jackylk <[hidden email]>
Date:   2017-04-26T03:26:15Z

    fix testcase

commit 3db32713d0ef89f40829b901c0681f48e0ae5d22
Author: jackylk <[hidden email]>
Date:   2017-04-26T03:40:36Z

    fix testcase

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata issue #852: [WIP] Refactory write step and add ColumnPa...

qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/852
 
    Build Failed  with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1800/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata issue #852: [WIP] Refactory write step and add ColumnPa...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/852
 
    Build Failed  with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1857/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata issue #852: [WIP] Refactory write step and add ColumnPa...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/852
 
    Build Failed  with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1860/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata issue #852: [WIP] Refactory write step and add ColumnPa...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/852
 
    Build Failed  with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1864/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata issue #852: [WIP] Refactory write step and add ColumnPa...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/852
 
    Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1871/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata issue #852: [WIP] Refactory write step and add ColumnPa...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/852
 
    Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1872/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata pull request #852: [CARBONDATA-1015] Refactory write st...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/852#discussion_r114752073
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/FixLengthColumnPage.java ---
    @@ -0,0 +1,155 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.carbondata.core.datastore.page;
    +
    +import java.math.BigDecimal;
    +import java.nio.ByteBuffer;
    +import java.util.BitSet;
    +
    +import org.apache.carbondata.core.constants.CarbonCommonConstants;
    +import org.apache.carbondata.core.metadata.datatype.DataType;
    +import org.apache.carbondata.core.util.DataTypeUtil;
    +
    +// Represent a columnar data in one page for one column.
    +public class FixLengthColumnPage extends ColumnPage {
    +
    +  // Only one of following fields will be used
    +  private byte[] byteData;
    +  private short[] shortData;
    +  private int[] intData;
    +  private long[] longData;
    +  private double[] doubleData;
    +
    +  private byte[][] byteArrayData;
    --- End diff --
   
    why fix length page has two dimensional array?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata pull request #852: [CARBONDATA-1015] Refactory write st...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/852#discussion_r114752587
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/statistics/PageStatistics.java ---
    @@ -0,0 +1,124 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.carbondata.core.datastore.page.statistics;
    +
    +import java.math.BigDecimal;
    +
    +import org.apache.carbondata.core.metadata.datatype.DataType;
    +import org.apache.carbondata.core.util.DataTypeUtil;
    +
    +/** statics for one column page */
    +public class PageStatistics {
    +  private DataType dataType;
    +
    +  /** min and max value of the measures */
    +  private Object min, max;
    --- End diff --
   
    what about min/max for no dictionary column and dictionary column?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata pull request #852: [CARBONDATA-1015] Refactory write st...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/852#discussion_r114752909
 
    --- Diff: core/src/test/java/org/apache/carbondata/core/util/CarbonMetadataUtilTest.java ---
    @@ -169,7 +171,12 @@
         long[] longArr = { 1, 2, 3, 4, 5 };
         byte[][] maxByteArr = { { 1, 2 }, { 3, 4 }, { 5, 6 }, { 2, 4 }, { 1, 2 } };
         int[] cardinality = { 1, 2, 3, 4, 5 };
    -    char[] charArr = { 'a', 's', 'd', 'g', 'h' };
    +    org.apache.carbondata.core.metadata.datatype.DataType[] charArr = {
    --- End diff --
   
    better rename to ```dataTypes```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata issue #852: [CARBONDATA-1015] Refactory write step and ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user ravipesala commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/852
 
    @jackylk Still wiring is not done for columnpage to writer step right? are you planning to do in this PR only?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata pull request #852: [CARBONDATA-1015] Refactory write st...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/852#discussion_r114771512
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/FixLengthColumnPage.java ---
    @@ -0,0 +1,155 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.carbondata.core.datastore.page;
    +
    +import java.math.BigDecimal;
    +import java.nio.ByteBuffer;
    +import java.util.BitSet;
    +
    +import org.apache.carbondata.core.constants.CarbonCommonConstants;
    +import org.apache.carbondata.core.metadata.datatype.DataType;
    +import org.apache.carbondata.core.util.DataTypeUtil;
    +
    +// Represent a columnar data in one page for one column.
    +public class FixLengthColumnPage extends ColumnPage {
    +
    +  // Only one of following fields will be used
    +  private byte[] byteData;
    +  private short[] shortData;
    +  private int[] intData;
    +  private long[] longData;
    +  private double[] doubleData;
    +
    +  private byte[][] byteArrayData;
    --- End diff --
   
    It is for decimal data type. I will add comment for it.
    I could not think of a better name for this class, could you suggest one?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata pull request #852: [CARBONDATA-1015] Refactory write st...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/852#discussion_r114771917
 
    --- Diff: core/src/test/java/org/apache/carbondata/core/util/CarbonMetadataUtilTest.java ---
    @@ -169,7 +171,12 @@
         long[] longArr = { 1, 2, 3, 4, 5 };
         byte[][] maxByteArr = { { 1, 2 }, { 3, 4 }, { 5, 6 }, { 2, 4 }, { 1, 2 } };
         int[] cardinality = { 1, 2, 3, 4, 5 };
    -    char[] charArr = { 'a', 's', 'd', 'g', 'h' };
    +    org.apache.carbondata.core.metadata.datatype.DataType[] charArr = {
    --- End diff --
   
    ok


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata pull request #852: [CARBONDATA-1015] Refactory write st...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/852#discussion_r114774020
 
    --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/statistics/PageStatistics.java ---
    @@ -0,0 +1,124 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.carbondata.core.datastore.page.statistics;
    +
    +import java.math.BigDecimal;
    +
    +import org.apache.carbondata.core.metadata.datatype.DataType;
    +import org.apache.carbondata.core.util.DataTypeUtil;
    +
    +/** statics for one column page */
    +public class PageStatistics {
    +  private DataType dataType;
    +
    +  /** min and max value of the measures */
    +  private Object min, max;
    --- End diff --
   
    The min and max for no dictionary and dictionary columns actually are get from `BlockIndexerStorageForInt` and `BlockIndexerStorageForShort` after each column is sorted. So I am not updating them here. Make sense?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata issue #852: [CARBONDATA-1015] Refactory write step and ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user jackylk commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/852
 
    I have not changed Writer interface, so writer is not using ColumnPage yet.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata issue #852: [CARBONDATA-1015] Refactory write step and ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/852
 
    Build Failed  with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1891/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---
Reply | Threaded
Open this post in threaded view
|

[GitHub] incubator-carbondata issue #852: [CARBONDATA-1015] Refactory write step and ...

qiuchenjian-2
In reply to this post by qiuchenjian-2
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/852
 
    Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1893/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---