[GitHub] [carbondata] MarvinLitt opened a new pull request #3521: [doc_zh_cn] add sybase iq and carbon data query performance comparison doc chines…

classic Classic list List threaded Threaded
44 messages Options
123
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] MarvinLitt opened a new pull request #3521: [doc_zh_cn] add sybase iq and carbon data query performance comparison doc chines…

GitBox
MarvinLitt opened a new pull request #3521: [doc_zh_cn] add sybase iq and carbon data query performance comparison doc chines…
URL: https://github.com/apache/carbondata/pull/3521
 
 
   …e doc to carbondata
   as talked with likun add chinese doc file path.
   
   Be sure to do all of the following checklist to help us incorporate
   your contribution quickly and easily:
   
    - [ ] Any interfaces changed?
   
    - [ ] Any backward compatibility impacted?
   
    - [ ] Document update required?
   
    - [ ] Testing done
           Please provide details on
           - Whether new unit test cases have been added or why no new tests are required?
           - How it is tested? Please attach test report.
           - Is it a performance related change? Please attach the performance test report.
           - Any additional information to help reviewers in testing this change.
         
    - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3521: [doc_zh_cn] add sybase iq and carbon data query performance comparison doc chines…

GitBox
CarbonDataQA1 commented on issue #3521: [doc_zh_cn] add sybase iq and carbon data query performance comparison doc chines…
URL: https://github.com/apache/carbondata/pull/3521#issuecomment-567864541
 
 
   Build Success with Spark 2.1.0, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.1/1227/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3521: [doc_zh_cn] add sybase iq and carbon data query performance comparison doc chines…

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3521: [doc_zh_cn] add sybase iq and carbon data query performance comparison doc chines…
URL: https://github.com/apache/carbondata/pull/3521#issuecomment-567934963
 
 
   Build Success with Spark 2.2.1, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.2/1237/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3521: [doc_zh_cn] add sybase iq and carbon data query performance comparison doc chines…

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3521: [doc_zh_cn] add sybase iq and carbon data query performance comparison doc chines…
URL: https://github.com/apache/carbondata/pull/3521#issuecomment-567938154
 
 
   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1246/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] jackylk commented on a change in pull request #3521: [doc_zh_cn] add sybase iq and carbon data query performance comparison doc chines…

GitBox
In reply to this post by GitBox
jackylk commented on a change in pull request #3521: [doc_zh_cn] add sybase iq and carbon data query performance comparison doc chines…
URL: https://github.com/apache/carbondata/pull/3521#discussion_r360651523
 
 

 ##########
 File path: docs/zh_cn/SybaseIQ和CarbonData查询性能对比.md
 ##########
 @@ -0,0 +1,109 @@
+<!--
+    Licensed to the Apache Software Foundation (ASF) under one or more
+    contributor license agreements.  See the NOTICE file distributed with
+    this work for additional information regarding copyright ownership.
+    The ASF licenses this file to you under the Apache License, Version 2.0
+    (the "License"); you may not use this file except in compliance with
+    the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+    
+    Unless required by applicable law or agreed to in writing, software
+    distributed under the License is distributed on an "AS IS" BASIS,
+    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+    See the License for the specific language governing permissions and
+    limitations under the License.
+-->
+
+## Carbondata 替换Sybase IQ查询性能对比
+
+本文主要在于给用户呈现Carbondata在替换Syabse IQ过程中对于Sybase IQ的查询性能,Carbondata自身的优势和特点,本文的数据仅为基于某领域查询特点框架下SQL的查询结果,只代表该特定查询特点下的性能对比。
+
+
+
+
+
+## 1.集群状态对比
+
+| 集群       | 描述                                                      |
+| ---------- | --------------------------------------------------------- |
+| IQ集群     | 1个加载节点,1个协调节点,1个查询节点,SSD硬盘,磁阵      |
+| Hadoop集群 | 2个namenode,6个datanode,STAT硬盘,查询队列分配1/6的资源 |
+
+## 2.查询SQL模型介绍
+
+IQ与Carbon查询SQL本身存在差异,在执行性能测试之前需要对SQL进行修改。
+
+```IQ的查询SQL模型:```
+
+SELECT TOP 5000 SUM(COALESCE(COLUMN_A, 0)) + SUM(COALESCE(COLUMN_B, 0)) AS COLUMN_C , SUM(COALESCE(COLUMN_A, 0)) AS COLUMN_A_A , SUM(COALESCE(COLUMN_B, 0)) AS COLUMN_B_B , SUM(COALESCE(COLUMN_D, 0)) + SUM(COALESCE(COLUMN_E, 0)) AS COLUMN_F , SUM(COALESCE(COLUMN_D, 0)) AS COLUMN_D_D , SUM(COALESCE(COLUMN_E, 0)) AS COLUMN_E_E , (SUM(COALESCE(COLUMN_A, 0)) + SUM(COALESCE(COLUMN_B, 0))) * 8 / 72000 / 1024 AS COLUMN_F , SUM(COALESCE(COLUMN_A, 0)) * 8 / 72000 / 1024 AS COLUMN_G , SUM(COALESCE(COLUMN_B, 0)) * 8 / 72000 / 1024 AS COLUMN_H , MT."202080101" AS "202080101", COUNT(1) OVER () AS countNum FROM ( SELECT COALESCE(SUM("COLUMN_1_A"), 0) AS COLUMN_A , COALESCE(SUM("COLUMN_1_B"), 0) AS COLUMN_B , COALESCE(SUM("COLUMN_1_E"), 0) AS COLUMN_E , COALESCE(SUM("COLUMN_1_D"), 0) AS COLUMN_D , TABLE_A."202080101" AS "202080101" FROM TABLE_B LEFT JOIN ( SELECT "COLUMN_CSI" AS "202050101" , CASE WHEN "TYPE_ID" = 2 THEN "COLUMN_CSI" END AS "202080101" , CASE WHEN "TYPE_ID" = 2 THEN "CLOUMN_NAME" END AS NAME_202080101 FROM DIMENSION_TABLE GROUP BY "COLUMN_CSI", CASE WHEN "TYPE_ID" = 2 THEN "COLUMN_CSI" END, CASE WHEN "TYPE_ID" = 2 THEN "CLOUMN_NAME" END ) TABLE_A ON "COLUMN_CSI" = TABLE_A."202050101" WHERE TABLE_A.NAME_202080101 IS NOT NULL AND "TIME" < 1576087200 AND "TIME" >= 1576015200 GROUP BY TABLE_A."202080101" ) MT GROUP BY MT."202080101" ORDER BY COLUMN_C DESC
+
+其中一个SUM后面称为一个counter
+
+```Spark的查询SQL模型:```
+
+SELECT COALESCE(SUM(COLUMN_A), 0) + COALESCE(SUM(COLUMN_B), 0) AS COLUMN_C , COALESCE(SUM(COLUMN_A), 0) AS COLUMN_A_A , COALESCE(SUM(COLUMN_B), 0) AS COLUMN_B_B , COALESCE(SUM(COLUMN_D), 0) + COALESCE(SUM(COLUMN_E), 0) AS COLUMN_F , COALESCE(SUM(COLUMN_D), 0) AS COLUMN_D_D , COALESCE(SUM(COLUMN_E), 0) AS COLUMN_E_E , (COALESCE(SUM(COLUMN_A), 0) + COALESCE(SUM(COLUMN_B), 0)) * 8 / 72000 / 1024 AS COLUMN_F , COALESCE(SUM(COLUMN_A), 0) * 8 / 72000 / 1024 AS COLUMN_G , COALESCE(SUM(COLUMN_B), 0) * 8 / 72000 / 1024 AS COLUMN_H , MT.`202080101` AS `202080101` FROM ( SELECT `COLUMN_1_A` AS COLUMN_A, `COLUMN_1_E` AS COLUMN_E, `COLUMN_1_B` AS COLUMN_B, `COLUMN_1_D` AS COLUMN_D, TABLE_A.`202080101` AS `202080101` FROM TABLE_B LEFT JOIN ( SELECT `COLUMN_CSI` AS `202050101` , CASE WHEN `TYPE_ID` = 2 THEN `COLUMN_CSI` END AS `202080101` , CASE WHEN `TYPE_ID` = 2 THEN `COLUMN_NAME` END AS NAME_202080101 FROM DIMENSION_TABLE GROUP BY `COLUMN_CSI`, CASE WHEN `TYPE_ID` = 2 THEN `COLUMN_CSI` END, CASE WHEN `TYPE_ID` = 2 THEN `COLUMN_NAME` END ) TABLE_A ON `COLUMN_CSI` = TABLE_A.`202050101` WHERE TABLE_A.NAME_202080101 IS NOT NULL AND `TIME` >= 1576015200 AND `TIME` < 1576087200 ) MT GROUP BY MT.`202080101` ORDER BY COLUMN_C DESC LIMIT 5000
+
+## 3.Carbon主要配置参数
+
+```主要配置```
+
+| Carbon主要配置                       | 参数值 | 描述                                                         |
+| ------------------------------------ | ------ | ------------------------------------------------------------ |
+| carbon.inmemory.record.size          | 480000 | 查询每个表需要加载到内存的总行数。                           |
+| carbon.number.of.cores               | 4      | carbon查询过程中并行扫描的线程数。                           |
+| carbon.number.of.cores.while.loading | 15     | carbon数据加载过程中并行扫描的线程数。                       |
+| carbon.sort.file.buffer.size         | 20     | 在合并排序(读/写)操作时存储每个临时过程文件的所使用的总缓存大小。单位为MB |
+| carbon.sort.size                     | 500000 | 在数据加载操作时,每次被排序的记录数。                       |
+| Spark主要配置                        |        |                                                              |
+| spark.sql.shuffle.partitions         | 70     |                                                              |
+| spark.executor.instances             | 6      |                                                              |
+| spark.executor.cores                 | 13     |                                                              |
+| spark.locality.wait                  | 0      |                                                              |
+| spark.executor.memory                | 5G     |                                                              |
+| spark.driver.cores                   | 3      |                                                              |
+| spark.driver.memory                  | 50G    |                                                              |
+| spark.sql.codegen.wholeStage         | True   |                                                              |
+| spark.sql.codegen.hugeMethodLimit    | 8000   |                                                              |
+
+## 4.不同数量级查询性能对比结果:
+
+carbondata的查询取多次求平均值,并且排除了首次查询的较长耗时,目前首次查询的耗时正在优化中。
 
 Review comment:
   Please modify this

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] jackylk commented on a change in pull request #3521: [doc_zh_cn] add sybase iq and carbon data query performance comparison doc chines…

GitBox
In reply to this post by GitBox
jackylk commented on a change in pull request #3521: [doc_zh_cn] add sybase iq and carbon data query performance comparison doc chines…
URL: https://github.com/apache/carbondata/pull/3521#discussion_r360651612
 
 

 ##########
 File path: docs/zh_cn/SybaseIQ和CarbonData查询性能对比.md
 ##########
 @@ -0,0 +1,109 @@
+<!--
+    Licensed to the Apache Software Foundation (ASF) under one or more
+    contributor license agreements.  See the NOTICE file distributed with
+    this work for additional information regarding copyright ownership.
+    The ASF licenses this file to you under the Apache License, Version 2.0
+    (the "License"); you may not use this file except in compliance with
+    the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+    
+    Unless required by applicable law or agreed to in writing, software
+    distributed under the License is distributed on an "AS IS" BASIS,
+    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+    See the License for the specific language governing permissions and
+    limitations under the License.
+-->
+
+## Carbondata 替换Sybase IQ查询性能对比
 
 Review comment:
   I think we can just say performance comparison between CarbonData and Columnar Database

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] MarvinLitt commented on a change in pull request #3521: [doc_zh_cn] add a commercial inventory DB and carbon data query performance comparison doc chinese doc to carbondata

GitBox
In reply to this post by GitBox
MarvinLitt commented on a change in pull request #3521: [doc_zh_cn] add a commercial inventory DB and carbon data query performance comparison doc chinese doc to carbondata
URL: https://github.com/apache/carbondata/pull/3521#discussion_r360832593
 
 

 ##########
 File path: docs/zh_cn/SybaseIQ和CarbonData查询性能对比.md
 ##########
 @@ -0,0 +1,109 @@
+<!--
+    Licensed to the Apache Software Foundation (ASF) under one or more
+    contributor license agreements.  See the NOTICE file distributed with
+    this work for additional information regarding copyright ownership.
+    The ASF licenses this file to you under the Apache License, Version 2.0
+    (the "License"); you may not use this file except in compliance with
+    the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+    
+    Unless required by applicable law or agreed to in writing, software
+    distributed under the License is distributed on an "AS IS" BASIS,
+    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+    See the License for the specific language governing permissions and
+    limitations under the License.
+-->
+
+## Carbondata 替换Sybase IQ查询性能对比
 
 Review comment:
   done

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] MarvinLitt commented on a change in pull request #3521: [doc_zh_cn] add a commercial inventory DB and carbon data query performance comparison doc chinese doc to carbondata

GitBox
In reply to this post by GitBox
MarvinLitt commented on a change in pull request #3521: [doc_zh_cn] add a commercial inventory DB and carbon data query performance comparison doc chinese doc to carbondata
URL: https://github.com/apache/carbondata/pull/3521#discussion_r360832602
 
 

 ##########
 File path: docs/zh_cn/SybaseIQ和CarbonData查询性能对比.md
 ##########
 @@ -0,0 +1,109 @@
+<!--
+    Licensed to the Apache Software Foundation (ASF) under one or more
+    contributor license agreements.  See the NOTICE file distributed with
+    this work for additional information regarding copyright ownership.
+    The ASF licenses this file to you under the Apache License, Version 2.0
+    (the "License"); you may not use this file except in compliance with
+    the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+    
+    Unless required by applicable law or agreed to in writing, software
+    distributed under the License is distributed on an "AS IS" BASIS,
+    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+    See the License for the specific language governing permissions and
+    limitations under the License.
+-->
+
+## Carbondata 替换Sybase IQ查询性能对比
+
+本文主要在于给用户呈现Carbondata在替换Syabse IQ过程中对于Sybase IQ的查询性能,Carbondata自身的优势和特点,本文的数据仅为基于某领域查询特点框架下SQL的查询结果,只代表该特定查询特点下的性能对比。
+
+
+
+
+
+## 1.集群状态对比
+
+| 集群       | 描述                                                      |
+| ---------- | --------------------------------------------------------- |
+| IQ集群     | 1个加载节点,1个协调节点,1个查询节点,SSD硬盘,磁阵      |
+| Hadoop集群 | 2个namenode,6个datanode,STAT硬盘,查询队列分配1/6的资源 |
+
+## 2.查询SQL模型介绍
+
+IQ与Carbon查询SQL本身存在差异,在执行性能测试之前需要对SQL进行修改。
+
+```IQ的查询SQL模型:```
+
+SELECT TOP 5000 SUM(COALESCE(COLUMN_A, 0)) + SUM(COALESCE(COLUMN_B, 0)) AS COLUMN_C , SUM(COALESCE(COLUMN_A, 0)) AS COLUMN_A_A , SUM(COALESCE(COLUMN_B, 0)) AS COLUMN_B_B , SUM(COALESCE(COLUMN_D, 0)) + SUM(COALESCE(COLUMN_E, 0)) AS COLUMN_F , SUM(COALESCE(COLUMN_D, 0)) AS COLUMN_D_D , SUM(COALESCE(COLUMN_E, 0)) AS COLUMN_E_E , (SUM(COALESCE(COLUMN_A, 0)) + SUM(COALESCE(COLUMN_B, 0))) * 8 / 72000 / 1024 AS COLUMN_F , SUM(COALESCE(COLUMN_A, 0)) * 8 / 72000 / 1024 AS COLUMN_G , SUM(COALESCE(COLUMN_B, 0)) * 8 / 72000 / 1024 AS COLUMN_H , MT."202080101" AS "202080101", COUNT(1) OVER () AS countNum FROM ( SELECT COALESCE(SUM("COLUMN_1_A"), 0) AS COLUMN_A , COALESCE(SUM("COLUMN_1_B"), 0) AS COLUMN_B , COALESCE(SUM("COLUMN_1_E"), 0) AS COLUMN_E , COALESCE(SUM("COLUMN_1_D"), 0) AS COLUMN_D , TABLE_A."202080101" AS "202080101" FROM TABLE_B LEFT JOIN ( SELECT "COLUMN_CSI" AS "202050101" , CASE WHEN "TYPE_ID" = 2 THEN "COLUMN_CSI" END AS "202080101" , CASE WHEN "TYPE_ID" = 2 THEN "CLOUMN_NAME" END AS NAME_202080101 FROM DIMENSION_TABLE GROUP BY "COLUMN_CSI", CASE WHEN "TYPE_ID" = 2 THEN "COLUMN_CSI" END, CASE WHEN "TYPE_ID" = 2 THEN "CLOUMN_NAME" END ) TABLE_A ON "COLUMN_CSI" = TABLE_A."202050101" WHERE TABLE_A.NAME_202080101 IS NOT NULL AND "TIME" < 1576087200 AND "TIME" >= 1576015200 GROUP BY TABLE_A."202080101" ) MT GROUP BY MT."202080101" ORDER BY COLUMN_C DESC
+
+其中一个SUM后面称为一个counter
+
+```Spark的查询SQL模型:```
+
+SELECT COALESCE(SUM(COLUMN_A), 0) + COALESCE(SUM(COLUMN_B), 0) AS COLUMN_C , COALESCE(SUM(COLUMN_A), 0) AS COLUMN_A_A , COALESCE(SUM(COLUMN_B), 0) AS COLUMN_B_B , COALESCE(SUM(COLUMN_D), 0) + COALESCE(SUM(COLUMN_E), 0) AS COLUMN_F , COALESCE(SUM(COLUMN_D), 0) AS COLUMN_D_D , COALESCE(SUM(COLUMN_E), 0) AS COLUMN_E_E , (COALESCE(SUM(COLUMN_A), 0) + COALESCE(SUM(COLUMN_B), 0)) * 8 / 72000 / 1024 AS COLUMN_F , COALESCE(SUM(COLUMN_A), 0) * 8 / 72000 / 1024 AS COLUMN_G , COALESCE(SUM(COLUMN_B), 0) * 8 / 72000 / 1024 AS COLUMN_H , MT.`202080101` AS `202080101` FROM ( SELECT `COLUMN_1_A` AS COLUMN_A, `COLUMN_1_E` AS COLUMN_E, `COLUMN_1_B` AS COLUMN_B, `COLUMN_1_D` AS COLUMN_D, TABLE_A.`202080101` AS `202080101` FROM TABLE_B LEFT JOIN ( SELECT `COLUMN_CSI` AS `202050101` , CASE WHEN `TYPE_ID` = 2 THEN `COLUMN_CSI` END AS `202080101` , CASE WHEN `TYPE_ID` = 2 THEN `COLUMN_NAME` END AS NAME_202080101 FROM DIMENSION_TABLE GROUP BY `COLUMN_CSI`, CASE WHEN `TYPE_ID` = 2 THEN `COLUMN_CSI` END, CASE WHEN `TYPE_ID` = 2 THEN `COLUMN_NAME` END ) TABLE_A ON `COLUMN_CSI` = TABLE_A.`202050101` WHERE TABLE_A.NAME_202080101 IS NOT NULL AND `TIME` >= 1576015200 AND `TIME` < 1576087200 ) MT GROUP BY MT.`202080101` ORDER BY COLUMN_C DESC LIMIT 5000
+
+## 3.Carbon主要配置参数
+
+```主要配置```
+
+| Carbon主要配置                       | 参数值 | 描述                                                         |
+| ------------------------------------ | ------ | ------------------------------------------------------------ |
+| carbon.inmemory.record.size          | 480000 | 查询每个表需要加载到内存的总行数。                           |
+| carbon.number.of.cores               | 4      | carbon查询过程中并行扫描的线程数。                           |
+| carbon.number.of.cores.while.loading | 15     | carbon数据加载过程中并行扫描的线程数。                       |
+| carbon.sort.file.buffer.size         | 20     | 在合并排序(读/写)操作时存储每个临时过程文件的所使用的总缓存大小。单位为MB |
+| carbon.sort.size                     | 500000 | 在数据加载操作时,每次被排序的记录数。                       |
+| Spark主要配置                        |        |                                                              |
+| spark.sql.shuffle.partitions         | 70     |                                                              |
+| spark.executor.instances             | 6      |                                                              |
+| spark.executor.cores                 | 13     |                                                              |
+| spark.locality.wait                  | 0      |                                                              |
+| spark.executor.memory                | 5G     |                                                              |
+| spark.driver.cores                   | 3      |                                                              |
+| spark.driver.memory                  | 50G    |                                                              |
+| spark.sql.codegen.wholeStage         | True   |                                                              |
+| spark.sql.codegen.hugeMethodLimit    | 8000   |                                                              |
+
+## 4.不同数量级查询性能对比结果:
+
+carbondata的查询取多次求平均值,并且排除了首次查询的较长耗时,目前首次查询的耗时正在优化中。
 
 Review comment:
   okay, done.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3521: [doc_zh_cn] add a commercial inventory DB and carbon data query performance comparison doc chinese doc to carbondata

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3521: [doc_zh_cn] add a commercial inventory DB and carbon data query performance comparison doc chinese doc to carbondata
URL: https://github.com/apache/carbondata/pull/3521#issuecomment-568431434
 
 
   Build Success with Spark 2.1.0, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.1/1254/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3521: [doc_zh_cn] add a commercial inventory DB and carbon data query performance comparison doc chinese doc to carbondata

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3521: [doc_zh_cn] add a commercial inventory DB and carbon data query performance comparison doc chinese doc to carbondata
URL: https://github.com/apache/carbondata/pull/3521#issuecomment-568454964
 
 
   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1275/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3521: [doc_zh_cn] add a commercial inventory DB and carbon data query performance comparison doc chinese doc to carbondata

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3521: [doc_zh_cn] add a commercial inventory DB and carbon data query performance comparison doc chinese doc to carbondata
URL: https://github.com/apache/carbondata/pull/3521#issuecomment-568505672
 
 
   Build Success with Spark 2.2.1, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.2/1264/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] jackylk commented on a change in pull request #3521: [doc_zh_cn] add a commercial inventory DB and carbon data query performance comparison doc chinese doc to carbondata

GitBox
In reply to this post by GitBox
jackylk commented on a change in pull request #3521: [doc_zh_cn] add a commercial inventory DB and carbon data query performance comparison doc chinese doc to carbondata
URL: https://github.com/apache/carbondata/pull/3521#discussion_r361106581
 
 

 ##########
 File path: docs/zh_cn/某商业列存DB和CarbonData查询性能对比.md
 ##########
 @@ -0,0 +1,109 @@
+<!--
+    Licensed to the Apache Software Foundation (ASF) under one or more
+    contributor license agreements.  See the NOTICE file distributed with
+    this work for additional information regarding copyright ownership.
+    The ASF licenses this file to you under the Apache License, Version 2.0
+    (the "License"); you may not use this file except in compliance with
+    the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+    
+    Unless required by applicable law or agreed to in writing, software
+    distributed under the License is distributed on an "AS IS" BASIS,
+    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+    See the License for the specific language governing permissions and
+    limitations under the License.
+-->
+
+## Carbondata 替换某商业列存DB查询性能对比
+
+本文主要在于给用户呈现Carbondata在替换某商业列存DB过程中对于该DB的查询性能提升,Carbondata自身的优势和特点,本文的数据仅为基于某领域查询特点框架下SQL的查询结果,只代表该特定查询特点下的性能对比。
+
+
+
+
+
+## 1.集群状态对比
+
+| 集群             | 描述                                                      |
+| ---------------- | --------------------------------------------------------- |
+| 某商业列存DB集群 | 1个加载节点,1个协调节点,1个查询节点,SSD硬盘            |
 
 Review comment:
   please discribe how many CPU cores in both environements

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3521: [doc_zh_cn] add a commercial inventory DB and carbon data query performance comparison doc chinese doc to carbondata

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3521: [doc_zh_cn] add a commercial inventory DB and carbon data query performance comparison doc chinese doc to carbondata
URL: https://github.com/apache/carbondata/pull/3521#issuecomment-568712325
 
 
   Build Success with Spark 2.1.0, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.1/1263/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3521: [doc_zh_cn] add a commercial inventory DB and carbon data query performance comparison doc chinese doc to carbondata

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3521: [doc_zh_cn] add a commercial inventory DB and carbon data query performance comparison doc chinese doc to carbondata
URL: https://github.com/apache/carbondata/pull/3521#issuecomment-568724810
 
 
   Build Success with Spark 2.2.1, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.2/1273/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3521: [doc_zh_cn] add a commercial inventory DB and carbon data query performance comparison doc chinese doc to carbondata

GitBox
In reply to this post by GitBox
CarbonDataQA1 commented on issue #3521: [doc_zh_cn] add a commercial inventory DB and carbon data query performance comparison doc chinese doc to carbondata
URL: https://github.com/apache/carbondata/pull/3521#issuecomment-568725395
 
 
   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1284/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] MarvinLitt commented on a change in pull request #3521: [doc_zh_cn] add a commercial inventory DB and carbon data query performance comparison doc chinese doc to carbondata

GitBox
In reply to this post by GitBox
MarvinLitt commented on a change in pull request #3521: [doc_zh_cn] add a commercial inventory DB and carbon data query performance comparison doc chinese doc to carbondata
URL: https://github.com/apache/carbondata/pull/3521#discussion_r361239075
 
 

 ##########
 File path: docs/zh_cn/某商业列存DB和CarbonData查询性能对比.md
 ##########
 @@ -0,0 +1,109 @@
+<!--
+    Licensed to the Apache Software Foundation (ASF) under one or more
+    contributor license agreements.  See the NOTICE file distributed with
+    this work for additional information regarding copyright ownership.
+    The ASF licenses this file to you under the Apache License, Version 2.0
+    (the "License"); you may not use this file except in compliance with
+    the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+    
+    Unless required by applicable law or agreed to in writing, software
+    distributed under the License is distributed on an "AS IS" BASIS,
+    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+    See the License for the specific language governing permissions and
+    limitations under the License.
+-->
+
+## Carbondata 替换某商业列存DB查询性能对比
+
+本文主要在于给用户呈现Carbondata在替换某商业列存DB过程中对于该DB的查询性能提升,Carbondata自身的优势和特点,本文的数据仅为基于某领域查询特点框架下SQL的查询结果,只代表该特定查询特点下的性能对比。
+
+
+
+
+
+## 1.集群状态对比
+
+| 集群             | 描述                                                      |
+| ---------------- | --------------------------------------------------------- |
+| 某商业列存DB集群 | 1个加载节点,1个协调节点,1个查询节点,SSD硬盘            |
 
 Review comment:
   done

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] jackylk commented on a change in pull request #3521: [doc_zh_cn] add a commercial inventory DB and carbon data query performance comparison doc chinese doc to carbondata

GitBox
In reply to this post by GitBox
jackylk commented on a change in pull request #3521: [doc_zh_cn] add a commercial inventory DB and carbon data query performance comparison doc chinese doc to carbondata
URL: https://github.com/apache/carbondata/pull/3521#discussion_r361771821
 
 

 ##########
 File path: docs/zh_cn/某商业列存DB和CarbonData查询性能对比.md
 ##########
 @@ -0,0 +1,111 @@
+<!--
+    Licensed to the Apache Software Foundation (ASF) under one or more
+    contributor license agreements.  See the NOTICE file distributed with
+    this work for additional information regarding copyright ownership.
+    The ASF licenses this file to you under the Apache License, Version 2.0
+    (the "License"); you may not use this file except in compliance with
+    the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+    
+    Unless required by applicable law or agreed to in writing, software
+    distributed under the License is distributed on an "AS IS" BASIS,
+    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+    See the License for the specific language governing permissions and
+    limitations under the License.
+-->
+
+## CarbonData 替换某商业列存DB查询性能对比
+
+本文主要在于给用户呈现CarbonData在替换某商业列存DB过程中对于该DB的查询性能提升,CarbonData自身的优势和特点,本文的数据仅为基于某领域查询特点框架下SQL的查询结果,只代表该特定查询特点下的性能对比。
+
+
+
+
+
+## 1.集群状态对比
+
+| 集群             | 描述                                                      |
+| ---------------- | --------------------------------------------------------- |
+| 某商业列存DB集群 | 3节点,SSD硬盘                                            |
+| Hadoop集群       | 2个namenode,6个datanode,STAT硬盘,查询队列分配1/6的资源 |
 
 Review comment:
   Are these two cluster using the same resource?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] jackylk commented on a change in pull request #3521: [doc_zh_cn] add a commercial inventory DB and carbon data query performance comparison doc chinese doc to carbondata

GitBox
In reply to this post by GitBox
jackylk commented on a change in pull request #3521: [doc_zh_cn] add a commercial inventory DB and carbon data query performance comparison doc chinese doc to carbondata
URL: https://github.com/apache/carbondata/pull/3521#discussion_r361772009
 
 

 ##########
 File path: docs/zh_cn/某商业列存DB和CarbonData查询性能对比.md
 ##########
 @@ -0,0 +1,111 @@
+<!--
+    Licensed to the Apache Software Foundation (ASF) under one or more
+    contributor license agreements.  See the NOTICE file distributed with
+    this work for additional information regarding copyright ownership.
+    The ASF licenses this file to you under the Apache License, Version 2.0
+    (the "License"); you may not use this file except in compliance with
+    the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+    
+    Unless required by applicable law or agreed to in writing, software
+    distributed under the License is distributed on an "AS IS" BASIS,
+    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+    See the License for the specific language governing permissions and
+    limitations under the License.
+-->
+
+## CarbonData 替换某商业列存DB查询性能对比
 
 Review comment:
   ```suggestion
   ## CarbonData与商业列存DB查询性能对比
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] jackylk commented on a change in pull request #3521: [doc_zh_cn] add a commercial inventory DB and carbon data query performance comparison doc chinese doc to carbondata

GitBox
In reply to this post by GitBox
jackylk commented on a change in pull request #3521: [doc_zh_cn] add a commercial inventory DB and carbon data query performance comparison doc chinese doc to carbondata
URL: https://github.com/apache/carbondata/pull/3521#discussion_r361772063
 
 

 ##########
 File path: docs/zh_cn/某商业列存DB和CarbonData查询性能对比.md
 ##########
 @@ -0,0 +1,111 @@
+<!--
+    Licensed to the Apache Software Foundation (ASF) under one or more
+    contributor license agreements.  See the NOTICE file distributed with
+    this work for additional information regarding copyright ownership.
+    The ASF licenses this file to you under the Apache License, Version 2.0
+    (the "License"); you may not use this file except in compliance with
+    the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+    
+    Unless required by applicable law or agreed to in writing, software
+    distributed under the License is distributed on an "AS IS" BASIS,
+    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+    See the License for the specific language governing permissions and
+    limitations under the License.
+-->
+
+## CarbonData 替换某商业列存DB查询性能对比
+
+本文主要在于给用户呈现CarbonData在替换某商业列存DB过程中对于该DB的查询性能提升,CarbonData自身的优势和特点,本文的数据仅为基于某领域查询特点框架下SQL的查询结果,只代表该特定查询特点下的性能对比。
+
+
+
+
+
+## 1.集群状态对比
 
 Review comment:
   ```suggestion
   ## 1. 测试集群
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] jackylk commented on a change in pull request #3521: [doc_zh_cn] add a commercial inventory DB and carbon data query performance comparison doc chinese doc to carbondata

GitBox
In reply to this post by GitBox
jackylk commented on a change in pull request #3521: [doc_zh_cn] add a commercial inventory DB and carbon data query performance comparison doc chinese doc to carbondata
URL: https://github.com/apache/carbondata/pull/3521#discussion_r361772063
 
 

 ##########
 File path: docs/zh_cn/某商业列存DB和CarbonData查询性能对比.md
 ##########
 @@ -0,0 +1,111 @@
+<!--
+    Licensed to the Apache Software Foundation (ASF) under one or more
+    contributor license agreements.  See the NOTICE file distributed with
+    this work for additional information regarding copyright ownership.
+    The ASF licenses this file to you under the Apache License, Version 2.0
+    (the "License"); you may not use this file except in compliance with
+    the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+    
+    Unless required by applicable law or agreed to in writing, software
+    distributed under the License is distributed on an "AS IS" BASIS,
+    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+    See the License for the specific language governing permissions and
+    limitations under the License.
+-->
+
+## CarbonData 替换某商业列存DB查询性能对比
+
+本文主要在于给用户呈现CarbonData在替换某商业列存DB过程中对于该DB的查询性能提升,CarbonData自身的优势和特点,本文的数据仅为基于某领域查询特点框架下SQL的查询结果,只代表该特定查询特点下的性能对比。
+
+
+
+
+
+## 1.集群状态对比
 
 Review comment:
   ```suggestion
   ## 1. 测试环境
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services
123