Apache CarbonData Dev Mailing List archive › Apache CarbonData JIRA issues

[GitHub] carbondata pull request #2788: [Documentation] Readme updated with latest to...

Classic

List

36 messages Options

Options

12

[GitHub] carbondata pull request #2788: [Documentation] Readme updated with latest to...

GitHub user sgururajshetty opened a pull request:

https://github.com/apache/carbondata/pull/2788

[Documentation] Readme updated with latest topics and new TOC

> Readme updated with the new structure
> Formatting issue fixed
> Review comments fixed

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sgururajshetty/carbondata doc_sept

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/2788.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2788

----
commit c8bc47ec164e43736d6f5b39b7c883d2b11bd7f7
Author: sgururajshetty <sgururajshetty@...>
Date: 2018-09-28T13:43:08Z

Readme updated with latest topics and new TOC
Formatting issues fixed

----

---

[GitHub] carbondata issue #2788: [Documentation] Readme updated with latest topics an...

Github user sgururajshetty commented on the issue:

https://github.com/apache/carbondata/pull/2788

@sraghunandan review

---

[GitHub] carbondata issue #2788: [Documentation] Readme updated with latest topics an...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2788

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/638/

---

[GitHub] carbondata pull request #2788: [Documentation] Readme updated with latest to...

In reply to this post by qiuchenjian-2

Github user sraghunandan commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2788#discussion_r221295892

--- Diff: docs/carbon-as-spark-datasource-guide.md ---
@@ -15,19 +15,20 @@
limitations under the License.
-->

-# Carbon as Spark's datasource guide
+# Carbon as Spark's Datasource

-Carbon fileformat can be integrated to Spark using datasource to read and write data without using CarbonSession.
+The Carbon fileformat is now integrated as Spark datasource for read and write operation without using CarbonSession. This is useful for the users who wants same Spark datasource.
--- End diff --

Sentence not correct.This is useful for users who wants to use carbondata as a spark's data source

---

[GitHub] carbondata pull request #2788: [Documentation] Readme updated with latest to...

In reply to this post by qiuchenjian-2

Github user sraghunandan commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2788#discussion_r221298094

--- Diff: docs/streaming-guide.md ---
@@ -157,7 +157,7 @@ ALTER TABLE streaming_table SET TBLPROPERTIES('streaming'='true')
At the begin of streaming ingestion, the system will try to acquire the table level lock of streaming.lock file. If the system isn't able to acquire the lock of this table, it will throw an InterruptedException.

## Create streaming segment
-The input data of streaming will be ingested into a segment of the CarbonData table, the status of this segment is streaming. CarbonData call it a streaming segment. The "tablestatus" file will record the segment status and data size. The user can use âSHOW SEGMENTS FOR TABLE tableNameâ to check segment status.
+The input data of streaming will be ingested into a segment of the CarbonData table, the status of this segment is streaming. CarbonData call it a streaming segment. The "tablestatus" file will record the segment status and data size. The user can use "SHOW SEGMENTS FOR TABLE tableName" to check segment status.
--- End diff --

The streaming data will be ingested into a separate segment of carbondata table.This segment is termed streaming segment.The status of this segment will be recorded as "streaming" in "tablestatus" file along with its data size.

---

[GitHub] carbondata pull request #2788: [Documentation] Readme updated with latest to...

In reply to this post by qiuchenjian-2

Github user sraghunandan commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2788#discussion_r221295959

--- Diff: docs/carbon-as-spark-datasource-guide.md ---
@@ -15,19 +15,20 @@
limitations under the License.
-->

-# Carbon as Spark's datasource guide
+# Carbon as Spark's Datasource
--- End diff --

CarbonData

---

[GitHub] carbondata pull request #2788: [Documentation] Readme updated with latest to...

In reply to this post by qiuchenjian-2

Github user sraghunandan commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2788#discussion_r221299726

--- Diff: docs/carbon-as-spark-datasource-guide.md ---
@@ -44,22 +45,20 @@ Carbon table can be created with spark's datasource DDL syntax as follows.
| table_blocksize | 1024 | Size of blocks to write onto hdfs |
| table_blocklet_size | 64 | Size of blocklet to write |
| local_dictionary_threshold | 10000 | Cardinality upto which the local dictionary can be generated |
-| local_dictionary_enable | false | Enable local dictionary generation |
-| sort_columns | all dimensions are sorted | comma separated string columns which to include in sort and its order of sort |
-| sort_scope | local_sort | Sort scope of the load.Options include no sort, local sort ,batch sort and global sort |
-| long_string_columns | null | comma separated string columns which are more than 32k length |
+| local_dictionary_enable | false | Enable local dictionary generation |
+| sort_columns | all dimensions are sorted | Comma separated string columns which to include in sort and its order of sort |
--- End diff --

int columns can be included in sort_columns.Remove the word string.
Change the grammar of the sentence.

---

[GitHub] carbondata pull request #2788: [Documentation] Readme updated with latest to...

In reply to this post by qiuchenjian-2

Github user sraghunandan commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2788#discussion_r221296016

--- Diff: docs/carbon-as-spark-datasource-guide.md ---
@@ -15,19 +15,20 @@
limitations under the License.
-->

-# Carbon as Spark's datasource guide
+# Carbon as Spark's Datasource

-Carbon fileformat can be integrated to Spark using datasource to read and write data without using CarbonSession.
+The Carbon fileformat is now integrated as Spark datasource for read and write operation without using CarbonSession. This is useful for the users who wants same Spark datasource.
--- End diff --

CarbonData

---

[GitHub] carbondata pull request #2788: [Documentation] Readme updated with latest to...

In reply to this post by qiuchenjian-2

Github user sraghunandan commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2788#discussion_r221298928

--- Diff: docs/carbon-as-spark-datasource-guide.md ---
@@ -44,22 +45,20 @@ Carbon table can be created with spark's datasource DDL syntax as follows.
| table_blocksize | 1024 | Size of blocks to write onto hdfs |
--- End diff --

add sentence to refer to the original definition and explanation along with link to it.

---

[GitHub] carbondata pull request #2788: [Documentation] Readme updated with latest to...

In reply to this post by qiuchenjian-2

Github user sraghunandan commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2788#discussion_r221301565

--- Diff: docs/carbon-as-spark-datasource-guide.md ---
@@ -15,19 +15,20 @@
limitations under the License.
-->
--- End diff --

How to navigate to this document from the carbondata.apache.org/documentation.html ?

---

[GitHub] carbondata pull request #2788: [Documentation] Readme updated with latest to...

In reply to this post by qiuchenjian-2

Github user sraghunandan commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2788#discussion_r221296480

--- Diff: docs/ddl-of-carbondata.md ---
@@ -104,17 +104,18 @@ CarbonData DDL statements are documented here,which includes:

Following are the guidelines for TBLPROPERTIES, CarbonData's additional table options can be set via carbon.properties.

- - ##### Dictionary Encoding Configuration
+ - **Dictionary Encoding Configuration**
--- End diff --

These needs to be headings so that links can be created to this.Infact there are links to these

---

[GitHub] carbondata pull request #2788: [Documentation] Readme updated with latest to...

In reply to this post by qiuchenjian-2

Github user sraghunandan commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2788#discussion_r221297098

--- Diff: docs/ddl-of-carbondata.md ---
@@ -104,17 +104,18 @@ CarbonData DDL statements are documented here,which includes:

Following are the guidelines for TBLPROPERTIES, CarbonData's additional table options can be set via carbon.properties.

- - ##### Dictionary Encoding Configuration
+ - **Dictionary Encoding Configuration**
--- End diff --

handle these in all other places modified

---

[GitHub] carbondata pull request #2788: [Documentation] Readme updated with latest to...

In reply to this post by qiuchenjian-2

Github user sraghunandan commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2788#discussion_r221300203

--- Diff: docs/carbon-as-spark-datasource-guide.md ---
@@ -44,22 +45,20 @@ Carbon table can be created with spark's datasource DDL syntax as follows.
| table_blocksize | 1024 | Size of blocks to write onto hdfs |
| table_blocklet_size | 64 | Size of blocklet to write |
| local_dictionary_threshold | 10000 | Cardinality upto which the local dictionary can be generated |
-| local_dictionary_enable | false | Enable local dictionary generation |
-| sort_columns | all dimensions are sorted | comma separated string columns which to include in sort and its order of sort |
-| sort_scope | local_sort | Sort scope of the load.Options include no sort, local sort ,batch sort and global sort |
-| long_string_columns | null | comma separated string columns which are more than 32k length |
+| local_dictionary_enable | false | Enable local dictionary generation |
+| sort_columns | all dimensions are sorted | Comma separated string columns which to include in sort and its order of sort |
+| sort_scope | local_sort | Sort scope of the load.Options include no sort, local sort, batch sort, and global sort |
+| long_string_columns | null | Comma separated string columns which are more than 32k length |
--- End diff --

not alone string columns, it can be char/varchar also.pls refer to the 32k feature description to check the supported data types

---

[GitHub] carbondata issue #2788: [Documentation] Readme updated with latest topics an...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2788

Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/834/

---

[GitHub] carbondata issue #2788: [Documentation] Readme updated with latest topics an...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2788

Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/8903/

---

[GitHub] carbondata pull request #2788: [Documentation] Readme updated with latest to...

In reply to this post by qiuchenjian-2

Github user kunal642 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2788#discussion_r222181198

--- Diff: README.md ---
@@ -45,23 +45,26 @@ CarbonData file format is a columnar store in HDFS, it has many features that a
CarbonData is built using Apache Maven, to [build CarbonData](https://github.com/apache/carbondata/blob/master/build)

## Online Documentation
--- End diff --

Please remove 'Alpha Feature' tag from s3-guide.md.

---

[GitHub] carbondata issue #2788: [Documentation] Readme updated with latest topics an...

In reply to this post by qiuchenjian-2

Github user sgururajshetty commented on the issue:

https://github.com/apache/carbondata/pull/2788

@sraghunandan & @kunal642 kindly review and merge the doc

---

[GitHub] carbondata issue #2788: [Documentation] Readme updated with latest topics an...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2788

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/684/

---

[GitHub] carbondata issue #2788: [Documentation] Readme updated with latest topics an...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2788

Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/880/

---

[GitHub] carbondata issue #2788: [Documentation] Readme updated with latest topics an...

In reply to this post by qiuchenjian-2

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2788

Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/8948/

---

12