Looking at the DESC FORMATTED command again, I still feel it is not very
clear for the table property section. For table properties, I think it is not very good for DESC command to print the default value if the user does not specify when creating the table. Because the default value in CarbonCommonConstain file may change from version to version, I think it is better to always write the default value to table property (in schema file) when loading the table. Then in DESC table, we can always get the table properties from the schema file. So I suggest we do following: 1. categorize the properties into file level, table level, system level 2. write the file level property into data file's footer, including all file level properties either specified by user or from default value. 3. write the table level property into schema file, including all table level properties either specified by user or from default value. 4. DESC command should print the properties read from the schema file, which should contain all table level properties. Another suggestion is that besides just printing the schema and table properties like the standard hive DESC command, we can introduce another command to print the output from calling CarbonCli tool for more profiling and debugging information, like writing how many files the table contains, what is the average size of page/blocklet, min/max percentage etc. For example, the syntax of this command can be "SUMMARY table_name" Regards, Jacky -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ |
This post was updated on .
Hi:
I agree with Jacky. Currently if i use the default value of blocklet size (64mb) to create a table and load some data into table, and then change the default value of blocklet size to 128mb, it will affect the table created before, is it right? I think it still need to use 64mb as blocklet size for tables created before. These properties either specified by user or from default value need to be saved when create table: file level properties: |Blocklet Size |64 MB |64 MB | table level properties: |Table Block Size |1024 MB |1024 MB | |SORT_SCOPE |LOCAL_SORT |LOCAL_SORT | |CACHE_LEVEL |BLOCKLET |BLOCK | |AUTO_LOAD_MERGE |true |false | |COMPACTION_LEVEL_THRESHOLD |2,8 |4,3 | |COMPACTION_PRESERVE_SEGMENTS |0 |0 | |ALLOWED_COMPACTION_DAYS |0 |0 | |MAJOR_COMPACTION_SIZE |3072 MB |1024 MB | |Local Dictionary Enabled |false |false | |Local Dictionary Threshold |10000 |10000 | Hi Jacky: I think we need to refactor CarbonCli module and move some common tools to core module, and then CarbonCli module and Spark2 module can use them, right? -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ |
I think for all table property that documented in https://github.com/apache/carbondata/blob/master/docs/ddl-of-carbondata.md <https://github.com/apache/carbondata/blob/master/docs/ddl-of-carbondata.md>, we should write their values in the schema file.
Because the default value for these properties may change, if they are changes, user will not know what is the table property that was used when writing the file. Regards, Jacky Li > 在 2018年10月8日,上午12:20,xm_zzc <[hidden email]> 写道: > > Hi: > I agree with Jacky. > Currently if i use the default value of blocklet size (64mb) to create a > table and load some data into table, and then change the default value of > blocklet size to 128mb, it will affect the table created before, is it > right? I think it still need to use 64mb as blocklet size for tables created > before. > > These properties either specified by user or from default value need to be > saved when create table: > property value default > value > |Blocklet Size |64 MB |64 MB > | > |Table Block Size |1024 MB |1024 MB | > |SORT_SCOPE |LOCAL_SORT |LOCAL_SORT | > |CACHE_LEVEL |BLOCKLET |BLOCK | > |AUTO_LOAD_MERGE |true |false | > |COMPACTION_LEVEL_THRESHOLD |2,8 |4,3 | > |COMPACTION_PRESERVE_SEGMENTS |0 |0 | > |ALLOWED_COMPACTION_DAYS |0 |0 | > |MAJOR_COMPACTION_SIZE |3072 MB |1024 MB | > |Local Dictionary Enabled |false |false | > > Hi Jacky: > I think we need to refactor CarbonCli module and move some common tools to > core module, and then CarbonCli module and Spark2 module can use them, > right? > > > > > -- > Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ > |
Hi,
I revisit this discussion again, and suggest to change the DESC FORMATTED output to following: The information is outline in 6 sections: 1. Table basic information 2. Index information 3. Encoding information 4. Compaction information 5. Partition information (only for partition table) 6. Dynamic information Please check whether it contains enough information of your preference, I will create a JIRA and PR soon. Regards, Jacky -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ |
The example is missing in my last mail, now I have put the example in
CARBONDATA-3087 <https://issues.apache.org/jira/browse/CARBONDATA-3087> , please go to the JIRA and reply if you have any comment Regards, Jacky -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ |
Free forum by Nabble | Edit this page |