Apache CarbonData Dev Mailing List archive

Re: carbondata Indenpdent reader

Posted by Liang Chen on Dec 21, 2016; 2:34am
URL: http://apache-carbondata-dev-mailing-list-archive.168.s1.nabble.com/carbondata-Indenpdent-reader-tp4678p4768.html

Hi

For Q1: Carbon Data be stored under storePath , it can specify anywhere. Under "storePath", there are two folders : Fact and Metadata. As per you provided info, you specified the "storePath" is load path, this is why you can not find info from hdfs.
For Q2: Please refer to examples(DatasourceExample,DirectSQLExample)
For Q3: Same as Q1,please check you specified storePath.
For Q4: Can not get your question exactly. you can refer to example (DirectSQLExample) , how to parse generated carbon data

Regards
Liang

ffpeng90 wrote

Hi,all:
Recently, I load a carbon table in hive via carbon-spark plugin. I see there is nothing in hive, and all data is stored in a folder named "storePath".
scala code following:

Q1: Does it mean that carbon-spark plugin just create a external table in hive and raw data can be stored anywhere? I have checked the hdfs path, there is only a table directory and nothing under the table directory.

Q2: If i want to build a independent reader for carbondata table, should i read data from hive, or just parse files in the "storePath"?

Q3: I check the files under "storePath", they are not sotred as hdfs format, but common files in linux. Do i get the point?

Q4: I have finished brief read logic for my independent reader, all input path is local.
Test1:[Carbondata-hadoop->Target->store->testdb->testtable] which contains 1K rows generated by testcase, and my code can extract data successfully.
Testcase2: However, i try to parse the data generated by carbon-spark plugin which contains 100W rows, It throws exception @BlockIndexStore.fillLoadedBlocks()

Appreciate for your regard.