Hi Community!
When load/insert command is triggered in the scenario where the main table
has one or more SI tables, after loading the new segment in the main table
and all the SI tables there is a SILoadEventListenerForFailedSegments which
compares the segments in main table and SI table. In case of any mismatch or
missing segments in any of the SI tables, the listener fires a load on the
missing segments in the SI table. The load/insert command on the main table
will be finished only after all the missing segments in all the SI tables
have been loaded again.
Consider a scenario where the SI table has 10000 missing segments. In this
case after the new load is completed on both the main table and SI table,
the SILoadEventListenerForFailedSegments will try to load all the missing
10000 segments back to the SI table. Since there are a lot of segments to be
reloaded in the SI table, this step will block the next load command for
many hours if not days. To solve this problem please find the 2 step
solution.
Step 1. Add a carbon property which will enable/disable the loading for
missing/failed segments. By default it can be kept true, only when the user
sets it as false this functionality will be disabled.
Step 2. Provide a separate SI repair command thus making the whole
functionality independent of load/insert command. We can provide both table
level as well as segment level command to repair the missing segments.
Example table level command: REPAIR INDEX ON TABLE MAIN_TABLE. This will
check for all the SI table in the main table
Example Segment Level command: REPAIR INDEX ON MAIN_TABLE WHERE SEGMENT.ID
IN (0,1,2,3,4). This will only check for the given segments in all the SI
tables.
Please give your input and suggestions for the above solution.
Rgds
Vikram Ahuja
--
Sent from:
http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/