hadoop - Hive Updates Efficiency (Version 0.14) -
how can hive efficiently handle updates on columns not partitioned?
suppose want update row specific transactionid (not partitioned), how hive handle internally. understand hive first search (which slow) , update particular partition (if any) particular row containing transactionid stored. though provided abstraction user update data efficient perform lot of updates ?
row level updates may not efficient in hadoop hadoop designed large data processing. hive version 0.14 supports row level updates on hive tables support acid. check hive tutorial further details on how implement row level updates. https://cwiki.apache.org/confluence/display/hive/languagemanual+dml#languagemanualdml-update