Hive - How To Efficiently Create Table As Select?
I have a hive table, htable that's partitioned on foo and bar. I want to create a small subset of this table for experiments, so I would think the thing to do would be create tabl
Solution 1:
Add distribute by foo, bar
:
insert into new_table partition (foo, bar) select * from htable
whererand() < 0.01 and foo in (a,b)
distribute by foo, bar
this will reduce memory consumption.
Post a Comment for "Hive - How To Efficiently Create Table As Select?"