scala/spark mapping [String,List[String]] to String pairs -


i have rdd structures of rdd:[string a,list(strings bs)] map rdd:[string a,string b], each element in list matched string a. efficient way this?

i using flatmapvalues, efficient way? (i have huge dataset)

rdd.flatmapvalues(identity) should job done.

that should pretty efficient , simple way. optimize performance, compare implementation using mappartitions , pick better of two. wouldn't expect huge difference in both cases wrapper objects need created anyway.

rdd.mappartitions(iter => iter.flatmap(elem => elem._2.map(v => (elem._1,v))) 

Popular posts from this blog

c# - ODP.NET Oracle.ManagedDataAccess causes ORA-12537 network session end of file -

matlab - Compression and Decompression of ECG Signal using HUFFMAN ALGORITHM -

utf 8 - split utf-8 string into bytes in python -