Cloud Dataflow Running really slow when reading/writing from Cloud Storage (GCS) -


since using release of latest build of cloud dataflow (0.4.150414) our jobs running slow when reading cloud storage (gcs). after running 20 minutes 10 vms able read in 20 records when read in millions without issue.

it seems hanging, although no errors being reported console.

we received email informing latest build slower , countered using more vms got similar results 50 vms.

here job id reference: 2015-04-22_22_20_21-5463648738106751600

instance: n1-standard-2
region: us-central1-a

we had similar issue. when side-input reading bigquery table has had data streamed in, rather bulk loaded. when copy table(s), , read copies instead works fine.

if tables streamed, try copying them , reading copies instead. workaround.

see: dataflow performance issues


Popular posts from this blog

c# - ODP.NET Oracle.ManagedDataAccess causes ORA-12537 network session end of file -

matlab - Compression and Decompression of ECG Signal using HUFFMAN ALGORITHM -

utf 8 - split utf-8 string into bytes in python -