python - Reading a compressed/deflated (csv) file line by line -
this question has answer here:
- python: read lines compressed text files 3 answers
i'm using following generator iterate through given csv file row row in memory efficient way:
def csvreader(file): open(file, 'rb') csvfile: reader = csv.reader(csvfile, delimiter=',',quotechar='"') row in reader: yield row`
this works , able handle large files incredibly well. csv file of several gigabytes seems no problem @ small virtual machine instance limited ram.
however, when files grow large, disk space becomes problem. csv files seem high compression rates, allows me store files @ fraction of uncompressed size, before can use above code handle file, have decompress/inflate file , run through script.
my question: there way build efficient generator above (given file, yield csv rows array), inflating parts of file, till newline reached, , running through csv reader, without ever having deflate/decompress file whole?
thanks consideration!
try using gzip
just replace with open(file, 'rb') csvfile:
with gzip.open(file, 'rb') csvfile:
, add import gzip
@ top of script.