bash - keep some lines of a file according to some conditions -


i have file of kind :

k1   bla   started k1   bla   finished k2   blu   finished k3   bli   started k3   bli   died_skipped_permanently k4   blo   started k5   ble   started k5   ble   died_skipped_permanently k6   blou  started k6   blou  started 

from this, want obtain file where, when each name in column 1 there finished or died_skipped_permanently, line containing information present , not other ones (with started or other things). moreover, if 2 lines identical (like 1 of k6), want print one.

with example, output be:

k1   bla   finished k2   blu   finished k3   bli   died_skipped_permanently k4   blo   started k5   ble   died_skipped_permanently k6   blou  started 

i can't delete

grep -v started  

because names, k4 in example, line present , want know started (or not) need keep info.

i have file names column 1 obtained with:

awk '{print $1}' file | sort | uniq > names    # 7,752 lines 

i thinking loop of kind:

for each names present in file "names", do:

if 1 of line $line contains finished or died_skipped_permanently, print line in output , don't print others. else, keep lines containing name. delete lines identical.

here idea, don't know how can this. appreciate if help

we can use fact started lexicographically greater both finished , died_skipped_permanently , use

sort filename | awk '!seen[$1,$2]++' 

because started lexicographically greatest, started line appear after finished or died_skipped_permanently line when sort done. awk code wades through sorted lines , prints hasn't seen combination of fields 1 , 2 before.


Popular posts from this blog

c# - ODP.NET Oracle.ManagedDataAccess causes ORA-12537 network session end of file -

matlab - Compression and Decompression of ECG Signal using HUFFMAN ALGORITHM -

utf 8 - split utf-8 string into bytes in python -