chr1 26150023 26150023 ncRNA_exonic
chr1 26162313 26162313 exonic
chr1 26349533 26349535 exonic
chr1 26357656 26357656 UTR5
chr1 26487940 26487940 exonic
chr1 26150023 26150023 ncRNA_exonic
chr1 26162353 26162313 splicing
chr1 26349533 26349535 exonic;splicing
chr1 26357656 26357656 exonic
chr1 26487940 26487940 UTR3
chr1 26357656 26357656 intronic
chr1 26487940 26487940 intergenic
I have a very big csv file It includes dozens of columns and thousands of rows. I want to delete rows if 4th column of those rows include any string except exonic, exonic;splicing, splicing.
After deleting my file would look like this:
chr1 26162313 26162313 exonic
chr1 26349533 26349535 exonic
chr1 26487940 26487940 exonic
chr1 26162353 26162313 splicing
chr1 26349533 26349535 exonic;splicing
chr1 26357656 26357656 exonic
I tried with sed but It deletes unwanted rows. For example, If I have UTR3 in 10th column, It will delete that row too and I don't want that. I used this command :
sed -e '/upstream/d' -e '/downstream/d' -e '/intronic/d' -e '/intergenic/d' -e '/ncRNA_exonic/d' -e '/ncRNA_intronic/d' -e '/ncRNA_splicing/d' -e '/ncRNA_UTR5/d' -e '/UTR3/d' -e '/UTR5/d' input.csv > output.csv
Is there anyway I can get this work?
Thanks in advance