Munch Lab

Applied programming 2014 week five


At the single lecture I will show you how you can run your python scripts without Sublime Text, and how you can pass arguments to your scripts like options, input and output files. At the double lecture I will introduce you to classes.

Reading material

The chapter: “Classes and objects – the basics” from  (our version of) How to think like a computer scientist.

Computer exercises

There is not a lot of reading for this week so you have the opportunity put more time into the difficult Open Reading Frames exercise.

Weekly assignment

Write a function, parseFasta(filename), that should read the file named filename. This
file should contain multiple sequence entries in Fasta format and the function must parse
this data and return a list of tuples of the form (header, sequence). Download this file
and use as input. The first entry in the file is:

>numberOne this is sequence one

so the first tuple you want to make the first tuple in your list:


This assignment can be solved in a lot of different ways — of varying complexity — so be
inventive and after you have handed in your own assignment, have a look at what your
friends have done to solve the problem.

Here is one approach: read the entire content of the file into a string using the read()
method of the opened file. Then you can use the split() method to split the the string
into a list of individual Fasta entries. Try using '>' as argument for split and print the resulting list. Using a for loop you can iterate over the list and turn each Fasta entry into a tuple. Hint: use the splitlines() method to split each Fasta entry of into the individual lines it contains (i.e. the header line and all the sequence lines). Then all that remains is to fish out the header line, concatenate the sequence lines and add the (header, sequence) pairs to a list. Here is some code to get you started:

fasta_file = open('input.fasta', 'r')
file_content =
list_of_entries = file_content.split(">"):
for entry in list_of_entries:
    print entry 

# figure out why the first string in 
# list_of_entries is empty. You need to 
# remove it before entering the for loop.

Still, there are many other, and better, ways to do it. You can think about a way once you are done with the large exercise this week.

Handing in

To hand in the assignment put the code in a file named after your self and the week. If it was me it would be Attach it to an email with subject “Assignment” and send it to Dan ( You can see your hand in deadline on the main course page.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: