Bio Editor Tutorial

This tutorial introduces you to what is arguably the richest resource for music tonal patterns available, the patterns of life itself: DNA sequences and the protein sequences that DNA translates to. Since this is new territory for most musicians, we will take you through the essential steps of searching a public domain genetic data base (the National Institute of Health Genbank), and then adding it to the default Bio Sequence Table (.bst) file for use by Bio Sequence modules.

Obtaining the raw genetic sequence

This first step is optional, because the file we will fetch from the internet is already in the Bio Editor examples directory. If you do not have a good internet connection, you can skip this section and return to it later when you are ready to search out your own choice of genetic raw material.

There are several places you can visit to download genetic sequences. The Links Page of the Algorithmic Arts web site lists a few you can try out. For this tutorial we will use the NIH Genbank. Be sure your Internet connection is active, then click on the following URL:

A new page will open on your browser, so you can switch between it and this tutorial.

The page that opened should be the NCBI, or National Center for Biotechnology Information. Near the top of the page is a search form, with a drop down menu that should say "Genbank" on the left. If it does not, go ahead and select Genbank from the drop down. In the search field, put the following text:

homo sapiens blue cone pigment

We are going to look for the DNA that makes blue color receptors in the human eye. There are also green and red receptors as well as gray, or intensity, receptors but we will just use the blue for this. After you have pasted the search text above into the browser search field, click on the GO button.

The search will turn up at least two entries. One of them will say "Human blue cone pigment gene, complete cds" and it probably will have the index number: U53874. This is the one we want to work with. The index number is underlined, indicating a link. Click on it.

The page that is displayed next is the DNA sequence we want. Capture this page either by downloading it (the SAVE button near the top of the page) or cutting the text and pasting it directly into the Bio Editor. We will use cut & paste.

Click on the drop down menu just to the left of the SAVE button near the top of the page, and select TEXT. This displays just the plain text part. From your keyboard, select Control-A (or from the browser Edit menu, "Select All"), then copy the entire page.

Editing the Raw Sequence Data with Bio Editor

The goal here is to put the raw sequence data you just downloaded into a format the MIDI software can use.

Open the Bio Editor if you have not done so already. It will open with the default Table file which is shown on the program caption bar; if you want to use a different .bst file, use the top line menu: Files/Open Table.

Click on an empty slot from the list to the right of the text edit window. Right click in the empty text area, and Paste. You should now have the complete DNA data file that starts with "LOCUS" on the first line and ends with two slashes "//" on the last line. If you do not have this in the Bio Editor text area, you need to either retrace the steps so far in this tutorial, or just cheat and load the file "bluecone.dna" from the BioEditor examples Source directory.

Although there are steps you can take before you can use the sequence you just fetched, there is only one step you must take, and that is to mark the data area so the Bio Editor will know what part of the text file to translate. This is easy to do: just highlight the last section where the codes are (gcat...) starting with just after the word ORIGIN and going to the end of the page.

When you have the text highlighted, click on the Start/End Marker toolbox button, which shows a tilde (~) icon. After you click on this button, you can see that a tilde was inserted at the start and end of the highlighted text. The Bio Editor will look for the first tilde to determine where it should start processing text, then it will continue processing text until it gets to the end of the file or a second tilde mark

Now translate the text by clicking on the Translate to DNA button, the rightmost one of the group, with the 3 "sperms." You will see a page with the DNA codons grouped in 3s. This is the translated text - that is, the translation into a binary format the MIDI software can use has taken place, and this page shows the translation of that back to readable text.

In this sequence, as is fairly typical for DNA sequences, there are a lot of debris called "introns" that don't actually encode to make the target protein. Click on the Translate to Protein button to get the protein translation view. These two views are just that - views. They do not change how the source text is translated which in this case is from the DNA data you delimited with the tilde character. However with this translation from the raw DNA codes, there are a lot of error markers (?) scattered throughout the text, which is an indication that it was not correctly translated. You should see no error codes with the possible exception of one or two a the end of the sequence that translates into a "stop" code. The problem here is those pesky introns.

Since there are 20 different amino acids in DNA, but only 4 bases ("A,C,T,G"), the most interesting musical material will be found in the protein translation, not the DNA translation; although the 4 DNA bases can be put to good use in rhythm and drone tracks. But you need to start with proteins to get a sense of the music patterns that are encoded in the DNA.

One way to translate (correctly) to the protein, is to carefully trim out the intron debris. The Bio Editor has functions that help greatly with that otherwise tedious job. But for now, on this particular file (indeed, in most DNA files), the protein translation already has been done, and we can tell the Bio Editor to use that instead of the raw DNA sequence simply by enclosing the protein codes in the tilde Start/End markers.

Look at the source file, just above the DNA codes. There is a line that starts with:
This is the protein translation of the DNA, with all the untranslatable stuff (the introns, and the stop codon at the end) trimmed out. We can start making music with this right away, and it will be an accurate musical representation of the blue cone pigment protein.

With the mouse cursor, highlight the text that is between the quotation marks, starting with MRKM, and ending with VGPN. When it is highlighted, click on the Start/End Marker (~) button. The tildes will enclose the text. Now click on the Translate to Protein button. You should see the same protein text now in the protein view page, this time with no error markers (?).

If you like, you can switch back and forth between the protein amino acid view and the DNA codon view. However, now the DNA translation is inaccurate because it is being backtranslated from the protein data. Click here to read more about this. In order to have both the DNA and amino acids exactly right, you have to translate from the DNA, having first removed the introns from the code. This is not difficult to do, but it is more than we will tackle in this tutorial. For now let's just save the source file and go make some music with what we have.

Click on the View Source button, the one with the bluish "paper" icon, one over to the right of the Show DNA button you currently have pushed in. This will show the original source text again, and it will enable the buttons so you can save the source text. Do this by clicking the Save As button, third from the far left with the 3-disks icon. You will get a regular file save dialog box. Give it a name such as "blue1.dna," and do the save.

You have saved your new Source file, and created a binary sequence to the slot number next to its name (blue labels on the left of the text screen), but you have not updated the Bio Sequence Table, the .bst file. To save the table file for access by the MIDI software, select File/Save Table As.

Finally, exit Bio Editor and load the bio table, from the main menu. Now you can enjoy the process of making music from the DNA sequence you have just created.

Copyright © 2000-2010 by John Dunn and Algorithmic Arts. All Rights Reserved.