Project Title
Plasmid Finder, by Susan Chen, December 2012
Project Objective
Create a graphical user interface that facilitates searches for plasmids and primers in our lab.
Problem: A graduate student had created spreadsheets of plasmids and primers
as well as plasmid sequences in fasta files, but the fasta files were not linked
to any the spreadsheets and required tedious searches by hand to find the needed data.
There was no apparent way to quickly search for a primer or plasmid with all the
attributes one needs for experimental work.
Solution: Create a Python program to read and pull up fasta files.
I also wrote a script that searches for the fasta files and integrates
various search requirements and then returns the information that matches
these requirements. Lastly, I developed a graphical user interface to package
these searches in a user friendly way.
Vision: I had originally envisioned a gui where a user can choose
to search all primers or plasmids. If the primers button is clicked,
the user should be able to enter information in any one or all
combinations of fields. The information is then used to search
the primer list and any hits then are displayed.
On the plasmid side, the user should be able to specify any combination
of attributes such as promoter, fluorophore, or gene present in the plasmid,
and the identity and sequence of the plasmid hits should then be displayed.
Reality: Code always takes longer to write then you think it will.
For the final project I have been able to extract information
from the data source and store it in two files. Code has been implemented to
search the file containing primer information and obtain the desired results.
These actions have been implemented in a GUI that users can interact with.
Missing is the plasmid side of things. I planned on doing something similar
to what I did for the primer side (i.e., doing a combination of searches).
Also, the plasmid sequence display should have highlighted features that make
it easier to distinguish what the sequences actually are. The error handling
and taking into account different user input possibilities is also lacking.
The GUI is also limited in that users cannot click on the outputs of their
search for further information.
Lessons Learned: Being able to write your own programs to further your personal
research efforts is really useful. But I also have to think about what I should
do myself. Is worth spending time and effort to write my own programs? There's
lots of code out there, and so I don't want to reinvent the wheel! I'd like to
do more programming, and especially want to be able to control laboratory instruments
on the equipment I work with.
Program Details
There are three source files: JSO_MasterList_121212, JSO_Plasmids.fasta, fluorescent_sequence.fasta:
-
plasmid_list.py and primer_list.py read in the source data and rearranges/organizes
them and then writes the output to a text file. There are no inputs for these two files
since they are just scripts. After running them, there should be two text files:
compiled_info_plasmid.txt and compiled_info_primer.txt.
-
plasmid_process.py and primer_process.py are the search scripts that are hooked up
on the back end of the GUI. For plasmid_process.py, the "match_id" function takes
user input "input" and also the dictionaries "description, selection, sequence, etc"
generated from the "make_dictionaries" function. It then returns the values of the
dictionaries that match the user input.
-
For primer_process.py, the search is broken into three different functions: "id_search"
finds the sequences that match the id and returns the description and sequence of the match.
"substr_search" matches a partial sequence from user input, and finds all sequences that
matches the subsequence. And "desc_search" matches a fragment of the description with the
description of the primer and returns the primers that match the search. Lastly, "combo"
works when the user enter two fragments of information about the sequence and description
of the primers and searches for primers that match both requirements and then returns the result.
Instructions
- Run primer_list.py and plasmid_list.py to generate the files to search on
- Run HESdnaReader.py to start the GUI
- On the plasmid side:
-pressing update with no entry results in an error message
-entries should be in the form JSO#, such as JSO665
-if an entry is not present, such as JSO10000, then an error message results
- On the primer side:
-pressing update with no entry results in an error message
-search on id will only occur if the other two fields are empty. Ids should be of the form: 27.0
-search on a description will only occur if the other two fields are empty.
Descriptions should be lower case, and any substring of the description will work, such as yegfp.
If the description does not exist, an error message will result.
-search on sequences will only occur if the other two fields are empty.
A sequence entry should be lower case. Any substring of the sequence will work,
such as gggccc. If the sequence does not exist, an error message will result.
-if the id is not known, and the user desires to enter information about
the sequence and description, the intersection of these two inputs is found.
If only the sequence information is true, the primers with that sequence information
will be displayed. Likewise, if only the description information is true,
the primers with that description information will be displayed. If both information
is wrong, then an error message will result.
Program Source Code
Data Files Used