Bioinformatics Practical for. Biochemists ... bioinformatics programs to analyze
data. • Open Sessions .... Understanding Bioinformatics, Zvelebil & Baum, 2007.
Bioinformatics Practical for Biochemists Andrei Lupas, Birte Höcker, Steffen Schmidt SS 2012 01. History of DNA
Description
•
Lectures about general topics in Bioinformatics & history
•
Tutorials will provide you with a toolbox of bioinformatics programs to analyze data
•
Open Sessions will give you the opportunity to use these tools
Course Outline
• • • • •
Mon
– DNA & Genomics
Tue
– Introduction to Proteins
Wed
– Annotation of Sequence Features
Thr
– Evolution & Design
Fri
– Protein Classification
Course Outline
• 13:00-14:00 • 14:15-17:30
Presentation Tutorial (2 x 30min) & hands-on practical
• You will need to keep an electronic lab notebook
• Fri afternoon: Test Exercises
Software Requirements
• • •
Browser (e.g. Firefox) “Advanced” Word Processor PyMOL (www.pymol.org – free for teaching)
History of DNA
1953 Model of DNA (F. Crick)
What is the “genetic material”?
•
1865 Gregor Mendel
•
•
1869 Friedrich Miescher
•
•
discovery of ‘nuclein’ (DNA), only published in 1871 since Hoppe-Seyler
1881 Edward Zacharias
•
•
basic rules of heredity
chromosomes are composed of nuclein
1899 Richard Altmann
•
renaming nuclein to nucleic acid wikipedia.org
DNA is the “transforming material”
•
1928 Frederick Griffith
•
•
“transforming principle” Str. pneumoniae experiment
1944
Avery & McCarty
• •
Griffith’s “transforming principle” is DNA Isolation of DNA/RNA
DNA is the genetic material
•
1950 Erwin Chargaff
•
•
A/T, C/G same amount in different tissues
1952 Hershey & Chase
•
DNA is the genetic material using P32/S35 Phage/E. coli experiment
Solving the DNA structure
•
1952/53 Linus Pauling
•
beat Cavendish lab in discovery of α-helix
•
Cavendish (Cambridge) allows Watson & Crick to work full-time on DNA
•
Manuscript shared with Cavendish lab before publication
http://osulibrary.oregonstate.edu/specialcollections/coll/pauling/dna/notes/1952a.22-ms-01.html
Solving the DNA structure
•
NATURE | VOL 421 | 23 JANUARY 2003 | ww
•
1952 Franklin & Wilkins
•
X-ray of B-DNA - Wilkins showed results to Watson & Crick
•
periodicity, phosphates are outside
1953 Crick & Watson
•
model of B-DNA
ature.com/nature
Solving the DNA structure
© 2003 Nature Publishing Group
397
DNA structure
Getting the “code”
•
1953 George E. Palade
•
•
“RNA organelles” (ribosomes)
1957 Crick et.al
• • •
suggest non-overlapping triplets only 20 out of 64 triplet code for an amino acid “comma-free code”
(d) The code is probably ‘degenerate’; that is, in general, one particular ammo-acid can be coded by one of several tripieta of bases.
The Reading ofthe the Codecode Getting
‘report hers our work ,on the mutant P 13 (now renamed FC 0) in the Bl segment of the B cistron. Thie mutant was originally produced by the action of proflavins. We@ have previously argued that acridines such aa pro5vin act as mutagens because they add or dslsts a base or bases. The most striking evidence in favour of this is that mutants produced by a&dines are seldom ‘leaky’ ; they are almost always completely Since our note lacking in the function of the gene. was published, experimental data from two eourcsa have been added to 0u.1: previous evidence: (1) we have examined a set of 126 pn mutants made with polyF acridine yellow; of these only 6 are IeaLT- (typically about half the mutants made with base analogues are leaky) ; (2) Streisinger lo has found that whereas mutants of the lysozyme of phage T4 produced by all lysozyme baas-analogues are usually leaky, mutants produced by proflavin are negative, that is, the function is completely lacking. If an acridine mutant i,3 produced by, say, adding a base, it should revert to ‘lvild-type’ by deleting a bass. Our work on revertants of FC-0 shows that it-usually
The evidence that the genetic cods is not overlapping (see Fig. 1) doss not come from our work. but from that, of Wittmannl and of Tsugita and Frasnkel-Conrat on the mutants of tobacco mosaic virus produced by nitrous asid. In an overlapping triplet code, an alteration to one baas will in general change three adjacent amino-acids in the polypeptide produces chain. Their work on the polyU alterationsmRNA produced in the protein of the virus show that usually only one amino-acid at a time is changed a8 a result of treating complete genetic code the ribonuclsic acid (RNA) of the virus with nitrous acid. In the rarer cases where two amino-acids are altered (owing presumably to two separate deammations by the nitrous acid on one piece of RNA), the altered amino-acids ars not in adjacent positions in the polypeptide chain. Brsnnera had previously shown that, if the code were universal (that is, the same throughout Nature), then all overlapping triplet codes were impossible. no overlapping codes Starlinq point Moreover, all the abnormal human hremoglobins 3 ,, ;$I Overlappirq code studied in detail4 show only single amino-acid changes. The newer experimental rssulta ssssntially rule out concept of mRNA +7 all simple codes of the overlapping type. NUCLEIC ACID * I’ ’ ’ ’ ’ ’ ’ --If the code is not overlapping, then there must be ,-J+-~---triplet Code Borne arrangement to show how to select the correct ETC. 1 triplets (or quadruplets, or(Crick, whatever Brenner, it may be)Barnett, along 3 ' the continuous sequence of bases. One obvious Non-overlapplnq Code Watts-Tobin) suggestion is that, say, every fourth baas is a ‘comma’. Fig. 1. To show the difference between an overlapping code and &other idea is that certain triplets make ‘sense’, a non-overlappinu code. The short wrticnl lines represent the whereas others make ‘nonsense’, as in the comma-free bases of the nucleic acid. The czw illustrated is for a triplet code
•
1961 Nirenberg & Matthaei
• •
•
1961 Sydney Brenner
• • •
Gene Structure
•
1977 Sharp & Roberts
•
•
1982 Cech
•
•
pre-mRNA is processed
ribo(nucleic en)zymes
1980 Joan A. Steitz
•
role of snRNPs in splicing
Genomic era
•
1975 Frederick Sanger
•
• •
dideoxy sequencing
1986 Human Genome Initiative Genomes
• • • • •
1995
H. influenca
1.8 Mb
1.7k
genes
1997
E. coli
4.6 Mb
4.3k
genes
1996
S. cerevisiae
12.5 Mb
5.7k
genes
1998
C. elegans
100 Mb
21.7k
genes
2000
D. melanogaster
121 Mb
17k
genes
The human genome •
2001
Draft H. sapiens
2.9 Mb
20-30k genes
Science (2001), Nature (2001)
Gene content
Excursion: Packing of DNA
•
human:
• •
2 x 3e9 base pairs packed in a nucleus of 6µm ∅
Histone tails
Histones
Chromosome
Qui, Nature 2006
•
E. coli
• •
6 Mbp 1 by 2 µm cell size
Kavanoff, Nature Education : Supercoiled chromosome of E. coli.
Eukaryote!
!
• Large&(10&Mb&–&100,000&Mb)&
Size!
Content!
• There&is¬&generally&a& relationship&between&organism& complexity&and&its&genome&size& (many&plants&have&larger& genomes&than&human!)& • Most&DNA&is&nonLcoding&
• Generally&small&(