周次 |
課程主題 |
課程內容與指定閱讀 |
教學活動與作業 |
學習投入時數 |
1 |
Introduction |
What is bioinformatics?
Central dogma of molecular biology: DNA, mRNA, protein |
The DNA Journey
?天下文化 觀念生物學1~4
Canadian Bioinformatics Workshops (all slides and video are available) |
6 |
2 |
Sequence alignment |
Why do we need sequence alignment?
Its application in structure homology and evolutionary modeling context?
Dynamic programming |
SEAVIEW : Sequence alignment editor
T-Coffee documentation |
6 |
3 |
Pairwise Sequence alignment |
Global & Local alignment
Linear space algorithm
BLAST |
NCBI BLAST server
BLAST by O'Reilly Media |
6 |
4 |
Multiple Sequence alignment |
The variation of the algorithms, which one is better?
Another issue: huge amount data |
T-Coffee web server
?PSI/TM-Coffee web server - PSI/TM-Coffee: Floden, E. W. et al. PSI/TM-Coffee: a web server for fast and accurate multiple sequence alignments of regular and transmembrane proteins using homology extension on reduced databases. Nucleic Acids Res. 44, W339–43 (2016).
- PSI-Coffee: ?Chang, J.-M. M., Di Tommaso, P., Taly, J.-F. F. & Notredame, C. Accurate multiple sequence alignment of transmembrane proteins with PSI-Coffee. BMC Bioinformatics 13 Suppl 4, S1 (2012).
|
6 |
5 |
Sequence alignment post-process |
Uncertainty and its effect on downstream analysis
How to detect uncertainty? |
TCS web server
TCS: Chang, J.-M. M., Di Tommaso, P. & Notredame, C. TCS: a new multiple sequence alignment reliability measure to estimate alignment accuracy and improve phylogenetic tree reconstruction.Mol. Biol. Evol. 31, 1625–37 (2014). |
6 |
6 |
Phylogenetic tree 1/2 |
Probabilistic and ideal-data models
Character/parsimony-based methods |
Databases of rRNA sequences and associated software summary by Manolo Gouy
The rRNA WWW Server by Antwerp, Belgium
The Ribosomal Database Project by Michigan State University |
6 |
7 |
Phylogenetic tree 2/2 |
Distance-based methods: UPGMA, NJ
Maximum-likelihood methods: PhyML |
Programs for molecular phylogeny summary by Manolo Gouy
PHYLIP: an extensive package of programs for all platforms
PAUP: a very performing commercial package
PHYLO_WIN: a graphical interface, for unix only
MrBayes: Bayesian phylogenetic analysis
PhyML: fast maximum likelihood tree building
WWW-interface at Institut Pasteur, Paris |
6 |
8 |
Protein secondary structure prediction |
Neural network approach
Knowledge-based approach |
- HYPROSP: Wu, K.-P., Lin, H.-N., Chang, J.-M., Sung, T.-Y. & Hsu, W.-L. HYPROSP: a hybrid protein secondary structure prediction algorithm—a knowledge-based approach. Nucleic Acids Research 32, 5059–5065 (2004).
- HYPROSPII: Lin, H.-N., Chang, J.-M., Wu, K.-P., Sung, T.-Y. & Hsu, W.-L. HYPROSP II-A knowledge-based hybrid method for protein secondary structure prediction based on local prediction confidence. Bioinformatics 21, 3227–3233 (2005).
|
6 |
9 |
Protein functional classes prediction |
Machine learning
Feature reduction? |
The Critical Assessment of protein Function Annotation algorithms (CAFA) - CAFA1: Radivojac, P. et al. A large-scale evaluation of computational protein function prediction. Nat. Methods 10, 221–7 (2013).
- CAFA2: Jiang, Y. et al. An expanded evaluation of protein function prediction methods shows an improvement in accuracy. arXiv preprint arXiv:1601.00891 (2016). at
PSLDoc: Chang, J.-M. M. et al. PSLDoc: Protein subcellular localization prediction based on gapped-dipeptides and probabilistic latent semantic analysis. Proteins 72, 693–710 (2008)
PSLDoc2: Chang, J.-M. M. et al. Efficient and interpretable prediction of protein functional classes by correspondence analysis and compact set relations. PLoS ONE 8, e75542 (2013) |
6 |
10 |
Midterm |
One A4 page |
|
3 |
11 |
Genomics |
What is gene and genome? How does a gene express and regulate?
The Human Genome Project
Gene finding |
The Assemblathon |
6 |
12 |
Next generation sequencing |
RNA-Seq: large amounts of data
How to identify significant expression? |
Applications of next-generation sequencing by Nature Reviews Genetics |
6 |
13 |
Comparative genomics |
Genome alignment
Phylogenomics
Single-nucleotide polymorphisms related with diseases |
The Alignathon
?HaploReg: a tool for exploring annotations of the noncoding genome at variants on haplotype blocks
ClinVar?: aggregates information about genomic variation and its relationship to human health |
6 |
14 |
Computational epigenetics |
Chromatin biology
Nuclear organization |
染色體結構捕捉技術 by 陳政儀 - HiC contact bias : Yaffe, E. & Tanay, A. Probabilistic modeling of Hi-C contact maps eliminates systematic biases to characterize global chromosomal architecture. Nat. Genet. 43,1059–65 (2011).
- HiC peak calling method : Rao, S. S. et al. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell 159, 1665–80 (2014).
- Genome segmentation : Ernst, J. & Kellis, M. ChromHMM: automating chromatin-state discovery and characterization. Nature Methods 9, 215–216 (2012).
- Review : Sexton, T. & Cavalli, G. The role of chromosome domains in shaping the functional genome. Cell 160, 1049–59 (2015).
|
6 |
15 |
Big data - big projects |
ENCODE: Encyclopedia of DNA Elements
modENCODE: model organism Encyclopedia of DNA Elements
NIH Roadmap Epigenomics
1000 Genomes |
Collected papers for Epigenome Roadmap
Epigenetics by Nature Reviews Genetics
ENCODE
modENCODE
Roadmap Epigenomics project
?1000 Genomes project |
6 |
16 |
Big data visualization |
UCSC/Ensembl genome browser
WashU epi-genetics browser |
UCSC Genome browser
Ensembl genome browser
WashU Epigenome browser |
6 |
17 |
Data base |
RCSB Protein Data Bank (PDB)
NCBI Sequence Read Archive (SRA) |
PDB
?SRA NCBI |
6 |
18 |
Final project presentation |
Rubrics/評分量尺 |
|
|