|What is bioinformatics? |
Central dogma of molecular biology: DNA, mRNA, protein
|The DNA Journey |
Canadian Bioinformatics Workshops (all slides and video are available)
|Why do we need sequence alignment? |
Its application in structure homology and evolutionary modeling context?
|SEAVIEW : Sequence alignment editor |
|Pairwise Sequence alignment|
|Global & Local alignment |
Linear space algorithm
|NCBI BLAST server |
BLAST by O'Reilly Media
|Multiple Sequence alignment|
|The variation of the algorithms, which one is better? |
Another issue: huge amount data
|T-Coffee web server |
?PSI/TM-Coffee web server
- PSI/TM-Coffee: Floden, E. W. et al. PSI/TM-Coffee: a web server for fast and accurate multiple sequence alignments of regular and transmembrane proteins using homology extension on reduced databases. Nucleic Acids Res. 44, W339–43 (2016).
- PSI-Coffee: ?Chang, J.-M. M., Di Tommaso, P., Taly, J.-F. F. & Notredame, C. Accurate multiple sequence alignment of transmembrane proteins with PSI-Coffee. BMC Bioinformatics 13 Suppl 4, S1 (2012).
|Sequence alignment post-process|
|Uncertainty and its effect on downstream analysis |
How to detect uncertainty?
|TCS web server |
TCS: Chang, J.-M. M., Di Tommaso, P. & Notredame, C. TCS: a new multiple sequence alignment reliability measure to estimate alignment accuracy and improve phylogenetic tree reconstruction.Mol. Biol. Evol. 31, 1625–37 (2014).
|Phylogenetic tree 1/2|
|Probabilistic and ideal-data models |
|Databases of rRNA sequences and associated software summary by Manolo Gouy |
The rRNA WWW Server by Antwerp, Belgium
The Ribosomal Database Project by Michigan State University
|Phylogenetic tree 2/2|
|Distance-based methods: UPGMA, NJ |
Maximum-likelihood methods: PhyML
|Programs for molecular phylogeny summary by Manolo Gouy |
PHYLIP: an extensive package of programs for all platforms
PAUP: a very performing commercial package
PHYLO_WIN: a graphical interface, for unix only
MrBayes: Bayesian phylogenetic analysis
PhyML: fast maximum likelihood tree building
WWW-interface at Institut Pasteur, Paris
|Protein secondary structure prediction|
|Neural network approach |
- HYPROSP: Wu, K.-P., Lin, H.-N., Chang, J.-M., Sung, T.-Y. & Hsu, W.-L. HYPROSP: a hybrid protein secondary structure prediction algorithm—a knowledge-based approach. Nucleic Acids Research 32, 5059–5065 (2004).
- HYPROSPII: Lin, H.-N., Chang, J.-M., Wu, K.-P., Sung, T.-Y. & Hsu, W.-L. HYPROSP II-A knowledge-based hybrid method for protein secondary structure prediction based on local prediction confidence. Bioinformatics 21, 3227–3233 (2005).
|Protein functional classes prediction|
|Machine learning |
|The Critical Assessment of protein Function Annotation algorithms (CAFA) |
- CAFA1: Radivojac, P. et al. A large-scale evaluation of computational protein function prediction. Nat. Methods 10, 221–7 (2013).
- CAFA2: Jiang, Y. et al. An expanded evaluation of protein function prediction methods shows an improvement in accuracy. arXiv preprint arXiv:1601.00891 (2016). at
PSLDoc: Chang, J.-M. M. et al. PSLDoc: Protein subcellular localization prediction based on gapped-dipeptides and probabilistic latent semantic analysis. Proteins 72, 693–710 (2008)
PSLDoc2: Chang, J.-M. M. et al. Efficient and interpretable prediction of protein functional classes by correspondence analysis and compact set relations. PLoS ONE 8, e75542 (2013)
|One A4 page|
|What is gene and genome? How does a gene express and regulate? |
The Human Genome Project
|Next generation sequencing|
|RNA-Seq: large amounts of data |
How to identify significant expression?
|Applications of next-generation sequencing by Nature Reviews Genetics|
|Genome alignment |
Single-nucleotide polymorphisms related with diseases
|The Alignathon |
?HaploReg: a tool for exploring annotations of the noncoding genome at variants on haplotype blocks
ClinVar?: aggregates information about genomic variation and its relationship to human health
|Chromatin biology |
|染色體結構捕捉技術 by 陳政儀 |
- HiC contact bias : Yaffe, E. & Tanay, A. Probabilistic modeling of Hi-C contact maps eliminates systematic biases to characterize global chromosomal architecture. Nat. Genet. 43,1059–65 (2011).
- HiC peak calling method : Rao, S. S. et al. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell 159, 1665–80 (2014).
- Genome segmentation : Ernst, J. & Kellis, M. ChromHMM: automating chromatin-state discovery and characterization. Nature Methods 9, 215–216 (2012).
- Review : Sexton, T. & Cavalli, G. The role of chromosome domains in shaping the functional genome. Cell 160, 1049–59 (2015).
|Big data - big projects|
|ENCODE: Encyclopedia of DNA Elements |
modENCODE: model organism Encyclopedia of DNA Elements
NIH Roadmap Epigenomics
|Collected papers for Epigenome Roadmap |
Epigenetics by Nature Reviews Genetics
Roadmap Epigenomics project
?1000 Genomes project
|Big data visualization|
|UCSC/Ensembl genome browser |
WashU epi-genetics browser
|UCSC Genome browser |
Ensembl genome browser
WashU Epigenome browser
|Data base |
|RCSB Protein Data Bank (PDB) |
NCBI Sequence Read Archive (SRA)
|Final project presentation|