SemesterFall Semester, 2023
DepartmentSophomore Class of BA in Global Governance
Course NameData Science
InstructorLU HSIN-TSE
Course TypeSelectively
Course Objective
Course Description
Course Schedule






Introduction to data & ChatGPT

Self-made teaching materials



What is the problem on Data

Self-made teaching materials

  • Lecture

  • Homework: Collecting a tiny dataset.


CV - Introduction to Image Segmentation

Deng, J., Dong, W., Socher, R., Li, L. J., Li, K., & Fei-Fei, L. (2009, June). Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition (pp. 248-255). Ieee.

  • Lecture

  • Practice: clustering (K-Means, KNN) segmentation

  • Homework: semantic segmentation.


CV - Introduction to Image Classification

  • Lecture

  • Practice: SVM classification

  • Homework: Parameter tuning.


National Day (Oct-10th)


CV - Image Generation

Rombach, R., Blattmann, A., Lorenz, D., Esser, P., & Ommer, B. (2022). High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 10684-10695).

  • Lecture

  • Practice: Stable diffusion & Midjourney

  • Homework: Training a classification model by using AI generated content, & performance comparison


NLP - Introduction to Language Modeling (1)

Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., & Sutskever, I. (2019). Language models are unsupervised multitask learners. OpenAI blog, 1(8), 9.

  • Lecture

  • Practice: Word Vector & classification


NLP - Introduction to Language Modeling (2)

  • Lecture

  • Practice: Text processing

  • Homework: Semantic analysis


NLP - Introduction to Text Generation

  • Lecture

  • Practice: Markov Chain, GPT-2

  • Homework: Compare Markov Chain with ChatGPT


NLP – Other tasks (1)

Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.

  • Lecture

  • Practice: Machine reading comprehension & sentence classification by using HuggingFace

  • Homework: sentence classification


NLP – Other tasks (2)

  • Lecture

  • Practice: Machine reading comprehension & question-answering by using HuggingFace

  • Homework: question answering


Project Proposal




Time Series - Introduction to Time Series Forecasting

Self-made teaching materials

  • Lecture

  • Practice: ARIMA model


Time Series - Introduction to Time Series Forecasting

Self-made teaching materials

  • Lecture

  • Practice: RNN

  • Homework: stock price prediction, performance comparision


In-class speech

(Tentative) Mr. Veeresh Ittangihal, Data Scientist in Micron Technology - Data Scientist in Semiconductor Industry


Final Project




Final Project




Final Project


Final Project Presentation

Teaching Methods
Teaching Assistant

  1. Attendance (10%): This course will be an in-person class, you have to come to the classroom every week. Based on school epidemic prevention policies, we would switch to online classes (Google Meets) if the COVID-19 pandemic outbreak occurs again.

  2. Homework (40%): Homework will be assigned almost every week, and you (or your team) should submit it to the learning management system (Google Classroom) on time. This course won't accept any delayed submissions.

  3. Final Project (50%): We are going to publish one data set on the Internet, and you (or your team) have to collect data, describe data characteristics, explain tasks, and give a simple demo on your data in order to prove the data quality.

Textbook & Reference

  1. Deng, J., Dong, W., Socher, R., Li, L. J., Li, K., & Fei-Fei, L. (2009, June). Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition (pp. 248-255). Ieee.

  2. Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.

  3. Radford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., ... & Sutskever, I. (2021, July). Learning transferable visual models from natural language supervision. In International conference on machine learning (pp. 8748-8763). PMLR.

  4. Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., & Sutskever, I. (2019). Language models are unsupervised multitask learners. OpenAI blog, 1(8), 9.

  5. Rombach, R., Blattmann, A., Lorenz, D., Esser, P., & Ommer, B. (2022). High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 10684-10695).

  6. Self-made teaching materials

Urls about Course