SemesterFall Semester, 2023
DepartmentSophomore Class of BA in Global Governance
Course NameData Science
InstructorLU HSIN-TSE
Credit3.0
Course TypeSelectively
Prerequisite
Course Objective
Course Description
Course Schedule

















































































































週次



課程主題



課程內容與指定閱讀



教學活動與作業



1



Introduction to data & ChatGPT



Self-made teaching materials



Lecture



2



What is the problem on Data



Self-made teaching materials




  • Lecture

  • Homework: Collecting a tiny dataset.



3



CV - Introduction to Image Segmentation



Deng, J., Dong, W., Socher, R., Li, L. J., Li, K., & Fei-Fei, L. (2009, June). Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition (pp. 248-255). Ieee.




  • Lecture

  • Practice: clustering (K-Means, KNN) segmentation

  • Homework: semantic segmentation.



4



CV - Introduction to Image Classification




  • Lecture

  • Practice: SVM classification

  • Homework: Parameter tuning.



5



National Day (Oct-10th)



6



CV - Image Generation



Rombach, R., Blattmann, A., Lorenz, D., Esser, P., & Ommer, B. (2022). High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 10684-10695).




  • Lecture

  • Practice: Stable diffusion & Midjourney

  • Homework: Training a classification model by using AI generated content, & performance comparison



7



NLP - Introduction to Language Modeling (1)



Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., & Sutskever, I. (2019). Language models are unsupervised multitask learners. OpenAI blog, 1(8), 9.




  • Lecture

  • Practice: Word Vector & classification



8



NLP - Introduction to Language Modeling (2)




  • Lecture

  • Practice: Text processing

  • Homework: Semantic analysis



9



NLP - Introduction to Text Generation




  • Lecture

  • Practice: Markov Chain, GPT-2

  • Homework: Compare Markov Chain with ChatGPT



10



NLP – Other tasks (1)



Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.




  • Lecture

  • Practice: Machine reading comprehension & sentence classification by using HuggingFace

  • Homework: sentence classification



11



NLP – Other tasks (2)




  • Lecture

  • Practice: Machine reading comprehension & question-answering by using HuggingFace

  • Homework: question answering



12



Project Proposal



 



Presentation



13



Time Series - Introduction to Time Series Forecasting



Self-made teaching materials




  • Lecture

  • Practice: ARIMA model



14



Time Series - Introduction to Time Series Forecasting



Self-made teaching materials




  • Lecture

  • Practice: RNN

  • Homework: stock price prediction, performance comparision



15



In-class speech



(Tentative) Mr. Veeresh Ittangihal, Data Scientist in Micron Technology - Data Scientist in Semiconductor Industry



16



Final Project



 



 



17



Final Project



 



 



18



Final Project



 



Final Project Presentation



Teaching Methods
Teaching Assistant
Requirement/Grading

  1. Attendance (10%): This course will be an in-person class, you have to come to the classroom every week. Based on school epidemic prevention policies, we would switch to online classes (Google Meets) if the COVID-19 pandemic outbreak occurs again.

  2. Homework (40%): Homework will be assigned almost every week, and you (or your team) should submit it to the learning management system (Google Classroom) on time. This course won't accept any delayed submissions.

  3. Final Project (50%): We are going to publish one data set on the Internet, and you (or your team) have to collect data, describe data characteristics, explain tasks, and give a simple demo on your data in order to prove the data quality.


Textbook & Reference

  1. Deng, J., Dong, W., Socher, R., Li, L. J., Li, K., & Fei-Fei, L. (2009, June). Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition (pp. 248-255). Ieee.

  2. Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.

  3. Radford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., ... & Sutskever, I. (2021, July). Learning transferable visual models from natural language supervision. In International conference on machine learning (pp. 8748-8763). PMLR.

  4. Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., & Sutskever, I. (2019). Language models are unsupervised multitask learners. OpenAI blog, 1(8), 9.

  5. Rombach, R., Blattmann, A., Lorenz, D., Esser, P., & Ommer, B. (2022). High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 10684-10695).

  6. Self-made teaching materials


Urls about Course
Attachment