Building Foundation Models using Transformers

Hands-on session on building and pre-training a BERT-based language model, covering advancements in representation learning for language and the importance of interdisciplinary research.

Series: Paper Talk Episode 4, Research et al.

Venue: PES University

Date: September 19, 2023

Overview

Delivered as part of Research et al.’s Paper Talk series to undergraduate students at PES University’s Department of Computer Science and Engineering. The session covered advancements in representation learning for language and the importance of interdisciplinary research.

Hands-On Session

The hands-on session walked students through building and pre-training a BERT-based language model from scratch using HuggingFace Transformers. Students trained their own tokenizer and model on chat data exported from their WhatsApp groups, then ran inference on the trained model using Google Colab.

Materials