The emergence of collaboration platforms in recent times has enabled users to communicate effortlessly across the world. Existing conversational modelling techniques do not learn accurate representations of structured conversations since they are primarily aimed at causal text generation in a one-to-one setting. Structured conversations, on the other hand, involve multiple turns and store metadata such as authorship, timestamps, membership, and other attributes. Causal language models only attend to previous tokens while generating the next token, which makes them fail at representing complex bidirectional relationships between concepts, authors and turns in structured conversations, which are essential for lanugage understanding and predictive tasks. In this paper, we propose a novel pre-training strategy for multi-turn dialogue modelling that leverages both conversational data and metadata. Our approach combines multiple supervised and unsupervised objectives to learn task and domain-agnostic representations to capture both semantics and structure of conversations. Experiments with our approach show that our language models can learn hierarchical relationships between dialogues, concepts and authors in conversations, which allow it to outperform existing conversational models on multiple downstream tasks.
2022
AAAI-MAKE 2022
CalBERT - Code-Mixed Adaptive Language Representations Using BERT
Aditeya Baral, Aronya Baksy, Ansh Sarkar, and 2 more authors
In Proceedings of the AAAI 2022 Spring Symposium on Machine Learning and Knowledge Engineering for Hybrid Intelligence (AAAI-MAKE 2022), Stanford University, Palo Alto, California, USA, March 21-23, 2022 , 2022
A code-mixed language is a type of language that involves the combination of two or more language varieties in its script or speech. Analysis of code-text is difficult to tackle because the language present is not consistent and does not work with existing monolingual approaches. We propose a novel approach to improve performance in Transformers by introducing an additional step called "Siamese Pre-Training", which allows pre-trained monolingual Transformers to adapt language representations for code-mixed languages with a few examples of code-mixed data. The proposed architectures beat the state of the art F1-score on the Sentiment Analysis for Indian Languages (SAIL) dataset, with the highest possible improvement being 5.1 points, while also achieving the state-of-the-art accuracy on the IndicGLUE Product Reviews dataset by beating the benchmark by 0.4 points.
2021
Intel Research
Information Maximization to Overcome Catastrophic Forgetting in Few-Shot Object Detection
Aditeya Baral, Anay Majee, and Anbumani Subramanian
In Work done and published as part of Intel (VSG) Research , 2021
Few-shot object detection encompasses the tasks of localizing and classifying objects in an image provided a limited number of training examples. Recent techniques in this domain suffer from confusion between object classes and demonstrate a tendency to forget the knowledge of already learnt classes, also known as catastrophic forgetting.Our work overcomes the impedance of catastrophic forgetting through an information maximization approach - Information Maximization Network (IMNet) that focuses on learning more descriptive feature representations without overfitting on irrelevant ones while retaining the relevant features from already learnt classes in an input image. Our introduced Cross-Entropy Similarity Loss decreases class confusion by adjusting the embedding space to allow homogeneous classes to have feature representations close to one another and heterogeneous classes to have a high separation between them. We conduct our experiments on the India Driving Dataset (IDD), which demonstrates a real-world setting alongside large class imbalance. Our IMNet architecture outperforms existing meta-learning approaches by 0.2 mAP on the base classes and up to 3 mAP on novel classes of IDD.
ICNLSP 2021
MAPLE – MAsking words to generate blackout Poetry using sequence-to-sequence LEarning
Aditeya Baral, Himanshu Jain, Deeksha D, and 1 more author
In Proceedings of the 4th International Conference on Natural Language and Speech Processing (ICNLSP 2021) , Nov 2021
Poetry has morphed rapidly over changing times with non-traditional forms stirring the creative minds of people today. One such type of poetry is blackout poetry. Blackout poetry is a form of poetry in which words in a passage are masked, except for a few which when combined together in order to convey some meaning. With the recent developments in Natural Language Processing aiming to simulate human creativity, we propose a novel approach to blackout poetry generation employing deep learning. We explore four different architectures, namely an encoder-decoder with Bidirectional Long Short-Term Memory (LSTM) and Attention, a Bidirectional LSTM Conditional Random Fields (LSTMCRF) architecture, Bidirectional Encoder Representations from Transformers (BERT) and Robustly Optimized BERT Pre-training Approach (RoBERTa). The first architecture employs abstractive summarization and the remaining employed sequence labelling to generate poetry. The Transformer based architectures prove to be the best working models, and were also able to pass a Turing Test as well.
IEEE CONIT 2021
Analysis of Kepler Objects of Interest using Machine Learning for Exoplanet Identification
Ameya Rajendra Bhamare, Aditeya Baral, and Saarthak Agarwal
In 2021 International Conference on Intelligent Technologies (CONIT) , Aug 2021
For several decades, planet identification has only been performed by astronomical experts and researchers with the help of specialized equipment. With the advent of computational methods and access to satellite data from space missions, this trend has changed. For instance, NASA’s Exoplanet Exploration program has provided us vast amounts of data on celestial objects to assist in space exploration. One such mission of interest is the Kepler mission. Over 4000 such transiting exoplanets have been identified since the mission commenced in 2007. It has provided us with an extensive database of discoveries that help in computing planet occurrence rates as a function of an object’s parameters such as the size, insolation flux, star type and orbital period. This information is catalogued in the Cumulative Kepler Object of Information dataset. Four basic models have been compared. Namely, Support Vector Machines, Random Forest Classifiers, AdaBoost and Deep Neural Networks. The AdaBoost classifier was selected as the optimum machine learning model and returned an F-1 score of 0.98.