cv CV ML SWE
General Information
| Full Name | Aditeya Baral |
| aditeyabaral [at] nyu [dot] edu | |
| Location | New York, NY |
| Languages | English, Hindi, Bengali |
Education
-
Sep '24 - May '26 Masters in Computer Science
New York University, Courant Institute of Mathematics, Computing and Data Science - GPA: 3.83/4.00
- Concentration: Artificial Intelligence
- Worked as a Research Assistant at CILVR and Computation & Psycholinguistics Lab, advised by Shauli Ravfogel, Jackson Petty, and Tal Linzen.
-
Aug '18 - May '22 Bachelor of Technology in Computer Science & Engineering
PES University, Bengaluru, India - GPA: 8.71/10.00 ≈ 3.76/4.00
- Specialization: Machine Intelligence and Data Science
- Received the Undergraduate Researcher Award for my work in the field of Machine Learning.
- Worked as a Research Assistant at the Center for Cloud Computing & Big Data, advised by Dr. KV Subramaniam.
Research Experience
-
June '25 – Dec '25 Applied Research Scientist Intern, Redis LangCache
Redis, San Francisco, USA - Architected a two-stage retrieval and re-ranking pipeline for Redis LangCache, achieving a 12.5% PR-AUC and 8% P-CHR AUC improvement over baselines by integrating cross-encoder re-rankers for full token-level interaction.
- Curated and open-sourced LangCache SentencePairs (v1-v3), a large-scale dataset family spanning 1M to 40M examples from diverse linguistic sources, enabling robust fine-tuning of semantic retrieval and re-ranking models.
- Open-sourced LangCache ReRanker v1 and v2 model families comprising cross-encoder variants fine-tuned with ranking and classification objectives, enabling application-specific score calibration for diverse semantic caching use cases.
- Assisted in the fine-tuning and deployment of LangCache Embed v3, a generalist model for semantic retrieval, achieving 13.5% PR-AUC improvement over v2 and outperforming larger general-purpose models even without re-ranking.
- Developed a comprehensive evaluation framework integrated with RedisVL for LangCache customers, enabling systematic analysis of achievable P-CHR tradeoffs, valid cache-hit rates, and operational thresholds before onboarding.
- Quantified retriever bottlenecks and aggressive vs. conservative re-ranking effectiveness by analyzing recall ceilings and re-ranking movement to optimize operational trade-offs and improve cache-hit quality.
- Supported downstream integration and development of LMCache by building prototypes and conducting performance studies with Redis as an in-memory KV store, demonstrating latency and throughput gains.
-
May '25 – Present Research Assistant
Computational Intelligence, Vision, and Robotics (CILVR) Lab, NYU - Investigating arithmetic circuit dynamics in LLMs when operators are redefined in-context by analyzing activation representations and attention patterns across transformer layers using Llama-3.3-70B-Instruct.
- Conducting layer-wise analysis of activation geometry using PCA, centroid trajectories, and cluster separability metrics to trace representational evolution under operator semantic redefinition.
- Examining attention circuit reconfiguration at token and head levels to determine whether semantic remapping reuses existing circuits or activates distinct computational pathways.
-
May '25 – Present Research Assistant
Computation and Psycholinguistics Lab, NYU - Evaluating LLMs on compositional generalization and instruction synthesis by studying their ability to translate synthetic Context-Free Grammars (CFGs) into conforming strings.
- Analyzing model outputs in few- and zero-shot settings to assess grammatical conformity and uncover generation strategies used during translation.
-
Jul '22 – Jul '24 Applied AI Engineer, Webex Media Quality Analytics
Cisco Systems, Bengaluru, India - Instruction fine-tuned LLMs like Mistral and Llama-2 on-prem to enable secure and cost-effective AI solutions such as translation and RAG for engineers and customers, cutting third-party dependency costs by 30%.
- Led the initiative to build a novel pre-training algorithm for conversational data using PyTorch and HuggingFace, achieving a 40% performance gain over standard approaches at benchmark fine-tuning tasks.
- Developed the Webex Contextual Search engine and improved searching, ranking, recommendations, and topic modeling by 75% with <10% increased overhead latency.
- Integrated OpenAI APIs and on-prem LLMs with the Webex AI Assistant for 15M+ worldwide users to add auto-replies, summarization, querying, and action-item extraction to message threads and meeting transcripts.
-
Aug '21 – Dec '21 Applied Research Scientist Intern, Intel (VSG) Research
Intel Corporation, Bengaluru, India - Explored Few-Shot Learning Object Detection (FSOD) techniques to reduce catastrophic forgetting in constrained and heterogeneous driving environments.
- Investigated and designed novel representation learning and attention mechanisms to learn inter/intra-object relationships using PyTorch.
- Outperformed existing approaches at the time on base and novel classes by 0.2 mAP and 3 mAP on the Few-Shot India Driving Dataset, a benchmark for FSOD.
-
May '20 – Jul '20 Research Assistant
Center for Cloud Computing & Big Data, PES University - Compiled and used TailBench to simulate and profile application loads, monitor performance, and analyze results.
- Explored ways to reduce tail latencies in latency-critical applications such as translation and image recognition.
Software Engineering Experience
-
Jul '22 – Jul '24 Big Data Engineer, Webex Media Quality Analytics
Cisco Systems, Bengaluru, India - Developed and deployed streaming jobs in Scala and Flink to process 1M+ reports/min and compute 1200+ real-time metrics from Calls and Meetings.
- Applied statistical modeling techniques to investigate and report media quality insights to downstream consumers, reducing errors by 30% and analysis time by 15 hrs/week per team member.
- Led the development of real-time (<1 min) auditing pipelines using Kafka and Python to ensure per-minute data consistency between streaming jobs and Iceberg and Pinot data stores, reducing manual effort by >80%.
- Built graphs and dashboards on the Webex Media Quality Analytics Dashboard using Grafana and Kibana to set up alerts and KPIs for 20,000+ clients and customers.
-
Jan '22 – Jun '22 Big Data Engineering Intern, Webex VideoMesh Analytics
Cisco Systems, Bengaluru, India - Migrated the Meetings Analytics Engine from Java and Spark to Scala and Flink to scale up to 1M+ reports/min and significantly improve real-time report generation by over 40%.
- Built VideoMesh Developer APIs using Java and globally rolled them out for 30,000+ enterprises with customer-facing applications.
Skills
-
Languages
- Python, Scala, Java, C/C++, R, Groovy, SQL, LaTeX
-
ML/Stats Libraries
- PyTorch, Tensorflow, HuggingFace, NLTK, pandas, NumPy, scikit-learn, seaborn, matplotlib, plotly
-
AI/ML Techniques
- Representation Learning, Mechanistic Interpretability, Transfer Learning, Language Models, RAG
-
Big Data/Cloud
- Hadoop, Kafka, Zookeeper, Spark, Flink, Iceberg, Pinot, Redis, ELK
-
Frameworks/Tools
- Git, GitHub, Jenkins, Docker, Kubernetes, Flask, Grafana, PSQL, MongoDB, AWS, Linux
Honors and Awards
-
2024 - Second Place out of 20+ teams at Webex Analytics Datathon 2024
- Containerized and deployed a self-sufficient, on-prem and quantized LLM-RAG pipeline to assist engineers with engineering queries and incident resolution.
- Second Place out of 20+ teams at Webex Analytics Datathon 2024
-
2023 - Ranked #1 Internationally out of 300+ teams at the Webex IDEA Hackathon 2023
- Integrated OpenAI LLM APIs with the Webex Assistant to enable summarization of message threads, media and transcripts: https://www.webex.ai/?socialshare=VideoOverlayPlayer
- Developed thread-related user actions like searching, grouping and sorting across Webex.
- Assisted in globally rolling out these features worldwide.
- Ranked #1 regionally and Top 20 Internationally out of 300+ teams at the Webex Playtime Hackathon 2023
- Developed the Webex Contextual Search engine using novel conversational representation learning techniques and displayed significant improvement in searching, ranking and recommendations.
- Ranked #1 Internationally out of 300+ teams at the Webex IDEA Hackathon 2023
-
2022 - Awarded the Undergraduate Researcher Award by PES University for my work in the field of Machine Learning.
- 3x Scholarship Recipient (Prof. CNR Rao, MRD & DAC Scholarship Awards) for being in the top 20% among 900+ students at PES University.
- Finalist at Intel Technovation, Flipkart, IBM, and IISc Hackathons, placing among the top 200+ teams.
-
2017 - National newspaper coverage for proposing the currently implemented model to track garbage collection in Bengaluru.
- Received extensive coverage and recognition for developing an Android app to track and schedule garbage collection in Bengaluru.
- Currently implemented model was based on our designs and proposals made to the BBMP.
- The Hindu: https://www.thehindu.com/news/cities/bangalore/waste-disposal-all-garbage-trucks-to-have-gps-devices/article29906398.ece
- India Today: https://www.indiatoday.in/cities/bengaluru/story/app-bangaloreans-track-garbage-vehicles-bbmp-gps-fails-1909736-2022-02-07
- The Times of India: https://bangaloremirror.indiatimes.com/bangalore/others/these-12th-graders-want-app-solutely-no-garbage/articleshow/57619336.cms
- National newspaper coverage for proposing the currently implemented model to track garbage collection in Bengaluru.
Services and Volunteering
-
2023 - Speaker, Guest Lecture on - Building Foundation Models using Transformers
- Delivered a guest lecture to undergraduate students on the advancements in representation learning techniques for language and highlighted the importance of interdisciplinary research.
- Speaker, Guest Lecture on - Building Foundation Models using Transformers
-
2021 - Appointed a Teaching Assistant for CS322 Big Data course at the Department of Computer Science, PES University.
- Designed and graded coursework, assignments and projects, and delivered hands-on sessions on Hadoop and Spark for a class of 600+ enrolled students for the undergraduate Big Data course.