Publications

A brief glimpse into things I've scribbled. For a comprehensive list, please refer to this page.

2026

Building a Foundational Guardrail for General Agentic Systems via Synthetic Data

Yue Huang, Hang Hua, Yujun Zhou, Pengcheng Jing, Manish Nagireddy, Inkit Padhi, Greta Dolcetti, Zhangchen Xu, Subhajit Chaudhury, Ambrish Rawat, Liubov Nedoshivina, Yu Chen, Prasanna Sattigeri, Xiangliang Zhang

International Conference on Learning Representations, 2026

2025

When in Doubt, Cascade: Towards Building Efficient and Capable Guardrails

Manish Nagireddy, Inkit Padhi, Soumya Ghosh, Prasanna Sattigeri

Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, 2025

Best Paper

Granite Guardian: Comprehensive LLM Safeguarding

Inkit Padhi, Manish Nagireddy, Giandomenico Cornacchia, Subhajit Chaudhury, Tejaswini Pedapati, Pierre Dognin, Keerthiram Murugesan, Erik Miehling, Martin Santillan Cooper, Kieran Fraser, Giulio Zizzo, Muhammad Zaid Hameed, Mark Purcell, Michael Desmond, Qian Pan, Inge Vejsbjerg, Elizabeth M. Daly, Michael Hind, Werner Geyer, Ambrish Rawat, Kush R. Varshney, Prasanna Sattigeri

Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Industry Track), 2025

Programming Refusal with Conditional Activation Steering

Bruce W. Lee, Inkit Padhi, Karthikeyan Natesan Ramamurthy, Erik Miehling, Pierre Dognin, Manish Nagireddy, Amit Dhurandhar

International Conference on Learning Representations, 2025

Spotlight

2024

Final-Model-Only Data Attribution with a Unifying View of Gradient-Based Methods

Dennis Wei, Inkit Padhi, Soumya Ghosh, Amit Dhurandhar, Karthikeyan Natesan Ramamurthy, Maria Chang

arXiv preprint, 2024

Value Alignment from Unstructured Text

Inkit Padhi, Karthikeyan Natesan Ramamurthy, Prasanna Sattigeri, Manish Nagireddy, Pierre Dognin, Kush Varshney

Conference on Empirical Methods in Natural Language Processing (Industry Track), 2024

When in Doubt, Cascade: Towards Building Efficient and Capable Guardrails

Manish Nagireddy, Inkit Padhi, Soumya Ghosh, Prasanna Sattigeri

arXiv preprint, 2024

WikiContradict: A Benchmark for Evaluating LLMs on Real-World Knowledge Conflicts from Wikipedia

Yufang Hou, Alessandra Pascale, Javier Carnerero-Cano, Tigran Tchrakian, Radu Marinescu, Elizabeth Daly, Inkit Padhi, Prasanna Sattigeri

arXiv preprint, 2024

Split, Unlearn, Merge: Leveraging Data Attributes for More Effective Unlearning in LLMs

Swanand Ravindra Kadhe, Farhan Ahmed, Dennis Wei, Nathalie Baracaldo, Inkit Padhi

arXiv preprint, 2024

Contextual Moral Value Alignment Through Context-Based Aggregation

Pierre Dognin, Jesus Rios, Ronny Luss, Inkit Padhi, Matthew D. Riemer, Miao Liu, Prasanna Sattigeri, Manish Nagireddy, Kush R. Varshney, Djallel Bouneffouf

arXiv preprint, 2024

Detectors for Safe and Reliable LLMs: Implementations, Uses, and Limitations

Swapnaja Achintalwar, Adriana Alvarado Garcia, Ateret Anaby-Tavor, Ioana Baldini, Sara E. Berger, Bishwaranjan Bhattacharjee, Djallel Bouneffouf, Subhajit Chaudhury, Pin-Yu Chen, Lamogha Chiazor, Elizabeth M. Daly, Kirushikesh DB, Rogerio Abreu de Paula, Pierre Dognin, Eitan Farchi, Soumya Ghosh, Michael Hind, Raya Horesh, George Kour, Ja Young Lee, Nishtha Madaan, Sameep Mehta, Erik Miehling, Keerthiram Murugesan, Manish Nagireddy, Inkit Padhi, David Piorkowski, Ambrish Rawat, Orna Raz, Prasanna Sattigeri, Hendrik Strobelt, Sarathkrishna Swaminathan, Christoph Tillmann, Aashka Trivedi, Kush R. Varshney, Dennis Wei, Shalisha Witherspoon, Marcel Zalmanovici

arXiv preprint, 2024

Alignment Studio: Aligning Large Language Models to Particular Contextual Regulations

Swapnaja Achintalwar, Ioana Baldini, Djallel Bouneffouf, Joan Byamugisha, Maria Chang, Pierre Dognin, Eitan Farchi, Ndivhuwo Makondo, Aleksandra Mojsilovic, Manish Nagireddy, Karthikeyan Natesan Ramamurthy, Inkit Padhi, Orna Raz, Jesus Rios, Prasanna Sattigeri, Moninder Singh, Siphiwe Thwala, Rosario A. Uceda-Sosa, Kush R. Varshney

IEEE Internet Computing, 2024

ComVas: Contextual Moral Values Alignment System

Inkit Padhi, Pierre Dognin, Jesus Rios, Ronny Luss, Swapnaja Achintalwar, Matthew Riemer, Miao Liu, Prasanna Sattigeri, Manish Nagireddy, Kush R. Varshney, Djallel Bouneffouf

International Joint Conference on Artificial Intelligence (Demo Track), 2024

Demo

Auditing and Generating Synthetic Data with Controllable Trust Trade-offs

Brian Belgodere, Pierre Dognin, Adam Ivankay, Igor Melnyk, Youssef Mroueh, Aleksandra Mojsilovic, Jiri Navratil, Apoorva Nitsure, Inkit Padhi, Mattia Rigotti, Jerret Ross, Yair Schiff, Radhika Vedpathak, Richard A. Young

IEEE Journal on Emerging and Selected Topics in Circuits and Systems, 2024

2023

The Impact of Positional Encoding on Length Generalization in Transformers

Amirhossein Kazemnejad, Inkit Padhi, Karthikeyan Natesan Ramamurthy, Payel Das, Siva Reddy

Advances in Neural Information Processing Systems, 2023

Influence Based Approaches to Algorithmic Fairness: A Closer Look

Soumya Ghosh, Prasanna Sattigeri, Inkit Padhi, Manish Nagireddy, Jie Chen

XAI in Action: Past, Present, and Future Applications, 2023

Reprogramming Pretrained Language Models for Antibody Sequence Infilling

Igor Melnyk, Vijil Chenthamarakshan, Pin-Yu Chen, Payel Das, Amit Dhurandhar, Inkit Padhi, Devleena Das

International Conference on Machine Learning, 2023

The Incentive Gap in Data Work in the Era of Large Models

Katy Ilonka Gero, Payel Das, Pierre Dognin, Inkit Padhi, Prasanna Sattigeri, Kush R. Varshney

Nature Machine Intelligence, 2023

Accelerating Material Design with the Generative Toolkit for Scientific Discovery

Matteo Manica, Jannis Born, Joris Cadow, Dimitrios Christofidellis, Ashish Dave, Dean Clarke, Yves Gaetan Nana Teukam, Giorgio Giannone, Samuel C. Hoffman, Matthew Buchan, Vijil Chenthamarakshan, Timothy Donovan, Hsiang Han Hsu, Federico Zipoli, Oliver Schilter, Akihiro Kishimoto, Lisa Hamada, Inkit Padhi, Karl Wehden, Lauren McHugh, Alexy Khrabrov, Payel Das, Seiji Takeda, John R. Smith

npj Computational Materials, 2023

Explainable Cross-Topic Stance Detection for Search Results

Tim Draws, Karthikeyan Natesan Ramamurthy, Ioana Baldini, Amit Dhurandhar, Inkit Padhi, Benjamin Timmermans, Nava Tintarev

Conference on Human Information Interaction and Retrieval, 2023

Cloud-Based Real-Time Molecular Screening Platform with MolFormer

Brian Belgodere, Vijil Chenthamarakshan, Payel Das, Pierre Dognin, Toby Kurien, Igor Melnyk, Youssef Mroueh, Inkit Padhi, Mattia Rigotti, Jarret Ross, Yair Schiff, Richard A. Young

Machine Learning and Knowledge Discovery in Databases, 2023

2022

Fair Infinitesimal Jackknife: Mitigating the Influence of Biased Training Data Points Without Refitting

Prasanna Sattigeri, Soumya Ghosh, Inkit Padhi, Pierre Dognin, Kush R. Varshney

Advances in Neural Information Processing Systems, 2022

Large-Scale Chemical Language Representations Capture Molecular Structure and Properties

Jerret Ross, Brian Belgodere, Vijil Chenthamarakshan, Inkit Padhi, Youssef Mroueh, Payel Das

Nature Machine Intelligence, 2022

Image Captioning as an Assistive Technology: Lessons Learned from VizWiz 2020 Challenge

Pierre Dognin, Igor Melnyk, Youssef Mroueh, Inkit Padhi, Mattia Rigotti, Jarret Ross, Yair Schiff, Richard A. Young, Brian Belgodere

Journal of Artificial Intelligence Research, 2022

2021

ReGen: Reinforcement Learning for Text and Knowledge Base Generation using Pretrained Language Models

Pierre Dognin, Inkit Padhi, Igor Melnyk, Payel Das

Conference on Empirical Methods in Natural Language Processing, 2021

Tabular Transformers for Modeling Multivariate Time Series

Inkit Padhi, Yair Schiff, Igor Melnyk, Mattia Rigotti, Youssef Mroueh, Pierre Dognin, Jerret Ross, Ravi Nair, Erik Altman

IEEE International Conference on Acoustics, Speech and Signal Processing, 2021

Accelerated Antimicrobial Discovery via Deep Generative Models and Molecular Dynamics Simulations

Payel Das, Tom Sercu, Kahini Wadhawan, Inkit Padhi, Sebastian Gehrmann, Flaviu Cipcigan, Vijil Chenthamarakshan, Hendrik Strobelt, Cicero dos Santos, Pin-Yu Chen, Yi Yan Yang, Jeremy P. K. Tan, James Hedrick, Jason Crain, Aleksandra Mojsilovic

Nature Biomedical Engineering, 2021

Generate Your Counterfactuals: Towards Controlled Counterfactual Generation for Text

Nishtha Madaan, Inkit Padhi, Naveen Panwar, Diptikalyan Saha

AAAI Conference on Artificial Intelligence, 2021

2020

DualTKB: A Dual Learning Bridge between Text and Knowledge Base

Pierre Dognin, Igor Melnyk, Inkit Padhi, Cicero Nogueira dos Santos, Payel Das

Conference on Empirical Methods in Natural Language Processing, 2020

Learning Implicit Text Generation via Feature Matching

Inkit Padhi, Pierre Dognin, Ke Bai, Cicero Nogueira dos Santos, Vijil Chenthamarakshan, Youssef Mroueh, Payel Das

Annual Meeting of the Association for Computational Linguistics, 2020

CogMol: Target-Specific and Selective Drug Design for COVID-19 Using Deep Generative Models

Vijil Chenthamarakshan, Payel Das, Samuel Hoffman, Hendrik Strobelt, Inkit Padhi, Kar Wai Lim, Benjamin Hoover, Matteo Manica, Jannis Born, Teodoro Laino, Aleksandra Mojsilovic

Advances in Neural Information Processing Systems, 2020

Alleviating Noisy Data in Image Captioning with Cooperative Distillation

Pierre Dognin, Igor Melnyk, Youssef Mroueh, Inkit Padhi, Mattia Rigotti, Jarret Ross, Yair Schiff

arXiv preprint, 2020

2019

Learning Implicit Generative Models by Matching Perceptual Features

Cicero Nogueira dos Santos, Youssef Mroueh, Inkit Padhi, Pierre Dognin

IEEE/CVF International Conference on Computer Vision, 2019

Sobolev Independence Criterion

Youssef Mroueh, Tom Sercu, Mattia Rigotti, Inkit Padhi, Cicero Nogueira dos Santos

Advances in Neural Information Processing Systems, 2019

Interactive Visual Exploration of Latent Space (IVELS) for Peptide Auto-Encoder Model Selection

Tom Sercu, Sebastian Gehrmann, Hendrik Strobelt, Payel Das, Inkit Padhi, Cicero Dos Santos, Kahini Wadhawan, Vijil Chenthamarakshan

ICLR Workshop, 2019

Generative Feature Matching Networks

Cicero Nogueira dos Santos, Inkit Padhi, Pierre Dognin, Youssef Mroueh

ICLR Workshop, 2019

2018

Fighting Offensive Language on Social Media with Unsupervised Text Style Transfer

Cicero Nogueira dos Santos, Igor Melnyk, Inkit Padhi

Annual Meeting of the Association for Computational Linguistics, 2018

Data Driven Techniques for Organizing Scientific Articles Relevant to Biomimicry

Yuanshuo Zhao, Ioana Baldini, Prasanna Sattigeri, Inkit Padhi, Yoong Keok Lee, Ethan Smith

AAAI/ACM Conference on AI, Ethics, and Society, 2018

PepCVAE: Semi-Supervised Targeted Design of Antimicrobial Peptide Sequences

Payel Das, Kahini Wadhawan, Oscar Chang, Tom Sercu, Cicero Dos Santos, Matthew Riemer, Vijil Chenthamarakshan, Inkit Padhi, Aleksandra Mojsilovic

arXiv preprint, 2018

2017

Improved Neural Text Attribute Transfer with Non-parallel Data

Igor Melnyk, Cicero Nogueira dos Santos, Kahini Wadhawan, Inkit Padhi, Abhishek Kumar

arXiv preprint, 2017

2016

Does String-Based Neural MT Learn Source Syntax?

Xing Shi, Inkit Padhi, Kevin Knight

Conference on Empirical Methods in Natural Language Processing, 2016