Kishor Jothimurugan

Quantitative Researcher

Two Sigma

About

Hi! I am a Quantitative Researcher at Two Sigma. I earned my PhD in Computer and Information Science from the University of Pennsylvania. I was advised by Prof. Rajeev Alur. My research interests lie at the intersection of Formal Methods and Machine Learning. In particular, I am interested in Neurosymbolic Programming, Reinforcement Learning and Interpretable Machine Learning.

Interests

Formal Methods
Reinforcement Learning
Machine Learning

Education

PhD in Computer and Information Science

University of Pennsylvania
BSc in Mathematics and Computer Science

Chennai Mathematical Institute

PhD Research

Reinforcement Learning (RL) has been shown to be successful in many applications including robotics and game playing. However, existing approaches do not scale well to complex long-horizon tasks such as controlling an autonomous car to navigate a series of turns or stacking multiple blocks using a robotic arm. My research attempts to tackle such problems using techniques from formal methods. More specifically, my work spans across the following themes.

RL from logical specifications. Long-horizon tasks are challenging to express using Markovian rewards. This line of work focuses on designing RL algorithms for learning to perform tasks expressed in logical specification languages such as Linear Temporal Logic (LTL). I have contributed to the theoretical foundations of RL from LTL specifications (Festschrift 22). I have designed a composable specification language for specifying robotics tasks (NeurIPS 19) along with algorithms to train policies to satisfy such specifications (NeurIPS 21, CAV 22).
I co-presented a tutorial on this topic at AAAI 2023!

Compositional reinforcement learning. This direction of research aims to build RL algorithms for learning to perform long-horizon tasks by decomposing the given task into simpler subtasks. For example, such decompositions can be obtained using user provided state abstractions (AISTATS 21) or from the structure in a given logical specification (NeurIPS 21). Compositional RL algorithms also enable us to train policies that generalize to a wide variety of tasks (ICML 23).

Verification of neural network controllers. Verifying safety of neural policies trained using RL is a challenging problem and current techniques do not scale well to long horizons. I have developed a compositional verification framework that leverages existing techniques and inductive reasoning to scale verification to long (potentially infinite) horizons (EMSOFT 21).

Publications

Kishor Jothimurugan, Suguman Bansal, Osbert Bastani, Rajeev Alur (2025). Specification-Guided Reinforcement Learning. International Conference on Neuro-symbolic Systems (NeuS).

Cite PDF

Kishor Jothimurugan, Steve Hsu, Osbert Bastani, Rajeev Alur (2023). Robust Subtask Leaning for Compositional Generalization. International Conference on Machine Learning (ICML).

Cite PDF ArXiv Code

Rajeev Alur, Osbert Bastani, Kishor Jothimurugan, Mateo Perez, Fabio Somenzi, Ashutosh Trivedi (2023). Policy Synthesis and Reinforcement Learning for Discounted LTL. International Conference on Computer Aided Verification (CAV).

Cite PDF ArXiv

Rajeev Alur, Suguman Bansal, Osbert Bastani, Kishor Jothimurugan (2022). A Framework for Transforming Specifications in Reinforcement Learning. Springer Festschrift in honor of Prof. Tom Henzinger.

Cite PDF ArXiv Slides

Kishor Jothimurugan, Suguman Bansal, Osbert Bastani, Rajeev Alur (2022). Specification-Guided Learning of Nash Equilibria with High Social Welfare. International Conference on Computer Aided Verification (CAV).

Cite PDF ArXiv Code Slides

Kishor Jothimurugan, Suguman Bansal, Osbert Bastani, Rajeev Alur (2021). Compositional Reinforcement Learning from Logical Specifications. Advances in Neural Information Processing Systems (NeurIPS).

Cite PDF ArXiv Code

Radoslav Ivanov, Kishor Jothimurugan, Steve Hsu, Shaan Vaidya, Rajeev Alur, Osbert Bastani (2021). Compositional Learning and Verification of Neural Network Controllers. International Conference on Embedded Software (EMSOFT).

Cite PDF Code Slides

Kishor Jothimurugan, Osbert Bastani, Rajeev Alur (2021). Abstract Value Iteration for Hierarchical Reinforcement Learning. International Conference on Artificial Intelligence and Statistics (AISTATS).

Cite PDF ArXiv Code Slides Talk

Rajeev Alur, Yu Chen, Kishor Jothimurugan, Sanjeev Khanna (2020). Space-efficient Query Evaluation over Probabilistic Event Streams. ACM/IEEE Symposium on Logic in Computer Science (LICS).

Cite PDF Slides

Kishor Jothimurugan, Rajeev Alur, Osbert Bastani (2019). A Composable Specification Language for Reinforcement Learning Tasks. Advances in Neural Information Processing Systems (NeurIPS).

Cite PDF ArXiv Code Slides

Preprints

Kishor Jothimurugan, Matthew Andrews, Jeongran Lee, Lorenzo Maggi (2020). Learning Algorithms for Regenerative Stopping Problems with Applications to Shipping Consolidation in Logistics. Nokia Bell Labs Intern Report.

PDF ArXiv

Teaching

AAAI Tutorial on Specification-Guided Reinforcement Learning

The unprecedented proliferation of data-driven approaches, especially machine learning, has put the spotlight on building trustworthy AI through the combination of logical reasoning and machine learning. Reinforcement Learning from Logical Specifications is one such topic where formal logical constructs are utilized to overcome challenges faced by modern RL algorithms.

Feb 7, 2023

Internships

Applied Scientist Intern

Amazon Web Services, AI Labs

May 2022 – Aug 2022 New York, NY

I worked on a project on incorporating execution semantics during training of large language models for code generation. I explored different models for code generation that also predict traces obtained by running the generated code on various inputs. The trace prediction task is added as an auxiliary objective during training in order to improve semantic understanding of code by the model.

Research Intern

Bell Labs

May 2020 – Jul 2020 Remote

I explored applications of deep reinforcement learning and imitation learning in solving classical regenerative stopping problems and studied the effectiveness of machine learning based solutions for logistics optimization.

Software Engineering Intern

Amazon Web Services, Automated Reasoning Group

May 2019 – Aug 2019 Minneapolis, MN

I worked on automatically discovering sinks of sensitive data in Java code. Functions known as sinks are usually given as inputs to a taint analysis tool to check for security vulnerabilities in code. New techniques are needed to find such sinks since manual examination of code is close to impossible in large codebases. My project was on applying machine learning to identify sinks in the codebase of AWS storage services.