Kishor Jothimurugan

Kishor Jothimurugan

Quantitative Researcher

Two Sigma


Hi! I am a Quantitative Researcher at Two Sigma. I earned my PhD in Computer and Information Science from the University of Pennsylvania. I was advised by Prof. Rajeev Alur. My research interests lie at the intersection of Formal Methods and Machine Learning. In particular, I am interested in Neurosymbolic Programming, Reinforcement Learning and Interpretable Machine Learning.

  • Formal Methods
  • Reinforcement Learning
  • Machine Learning
  • PhD in Computer and Information Science

    University of Pennsylvania

  • BSc in Mathematics and Computer Science

    Chennai Mathematical Institute

PhD Research

Reinforcement Learning (RL) has been shown to be successful in many applications including robotics and game playing. However, existing approaches do not scale well to complex long-horizon tasks such as controlling an autonomous car to navigate a series of turns or stacking multiple blocks using a robotic arm. My research attempts to tackle such problems using techniques from formal methods. More specifically, my work spans across the following themes.

  • RL from logical specifications. Long-horizon tasks are challenging to express using Markovian rewards. This line of work focuses on designing RL algorithms for learning to perform tasks expressed in logical specification languages such as Linear Temporal Logic (LTL). I have contributed to the theoretical foundations of RL from LTL specifications (Festschrift 22). I have designed a composable specification language for specifying robotics tasks (NeurIPS 19) along with algorithms to train policies to satisfy such specifications (NeurIPS 21, CAV 22).
    I co-presented a tutorial on this topic at AAAI 2023!

  • Compositional reinforcement learning. This direction of research aims to build RL algorithms for learning to perform long-horizon tasks by decomposing the given task into simpler subtasks. For example, such decompositions can be obtained using user provided state abstractions (AISTATS 21) or from the structure in a given logical specification (NeurIPS 21). Compositional RL algorithms also enable us to train policies that generalize to a wide variety of tasks (ICML 23).

  • Verification of neural network controllers. Verifying safety of neural policies trained using RL is a challenging problem and current techniques do not scale well to long horizons. I have developed a compositional verification framework that leverages existing techniques and inductive reasoning to scale verification to long (potentially infinite) horizons (EMSOFT 21).


(2023). Robust Subtask Leaning for Compositional Generalization. International Conference on Machine Learning (ICML).

Cite PDF ArXiv Code

(2023). Policy Synthesis and Reinforcement Learning for Discounted LTL. International Conference on Computer Aided Verification (CAV).

Cite PDF ArXiv

(2022). A Framework for Transforming Specifications in Reinforcement Learning. Springer Festschrift in honor of Prof. Tom Henzinger.

Cite PDF ArXiv Slides

(2022). Specification-Guided Learning of Nash Equilibria with High Social Welfare. International Conference on Computer Aided Verification (CAV).

Cite PDF ArXiv Code Slides

(2021). Compositional Reinforcement Learning from Logical Specifications. Advances in Neural Information Processing Systems (NeurIPS).

Cite PDF ArXiv Code

(2021). Compositional Learning and Verification of Neural Network Controllers. International Conference on Embedded Software (EMSOFT).

Cite PDF Code Slides

(2021). Abstract Value Iteration for Hierarchical Reinforcement Learning. International Conference on Artificial Intelligence and Statistics (AISTATS).

Cite PDF ArXiv Code Slides Talk

(2020). Space-efficient Query Evaluation over Probabilistic Event Streams. ACM/IEEE Symposium on Logic in Computer Science (LICS).

Cite PDF Slides

(2019). A Composable Specification Language for Reinforcement Learning Tasks. Advances in Neural Information Processing Systems (NeurIPS).

Cite PDF ArXiv Code Slides


(2020). Learning Algorithms for Regenerative Stopping Problems with Applications to Shipping Consolidation in Logistics. Nokia Bell Labs Intern Report.



Amazon Web Services, AI Labs
Applied Scientist Intern
Amazon Web Services, AI Labs
May 2022 – Aug 2022 New York, NY
I worked on a project on incorporating execution semantics during training of large language models for code generation. I explored different models for code generation that also predict traces obtained by running the generated code on various inputs. The trace prediction task is added as an auxiliary objective during training in order to improve semantic understanding of code by the model.
Bell Labs
Research Intern
Bell Labs
May 2020 – Jul 2020 Remote
I explored applications of deep reinforcement learning and imitation learning in solving classical regenerative stopping problems and studied the effectiveness of machine learning based solutions for logistics optimization.
Amazon Web Services, Automated Reasoning Group
Software Engineering Intern
Amazon Web Services, Automated Reasoning Group
May 2019 – Aug 2019 Minneapolis, MN
I worked on automatically discovering sinks of sensitive data in Java code. Functions known as sinks are usually given as inputs to a taint analysis tool to check for security vulnerabilities in code. New techniques are needed to find such sinks since manual examination of code is close to impossible in large codebases. My project was on applying machine learning to identify sinks in the codebase of AWS storage services.