Learning Semantic Program Embeddings

Learning distributed representations of source code has been a challenging task for machine learning models. Of late, Graph Neural Network (GNN) was proposed to learn embeddings of programs from their graph representations. However, GNN can suffer from precision issues, especially when dealing with programs rendered into large graphs.

We present a new graph neural architecture, called Graph Interval Neural Network (GINN), to tackle the weaknesses of the existing GNN. Unlike the standard GNN, GINN focuses exclusively on intervals (generally manifested in looping construct) for mining the feature representation of a program, furthermore, GINN operates on a hierarchy of intervals for scaling the learning to large graphs.

Assistant Researcher

My research interests mainly lies in software testing and program language, including static and dynamic analysis. Recently, I focus on improving the performance of program analysis by machine learning.