TreeGen
May 31, 2022 ยท View on GitHub
A Tree-Based Transformer Architecture for Code Generation.
Our paper is available at https://arxiv.org/abs/1911.09983. (Accepted by AAAI'20)
The Pytorch version is available at https://github.com/zysszy/TreeGen-Pytorch/tree/main.
News:
TreeGen has been successfully applied to
Code Search (OCoR: An Overlapping-Aware Code Retriever, ASE'20): pdf (https://arxiv.org/pdf/2008.05201.pdf), code (https://github.com/pkuzqh/OCoR)
and
Program Repair (A Syntax-Guided Edit Decoder for Neural Program Repair, ESEC/FSE'21): pdf (https://arxiv.org/pdf/2106.08253v1.pdf), code (https://github.com/pkuzqh/Recoder).
Usage
To train a new model
python3 run.py ATIS|HS-B
To predict the code (ATIS)
python3 predict.py ATIS 5
where 5 is the beam size.
To predict the code (HS)
python3 predict_HS-B.py HS-B 5
To check the results
python3 eval.py ATIS|HS-B
Dependenices
- NLTK 3.2.1
- Tensorflow 1.12.1
- Python 3.7
- Ubuntu 16.04
Attention Visualization
An example of the attention mechanism following the long-distance dependencies in the partial AST reader self-attention in layer 3. The node Name -> id (the parent node of player) in the line elif type(player.hero.power) is hearthbreaker.powers.MindSpike: denotes the information of the value of function type. As shown, this node relies on complex information, which contains the variables ( player and game, in the defination of function), the body of if statement, and the structure of elif. Different colored columns represent different heads.
An example of the attention mechanism in the partial AST reader self-attention in layer 1. The node Expr -> value in the line minion.damage(player.effective_spell_damage(3), self) decides the type of this statement, which mostly relies on the previous statements of the function use and the variable targets and minion as shown.