Codenote is a natural language toolkit built for programming languages. While a subset of AI known as natural language processing (NLP) has produced remarkable results, we haven’t even begun to scratch regarding what can be done with programming languages. Codenote aims to apply state-of-the-art NLP techniques to understand code at a semantic level rather than just a syntactical level. Once the code is processed and understood at a high-level, a variety of exciting applications can be made. One of the most exciting of which is helping the millions of people who are learning to code by providing them with code summaries of large code files saving them time. Also, veteran developers will be able to use an auto commenting feature that can automatically comment on their code as well as create a high-level code conversion platform between programming language (Google Translate for coding)! The main tools used for the project are the python programming language and its various libraries. In particular, the fastText library was used to convert extracted code comments, methods names, etc. into word embeddings (vectors), the Gensim library was used to visualize this multidimensional vector space and PyTorch library will be used to generate a custom deep learning model for each codebase (deep learning, app development, game, etc.). Once the deep learning model consistently and accurately predicts high-level coding features, a web app and browser extension will be made using React.js and Node.js.

What inspired you (or your team)?

As someone learning to code, I often found myself looking at other people’s code to learn from examples or understand the codebase of a library. I remember the frustration I feel when having to go through dozens of lines of someone else’s confusing that often lacks comments. This waste of time and effort makes the whole learning process confusing and inorganic. And then one day it hit me, why not use computers to understand programming language’s better? As a concept, it seems intuitive since humans are best at understanding human languages, computers should be best at understanding computer/programming languages. With my experience of building several projects in the artificial intelligence and machine learning space, I envisioned a platform or tool kit which would enable a variety of applications for natural programming language processing.

One of the applications that I personally passionate about is the code summarization tool, which would have coding beginners to understand various code faster and in a more natural manner. The exciting part is that I can help nearly all of the 700,000 students that learning to code every year and enable them to build unique and innovative solutions to solve some of the world’s most pressing problems. Apart from quick summaries, providing high-quality explanations to various codes can help accelerate their learning in a way that wouldn’t be possible with traditional techniques used to learn to code.