Codegen is built on top of
Tree-sitter and
rustworkx and has implemented most
language server features from scratch.
Codegen is open source. Check out the source
code to learn more!
The Codebase Graph
At the heart of Codegen is a comprehensive graph representation of your code. When you initialize a Codebase, it performs static analysis to construct a rich graph structure connecting code elements:Building the Graph
Codegen’s graph construction happens in two stages:- AST Parsing: We use Tree-sitter as our foundation for parsing code into Abstract Syntax Trees. Tree-sitter provides fast, reliable parsing across multiple languages.
- Multi-file Graph Construction: Custom parsing logic, implemented in rustworkx and Python, analyzes these ASTs to construct a more sophisticated graph structure. This graph captures relationships between symbols, files, imports, and more.
Performance Through Pre-computation
Pre-computing a rich index enables Codegen to make certain operations very fast that that are relevant to refactors and code analysis:- Finding all usages of a symbol
- Detecting circular dependencies
- Analyzing the dependency graphs
- Tracing call graphs
- Static analysis-based code retrieval for RAG
- …etc.
Pre-parsing the codebase enables constant-time lookups rather than requiring
re-parsing or real-time analysis.
Multi-Language Support
One of Codegen’s core principles is that many programming tasks are fundamentally similar across languages. Currently, Codegen supports:Learn about how Codegen handles language specifics in the Language
Support guide.
Build with Us
Codegen is just getting started, and we’re excited about the possibilities ahead. We enthusiastically welcome contributions from the community, whether it’s:- Adding support for new languages
- Implementing new analysis capabilities
- Improving performance
- Expanding the API
- Adding new transformations
- Improving documentation