language/duchain
Definition-Use Chain Design
Overview | Design | Implementing | Using
Overview
The duchain is a sequence of contexts in a code file, and the associated definitions which occur in those contexts. A simplified way of thinking about it is that for each set of brackets (curly {} or not ()), there is a separate context. Each context is represented by a KDevelop::DUContext. Each context will have one parent context (except in the case of the top level context which has none), and any number of child contexts (including none). Additionally, each context can import any number of other contexts. The reason for this will become clear later. Thus, the KDevelop::DUContext structure resembles a directed acyclic graph, for those familiar with the concept.
Parsing
These DUContexts are created on the first pass after parsing the code to an AST (abstract syntax tree). Also, in this stage the data types are parsed, and any declarations which are encountered are recorded against the context in which they are encountered in. Each declaration is represented by a Declaration.
Parsing code is arranged into builder classes, which subclass the AST visitor pattern. They are designed to be able to subclass each other, thus achieving multiple goals with each pass (as described in the above paragraph).
For most languages, the first pass is accomplished by the AbstractContextBuilder, AbstractTypeBuilder, and AbstractDeclarationBuilder. The customised builder class is a subclass of each of these classes. Thus, in the first pass, the AbstractContextBuilder creates the DUContext tree, the AbstractTypeBuilder records which types are encountered, and the AbstractDeclarationBuilder creates Declaration instances which are associated with the current type and context.
The second pass is the creation of uses, accomplished a subclass of both the KDevelop::AbstractContextBuilder and the KDevelop::AbstractUseBuilder. On the second pass, we only iterate previously parsed contexts (as they are already created). Then, as variable uses are encountered, a KDevelop::Use is created for each. A KDevelop::Declaration is searched for in the current context, and if one is found, they are associated with each other.
Classes and their purposes
- KDevelop::DUChain - a global object which keeps track of all loaded source files and the top level context of their definition-use chains.
- KDevelop::DUContext - an object which represents a single context in a source file, and stores information about parent and child DUContexts, and Declarations, Definitions and Uses which occur in them. Also provides convenience methods for searching the chain.
- KDevelop::Declaration - an object which represents a single declaration. Has several subclasses which store more information specific to the type of declaration which is being represented.
- KDevelop::Definition - an object which represents a definition corresponding to a Declaration.
- KDevelop::Use - an object which represents a use of a particular declaration.
- KDevelop::SymbolTable - a hash which stores identifiers available in the top level context of a source file and their respective Declarations.
- KDevelop::*Builder - objects whose purpose is to iterate the parsed AST and produce instances of the duchain objects.
- KDevelop::AbstractType - the base class for types.
Definition-use chain searching
Because iterating a complete definition-use chain can become expensive when they are large, when a search is being performed (eg. for a declaration corresponding to a certain identifier) it is first performed up to the top level context, then the symbol table is consulted. The symbol table is a hash of all identifiers which are known to the entire duchain. All potential matches are evaluated to see if they are visible from the location of the use.
Locking
The duchain is designed to operate in a multithreaded environment. This means that multiple parse jobs may be operating simultaneously, reading from and writing to the duchain. Thus, locking is required.
A single read-write lock is used to serialise writes to the chain and allow concurrent reads. Thus, to call non-const methods, you must hold a write lock, and for const methods, a read lock. Customised read/write lockers have been created, called DUChainWriteLocker and DUChainReadLocker. You must not request a write lock while holding a read lock, or you could cause a deadlock.
Also, when manipulating text editor ranges, the KTextEditor::SmartInterface must be locked.
- Warning:
- You must never attempt to acquire the duchain read or write lock when holding the smart lock, else you may cause a deadlock. See code in KDevelop::AbstractContextBuilder::openContextInternal and KDevelop::DUChainBase.
Interface for plugins
As plugins will be accessing the KDevelop::DUChain from the main thread, they will need to hold a read lock. In order to be notified of changes to the KDevelop::DUChain, an observer interface is offered. See KDevelop::DUChainObserver.
Text editor integration
The main classes are subclasses of a base class, KDevelop::DUChainBase. This object holds a reference to the text range. When the source file is opened in an editor, the KDevelop::EditorIntegrator will create smart text ranges, which are bound to the editor's copy of the document. From there, highlighting can be applied to these ranges, as well as other advanced functions (see the KTextEditor documentation for possibilities). The language support will convert these ranges to smart ranges when the corresponding document is loaded into an editor.
Future features - ideas
The completed duchain should allow for code refactoring, intelligent navigation, improved automatic code generation (eg. "create switch statement"), context-sensitive code completion, integration of documentation, debugger integration, a code structure view, call graph, static code analysis etc.
KDE 4.4 API Reference