Introduction
Background Information
During the development process, it is common to modify shared code. for a simple modification, the impact can usually be manually controlled. However, if a large number of modifications are made at once or if this code is widely referenced in the project, it is difficult to determine its impact range through manual control or by using VSCode's search function. Therefore, a code analysis tool is needed.
Implementation
config
for example, as the simple demo shows in project, if we changed [Person, Address] in shared file (IMPORT_FILE_PATH), in project we imported and used [Person, Address] from shared dir, we need to scan all the places where used these two changes, and output the report (as below)
output
The general idea is to use git diff
to obtain the modified code blocks, analyze the changed code through AST, and finally scan the referenced files to analyze where they are used.
Prerequisites
Tools For Node CLI
program
: a library for building command-line interfaces with Node.jschalk
: a library to stylize terminal output with colors and formattingora
: a library to create elegant terminal spinnersprogress
: a library to show a progress bar in the terminal
About AST
AST, or Abstract Syntax Tree, is an abstract representation of the syntax structure of source code. In frontend development, AST can help us with code analysis, code transformation, code optimization, and other operations. It can structure the source code and help us analyze the statements inside the source code.
AST Structure
Here is an example of a TypeScript file being transformed into an AST:
AST visualization: https://ts-ast-viewer.com/
It is able to structure the source code and help us analyze the statements within the source code.
Understanding common node types is important
AST parse
- ts-morph: a library to manipulate TypeScript ASTs programmatically
Here is an example of how to traverse and modify the AST (Abstract Syntax Tree) using TS-Morph:
const { Project, SyntaxKind } = require("ts-morph");
const project = new Project();
const sourceFile = project.addSourceFileAtPath("src/index.ts");
// Traverse the AST and find all variable declarations
sourceFile.forEachDescendant((node) => {
if (node.getKind() === SyntaxKind.VariableDeclaration) {
// Modify the variable name
const variableName = node.getName();
node.rename(`${variableName}Modified`);
}
});
// Save the changes
project.saveSync();
Generate AST: In this example, we first create a new Project
instance and load a source file using the addSourceFileAtPath
method.
Traverse and Modify: Then we traverse the AST using the forEachDescendant
method and check if each node is a VariableDeclaration
.If it is, we modify the variable name by appending the string "Modified" to the original name using the rename
method.
OutPut: Finally, we save the changes using the saveSync
method.
Process Design
- Scan the changed files.
- Analyze the changed code in those files.
- Scan the files in the project where the changed code is used.
- Analyze the code calls.
- Output the resulting report.
Implementation Details Of Key Processes
Get the changed files and the changed codes
Git tools can help analyze which files have changed and which lines of code have been modified.
By analyzing the Abstract Syntax Tree (AST) based on the line number, you can determine the scope of the modified code, analyze the affected exported variables, and count the changed files and corresponding variables.
Code implementation
- STEP1. Use
git status
to get the changed files.
- STEP2. Use
git diff
to get the changed code lines (start, end).
- STEP3. AST traversal, locate the declaration of the code based on the code line, and obtain the target definition.
finally output
Scan imports in TypeScript files, match changed files and changed variables.
There are various types of references used in our code.
To observe some patterns, paste this code into https://ts-ast-viewer.com/.
import foo from '../shared';
import * as foo from '../shared';
import { Address, Person as Person1 } from '@shared';
- All referenced places are of the
importDeclaration
node type. - These references include both the file name and exported variables, which we need to match and analyze separately. The file name may have aliases and relative references, while the exported variables may include default exports, namespace exports, named exports, or alias exports.
Code implementation
- STEP1. Recursively scan source code files
- Scan the files under the
package
directory with the following file types:[.js,.ts,.jsx,.tsx]
, while excluding thenode_modules
,__tests__
, andassets
folders.
- Scan the files under the
- STEP2. Check if a reference file is referencing the target file
- Retrieve the import declaration statement for the file.
const imports = sourceFile.getImportDeclarations();
- Traverse the statement and retrieve the ModuleSpecifierValue, such as '../shared', '@shared', and '~/test/shared/index.ts'.
for (const importDeclaration of imports) { const moduleSpecifier = importDeclaration.getModuleSpecifierValue() || ''; // ../shared','@shared','./test/shared/index.ts' const isMatchedFile = isMatchedFilePath(importDeclaration); .... }
- Determine whether the file matches the changed files in the scanned common files.
- We need to determine whether it is an alias reference (
'@shared',
), an absolute path reference, or a relative path reference ('../shared',
).
- We need to determine whether it is an alias reference (
- STEP3. Get the referenced variables from the referenced file
For example, we want to retrieve AFunc, BAlias, CFunc, DAlias from the referenced file. We should also indicate whether each variable is a default export, namespace export, or named export. If there is an alias, we should indicate the original name of the alias.
import AFunc from 'A-file' // DefaultImport import * as BAlias from 'B-file' // NamespaceImport import {CFunc, DFunc as DAlias } from 'B-file' // NamedImports
DefaultImport
(The functions of pos and end will be mentioned in the following text)
// getDefaultImport // import AFunc from 'A-file'; const defaultImport = importClause.getDefaultImport(); if (defaultImport) { handleImports({ useDefault: true, name: defaultImport.getText(), // AFunc pos: defaultImport.getPos(), end: defaultImport.getEnd(), }); }
NamespaceImport
// NamespaceImport // import * as alias from 'A-file'; const namespaceImport = importClause.getNamespaceImport(); if (namespaceImport) { handleImports({ useNameSpace: true, name: namespaceImport.getText(), pos: namespaceImport.getPos(), end: namespaceImport.getEnd(), }); }
NamedImports
// NamespaceImport // import {A, B as b} from 'A-file'; const namedImports = importClause.getNamedImports(); namedImports.forEach((namedImport) => { const nameNode = namedImport.getNameNode(); const aliasNode = namedImport.getAliasNode(); handleImports({ name: aliasNode?.getText() || nameNode?.getText(), originName: nameNode?.getText(), pos: namedImport.getPos(), end: namedImport.getEnd(), }); });
- STEP4. Check if the node hits the change Node of changeFiles and output the result.
this.changedFiles?.forEach((changedFile) => { if (changedFile.startsWith(item.importPath)) { const changedNodes = this.changedFilesNodes?.[changedFile] || []; if (changedNodes.includes(item.originName || item.name)) { _addItem(); } } });
Analysis of code used
We aim to analyze the usage of collected variables in a file. How can we design a strategy for this? Let's take a look at the following code snippet.
Copy the code and paste it into https://ts-ast-viewer.com/.
import { Foo, Bar } from "./types";
import baz from "./baz";
function run(foo: Foo, bar: Bar) {
return qux(foo) + bar.baz;
function qux(foo: Foo) {
return foo + 1;
}
}
const myFoo: Foo = 1;
const myBar: Bar = { baz: 2 };
baz(myFoo, myBar);
There are two key points to consider:
- All references are called through identifiers.
- To distinguish whether it is the same identifier reference or a local variable with the same name, we can identify it using the
pos
of the identifier's symbol'sdeclarations
. As shown in the figure below, thepos
andend
of thedeclarations
of symbols with the same reference are the same.
the pos of identifier from import
a local variable with the same name, has different pos
and end
the references of identifier from import, The pos
and end
of the declarations
of a symbol are the same as the references of the identifier.
Code implementation
STEP1. Traverse the AST nodes of the target source file
use forEachDescendant
sourceFile.forEachDescendant((node) => {
...
})
STEP2. Get all Identifier nodes
if (node.getKind() !== SyntaxKind.Identifier) {
return;
}
STEP3. Check if the text name of the node matches the text name of the reference node
const name = node.getText();
const matchImportItem = importItems[name];
if (!matchImportItem) {
return;
}
}
STEP4. Exclude local variables with the same name that could cause interference
const symbol = node.getSymbol();
if (symbol) {
const symbolDeclarations = symbol.getDeclarations();
if (symbolDeclarations && symbolDeclarations.length > 0) {
const symbolPos = symbolDeclarations[0].getPos();
const symbolEnd = symbolDeclarations[0].getEnd();
// Identifier symbol.declarations pos and end must be equal to importItem symbol.declarations pos and end
if (
// exclude the reference from importItem
matchImportItem.symbolPos !== symbolPos &&
// exclude local variable with the same name
matchImportItem.pos == symbolPos &&
matchImportItem.end == symbolEnd
) {
.....
}
}
}
STEP5.Count the lines where the code is called and output the result as statistics.
const line = node.getStartLineNumber();
importItemMap[filePath]?.[name]?.callLines?.push(line);
Further optimization
- Improve the matching of export code changes to cover more edge cases and achieve greater accuracy.
- Conduct further analysis of the affected code to determine impact priorities. For example, prioritize function calls with high impact and type calls with low impact.
- Visualize the results or create a VSCode plugin.