Introduction

Background Information

During the development process, it is common to modify shared code. for a simple modification, the impact can usually be manually controlled. However, if a large number of modifications are made at once or if this code is widely referenced in the project, it is difficult to determine its impact range through manual control or by using VSCode's search function. Therefore, a code analysis tool is needed.

Implementation

config

for example, as the simple demo shows in project, if we changed [Person, Address] in shared file (IMPORT_FILE_PATH), in project we imported and used [Person, Address] from shared dir, we need to scan all the places where used these two changes, and output the report (as below)

output

The general idea is to use git diff to obtain the modified code blocks, analyze the changed code through AST, and finally scan the referenced files to analyze where they are used.

Prerequisites

Tools For Node CLI

program: a library for building command-line interfaces with Node.js
chalk: a library to stylize terminal output with colors and formatting
ora: a library to create elegant terminal spinners
progress: a library to show a progress bar in the terminal

About AST

AST, or Abstract Syntax Tree, is an abstract representation of the syntax structure of source code. In frontend development, AST can help us with code analysis, code transformation, code optimization, and other operations. It can structure the source code and help us analyze the statements inside the source code.

AST Structure

Here is an example of a TypeScript file being transformed into an AST:

AST visualization: https://ts-ast-viewer.com/

It is able to structure the source code and help us analyze the statements within the source code.

Understanding common node types is important

About AST: Common Node Types

AST parse

ts-morph: a library to manipulate TypeScript ASTs programmatically

Here is an example of how to traverse and modify the AST (Abstract Syntax Tree) using TS-Morph:

const { Project, SyntaxKind } = require("ts-morph");

const project = new Project();
const sourceFile = project.addSourceFileAtPath("src/index.ts");

// Traverse the AST and find all variable declarations
sourceFile.forEachDescendant((node) => {
  if (node.getKind() === SyntaxKind.VariableDeclaration) {
    // Modify the variable name
    const variableName = node.getName();
    node.rename(`${variableName}Modified`);
  }
});

// Save the changes
project.saveSync();

Generate AST: In this example, we first create a new Projectinstance and load a source file using the addSourceFileAtPathmethod.

Traverse and Modify: Then we traverse the AST using the forEachDescendantmethod and check if each node is a VariableDeclaration.If it is, we modify the variable name by appending the string "Modified" to the original name using the renamemethod.

OutPut: Finally, we save the changes using the saveSyncmethod.

Process Design

Scan the changed files.
Analyze the changed code in those files.
Scan the files in the project where the changed code is used.
Analyze the code calls.
Output the resulting report.

Implementation Details Of Key Processes

Get the changed files and the changed codes

Git tools can help analyze which files have changed and which lines of code have been modified.

By analyzing the Abstract Syntax Tree (AST) based on the line number, you can determine the scope of the modified code, analyze the affected exported variables, and count the changed files and corresponding variables.

Code implementation

STEP1. Use git status to get the changed files.

STEP2. Use git diff to get the changed code lines (start, end).

STEP3. AST traversal, locate the declaration of the code based on the code line, and obtain the target definition.

finally output

Scan imports in TypeScript files, match changed files and changed variables.

There are various types of references used in our code.

To observe some patterns, paste this code into https://ts-ast-viewer.com/.

import foo from '../shared';
import * as foo from '../shared';
import { Address, Person as Person1 } from '@shared';

All referenced places are of the importDeclaration node type.
These references include both the file name and exported variables, which we need to match and analyze separately. The file name may have aliases and relative references, while the exported variables may include default exports, namespace exports, named exports, or alias exports.

Code implementation

STEP1. Recursively scan source code files
- Scan the files under the package directory with the following file types: [.js,.ts,.jsx,.tsx], while excluding the node_modules, __tests__, and assets folders.
STEP2. Check if a reference file is referencing the target file
- Retrieve the import declaration statement for the file.
```
const imports = sourceFile.getImportDeclarations();
```
- Traverse the statement and retrieve the ModuleSpecifierValue, such as '../shared', '@shared', and '~/test/shared/index.ts'.
```
for (const importDeclaration of imports) {
		const moduleSpecifier = importDeclaration.getModuleSpecifierValue() || ''; // ../shared'，'@shared'，'./test/shared/index.ts'       
    const isMatchedFile = isMatchedFilePath(importDeclaration);
      ....
}
```
- Determine whether the file matches the changed files in the scanned common files.
  - We need to determine whether it is an alias reference ('@shared',), an absolute path reference, or a relative path reference ('../shared',).

STEP3. Get the referenced variables from the referenced file

For example, we want to retrieve AFunc, BAlias, CFunc, DAlias from the referenced file. We should also indicate whether each variable is a default export, namespace export, or named export. If there is an alias, we should indicate the original name of the alias.

import AFunc from 'A-file' // DefaultImport
import * as BAlias from 'B-file' // NamespaceImport
import {CFunc, DFunc as DAlias } from 'B-file' // NamedImports

DefaultImport

(The functions of pos and end will be mentioned in the following text)

//  getDefaultImport
//  import AFunc from 'A-file';
const defaultImport = importClause.getDefaultImport();
if (defaultImport) {
   handleImports({
        useDefault: true,
        name: defaultImport.getText(), // AFunc
        pos: defaultImport.getPos(),
        end: defaultImport.getEnd(),
    });
}

NamespaceImport

// NamespaceImport
// import * as alias from 'A-file';
const namespaceImport = importClause.getNamespaceImport();
if (namespaceImport) {
    handleImports({
       useNameSpace: true,
       name: namespaceImport.getText(),
       pos: namespaceImport.getPos(),
       end: namespaceImport.getEnd(),
   });
}

NamedImports

// NamespaceImport
// import {A, B as b} from 'A-file';
const namedImports = importClause.getNamedImports();
namedImports.forEach((namedImport) => {
     const nameNode = namedImport.getNameNode();
     const aliasNode = namedImport.getAliasNode();
     handleImports({
        name: aliasNode?.getText() || nameNode?.getText(),
        originName: nameNode?.getText(),
        pos: namedImport.getPos(),
        end: namedImport.getEnd(),
     });
});

STEP4. Check if the node hits the change Node of changeFiles and output the result.

this.changedFiles?.forEach((changedFile) => {
    if (changedFile.startsWith(item.importPath)) {
         const changedNodes = this.changedFilesNodes?.[changedFile] || [];
          if (changedNodes.includes(item.originName || item.name)) {
              _addItem();
          }
     }
});

Analysis of code used

We aim to analyze the usage of collected variables in a file. How can we design a strategy for this? Let's take a look at the following code snippet.

Copy the code and paste it into https://ts-ast-viewer.com/.

import { Foo, Bar } from "./types";
import baz from "./baz";

function run(foo: Foo, bar: Bar) {
  return qux(foo) + bar.baz;

  function qux(foo: Foo) {
    return foo + 1;
  }
}

const myFoo: Foo = 1;
const myBar: Bar = { baz: 2 };

baz(myFoo, myBar);

There are two key points to consider:

All references are called through identifiers.
To distinguish whether it is the same identifier reference or a local variable with the same name, we can identify it using the pos of the identifier's symbol's declarations. As shown in the figure below, the pos and end of the declarations of symbols with the same reference are the same.

the pos of identifier from import

a local variable with the same name, has different pos and end

the references of identifier from import, The pos and end of the declarations of a symbol are the same as the references of the identifier.

Code implementation

STEP1. Traverse the AST nodes of the target source file

use forEachDescendant

sourceFile.forEachDescendant((node) => {
		...
})

STEP2. Get all Identifier nodes

if (node.getKind() !== SyntaxKind.Identifier) {
	 return;
}

STEP3. Check if the text name of the node matches the text name of the reference node

const name = node.getText();
   const matchImportItem = importItems[name];
   if (!matchImportItem) {
      return;
		}
}

STEP4. Exclude local variables with the same name that could cause interference

const symbol = node.getSymbol();
if (symbol) {
    const symbolDeclarations = symbol.getDeclarations();
    if (symbolDeclarations && symbolDeclarations.length > 0) {
          const symbolPos = symbolDeclarations[0].getPos();
          const symbolEnd = symbolDeclarations[0].getEnd();
          // Identifier symbol.declarations pos and end must be equal to importItem symbol.declarations pos and end
          if (
            // exclude the reference from importItem
            matchImportItem.symbolPos !== symbolPos &&
            // exclude local variable with the same name
            matchImportItem.pos == symbolPos &&
            matchImportItem.end == symbolEnd
          ) {
             .....
          }
   }
}

STEP5.Count the lines where the code is called and output the result as statistics.

const line = node.getStartLineNumber();
importItemMap[filePath]?.[name]?.callLines?.push(line);

Further optimization

Improve the matching of export code changes to cover more edge cases and achieve greater accuracy.
Conduct further analysis of the affected code to determine impact priorities. For example, prioritize function calls with high impact and type calls with low impact.
Visualize the results or create a VSCode plugin.