Dependency Management: Typescript Code Analysis

Posted on Fri, Mar 24, 2023 AST typescript Node cli

Introduction

Background Information

During the development process, it is common to modify shared code. for a simple modification, the impact can usually be manually controlled. However, if a large number of modifications are made at once or if this code is widely referenced in the project, it is difficult to determine its impact range through manual control or by using VSCode's search function. Therefore, a code analysis tool is needed.

Implementation

config

for example, as the simple demo shows in project, if we changed [Person, Address] in shared file (IMPORT_FILE_PATH), in project we imported and used [Person, Address] from shared dir, we need to scan all the places where used these two changes, and output the report (as below)

output

The general idea is to use git diff to obtain the modified code blocks, analyze the changed code through AST, and finally scan the referenced files to analyze where they are used.

Prerequisites

Tools For Node CLI

About AST

AST, or Abstract Syntax Tree, is an abstract representation of the syntax structure of source code. In frontend development, AST can help us with code analysis, code transformation, code optimization, and other operations. It can structure the source code and help us analyze the statements inside the source code.

AST Structure

Here is an example of a TypeScript file being transformed into an AST:

AST visualization: https://ts-ast-viewer.com/

It is able to structure the source code and help us analyze the statements within the source code.

Understanding common node types is important

AST parse

Here is an example of how to traverse and modify the AST (Abstract Syntax Tree) using TS-Morph:

const { Project, SyntaxKind } = require("ts-morph");

const project = new Project();
const sourceFile = project.addSourceFileAtPath("src/index.ts");

// Traverse the AST and find all variable declarations
sourceFile.forEachDescendant((node) => {
  if (node.getKind() === SyntaxKind.VariableDeclaration) {
    // Modify the variable name
    const variableName = node.getName();
    node.rename(`${variableName}Modified`);
  }
});

// Save the changes
project.saveSync();

Generate AST: In this example, we first create a new Projectinstance and load a source file using the addSourceFileAtPathmethod.

Traverse and Modify: Then we traverse the AST using the forEachDescendantmethod and check if each node is a VariableDeclaration.If it is, we modify the variable name by appending the string "Modified" to the original name using the renamemethod.

OutPut: Finally, we save the changes using the saveSyncmethod.

Process Design

  1. Scan the changed files.
  2. Analyze the changed code in those files.
  3. Scan the files in the project where the changed code is used.
  4. Analyze the code calls.
  5. Output the resulting report.

Implementation Details Of Key Processes

Get the changed files and the changed codes

Git tools can help analyze which files have changed and which lines of code have been modified.

By analyzing the Abstract Syntax Tree (AST) based on the line number, you can determine the scope of the modified code, analyze the affected exported variables, and count the changed files and corresponding variables.

Code implementation

finally output

Scan imports in TypeScript files, match changed files and changed variables.

There are various types of references used in our code.

To observe some patterns, paste this code into https://ts-ast-viewer.com/.

import foo from '../shared';
import * as foo from '../shared';
import { Address, Person as Person1 } from '@shared';

Code implementation

Analysis of code used

We aim to analyze the usage of collected variables in a file. How can we design a strategy for this? Let's take a look at the following code snippet.

Copy the code and paste it into https://ts-ast-viewer.com/.

import { Foo, Bar } from "./types";
import baz from "./baz";

function run(foo: Foo, bar: Bar) {
  return qux(foo) + bar.baz;

  function qux(foo: Foo) {
    return foo + 1;
  }
}

const myFoo: Foo = 1;
const myBar: Bar = { baz: 2 };

baz(myFoo, myBar);

There are two key points to consider:

  1. All references are called through identifiers.
  2. To distinguish whether it is the same identifier reference or a local variable with the same name, we can identify it using the pos of the identifier's symbol's declarations. As shown in the figure below, the pos and end of the declarations of symbols with the same reference are the same.

the pos of identifier from import

a local variable with the same name, has different pos and end

the references of identifier from import, The pos and end of the declarations of a symbol are the same as the references of the identifier.

Code implementation

STEP1. Traverse the AST nodes of the target source file

use forEachDescendant

sourceFile.forEachDescendant((node) => {
		...
})

STEP2. Get all Identifier nodes

if (node.getKind() !== SyntaxKind.Identifier) {
	 return;
}

STEP3. Check if the text name of the node matches the text name of the reference node

const name = node.getText();
   const matchImportItem = importItems[name];
   if (!matchImportItem) {
      return;
		}
}

STEP4. Exclude local variables with the same name that could cause interference

const symbol = node.getSymbol();
if (symbol) {
    const symbolDeclarations = symbol.getDeclarations();
    if (symbolDeclarations && symbolDeclarations.length > 0) {
          const symbolPos = symbolDeclarations[0].getPos();
          const symbolEnd = symbolDeclarations[0].getEnd();
          // Identifier symbol.declarations pos and end must be equal to importItem symbol.declarations pos and end
          if (
            // exclude the reference from importItem
            matchImportItem.symbolPos !== symbolPos &&
            // exclude local variable with the same name
            matchImportItem.pos == symbolPos &&
            matchImportItem.end == symbolEnd
          ) {
             .....
          }
   }
}

STEP5.Count the lines where the code is called and output the result as statistics.

const line = node.getStartLineNumber();
importItemMap[filePath]?.[name]?.callLines?.push(line);

Further optimization