Currently in the process of developing a parser that generates an AST and then traversing it through different passes. The simplified AST structure is as follows:
type LiteralExpr = {
readonly kind: 'literal',
readonly value: number,
};
type UnaryExpr = {
readonly kind: 'unary',
readonly operator: '!' | '-',
readonly operand: Expr,
};
type BinaryExpr = {
readonly kind: 'binary',
readonly left: Expr,
readonly operator: '+' | '-' | '*' | '/',
readonly right: Expr,
};
/** Expression enclosed in parentheses */
type GroupingExpr = {
readonly kind: 'grouping',
readonly subExpr: Expr,
};
type Expr = LiteralExpr | UnaryExpr | BinaryExpr | GroupingExpr;
Each pass makes slight modifications to the AST, creating a new version. For example, there's a pass that removes grouping
nodes:
class ParensRemover {
doPass(expr: Expr): Expr {
switch (expr.kind) {
case 'literal': return expr;
case 'unary': return { ...expr, operand: this.doPass(expr.operand) };
case 'binary': return { ...expr, left: this.doPass(expr.left), right: this.doPass(expr.right) };
case 'grouping': return this.doPass(expr.subExpr);
}
}
}
However, the code becomes repetitive especially with numerous nodes, prompting a refactoring to a base recursive class utilizing the visitor pattern:
abstract class ASTVisitor {
doPass(expr: Expr): Expr {
switch (expr.kind) {
case 'literal': return this.visitLiteral(expr);
case 'unary': return this.visitUnary(expr);
case 'binary': return this.visitBinary(expr);
case 'grouping': return this.visitGrouping(expr);
}
}
protected visitLiteral(expr: LiteralExpr): Expr {
return expr;
}
protected visitUnary(expr: UnaryExpr): Expr {
return { ...expr, operand: this.doPass(expr.operand) };
}
protected visitBinary(expr: BinaryExpr): Expr {
return { ...expr, left: this.doPass(expr.left), right: this.doPass(expr.right) };
}
protected visitGrouping(expr: GroupingExpr): Expr {
return { ...expr, subExpr: this.doPass(expr.subExpr) };
}
}
class ParensRemover extends ASTVisitor {
protected visitGrouping(expr: GroupingExpr): Expr {
return this.doPass(expr.subExpr);
}
}
The challenge arises when subsequent passes after ParensRemover
need to handle the no longer existing node kind grouping
. This can lead to inefficiencies given the multiple node types undergoing changes throughout various passes, necessitating adjustments to the AST Expr
type:
type LiteralExpr = {
readonly kind: 'literal',
readonly value: number,
};
type UnaryExpr<Addition> = {
readonly kind: 'unary',
readonly operator: '!' | '-',
readonly operand: ExprBase<Addition>,
};
type BinaryExpr<Addition> = {
readonly kind: 'binary',
readonly left: ExprBase<Addition>,
readonly operator: '+' | '-' | '*' | '/',
readonly right: ExprBase<Addition>,
};
/** Parenthesized expression */
type GroupingExpr = {
readonly kind: 'grouping',
readonly subExpr: BeforeRemoveParensExpr,
};
type ExprBase<Addition> = LiteralExpr | UnaryExpr | BinaryExpr | Addition;
type BeforeRemoveParensExpr = ExprBase<GroupingExpr>;
type AfterRemoveParensExpr = ExprBase<never>;
This leads to the question of how the ASTVisitor
will determine the correct type. A possible solution attempted involves:
type AllExprs = BeforeRemoveParensExpr | AfterRemoveParensExpr;
type PickExpr<E extends AllExprs, K extends E['kind']> = /* details not important, this type pulls a specific kind out of Expr */;
abstract class ASTVisitor<InputExpr extends AllExprs, OutputExpr extends AllExprs> {
doPass(expr: InputExpr): OutputExpr {
switch (expr.kind) {
case 'literal': return this.visitLiteral(expr as any);
case 'unary': return this.visitUnary(expr as any);
case 'binary': return this.visitBinary(expr as any);
case 'grouping': return this.visitGrouping(expr as any);
}
}
protected visitLiteral(expr: PickExpr<InputExpr, 'literal'>) {
return expr as unknown OutputExpr;
}
protected visitUnary(expr: PickExpr<InputExpr, 'unary'>) {
return { ...expr, operand: this.doPass(expr.operand) } as unknown as OutputExpr;
}
protected visitBinary(expr: PickExpr<InputExpr, 'binary'>) {
return { ...expr, left: this.doPass(expr.left), right: this.doPass(expr.right) } as unknown as OutputExpr;
}
protected visitGrouping(expr: PickExpr<InputExpr, 'grouping'>) {
return { ...expr, subExpr: this.doPass(expr.subExpr) } as unknown as OutputExpr;
}
}
class ParensRemover extends ASTVisitor<BeforeRemoveParensExpr, AfterRemoveParensExpr> {
protected visitGrouping(expr: GroupingExpr): AfterRemoveParensExpr {
return this.doPass(expr.subExpr);
}
}
Despite this attempt, dissatisfaction remains over losing type safety and requiring multiple casts to any
within the ASTVisitor
. Overlooking a required override of visitX()
for a certain X
that should change between passes could result in program errors without compiler warnings.
The ultimate goal is to maintain TypeScript's safety while achieving the desired functionality. Modifications to the AST representation are feasible if necessary. Your insights and suggestions are greatly appreciated. Apologies for the lengthy post. Thank you!