Deriving implementation design

文摘   2024-09-02 18:00   上海  

Mashkov Sergey 


华为编程语言实验室圣彼得堡研究所研究员


@Derive[ToString, Equatable]class C {    let a = 1}


Motivation


For frequently used interfaces one often need to write implementations for user defined types. Even though such implementations are very trivial, it takes time and sometimes it's a lot of "mechanical work", especially for so called data classes having a lot of fields. Such work is usually perceived as annoying and boring. Also some interfaces also require to implement a lot of functions, e.g. Comparable requires to implement 9 functions.


struct UserInfo <: Equatable<UserInfo> {    var login: String    var email: String    var address: String    var age: Int
public override operator func ==(other: UserInfo) { this.login == other.login && this.email == other.email && this.address == other.address && this.age == other.age }
public override operator func !=(other: UserInfo) { !(this == other) }}
enum Constraints <: Equatable<Constraints> & ToString { | North | West | East | South | Grid(Int)
public override operator func ==(other: Constraints) { match ((this, other)) { case (North, North) => true case (West, West) => true case (East, East) => true case (South, South) => true case (Grid(a), Grid(b)) where a == b => true case _ => false } }
public override operator func !=(other: Constraints) { !(this == other) }
public override func toString() { match (this) { case North => "North" case West => "West" case East => "East" case South => "South" case Grid(gap) => "Grid(${gap})" } }}


Many languages has solutions for the problem: language features, code processing tools or compiler plugins.
  • Java: Lombok (source processor)

  • Kotlin: data classes

  • C#: Equ (LINQ+runtime magic), Lombok.NET and others

  • Go: deriving

  • Groovy: groovy.transform.* transformers (compiler plugin)

  • Rust: derive

  • Haskell, OCaml: deriving


Overall architecture


The API part consists of data model that is constructed by input declarationand macro attributes and passed into deriving implementation. Impl core containsall the shared implementation that is common for all derivings. The macro packagecontains only macro declarations and basic stuff that can't be moved out ofthe package and it does delegate all the work to the core.

The core takes parsed input from macro, does analysis and validation, findsderiving implementations, does preparations and then invoke particular derivings.
The registry contains mappings between derived interfaces and derivingimplementations. Currently the mapping is hardcoded.


Stages


Combine


At this stage we take all options and combine them together also doing checking


 Lookup


At lookup stage we are looking for deriving implementations. For every interfacerequested to be derived we are looking into the derivings registry and findimplementations. At the moment the registry is hardcoded but in the futureit can be introduced a way to inject implementations to make derivingscustomizable.


 Resolve


For all fields missing types specified explicitly we do simple type resolutionbased on field initializer expressions. In the future when we will havelate macro, we will simply use types provided at the late stage.


Since we don't have late macro yet, we can temporarily implement simple typeresolution by evaluating initializer expression AST. This only works forexpressions consisting from literals and simple arithmetic operations and thiswould be enough for many cases. If the expression is too complicated or dependson unknown functions or properties then we give-up and produce error on thefield and demand user to specify the field type explicitly since it is allowedby the spec.


Verify


At this stage we do check types whenever possible for sub-typing. Every fieldtype need to be implementing the particular deriving interface we are generatingfor. For builtin types such as String, Int64 and others we do know theirtype hierarchy and can omit checks for ToString, Equatable and Comparable.


For unknown types since we have neither late macros nor compile-time reification,we produce "dead" code doing assignment to a local variable of the correspondingtype. We do use fields identifier token in the assignment expression so thatthe compiler will report error at the field itself if the type doesn't fit.

The example of the generated code would look like this (see function checkTypes):


@Derive[ToString]class C {    let x = createX() // we don't know the type of x}
// generated code:extend C <: ToString { public func toString(): ToString { /** implementation goes here **/ }
private func checkTypes() { let _: ToString = x // type check }}

If there are no fields to verify then the function checkTypes shouldn't begenerated. The actual name of the function should be different to avoid potentialclash with user codes namespace. If produced, the verification code is alsoappended during the next stage.


Generate


Obviously, at this stage we construct ast nodes, constructing implementationsand if necessary we also construct generic constraints. Collect list of extendnodes and after all we render them to tokens and concatenate them with theoriginal tokens. We never modify the original declaration tokens so no risk ofaccidental breaking user code.

What is important is that we do always try to produce as much code as possibleand do never do fail-fast. Instead we report diagnostics vis diagReport()facility and try to proceed. The reason why it's important is that the macrohave to complete without exceptions so that the compiler will be able tovisualize macro diagnostics properly and it will also get a chance to handleproduced type check functions that wouldn't appear if we just fail.


 Generic constraints


For every derived interface we have already found the deriving implementation.Each deriving implementation provides default upper bounds for generics anda function for constructing upper bounds with required generic arguments ifnecessary. For example, EquatableDeriving provides a way to construct genericsin form:


@Derive[Equatable]class C<T> { ... }
// generatedextend <T> C<T> <: Equatable<C<T>> where T <: Equatable<T> // this is constructed by the deriving implementation


The only particular deriving know how to construct the constraints for thespecific derived interface so we don't have any defaults in the derivings core.


 API


Data model



DerivingTarget contains an analysed target declaration enriched with usersettings and pure information for derivings, such as an array of named attributesthat should be considered during generation. On the contrary, classes ending withSettings contains only options from users without analysis.

All the structures are computed by the derivings core and passed into derivingimplementation.


Deriving core and implementations



The Deriving interface represents a deriving implementation. It may handleone or more interfaces and should be able to provide type information aboutinterfaces hierarchy that is used for sorting and combining derivings in thecases when deriving interfaces intersect (for example Equatable and Comparable).

GenericsInjector interface implementation is provided by deriving implementationand has two functions. injectGenerics should provide generic arguments forthe interface if required. constraintsFor should construct default genericconstraints.

All the functions are invoked by the deriving core.


Format


These are the following generation schemas for builtin derivings


ToString format


Examples:

User(name: "bot1")User()Server(host: "Venus", port: 10443)E.EnumMember1E.EnumMember2(773)E.EnumMember3(test, 123, named: 1) // 'named' stands for enum property
@Derive[ToString]class C {    let a = 1
@DeriveInclude prop b: String { get() { ... }}}
// generatedextend C <: ToString { public func toString(): String { let sb = StringBuilder()
sb.append("C(") sb.append("a = ") sb.append(a.toString()) sb.append(", b = ") sb.append(b.toString()) sb.append(")")
return sb.toString() }}


 Equatable and Comparable


@Derive[Comparable]class User {    var name = "root"}
// generated
extend User <: Equatable<User> { public operator func ==(other: User): Bool { if (name != other.name) { return false } return true }
public operator func !=(other: User): Bool { return !(this == other) }}
extend User <: Comparable<User> { public func compare(other: User): Ordering { match (name.compare(other.name)) { case Ordering.EQ => () case other => return other }
return Ordering.EQ }
public operator func <(other: User): Bool { return compare(other) == Ordering.LT }
public operator func <=(other: User): Bool { return compare(other) <= Ordering.EQ }
public operator func >(other: User): Bool { return compare(other) == Ordering.GT }
public operator func >=(other: User): Bool { return compare(other) >= Ordering.EQ }}


 Hashable


extend User <: Hashable {    public func hashCode(): Int64 {        var h = DefaultHasher()        h.write(name)        return h.finish()    }}


 Resource


@Derive[Resource]class Service {    let log = openFile(....)    let connection = openSocket(...)    let sql = openSql(connection)}
// generated
extend Service <: core.Resource { public func close(): Unit { try { tryClose(sql) } finally { try { tryClose(connection) } finally { tryClose(log) } } }
// these are generated helpers // used in close() and isClosed() private func tryClose(e: Resource) { e.close() }
private func tryClose<T>(_: T) { // not a resource }}



点击下方阅读全文,试用仓颉编程语言SDK

仓颉编程语言
仓颉编程语言官方公众号
 最新文章