47 Degrees joins forces with Xebia read more

Scala Macros - Annotate your case classes

Scala Macros - Annotate your case classes

Chances are, you occasionally need some way to apply a specific behavior to your classes—even more often if you’re managing your domain model.

Scala macros could be the perfect solution. Why? Let’s focus on an individual case to find a real-life example. Let’s assume that your application handles sensitive information and many of your case classes have been modeled with critical data.

What’s our first step? Should we separate our classes into two types? Or override the toString method in all our classes in order to prevent sensitive data from showing up in our system logs? No, there’s a better way.

The one I recommend is the use of annotations in our classes. Something like this:

@ToStringObfuscate("password")
case class TestWithObfuscation(username : String, password : String)

Here’s how to develop a ToStringObfuscate tag with Scala macros:

Scala Macros

When we talk about macros, we’re referring to something that’s running at compilation time. Scala Macros are related with the code invoked by the compiler at compilation time. At that point, the developer has access to compiler APIs so it’s possible to expand, generate, analyze and check your code. (You can learn more about macros in Scala Docs or on the Scala Macros website.)

Let’s Configure the sample project

We want to create a Macro that overrides the toString method and obfuscates all of those fields marked as sensitive in our case class. With macros there are several ways to achieve it, but in this case we are going to annotate the case class with all the fields that must be obfuscated.

In the example we’re using, we can develop it in two different sbt modules: one with the macro implementation and the other with some test cases. To do so, we need some special dependencies related to macros:

  • Scala Reflect: API that powers both compile-time metaprogramming (macros) and runtime metaprogramming (Java-style reflection) in Scala 2.10 and 2.11.
  • Macro paradise: compiler plugin for Scala 2.10 and Scala 2.11 that adds new functionality such as quasiquotes and macro annotations to scala.reflect.

Therefore, the root project could be configured as follows:

scalaVersion := "2.11.5"

name := "annotate-your-case-classes"

version := "1.0"

lazy val root = (project in file(".")).aggregate(macros, examples)

lazy val macros = project.in(file("modules/macros"))

lazy val examples = project.in(file("modules/examples")).dependsOn(macros)

On the other hand, modules/macros project could be configured in this way:

scalaVersion := "2.11.5"

name := "macros"

version := "1.0"

resolvers ++= Seq("snapshots", "releases").map(Resolver.sonatypeRepo)

libraryDependencies ++= Seq(
  "org.scala-lang" % "scala-reflect" % "2.11.5"
)

addCompilerPlugin("org.scalamacros" % "paradise" % "2.0.1" cross CrossVersion.full)

You’ll notice both special configuration keys commented above: the dependency on scala-reflect and the paradise compiler plugin.

And now, let’s Write the Macro

Before proceeding, make sure to keep compilation mode top of mind.

Macro annotations bring textual abstraction to the level of definitions. In our case, we are going to define an annotation called ToStringObfuscate, which receives one or more field names in the case class, indicating which fields should be obfuscated.

import scala.annotation.StaticAnnotation
import scala.language.experimental.macros
import scala.reflect.macros.whitebox

class ToStringObfuscate(fieldsToObfuscate: String*) extends StaticAnnotation {
  def macroTransform(annottees: Any*): Any = macro ToStringObfuscateImpl.impl
}

The macroTransform macro is designed to take a list of untyped annottees (represented as Any) and return one result (also of Any type). The implementation is delegated to a method called impl within a companion object ToStringObfuscateImpl.

So far so good? We’ve defined a macro annotation in a way that everything can be annotated–a class, a method, an object, a method parameter, etc. Now let’s see how to apply it to case classes specifically.

def impl(c: whitebox.Context)(annottees: c.Expr[Any]*): c.Expr[Any] = ???

The method signature is similar to what we saw above, however, you want to be sure you’re using the whitebox context–if you declare a macro annotation as blackbox, it will not work. (There’s more information related to contexts in Blackbox vs Whitebox.)

The use of Expr, which is a trait that defines strongly-typed tree wrappers and their operations in Scala Reflection, wraps an abstract syntax tree and tags it with its type.

For instance, the macro impl would stay only with those declarations defined as case classes:

annottees map (_.tree) toList match {
  case (classDecl: ClassDef) :: Nil => modifiedDeclaration(classDecl)
  case _ => c.abort(c.enclosingPosition, "Invalid annottee")
}

For each AST (Abstract Syntax Tree) representing the wrapped expression that matched with a class definition we are going to parse and process it with the following steps:

  • Extract the different parts of the case class.
  • Duplicate the params to fit them in the new class definition.
  • Create the new case class definition with the new toString method, leaving the remainder of the parts intact. (To accomplish this step we’ll use Quasiquotes; Quasiquotes let us manipulate Scala syntax trees with ease.)
  • Redefine the toString method.

These steps can be translated into the following piece of code:

def modifiedDeclaration(classDecl: ClassDef) = {
  val (className, fields, parents, body) = extractCaseClassesParts(classDecl)
  val newToString = extractNewToString(sensitiveFields)

  val params = fields.asInstanceOf[List[ValDef]] map { p => p.duplicate}

  c.Expr[Any](
    q"""
    case class $className ( ..$params ) extends ..$parents {
      $newToString
      ..$body
    }
  """
  )
}

When you wrap a snippet of code into q"..." quotations, it becomes a tree that represents the given snippet. Although they look like strings they operate on syntactic trees.

As you can see, we are putting things into quasiquotations with the help of $, which is really a way of unquoting. We use this special unquoting operation, called Splicing, which is necessary when we have a variable number of elements.

  • ..$ expects the argument to be an Iterable[Tree]. In our code, we have three use cases: the case class fields, the case class parents, and the case class body.
  • ...$ expects Iterable[Iterable[Tree]]. In our use case, we don’t need it.

To implement the extractCaseClassesParts method:

def extractCaseClassesParts(classDecl: ClassDef) = classDecl match {
  case q"case class $className(..$fields) extends ..$parents { ..$body }" =>
    (className, fields, parents, body)
}

Yet again we’re using quasiquotes, this time to parse a case class in order to extract the parts of its definition. In other words, we are deconstructing a tree using unquoting in a pattern match.

Now, we reach a defining moment in the task, we are going to learn how to obfuscate those sensitive fields:

val sensitiveFields = extractAnnotationParameters(c.prefix.tree)

val fieldReplacements = sensitiveFields map (replaceCaseClassSensitiveValues(_))

def extractAnnotationParameters(tree: Tree): List[c.universe.Tree] = tree match {
  case q"new $name( ..$params )" => params
  case _ => throw new Exception("ToStringObfuscate annotation must be have at least one parameter.")
}

def replaceCaseClassSensitiveValues(tree: Tree) = tree match {
  case Literal(Constant(field: String)) =>
    q"""
        ${TermName(field)} = com.fortysevendeg.macros.ToStringObfuscateImpl.obfuscateValue(this.${TermName(field)})
      """
  case _ => c.abort(c.enclosingPosition, s"[obfuscateValue] Match error with $tree")
}

Okay, keep calm, and let me explain:

  • In c.prefix.tree we are receiving all of the sensitive fields specified in the annotation. For instance, @ToStringObfuscate("password", "pinCode"), c.prefix.tree will be a tree with a structure similar to new ToStringObfuscate("password", "pinCode"), so unquoting it we can extract a parameters list (Tree expression list as well).
  • For each sensitive field we are going to replace them with a new obfuscated based * value, with a fixed length as we will soon see. We are talking about the call to the com.fortysevendeg.macros. ToStringObfuscateImpl.obfuscateValue function within the replaceCaseClassSensitiveValues method. The method body says something like this: given a Tree Expression related to one field to obfuscate, if it matches with Literal(Constant(field: String)), then we are going to re-define it as a new term with the same name but with an obfuscated value.

Literal and Constant are both AST extractors from Scala code. In this case, the AST extractors work together to get a field of one case class. In our example, we are dealing only with strings but we could expand it to other type cases.

We’ll conclude this part of our macro by mentioning a third matter which is also of significance: the use of TermName. TermName is a subtype of Name within the reflection API, that serves as a simple wrapper for strings. There is an additional subtype of Name, called TypeName. Therefore, with both subtypes we could get or define any term in our code at compilation time, which is precisely what we are doing in the function.

The helper function to obfuscate a field value could be defined as follows:

def obfuscateValue(value: String) = "*" * value.length

There are better algorithms to obfuscate values, but that’s not the objective of this post.

Finally, we’re seeing the code relative to the function that redefines the toString method in our case class:

def extractNewToString(sensitiveFields: List[Tree]) = q"""
   override def toString: ${typeOf[String]} = {
    scala.runtime.ScalaRunTime._toString(this.copy(..$fieldReplacements))
   }
"""

Once more, you’ll notice the use of the quasiquotes due to its simplicity. The trick here is the use of the ScalaRunTime object, which provides support methods required by the scala runtime, including the toString method.

Conclusion

That’s all—you can breathe again. This article represents only a particular use case. In real life—and on production systems—Scala macros might help you:

  • Develop your own Domain Specific Language (DSL).
  • Annotate your code to modify or expand it.
  • Perform static checks.
  • Generate code.
  • Develop for Android with Scala in a functional way, Macroid GitHub Project.

Scala macros might be a very useful technique to improve your software development, but remember, it’s still an experimental feature.

The Code

You can find all of the sample code used in this blog post here.

Ensure the success of your project

47 Degrees can work with you to help manage the risks of technology evolution, develop a team of top-tier engaged developers, improve productivity, lower maintenance cost, increase hardware utilization, and improve product quality; all while using the best technologies.