An overview of the 2016 Scala Days Berlin
- •
- June 21, 2016
- •
- news• spark• scala• scala days• events
- |
- 14 minutes to read.

Scala Days Berlin wrapped up last week and was another tremendous showing from the community during three days of excellent presentations, news, networking, and demos of upcoming projects and developments.
47 Degrees had several projects highlighted including:
-
ScalaDex: The Scala Library Index serves as a map of the known Scala ecosystem and contains over 2500 projects. We’ve been working in-conjunction with the Scala Center on this project and are responsible for the design and html of the platform.
-
Scala Excercises V.2: Scala Exercises is an Open Source project for learning different technologies based in the Scala Programming Language. Version 2.0 includes a bevy of new features including the ability to save progress across devices, additional koans, better evaluation, and the ability to write exercises in code. Stay tuned for the official launch on 6/29/16.
Presentation Overviews:
Here are the takeaways from a few of the presentations members of our team attended:
Roll your own shapeless
- By Daniel Spiewak of Verizon
- Summary provided by Diego Alonso:
shapeless, is a Scala
library that aims at providing idioms for type level programming
(TLP), which Daniel Spiewak defines as “using the type system as a programming language”. To introduce us to the shapeless
code, Daniel takes us through the process of reimplementing HList
from scratch, without anyone needing to know about shapeless
before.
HList
is a core abstraction in shapeless
that allows for the representation of
all tuple and case
-class types. In his talk, Daniel presents
shapeless
features for HList
by using the analogy with the
value-level List
type for finite lists.
- Type constructors: much like
Nil
andCons
for finite list values, allHList
types are built with theHNil
andHCons
type constructors. - Type functions: just as we use the
++
operation to append two value-level lists, shapeless provides theAppend
type function as an alias of theHList
that results from concatenating twoHList
. - Maps: just as
map
for transform aList
value to another, aMapper
transforms oneHList
type into another. Whereasmap
takes a function of values to values,Mapper
needs an instance oftrait Poly
, a type-level pairing of source types to target types and conversion value-level functions. - Natural indices: some
HList
functions likenth
orremove
need to take a number as a type parameter. Since Scala can not make integer values as a type,shapeless
uses a Peano-like encoding of natural numbers, and some macros for syntactic sugar of literals into these types.
During the talk, Daniel discusses the differences between some of the features of the Scala type system, and when and how to use them. He begins by comparing type functions against implicit values (including typeclasses) on their flexibility, context-independence, or expressivity, and he mentions how these differences are like those “between lambda-calculus and Prolog”. He then compares type parameters against type members, in regards to how the compiler handles them and where they can be used.
In conclusion, Daniel has shown us that type-level programming
in Scala is “easier that it sounds”. In regards to the shapeless
code,
he helps newcomers to distinguish the essential from the accidental complexity, i.e., the basic concepts and their definition from the verbose constructions needed to work around infamous compiler bugs.
The slides for Daniel’s talk can be found here: Roll your own shapeless
Dotty Linker: Precise Types Bring Performance
- By Dmitry Petrashko a PhD Student at EPFL
- Summary provided by Diego Alonso:
Dmitry Petrashko developed ScalaBlitz, an alternative, better-performing collections library and works on making Scala programs faster. In this talk, he spoke about making the dotty
linker, a next-generation Scala compiler, produce faster code.
Scala is more concise and expressive than Java, but that comes at the cost of performance. As an example, Dmitry shows how a Scala program runs much slower than the latter. The reason, he explained, was that the Scala program heavily uses the compiled code of the Scala libraries. These libraries support the functional programming features, such as generics or higher-order functions, that make Scala more expressive. However, since the Java Virtual Machine lacks these features, the Scala compiler has to translate the Scala libraries code to a lower form. For the example’s program to use the generic bytecode of the libraries, the compiler has to add some additional machine operations in the program’s compiled code.
Dmitry went on to discuss a major optimization technique, called type
specialization. The essence of this method is to copy the binary
code of a generic library to a more efficient non-generic version that
only works with a particular type. The problem with this technique is
that there can be an exponential number of specializations, so it must
be used sparingly. In the current Scala compiler, the programmer
controls this using the @specialized
annotation. Dmity explained
that the dotty
linker would use a global analysis to detect what
specializations would be needed for a particular program.
After that, Dmitry spoke about another optimization that would appear
in dotty
: Rewrite Rules. Rewrite Rules would locate those points in
the program’s code that use a less efficient method or solution
instead of a more efficient one, and automatically rewrites the
compiled code to use the most efficient one. Unlike macros, an
experimental feature of Scala which may be discarded soon, rewrite
rules do not affect the mechanism of typing or compilation, so they
may be applied without being noticed, and they could be used more
extensively than macros.
Final words? dotty
is still in early development and they welcome contributions to the compiler.
View his slides here: Dotty Linker: Precise Types Bring Performance
Scala: The Unpredicted Lingua Franca for Data Science
- By Deam Wampler of Lightbend and Andy Petrella of Data Fellas
- Summary provided by Diego Alonso:
In this talk, Dean Wampler and Andy Petrella explained how and why Scala and Spark, along with the Spark Notebook, have become the primary language and tool for developing Data Science applications.
Before Spark, work involving Data Science was done in Python, R, or Matlab. Dean first pointed out that companies wanted production systems to be run on the Java Virtual Machine (JVM), which was regarded as a well-proven and enterprise-ready platform. So much so, that data science programs used to be prototyped in R or Python but developed in Java. Dean then discussed how Scala was better than Java. For one, Scala is a functional language that allows serializing and sending functions across nodes. When working with lots of data split among nodes, sending the function to the nodes with the data is cheaper than sending the data to the node with the function. Another reason is that Scala syntax for tuples is concise and looks like the mathematical notation, whereas the use of pattern matching for value inspection has no equal in R or Python. Also, type inference allows writing type-safe code without polluting it with type annotations.
Andy first recalled that, with regards to data science, Scala lacked
the kind tools for rapid prototyping, interactive data
exploration, and visualization that existed for R or Python.
To bridge this gap, he developed the Spark Notebook, a web-based user
interface for writing and running Spark programs. Andy gave a tour
of the features of the Notebook including the use of inferred
types to keep track of the transformations done to the data. Whenever
possible, the Spark Notebook can show the results of Spark operations,
not as an anamorph text dump, but as tables, chart graphs, pie graphs,
time series, or geocharts. For such libraries as the MLLib, other
graphical representations are available. On top of that, the notebook
takes care of concerns such as jar dependencies or managing the
execution of several spark shells (kernels). All of these features
make the Spark Notebook as good of an interactive tool as those that
existed for Python, R, or Matlab.
View the slides for their talk here: Scala: The Unpredicted Lingua Franca for Data Science.
Monitoring Reactive Applications
- By: Henrik Engström and Duncan DeVore of Lightbend
- Summary provided by Juan Manuel Méndez:
Henrik Engström and Duncan DeVore started their talk by explaining the various types of monitoring that could be done in an application. They each spent time focusing on different aspects including business level, functional use cases, and monitoring at the system level.
In the last few decades, applications have evolved with the economic aspect of cloud computing and has allowed them to scale in bandwidth, CPU power, and storage capabilities. Now, distributed and asynchronous applications are becoming ubiquitous and traditional monitoring (println of stack traces) are no longer valid.
Lightbend’s Reactive Platform knew that receiving quaility information about what’s happening in a message driven application would require another approach. The company went on to create the Typesafe Console in response to the Dapper paper from Google. The metrics were saved in a database and queried later. However, because of a slow notification approach, the console is no longer recommended. Now, Lightbend provides another type of monitoring where the events are internally saved in the JVM and can be forwarded to monitoring tools like Takipi, Grafana, StatsD.
They also mentioned how to mitigate the cost of monitoring when using short-lived actors, or applications that use few actors, since monitoring them could provide less insight. A possible solution is to use a delta approach and not generate the information inside the actor. The collected data can later be processed with machine learning techniques for anomaly detection. Having an algorithm to accomplish that is better than defining thresholds because this crude approach stopped being valid when migrating or scaling the application into faster systems.
Lightbend’s monitoring allows a dynamic configuration to follow the actors wherever there is an interest in the information gleaned. They also mentioned Zipkin, a distributed tracing system also based on the Google Dapper paper.
They ended the talk by providing information on the recommended configuration for Dispatcher and Thread pools, as well as the best practices to follow when defining a monitoring policy that can help the DevOps diagnose different systems.
You can view the video from this talk here (NYC Version): Monitoring Reactive Applications.
Principles of Elegance
- Jon Pretty of the Scala Center and Propensive
- Summary provided by Juan Manuel Méndez:
This talk reviewed best practices to navigate between the expressive and safe features of Scala to create an application with code that is not only elegant and easy to read, but that’s safe with the assistance of the compiler.
We can start with the obvious best practices like using appropriate names. He recommends reading Haoyi Li’s blog post, Strategic Scala Style: Conciseness & Names for more about this best practice.
Another solid practice to follow is making sure to keep a smaller set of APIs. Smaller APIs are easier to maintain and adjust and, by keeping them simple, we can use them for easier composing. He also recommends not using terms you don’t want the end user to see. Including avoiding the use of structural types like Option, Either or tuples when we can use case classes instead. This means, avoiding primitive types and Strings and defining our own terms.
Using value classes can also help because that allows us to detect errors at compile time. That means you should promote the values that the compiler cannot recognize as valid or not, to types.
To put this in context, and show how this looks in code, he uses examples from his library, Rapture.
He also shared interesting tips for printing out nice error messages, when errors are occurring with types. This can be done by implementing a low priority conversion of a generic source to our generic expected type with a macro, then print the error and force the macro to fail.
The main takeaway: Embrace the typesystem!
You can view the presentation (NYC version) here: Principles of Elegance.
Implementing Microservices with Scala and Akka
- Vaughn Vernon of for {comprehension}
- Summary provided by Juan Manuel Méndez:
Vaughn Vernon begins his talk with an introduction to microservices while following the tenets that all reactive applications should follow, at the same time.
He explains how, by using a ubiquitous language that models the business, we can quickly identify how entities interact through events, messages, and commands. Domain experts must define this relation. The command is the stimulus that our service would receive, and a domain event can be generated and forwarded to another service, where they are classified by topics. Afterward, these topics can be queried or further forwarded to other areas that have an interest in the information of these events.
Another point he emphasizes is that it should be clear that microservices own databases. They must follow the Command Query Responsibility Segregation (CQRS), so the code is well organized.
Actors can be seen as asynchronous services, and as a tip to work with them, he recommends the analogy of “Rivers, Rapids and Ponds” from a talk by Fred George.
He concludes by running a sample application that combines Scala and Akka so we can see it in action.
You can view the presentation (NYC version) here: Implementing Microservices with Scala and Akka.
Being Creative with Genetic Algorithms & Typeclasses
- Noel Markham of 47 Degrees
- Summary provided by 47 Degrees:
In his presentation, Noel Markham discusses how Typeclasses are a hidden gem of the Scala language providing an immense power not seen in imperative languages. This means their approach might be unusual or alien to those approaching Scala from an imperative background. He goes on to show how typeclasses allow developers to effectively attach their own interfaces to code written by others.
He reviews what a genetic algorithm is and provides an example implementation in Scala. Using this implementation, he demonstrates how to define a specific typeclass for the example problem. Noel then derives several different implementations, showing how to get rock solid confidence in testing the algorithm - with the help of ScalaCheck - and then provides a completely different typeclass to provide a fun, visual and creative solution, illustrating the iterations and improvements as the genetic algorithm’s fitness function runs.
You can view the presentation (NYC version) here: Being Creative with Genetic Algorithms & Typeclasses.
Interviews:
We spoke with a few speakers about Scala, Spark, and future developments during the event. You can watch those videos here:
We’re already looking forward to Scala Days 2017, but in the meantime, you can catch our team at these functional programming events.