An overview of the 2016 Scala Days Berlin

An overview of the 2016 Scala Days Berlin

Scala Days Berlin wrapped up last week and was another tremendous showing from the community during three days of excellent presentations, news, networking, and demos of upcoming projects and developments.

47 Degrees had several projects highlighted including:

  • ScalaDex: The Scala Library Index serves as a map of the known Scala ecosystem and contains over 2500 projects. We’ve been working in-conjunction with the Scala Center on this project and are responsible for the design and html of the platform.

  • Scala Excercises V.2: Scala Exercises is an Open Source project for learning different technologies based in the Scala Programming Language. Version 2.0 includes a bevy of new features including the ability to save progress across devices, additional koans, better evaluation, and the ability to write exercises in code. Stay tuned for the official launch on 6/29/16.

Presentation Overviews:

Here are the takeaways from a few of the presentations members of our team attended:

Roll your own shapeless

Roll your own shapeless

  • By Daniel Spiewak of Verizon
  • Summary provided by Diego Alonso:

shapeless, is a Scala library that aims at providing idioms for type level programming (TLP), which Daniel Spiewak defines as “using the type system as a programming language”. To introduce us to the shapeless code, Daniel takes us through the process of reimplementing HList from scratch, without anyone needing to know about shapeless before.

HList is a core abstraction in shapeless that allows for the representation of all tuple and case-class types. In his talk, Daniel presents shapeless features for HList by using the analogy with the value-level List type for finite lists.

  • Type constructors: much like Nil and Cons for finite list values, all HList types are built with the HNil and HCons type constructors.
  • Type functions: just as we use the ++ operation to append two value-level lists, shapeless provides the Append type function as an alias of the HList that results from concatenating two HList.
  • Maps: just as map for transform a List value to another, a Mapper transforms one HList type into another. Whereas map takes a function of values to values, Mapper needs an instance of trait Poly, a type-level pairing of source types to target types and conversion value-level functions.
  • Natural indices: some HList functions like nth or remove need to take a number as a type parameter. Since Scala can not make integer values as a type, shapeless uses a Peano-like encoding of natural numbers, and some macros for syntactic sugar of literals into these types.

During the talk, Daniel discusses the differences between some of the features of the Scala type system, and when and how to use them. He begins by comparing type functions against implicit values (including typeclasses) on their flexibility, context-independence, or expressivity, and he mentions how these differences are like those “between lambda-calculus and Prolog”. He then compares type parameters against type members, in regards to how the compiler handles them and where they can be used.

In conclusion, Daniel has shown us that type-level programming in Scala is “easier that it sounds”. In regards to the shapeless code, he helps newcomers to distinguish the essential from the accidental complexity, i.e., the basic concepts and their definition from the verbose constructions needed to work around infamous compiler bugs.

The slides for Daniel’s talk can be found here: Roll your own shapeless


Dotty Linker

Dotty Linker: Precise Types Bring Performance

  • By Dmitry Petrashko a PhD Student at EPFL
  • Summary provided by Diego Alonso:

Dmitry Petrashko developed ScalaBlitz, an alternative, better-performing collections library and works on making Scala programs faster. In this talk, he spoke about making the dotty linker, a next-generation Scala compiler, produce faster code.

Scala is more concise and expressive than Java, but that comes at the cost of performance. As an example, Dmitry shows how a Scala program runs much slower than the latter. The reason, he explained, was that the Scala program heavily uses the compiled code of the Scala libraries. These libraries support the functional programming features, such as generics or higher-order functions, that make Scala more expressive. However, since the Java Virtual Machine lacks these features, the Scala compiler has to translate the Scala libraries code to a lower form. For the example’s program to use the generic bytecode of the libraries, the compiler has to add some additional machine operations in the program’s compiled code.

Dmitry went on to discuss a major optimization technique, called type specialization. The essence of this method is to copy the binary code of a generic library to a more efficient non-generic version that only works with a particular type. The problem with this technique is that there can be an exponential number of specializations, so it must be used sparingly. In the current Scala compiler, the programmer controls this using the @specialized annotation. Dmity explained that the dotty linker would use a global analysis to detect what specializations would be needed for a particular program.

After that, Dmitry spoke about another optimization that would appear in dotty: Rewrite Rules. Rewrite Rules would locate those points in the program’s code that use a less efficient method or solution instead of a more efficient one, and automatically rewrites the compiled code to use the most efficient one. Unlike macros, an experimental feature of Scala which may be discarded soon, rewrite rules do not affect the mechanism of typing or compilation, so they may be applied without being noticed, and they could be used more extensively than macros.

Final words? dotty is still in early development and they welcome contributions to the compiler.

View his slides here: Dotty Linker: Precise Types Bring Performance


Data Science

Scala: The Unpredicted Lingua Franca for Data Science

  • By Deam Wampler of Lightbend and Andy Petrella of Data Fellas
  • Summary provided by Diego Alonso:

In this talk, Dean Wampler and Andy Petrella explained how and why Scala and Spark, along with the Spark Notebook, have become the primary language and tool for developing Data Science applications.

Before Spark, work involving Data Science was done in Python, R, or Matlab. Dean first pointed out that companies wanted production systems to be run on the Java Virtual Machine (JVM), which was regarded as a well-proven and enterprise-ready platform. So much so, that data science programs used to be prototyped in R or Python but developed in Java. Dean then discussed how Scala was better than Java. For one, Scala is a functional language that allows serializing and sending functions across nodes. When working with lots of data split among nodes, sending the function to the nodes with the data is cheaper than sending the data to the node with the function. Another reason is that Scala syntax for tuples is concise and looks like the mathematical notation, whereas the use of pattern matching for value inspection has no equal in R or Python. Also, type inference allows writing type-safe code without polluting it with type annotations.

Andy first recalled that, with regards to data science, Scala lacked the kind tools for rapid prototyping, interactive data exploration, and visualization that existed for R or Python.
To bridge this gap, he developed the Spark Notebook, a web-based user interface for writing and running Spark programs. Andy gave a tour of the features of the Notebook including the use of inferred types to keep track of the transformations done to the data. Whenever possible, the Spark Notebook can show the results of Spark operations, not as an anamorph text dump, but as tables, chart graphs, pie graphs, time series, or geocharts. For such libraries as the MLLib, other graphical representations are available. On top of that, the notebook takes care of concerns such as jar dependencies or managing the execution of several spark shells (kernels). All of these features make the Spark Notebook as good of an interactive tool as those that existed for Python, R, or Matlab.

View the slides for their talk here: Scala: The Unpredicted Lingua Franca for Data Science.

Monitoring Reactive Applications

Monitoring Reactive Applications

Henrik Engström and Duncan DeVore started their talk by explaining the various types of monitoring that could be done in an application. They each spent time focusing on different aspects including business level, functional use cases, and monitoring at the system level.

In the last few decades, applications have evolved with the economic aspect of cloud computing and has allowed them to scale in bandwidth, CPU power, and storage capabilities. Now, distributed and asynchronous applications are becoming ubiquitous and traditional monitoring (println of stack traces) are no longer valid.

Lightbend’s Reactive Platform knew that receiving quaility information about what’s happening in a message driven application would require another approach. The company went on to create the Typesafe Console in response to the Dapper paper from Google. The metrics were saved in a database and queried later. However, because of a slow notification approach, the console is no longer recommended. Now, Lightbend provides another type of monitoring where the events are internally saved in the JVM and can be forwarded to monitoring tools like Takipi, Grafana, StatsD.

They also mentioned how to mitigate the cost of monitoring when using short-lived actors, or applications that use few actors, since monitoring them could provide less insight. A possible solution is to use a delta approach and not generate the information inside the actor. The collected data can later be processed with machine learning techniques for anomaly detection. Having an algorithm to accomplish that is better than defining thresholds because this crude approach stopped being valid when migrating or scaling the application into faster systems.

Lightbend’s monitoring allows a dynamic configuration to follow the actors wherever there is an interest in the information gleaned. They also mentioned Zipkin, a distributed tracing system also based on the Google Dapper paper.

They ended the talk by providing information on the recommended configuration for Dispatcher and Thread pools, as well as the best practices to follow when defining a monitoring policy that can help the DevOps diagnose different systems.

You can view the video from this talk here (NYC Version): Monitoring Reactive Applications.


Principles of Elegance

Principles of Elegance

This talk reviewed best practices to navigate between the expressive and safe features of Scala to create an application with code that is not only elegant and easy to read, but that’s safe with the assistance of the compiler.

We can start with the obvious best practices like using appropriate names. He recommends reading Haoyi Li’s blog post, Strategic Scala Style: Conciseness & Names for more about this best practice.

Another solid practice to follow is making sure to keep a smaller set of APIs. Smaller APIs are easier to maintain and adjust and, by keeping them simple, we can use them for easier composing. He also recommends not using terms you don’t want the end user to see. Including avoiding the use of structural types like Option, Either or tuples when we can use case classes instead. This means, avoiding primitive types and Strings and defining our own terms.

Using value classes can also help because that allows us to detect errors at compile time. That means you should promote the values that the compiler cannot recognize as valid or not, to types.

To put this in context, and show how this looks in code, he uses examples from his library, Rapture.

He also shared interesting tips for printing out nice error messages, when errors are occurring with types. This can be done by implementing a low priority conversion of a generic source to our generic expected type with a macro, then print the error and force the macro to fail.

The main takeaway: Embrace the typesystem!

You can view the presentation (NYC version) here: Principles of Elegance.


Implementing Microservices

Implementing Microservices with Scala and Akka

Vaughn Vernon begins his talk with an introduction to microservices while following the tenets that all reactive applications should follow, at the same time.

He explains how, by using a ubiquitous language that models the business, we can quickly identify how entities interact through events, messages, and commands. Domain experts must define this relation. The command is the stimulus that our service would receive, and a domain event can be generated and forwarded to another service, where they are classified by topics. Afterward, these topics can be queried or further forwarded to other areas that have an interest in the information of these events.

Another point he emphasizes is that it should be clear that microservices own databases. They must follow the Command Query Responsibility Segregation (CQRS), so the code is well organized.

Actors can be seen as asynchronous services, and as a tip to work with them, he recommends the analogy of “Rivers, Rapids and Ponds” from a talk by Fred George.

He concludes by running a sample application that combines Scala and Akka so we can see it in action.

You can view the presentation (NYC version) here: Implementing Microservices with Scala and Akka.


Genetic Algorithms & Typeclasses

Being Creative with Genetic Algorithms & Typeclasses

  • Noel Markham of 47 Degrees
  • Summary provided by 47 Degrees:

In his presentation, Noel Markham discusses how Typeclasses are a hidden gem of the Scala language providing an immense power not seen in imperative languages. This means their approach might be unusual or alien to those approaching Scala from an imperative background. He goes on to show how typeclasses allow developers to effectively attach their own interfaces to code written by others.

He reviews what a genetic algorithm is and provides an example implementation in Scala. Using this implementation, he demonstrates how to define a specific typeclass for the example problem. Noel then derives several different implementations, showing how to get rock solid confidence in testing the algorithm - with the help of ScalaCheck - and then provides a completely different typeclass to provide a fun, visual and creative solution, illustrating the iterations and improvements as the genetic algorithm’s fitness function runs.

You can view the presentation (NYC version) here: Being Creative with Genetic Algorithms & Typeclasses.


Interviews:

We spoke with a few speakers about Scala, Spark, and future developments during the event. You can watch those videos here:

 

We’re already looking forward to Scala Days 2017, but in the meantime, you can catch our team at these functional programming events.

blog comments powered by Disqus

Ensure the success of your project

47 Degrees can work with you to help manage the risks of technology evolution, develop a team of top-tier engaged developers, improve productivity, lower maintenance cost, increase hardware utilization, and improve product quality; all while using the best technologies.