Chapter 4. Language Summaries

The rest of this report examines the purpose of each language and how some of its features reflect that purpose.

Crystal

Although the proponents of many languages claim good performance, developers who make speed of execution a high priority still fall back on C or one of its derivatives. Crystal is the latest attempt to achieve both performance and the highly structured, compact code permitted by Ruby. Thus, Crystal is procedural and object oriented. It allows for both explicit typing and type inference, like many scripting languages, offering the fool-proof typing of traditional compiled languages and the simplicity of classic scripting languages.

The resemblance to Ruby can be seen in such syntax features as the do loops in the following calls to a database. The activities in this snippet will be familiar to anyone who has accessed a database from a programming language using something like the Open Database Connectivity standard:

DB.open "sqlite3://./data.db" do |db|                 1
  db.exec "CREATE TABLE contacts (name VARCHAR(30), age INT)"
  db.exec "INSERT INTO contacts VALUES (?, ?)", "Frank", 30
  db.exec "INSERT INTO contacts VALUES (?, ?)", "Alexa", 33

  db.query "SELECT name, age FROM contacts" do |rows| 2
    rows.each do
      name, age = rows.read(String, Int32)
      person = Person.new(name)
      person.age = age

      people << person                                3
    end
  end
end
1

The first line opens a connection to an SQLite database and launches the outer loop. This loop runs four SQL statements, the last one launching the middle loop.

2

Because a SELECT query can return multiple rows, a third inner loop runs to read the fields from each row into variables. Note that calls can return multiple variables; in this case, the two values returned by read are stored into name and age.

3

The final line of the innermost loop pushes a variable onto an array by using the << operator.

All of the common features of an object-oriented language are present in Crystal: inheritance, overloaded functions, chained method invocations, and so on. Concurrency is supported through some unusual features. These include fibers, which are independent tasks that cooperate somewhat like real-time tasks on an embedded system, and bidirectional channels.

Other notable syntax features include the following:

  • A simple interface for invoking C functions

  • Generics, which let you define behaviors and pass them to many classes, as with Java interfaces

  • First-class functions that can be passed to other functions through blocks, and closure to allow functions to be called with values that were previously set

  • The ability to pass by value or by reference

  • Macros, which are code expanded at compile-time instead of runtime

  • Packages (under the term shard) that group code for distribution

  • A testing library that promotes behavior-driven development, a support for robust project management and communications

Designers also added features in the hope of making it easy to create domain-specific languages on top of Crystal.

Elixir

Elixir takes the principles of functional programming about as far as any high-level scripting language. We can consider it an alternative syntax for Erlang, offering most of the same design and programming features but in a much simpler format. The Erlang community, eager to drive wider adoption of the functional programming model they have been promoting since the invention of Erlang in the 1980s, have enthusiastically backed Elixir.

Elixir is so strict that it doesn’t even contain loops: no for, no foreach, no while. If you want to run a function over every element of an array, use an iterator. For other types of iterative coding, use recursion.

This calls for a radical relearning for programmers accustomed to procedural languages. Let’s look at a trivial program that finds the power of 2 equal to or greater than an input number. Such a function can be useful: for instance, memory allocators always round up requests to the next power of 2 and allocate memory in blocks of 4096, 8192, and so on.

A procedural language such as Python would accomplish the task through a loop. The following code checks two corner cases (numbers less than or equal to 0, and 1).1 It then runs the loop, which shifts a bit left until the right power of 2 is found:

def find_power(i):

  if i <=0: raise "Cannot find a power for a non-positive number"
  elif i==1: return 1
  else:

    p = 1
    while p < i:
      p = p<<1
    return p

Elixir uses recursion even for this trivial operation. Note, in the following code, that the problem is solved through three tiny functions that subdivide the work. The following code was provided by Jay Hayes:

defmodule FindPower do
  use Bitwise

  def find(i) when i > 0 do    1
    find(i, 1)                 2
  end

  def find(i, p) when p < i do 3
    find(i, p <<< 2)           4
  end

  def find(_, p) do            5
    p
  end
end
1

This version of the find function runs when there is a single input. Thus, it’s the version called by the program that wants to use the function.

2

Start recursion with an initial value of 1 for p.

3

This version of the function is kicked off by the first version.

4

The function runs recursively, doubling p (by shifting left) until it exceeds the value submitted by the user.

5

This version of the function runs when the desired power is found. It ends the recursion by returning the power p.

This tiny example illustrates the principle that functional programming encourages the creation of multiple small functions.

Going into more depth about functional programming in general, and the Erlang model reflected by Elixir in particular, would go beyond the scope of this report. We end this section by mentioning that Elixir provides tools for project management through a build tool called Mix and a testing kit called ExUnit.

Elm

The Document Object Model (DOM) of HTML is fearsome, so it was not long after the release of JavaScript that libraries emerged to cut down the tedium and frustration engendered by such ordinary tasks as adding an element to a page or checking the values in a form. Angular and React are currently the most popular libraries for creating web pages. Now, Elm aims to provide a common, robust platform for the different elements of web page development, offering all the elements developers need in a new language with strong functional elements. Through adherence to functional principles, and strong typing, it makes sure that web developers cover edge cases, avoiding many common errors.

Elm simply defines an object for every HTML element (div for a <div> tag and so on) as well as a method to handle every JavaScript operation on DOM elements (onClick, onVisibilityChange, and so on). An accompanying CSS library offers an Elm interface to all the CSS attributes, so that you can use Elm to construct all the elements of a web page. Then you compile it to JavaScript. Elm also can be called within JavaScript.

The hurdle Elm must overcome is imposing a whole new syntax and design philosophy on busy web programmers. They almost certainly know HTML, CSS, and JavaScript already, and also realize that these skills open up doors everywhere in web development. Elm is asking them to learn something new that might or might not be transferable.

Furthermore, JavaScript has moved from the web page to the server through Node.js. Programming on the web server has bedeviled programmers for as long as programming the web client. Ruby on Rails was the first major framework to offer simplified programming on the server (by providing a strict set of conventions that work efficiently if followed religiously), and it was quickly imitated by frameworks and libraries in all major languages. But as soon as Node.js came along, it zoomed in popularity because it offered the unmatched advantage of opening up server programming to web developers who already were familiar with JavaScript as a tool for the client. The focus in Elm is on the client, and it does not help programmers conquer the server as JavaScript does through Node.js.

Elm overcomes these limitations by offering a robust functional interface, inspired by the Haskell programming language. If you’re one of the programmers who is intrigued by functional programming and has decided it will produce better programs, you will find some common features in Elm: complex data types, asynchronous tasks, recursion to implement iteration, standardized error handling through Maybe and Result, and so on. Elm organizes web pages around the classic MVC structure (which it calls model, view, and update) and offers optimizations such as JavaScript minification.

The Elm program that follows (provided by Daniel Hinojosa) shows several interesting features. Its task is simply to capitalize the first letter of a string. It does so by treating the string as a cons, a data type that might be recalled (fondly or unfondly) by anyone exposed to the classic Lisp programming language. Basically, languages like Lisp routinely process a linked list by popping off its first element (the head). All of the other elements in the list remain as the tail, which can be processed recursively until all have been handled. Here’s how it works in Elm:

  • The uncons function does the extraction, outputting the head and tail as two values.

  • The cons function does the reverse, taking an element and a list as inputs, and outputting a new list with the first input as its new head.

Here’s the full source code:

capFirst : String -> String         1
capFirst s =
    case uncons s of                2
        Just ( h, t ) ->            3
            cons (Char.toUpper h) t 4

        Nothing ->                  5
            ""
1

Strong typing is provided by the function prototype. In this case, the function accepts one string and outputs another string.

2

The uncons treats the string as a list, separating the head and tail. It will return a false value if the list is empty.

3

This line runs when the output of uncons is two values, meaning that the string contained at least one character. (A null tail is perfectly acceptable, but there must be at least one character in order to have a head and a tail.) This line assigns the initial character to h (for head) and the rest of the string to t (for tail).

4

The toUpper method capitalizes the character in h, after which the cons function reattaches the tail. Because the combined string is the output of this function, and the function is the last statement in the block, the string is returned to the caller.

5

This block shows the usefulness of the Nothing concept mentioned in “Syntax”. If the string passed is empty, uncons returns Nothing and this block runs. It prevents a failure that could abort the program. Instead, an empty string is returned.

Julia

Julia is an enormous language with features to match all tastes. It incorporates a number of functional elements, like Elixir and Elm, but also offers for and while loops. For concurrent programming, Julia offers asynchronous tasks and message passing, as well as the more traditional model of threads and mutexes. It has macros and an extended kind of introspection under the term metaprogramming. At the same time, Julia developers claim that it offers performance comparable to C.

All of this is in the service of mathematical, scientific, and statistical programming. It’s therefore likely that some statisticians and data scientists who have checked out MATLIB, Python, and R for these tasks will consider Julia.

In fulfillment of this mission, Julia offers a deep well of resources for matrix manipulation. It standardizes a missing value (a None or Nothing in other languages) but calls it a missing and builds in numerous features to manipulate it like a value. This is offered to help run statistics over datasets that are missing values, as nearly all do. Julia’s designers try to anticipate how a statistician would deal with a missing value, so you can run arrays with missing values through various mathematical and comparison operations.2

A central programming model in Julia is multiple dispatch, which is similar to polymorphism (method overloading) but more flexible because the version of the function to run is chosen dynamically at runtime. Thus, you can pass a variety of values of different data types at runtime and launch a customized function designed for those data types. In the following hypothetical code, the first process_inp function will run whenever the first argument passed is a 64-bit floating-point number and the second is a 64-bit integer. If two arguments of other data types are provided, the second function will run and figure out how to handle them:

process_inp( x::Float64, y::Int64 ) ...

process_inp( x, y ) ...

Kotlin

After the functional excursion represented by the previous sections on Elixir, Elm, and Julia, we turn to a language with more conventional, procedural elements. Kotlin’s major selling point is its use for Android programming. It was adopted by Google for this purpose because it strikes the right balance in several areas: a relatively traditional syntax and design that Java programmers can learn quickly, but a number of functional and modern features, as well, and support for the JVM that makes it easy to integrate Java code written for Android. In a parallel universe, Android might be running Scala programs instead of Kotlin, because the two languages share many traits.

Google announced in May 2019 that “Android development will become increasingly Kotlin-first”, a testimonial to Kotlin’s technical strengths.3

However, Kotlin is turning up in many general-purpose environments, as well. It can produce output in many formats besides the JVM: JavaScript and native binaries for popular platforms. Developers are currently working on multiplatform programming, a way for Kotlin source code to wrap implementations compiled for different platforms. Kotlin integrates with Spring and can be used to develop many server applications through Ktor.

Among the features that Kotlin shares with many other languages, coroutines and asynchronous calls are of major importance in Android. Like Julia, Kotlin gives programmers a choice: both a high-level asynchronous abstraction and a low-level threads interface.

As with Apple’s Swift language (see “Environment”), Kotlin must adapt to the same design choices that Java programmers face in Android. Thus, Android requires all user interaction—the display of buttons and tabs, the handling of taps, and so on—to be performed on a single thread. Other tasks such as network code must run on background threads. These threading requirements can significantly increase the complexity of Java code.

Kotlin coroutines greatly simplify the code and use threads more efficiently by allowing multiple pieces of concurrent code to share a thread, or spawn new threads. Asynchronous UI tasks can automatically be scheduled to run on the UI thread, and non-UI tasks can run in the background.

The snippet of code that follows was provided by Kenneth Kousen from his Kotlin Cookbook, and is part of a program that checks time zones. The snippet illustrates Kotlin’s use of collections (a basic design pattern), method chaining (a common technique used in many object-oriented languages), and higher-order functions (an essential functional technique):

val zones = ZoneId.getAvailableZoneIds()
    .filter { regex.matches(it) }
    .map { instant.atZone(ZoneId.of(it)) }
    .sortedBy { it.offset.totalSeconds }
    .toList()

This sequence of chained method calls uses the val construct to set zones as an immutable variable, an important and frequently used feature.

Method chaining involves the dots (periods) that come before getAvailableZoneIds, filter, map, and so on. Each of those methods must work with the ZoneId module. The getAvailableZoneIds method is defined explicitly by this module, and returns an array of zones that can be treated as a collection by the other methods in the chain. These methods, such as filter, are general-purpose functions provided by Kotlin to work on collections. Some are higher-order functions, meaning that they take other functions as arguments.

Typical method calls, such as getAvailableZoneIds() with no arguments and regex.matches(it) with one argument, enclose arguments in parentheses. But the filter, map, and sortedBy methods take a lambda (anonymous function), indicated by the curly braces. The function or data item inside the lambda determines the outcome of the function; thus:

  • filter returns only the times zones that match the regular expression (defined earlier).

  • map runs the functions within the lambda on each time zone.

  • sortedBy sorts by the totalSeconds field.

Thus, the code shown reveals Kotlin as an elegant implementation of popular modern programming practices. Furthermore, the code makes heavy use of the java.time package, just by importing it in the same way as a Kotlin package. This shows the convenience of Java/Kotlin integration.

Rust

Like the creators of Crystal, the creators of Rust wanted a better systems programming language that avoided some of the design decisions in C and C++ that have led to widespread problems such as buffer overflow errors. They also wanted the modern features that were appearing in other languages and were not in the C family. Whereas the Crystal developers were concerned with simplicity and streamlined syntax, the Rust developers focused on robust, secure, error-free programming. The Mozilla Foundation, famous as the guardian of the Firefox browser, launched the Rust project and coordinates the work of a large and well-functioning community.

Following are the three goals highlighted on Rust’s web page:

  • Performance, which is addressed on two levels: language constructs that permit compiled speed comparable to C, and support for high levels of asynchronous and parallel computation.

  • Reliability, which is guaranteed through constructs such as size-specific data types described in “Robustness” and other features mentioned later in this section.

  • Productivity, which is provided largely through the Cargo package manager and a kind of packaging called a crate. These features cover dependency downloading, building, testing, documentation generation, continuous integration in a variety of environments, integration into registries, and even fixing common errors.

As a systems programming language, Rust needs to appeal to C and C++ programmers, easing migration. The syntax and basic features, such as control flow, will be familiar to C programmers. Pointers and passing by reference are supported by Rust, implemented by a mechanism called a box. However, Rust ensures that only one variable points to any particular block of memory, allowing that memory to be freed efficiently when the variable goes out of scope and preventing many memory-related errors.

As further inducement to C and C++ programmers, Rust can call C functions using the Foreign Function Interface and get access to global variables in C programs. In return, C can call Rust through the mechanism of declaring a Rust function as a callback.

In addition to popular everyday platforms, Rust runs on a number of microcontrollers, NVIDIA GPUs, and WebAssembly, as is suitable for a systems programming language.

Rust has also generated a lot interest among Python programmers. (Check the graphic labeled “What programming languages are you comfortable with?” in Rust’s 2018 survey.)

Rust took the same path as Java and many high-level scripting languages, adding some common internal mechanisms that aren’t in C++ to protect programmers from their own mistakes. Rust proponents claim that the performance impacts of their implementations are minimal. The main mechanism is bounds checking. All indexes are checked against the size of the array. This prevents buffer overflows, probably the most common source of errors and security breaches in programs. If you use a variable as an index, the runtime checks the index dynamically against the size of the array and returns an error instead of letting the program access arbitrary data.

Rust’s requirement that numbers be defined with explicit sizes, a way to prevent some types of errors, was already introduced in “Robustness”.

Both forms of error-checking described in “Syntax” are present in Rust. The None type is supported through a data type called Option, short for optional. Also, you can define errors as data types and return them from functions. Option is used when the returned value might be absent, whereas Result is used when the returned value might be an error.

The example that follows illustrates error catching with Result. It also illustrates a popular feature of modern languages called pattern matching, which resembles switch or case statements in other languages but is highly flexible because you match variables in many different ways.

let num: i32 = match my_str.parse() { 1
  Ok(x) => x,                         2
  Err(_) => 0,                        3
};
1

This line creates a new integer variable num and assigns it the result of a match expression. The parse method of my_str returns a Result enum containing either the variant Ok, wrapping the integer that was parsed successfully, or Err wrapping an error if the function was unable to parse my_str.

2

This line is the branch of the match expression matching the Ok variant. If it matches (i.e., if the Result is indeed an Ok wrapping some value), the integer is assigned to a local variable x, and x is returned as the value of the branch and assigned to num.

3

This line is the branch of the match expression matching the Err variant. If it matches, the error is ignored (the underscore means we discard the actual error) and instead we return 0 to be assigned to num. Alternatively, we could have chosen to capture the error and do something with it here.

Variables are immutable by default. This suggests an emphasis on programming without side effects, as functional languages do. Influenced (as Elm was) by Haskell, Rust also supports other common elements of functional programming, such as higher-order functions, closures, algebraic data types, and iterators with maps, filters, and more.

Rust supports threads and asynchronous calls, which it calls futures. Rust provides immutable data structures that are safe to share among threads without locking, because they cannot change value. The language also offers a form of one-way communication between threads called channels.

Finally, Rust has a variety of collection types, such as Vecs (vectors that allow some sophisticated memory allocation), Hashmaps (key/value arrays), and Queues.

1 The caller could create an error by submitting a floating-point number so big that p overflows the maximum integer size. But this is not supposed to happen in either Python or Elixir, which supposedly support integers of unlimited size.

2 The R language also has a special data type for missing values called NA, which is appropriate for a language created by statisticians for statisticians.

3 Cynics might suggest that the decision was influenced by the long-running copyright lawsuit brought by Oracle against Google’s use of Java, which has serious implications for software development.

Get Emerging Programming Languages now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.