Exploring Gleam with Genetic Algorithms

Friday, March 22, 2024

I have been seeing a new functional programming language all over Twitter lately called Gleam. Despite my limited experience in writing real applications with functional programming, I've always enjoyed Haskell and the purity that other functional programming languages allow.

This post is an excuse to explore the Gleam language and build something fun. Warning, I am not a Gleam or functional programming expert I am certain there are better ways to do everything shown here.

What We Are Building
====================================================================================================================================================================================================================================================================================================================================

We are building a simple word-solving genetic algorithm. The goal is to take in some target text and, given a population of random texts, evolve the population through the use of genetic algorithms to match our target text.

I am not the first person to do this (showing the population evolving in this way in the terminal). I saw a clip of something similar on Twitter but cannot find it again. If anyone knows who originally did this, please let me know so I can credit them.

Our First Gleam Program
====================================================================================================================================================================================================================================================================================================================================

Gleam has great tooling. Starting a new project is as simple as running gleam new. They have their own formatter, package manager, and LSP built into the gleam executable. I really like this as it reduces the overhead of getting started with Gleam.

There are a lot of cool features in Gleam, but here are the three that you will be seeing the most (or not seeing):

More on all of these as we start programming!

The Code
====================================================================================================================================================================================================================================================================================================================================

Warning: Here Be Recursion

Our genetic algorithm will start with a population of Individuals. Let's start by creating these Individuals and a way to score them.

type Individual {
  Individual(text: String, display_text: String, score: Int)
}

fn score_text(text: String, target_text: String) -> #(String, Int) {
  case string.first(text), string.first(target_text) {
    Ok(tt_first), Ok(tt_last) -> {
      let #(colored_char, score) = case string.compare(tt_first, tt_last) {
        order.Eq -> #("\u{001b}[38;5;2m" <> tt_first, 1)
        _ -> #("\u{001b}[38;5;9m" <> tt_first, 0)
      }
      let #(colored_rest, score_rest) =
        score_text(string.drop_left(text, 1), string.drop_left(target_text, 1))
      #(colored_char <> colored_rest, score + score_rest)
    }
    _, _ -> #("", 0)
  }
}

fn score_individual(individual: Individual, target_text: String) -> Individual {
  let #(colored_text, score) = score_text(individual.text, target_text)
  Individual(..individual, display_text: colored_text, score: score)
}

Our score_text function scores each individual character, assigning a value of 1 if the characters match and 0 if they don't. Notice that it also builds a string of specialized escape sequences that color the output. This special display_text is what we print to the terminal. The text field of the Individual is what we use for scoring.

It would probably be better if we compare the numeric distance between the codepoint values of each character, but using 1s and 0s works well enough.

Let's create our population.

let population =
  list.range(0, population_size)
  |> list.map(fn(_) {
    generate_random_text(string.length(target_text))
    |> Individual("", 0)
  })

fn generate_random_text(length: Int) -> String {
  case length {
    0 -> ""
    _ -> {
      // We can unwrap here as we know this will never be an Error
      int.random(58) + 65
      |> string.utf_codepoint()
      |> result.map(fn(a) { string.from_utf_codepoints([a]) })
      |> result.unwrap("")
      <> generate_random_text(length - 1)
    }
  }
}

This function generates a List(Individuals) with a length of population_size, where each Individual's text field is a random string with the same length as the target_text.

Here, we see some use of that intriguing |> operator. Let's break down how we're using it. In the first section, we create our population by initially creating an empty list of size population_size, a constant we set off-screen. We pipe that list into the list.map function. Remember, the pipe operator passes the value on the left to the first argument of the function on the right. We could rewrite that first |> as:

let population =
  list.map(list.range(0, population_size), fn(_) {
    generate_random_text(string.length(target_text))
    |> Individual("", 0)
  })

The |> operator is just syntatic sugar, and I think nice syntatic sugar. When chaining a lot of arguments it is convenient to use!

Now that we have our individuals, we need a way to cross them together and create future generations.

fn cross_strings(s1: String, s2: String) {
  case string.first(s1), string.first(s2) {
    Ok(s1_first), Ok(s2_first) -> {
      let char = case int.random(2) {
        0 -> s1_first
        _ -> s2_first
      }
      use rest <- result.try(cross_strings(
        string.drop_left(s1, 1),
        string.drop_left(s2, 1),
      ))
      // 10% chance to randomly add 1, -1, or 0
      case int.random(10) >= 9 {
        True -> {
          use codepoint <- result.try(
            string.to_utf_codepoints(char)
            |> list.first(),
          )
          let int_val = string.utf_codepoint_to_int(codepoint)
          let add = int.random(3) - 1
          use codepoint <- result.try(string.utf_codepoint(int_val + add))
          Ok(string.from_utf_codepoints([codepoint]) <> rest)
        }
        False -> {
          Ok(char <> rest)
        }
      }
    }
    _, _ -> Ok("")
  }
}

fn cross(p1: Individual, p2: Individual) -> Individual {
  cross_strings(p1.text, p2.text)
  |> result.unwrap("")
  |> Individual(display_text: "", score: 0)
}

Given two Individuals, the cross function produces a new Individual. As part of our mating process, we also introduce the possibility of mutation, where a codepoint can be mutated by plus or minus one.

There are several ways to perform evolution in a population. In our case, we will select the top 50% highest-scoring Individuals and cross them with each other. The remaining 50% of our population is dropped as we need room for our new Individuals.

fn permute_until_done(
  population: List(Individual),
  target_text: String,
  population_size: Int,
  generation: Int,
) {
  // We want to score and sort our population by score DESC
  let population =
    population
    |> list.map(fn(p) { score_individual(p, target_text) })
    |> list.sort(fn(p1, p2) { int.compare(p2.score, p1.score) })

  print_population(list.take(population, population_size / 2))

  // Check if 50% of the items have solved it
  // If not, perform selection, crossover and mutation, and continue trying
  case
    population
    |> list.take(population_size / 2)
    |> list.all(fn(x) { x.score == string.length(target_text) })
  {
    False -> {
      list.take(population, population_size / 2)
      |> fn(p) { list.zip(p, list.reverse(p)) }
      |> list.map(fn(p) {
        let #(p1, p2) = p
        let p3 = cross(p1, p2)
        let p4 = cross(p1, p2)
        [p1, p2, p3, p4]
      })
      |> list.flatten()
      |> permute_until_done(target_text, population_size, generation + 1)
    }
    True -> {
      io.println("\nDone in  " <> int.to_string(generation) <> " generations")
    }
  }
}

Notice that we are only printing the top 50% of our population and only checking if the top 50% of our population has found the correct text. With the way we are performing crossover, it is essentially impossible for 100% of the population to find the right text at the same time, as the mutation will likely disrupt it.

The final piece we need is a main function to run it all.

fn do_work(target_text: String, population_size: Int) {
  list.range(0, population_size)
  |> list.map(fn(_) {
    generate_random_text(string.length(target_text))
    |> Individual("", 0)
  })
  |> permute_until_done(target_text, population_size, 0)
}

pub fn main() {
  case argv.load().arguments {
    ["--text", text, "--population-size", population_size] ->
      do_work(
        text,
        int.parse(population_size)
        |> result.unwrap(1000),
      )
    _ -> do_work("GleamRocks", 1000)
  }
}

Thats it! We ommitted a few of the more boring details like printing the population, but that is available in the code at the repository.

Thanks for reading! I hope you have an awesome day.

the repo

My Links
====================================================================================================================================================================================================================================================================================================================================

You can find me posting random opinions and updates on Twitter or lurking on LinkedIn. I love to meet new people so don't hesitate to reach out or connect.

Github
Twitter
LinkedIn
Newsletter