Using Avalon-MM for FPGA-HPS communication

On our public Slack channel at #clash we got the following question a while ago:

Do you happen to know of any tutorials that show some IP working with the avalon MM bridge on a DE10 nano (from the HPS side)? Currently the clash part is making sense but figuring out how to call even the most basic IP (something that adds two numbers) is eluding me.

As it happens, I had been working on a project to exchange data packets between the FPGA (in Clash) and the HPS on a Terasic DE10-Standard. This board has an Intel Cyclone V SoC, and I also used Avalon-MM interfaces for the communication. The question did not specify which (if any) operating system the HPS would be running; I implemented support for Linux. My project is still lacking part of its documentation, but I decided to release it early in order to write this blog post.

The project can be found on GitHub at That is the main packet-fifo project; you also need a small accompanying Linux kernel driver:

The packet-fifo project communicates between FPGA and HPS through a few standard components that are shipped with Intel Platform Designer (QSys). For the Quartus project, I started with the DE10-Standard Golden Hardware Reference Design, and instantiated the following components between HPS and FPGA:

The links from M to S are Avalon-MM links (master to slave), and the arrows from SRC to SNK are Avalon-ST interfaces, source to sink. The FPGA design is connected by an Avalon Conduit Interface, and sees groups of pins.

More detail can be obtained by opening the project in Quartus and looking around.

With this Golden Hardware Reference Design, the Clash design is not the top entity of the Quartus project. Instead, it is instantiated from a hand-edited Verilog file that ties the components together. The Clash top entity is connected to some LEDs and buttons and the Avalon Conduit Interfaces of the bus bridges (dubbed fifo_f2h_in and fifo_f2h_out). The type of the Clash top entity thus becomes:

    :: Clock System
    -> "rst_n" ::: Signal System Bool
    – ^ Active-low asynchronous reset
    -> "fpga_debounced_buttons" ::: Signal System (BitVector 3)
    -> "fifo_f2h_in" :::
           ( Signal System (Unsigned 32)
           – ^ fifo_f2h_in_mm_external_interface_read_data
           , Signal System Bool
           – ^ fifo_f2h_in_mm_external_interface_acknowledge
    -> "fifo_f2h_out" :::
           ( Signal System (Unsigned 32)
           – ^ fifo_h2f_out_mm_external_interface_read_data
           , Signal System Bool
           – ^ fifo_h2f_out_mm_external_interface_acknowledge
    -> ( "fpga_led_internal" ::: Signal System (Unsigned 10)
       , "fifo_f2h_in" :::
             ( Signal System (Unsigned 3)
             – ^ fifo_f2h_in_mm_external_interface_address
             , Signal System (Unsigned 32)
             – ^ fifo_f2h_in_mm_external_interface_write_data
             , Signal System Bool
             – ^ fifo_f2h_in_mm_external_interface_read
             , Signal System Bool
             – ^ fifo_f2h_in_mm_external_interface_write
             , Signal System (BitVector 4)
             – ^ fifo_f2h_in_mm_external_interface_byte_enable
       , "fifo_f2h_out" :::
             ( Signal System (Unsigned 6)
             – ^ fifo_h2f_out_mm_external_interface_address
             , Signal System (Unsigned 32)
             – ^ fifo_h2f_out_mm_external_interface_write_data
             , Signal System Bool
             – ^ fifo_h2f_out_mm_external_interface_read
             , Signal System Bool
             – ^ fifo_h2f_out_mm_external_interface_write
             , Signal System (BitVector 4)
             – ^ fifo_h2f_out_mm_external_interface_byte_enable

But the question is specifically about the HPS-side of the communication, so that is what I will go into in this blog post.

The first version of the code where I had the communication basically working is commit c86ad02. The directory /src/hps contains the Linux userspace code. I got to this version of the code by basically following chapter 7 of the DE10-Standard User Manual (from titled Examples for using both HPS SoC and FGPA. This version of the code needs to be compiled with the Intel SoC FPGA Embedded Development Suite (SoC EDS).

As the user manual describes, you first need to generate hps_0.h from your Quartus project. The GHRD (in /quartus in my project) contains the script This needs to be invoked from within an Intel FPGA SoC Embedded Command Shell, so it might look like this:

$ ~/intelFPGA/18.1/embedded/
Intel FPGA Embedded Command Shell

Version 18.1 [Build 625]
$ cd packet-fifo/quartus/
$ ./
swinfo2header: Creating macro file 'hps_0.h' for module 'hps_0'

This hps_0.h is already included in /src/hps in my project, but that is how it was generated, and that is what needs to be invoked again after the platform is changed in QSys.

Now writing to the Avalon-MM bridge is as simple as mmap()ing /dev/mem and writing to the proper address constructed from constants in the headers. If you simply had instantiated an *Avalon to External Bus Bridge*, your Clash design would be the addressed slave. The packet-fifo project has the Clash design as a master to its own Avalon-MM slaves, and it also contains a System ID Peripheral. The following code would map the whole Lightweight HPS-to-FPGA AXI bridge in the process’s virtual memory, then read and print the System ID and put the value 42 into the FIFO data buffer:

#include <sys/mman.h>
#include "hwlib.h"
#include "socal/socal.h"
#include "socal/hps.h"
#include "socal/alt_gpio.h"
#include "hps_0.h"

#define HW_REGS_SPAN ( 0x04000000 )
#define HW_REGS_MASK ( HW_REGS_SPAN - 1 )

    int fd;
    char *virtual_base;
    volatile uint32_t *h2p_sysid_addr, *fifo_h2f_base;
    uint32_t id;

    fd = open("/dev/mem", (O_RDWR | O_SYNC));
    virtual_base = mmap(NULL, HW_REGS_SPAN, (PROT_READ | PROT_WRITE),
            MAP_SHARED, fd, HW_REGS_BASE);
    h2p_sysid_addr = (uint32_t *) (virtual_base +
    fifo_h2f_base = (uint32_t *) (virtual_base +

    id = *h2p_sysid_addr;
    printf("%#010x\n", id);
    *fifo_h2f_base = 42;

    return 0;

This is fine for some simple programmed I/O. But as soon as you add more functionality than that, such as interrupts, it seems you necessarily end up in kernel space if you are running Linux. However, Linux does offer you the tools to do just the minimum necessary in kernel space, and do the rest in user space. This is accomplished with the Userspace I/O subsystem. Furthermore, by switching to using a Device Tree, we can almost dispense with the SoC EDS and use a regular compilation environment and cross-platform standards. The latest version of packet-fifo, as in the master branch, does exactly that. See the README for working with that code. Using Quartus and the SoC EDS, one can generate a Device Tree overlay that contains entries that look like this:

fifo_h2f_in: fifo@0x000000020 {
        compatible = "altr,fifo-18.1", "altr,fifo-1.0";
        reg = <0x00000000 0x00000020 0x00000008>,
                <0x00000000 0x00000000 0x00000020>;
        reg-names = "in", "in_csr";
        interrupts = <0 44 4>;
}; //end fifo@0x000000020 (fifo_h2f_in)

The compatible property matches devices to their Linux kernel drivers, and the tiny kernel driver used by packet-fifo matches on the "altr,fifo-1.0" programming model and binds to this device.

The numbers denote addresses and interrupt lines as seen by the direct parent of such a device, and do not correspond to the view from the CPU. But Linux takes care of the mapping for us, and the tiny kernel driver enables exposing them to user space through a device file. This file can be mmap()ed to access the Avalon-MM bus, and select() or a blocking read() is used to wait for an interrupt. All the details of handling the interrupt are done by the kernel, by the device drivers for the SoC platform. The interrupt could even be shared with another device if interrupt lines become a scarce resource.

With this information, it should be possible to start with exchanging words of data from the HPS-side of an Avalon-MM interface. By browsing through the `packet-fifo` project, you can also view an example of the other pieces that are needed to interface the FPGA and the HPS, albeit all still at a pretty basic level. Note that there are still some bits of documentation missing, though.

Starter project: get started with Clash in minutes!

We just released a new and convenient way to get started with Clash: a starter project based on Stack. To get started,install Stack. Next, open a terminal and type:

stack new my-clash-project clash-lang/simple

Your project will be instantiated in my-clash-project/ and is ready to be built. To do so, type:

stack build

To run the tests in the project, run:

stack test

To compile to VHDL, run:

stack run clash – Test.Example – vhdl

And that’s all there is to it! The directory includes a giving you an overview of the project and how to extend it. Alternatively, you can read it online too.
Happy Clashing!

Note: Nix users might be interested in the Clash Nix Starter Project.

New Clash FPGA Starter

For this tutorial, you will need at least Clash version 1.2.3. We’ll be programming the Terasic DE0-Nano FPGA development board using the functional hardware description language Clash. The end result of this tutorial is demonstrated in the video below:

Watch this video on YouTube.

This tutorial is not a general introduction to Clash, nor to programming FPGAs. It is meant to demonstrate how to use Synthesize annotations and SynthesisAttributes to configure your Clash designs for an FPGA, without writing a single line of VHDL or (System)Verilog.

Blinker circuit

We start with some general information about the DE0-Nano board:

  • It has a 50 MHz crystal connected to a clock pin of FPGA, which we will connect to one of the PLLs that is on the FPGA to create a stable 20 MHz clock signal.
  • It has 8 green LEDs that that are active-high (turn on at logic level 1).
  • It has two buttons that are already properly debounced by a Schmitt trigger. The buttons are active-low, meaning that their logic level is 0 when they are pressed, and logic level 1 when they are not pressed.

The circuit that we are making will repeatedly do one of two things:

  • Invert the state of the LEDs
  • Rotate the state of the LEDs

We switch between these two modes when the KEY1 button on the board is pressed and subsequently released. We reset the circuit by pressing the KEY0 button.

Clash circuit specification

The complete description of the circuit is given below, where the description of the core behavior of the circuit only starts at line 95, and everything that comes before it is there to correctly set up the FPGA.

module Blinker where

import Clash.Prelude
import Clash.Intel.ClockGen
import Clash.Annotations.SynthesisAttributes

 – Define a synthesis domain with a clock with a period of 20000 /ps/.
 – i.e. 50 MHz
createDomain vSystem{vName="Input", vPeriod=20000}

 – Define a synthesis domain with a clock with a period of 50000 /ps/.
 – i.e. 20 MHz
createDomain vSystem{vName="Dom20MHz", vPeriod=50000}

{-# ANN topEntity
    { t_name   = "blinker"
    , t_inputs = [ PortName "CLOCK_50"
                 , PortName "KEY0"
                 , PortName "KEY1"
    , t_output = PortName "LED"
    }) #-}
topEntity ::
  – | Incoming clock
  – Annotate with attributes to map the argument to the correct pin,
  – with the correct voltage settings, on the DE0-Nano development kit.
  Clock Input
    `Annotate` 'StringAttr "chip_pin" "R8"
    `Annotate` 'StringAttr
                "altera_attribute" "-name IO_STANDARD \"3.3-V LVTTL\"" ->
  – | Reset signal, straight from KEY0
  Signal Input Bool
    `Annotate` 'StringAttr "chip_pin" "J15"
    `Annotate` 'StringAttr
                "altera_attribute" "-name IO_STANDARD \"3.3-V LVTTL\"" ->
  – | Mode choice, straight from KEY1. See 'LedMode'.
  Signal Dom20MHz Bit
    `Annotate` 'StringAttr "chip_pin" "E1"
    `Annotate` 'StringAttr
                "altera_attribute" "-name IO_STANDARD \"3.3-V LVTTL\"" ->
  – | Output containing 8 bits, corresponding to 8 LEDs
  – Use comma-seperated list in the "chip_pin" attribute to maps the
  – individual bits of the result to the correct pins on the DE0-Nano
  – development kit
  Signal Dom20MHz (BitVector 8)
    `Annotate` 'StringAttr
                "chip_pin" "L3, B1, F3, D1, A11, B13, A13, A15"
    `Annotate` 'StringAttr
                "altera_attribute" "-name IO_STANDARD \"3.3-V LVTTL\""
topEntity clk50Mhz rstBtn modeBtn =
  – Connect our clock, reset, and enable lines to the corresponding
  – /implicit/ arguments of the 'mealy' and 'isRising' function.
    (mealy blinkerT initialStateBlinkerT . isRising 1)
  – Enable line for subcomponents: we'll keep it always running
  en = enableGen

  – Start with the first LED turned on, in rotate mode, with the counter
  – on zero
  initialStateBlinkerT = (1, Rotate, 0)

  – Signal coming from the reset button is low when pressed, and high when
  – not pressed. We convert this signal to the polarity of our domain with
  – 'unsafeFromActiveLow'.
  rst = unsafeFromLowPolarity rstBtn

  – Instantiate a PLL: this stabilizes the incoming clock signal and
  – indicates when the signal is stable. We're also using it to transform
  – an incoming clock signal running at 50 MHz to a clock signal running at
  – 20 MHz.
  (clk20Mhz, pllStable) =
      (SSymbol @"altpll20")

  – Synchronize reset to clock signal coming from PLL. We want the reset to
  – remain active while the PLL is NOT stable, hence the conversion with
  – 'unsafeFromActiveLow'
  rstSync =
      (unsafeFromLowPolarity pllStable)

data LedMode
  – | After some period, rotate active led to the left
  = Rotate
  – | After some period, turn on all disable LEDs, and vice versa
  | Complement
  deriving (Generic, NFDataX)

flipMode :: LedMode -> LedMode
flipMode Rotate = Complement
flipMode Complement = Rotate

 – Finally, the actual behavior of the circuit
blinkerT ::
  (BitVector 8, LedMode, Index 6660001) ->
  Bool ->
  ((BitVector 8, LedMode, Index 6660001), BitVector 8)
blinkerT (leds, mode, cntr) key1R = ((ledsN, modeN, cntrN), leds)
    – clock frequency = 20e6  (20 MHz)
    – led update rate = 333e-3 (every 333ms)
    cnt_max = 6660000 :: Index 6660001 – 20e6 * 333e-3

    cntrN | cntr == cnt_max = 0
          | otherwise       = cntr + 1

    modeN | key1R     = flipMode mode
          | otherwise = mode

    ledsN | cntr == 0 =
              case mode of
                Rotate -> rotateL leds 1
                Complement -> complement leds
          | otherwise = leds

The two things that we wanted to highlight from the “setup” part of the descriptions are:

The Synthesize annotation we see on line 15:

{-# ANN topEntity
    { t_name   = "blinker"
    , t_inputs = [ PortName "CLOCK_50"
                 , PortName "KEY0"
                 , PortName "KEY1"
    , t_output = PortName "LED"
    }) #-}

lets you control the port names of the generated entity (VHDL) or module (Verilog), and the name of the entity/module itself. It also tells the Clash compiler from which function it should start synthesis. So in the above case, the compiler should start from the topEnity function, and should call the generated entity corresponding to that function: blinker. Then the ports corresponding to the arguments clk50MHz, rstBtn and modeBtn should be called: CLOCK_50, KEY0 and KEY1. The output port, corresponding to the result of the function should be named: LED.

Next up are the Annotate attributes, such as the one on line 30:

  Clock Input
    `Annotate` 'StringAttr "chip_pin" "R8"
    `Annotate` 'StringAttr
                "altera_attribute" "-name IO_STANDARD \"3.3-V LVTTL\"" ->

where we annotate the type Clock Input with two string attributes:

  • "chip_pin" "R8", which informs Quartus to connect the port corresponding to this argument to the FPGA pin called R8.
  • "altera_attribute" "-name IO_STANDARD \"3.3-V LVTTL\"", which informs Quartus about the voltage characteristics of the pin associated with the port associated with this argument.

All of this allows us to simply import the generated VHDL later on in our Quartus project without having to do any further FPGA configuration.

VHDL generation

Now it’s time to generate some VHDL. Copy the above description into a file called Blinker.hs. Then from the command line, run the Clash compiler to generate VHDL:

clash – vhdl -fclash-hdlsyn Quartus Blinker.hs

This will create a ./vhdl/Blinker/blinker directory with all the generated files. Note we passed in the additional flag -fclash-hdlsyn Quartus, that is because Quartus is somewhat quirky as to where it expects attributes for entity ports: in the architecture. By passing -fclash-hdlsyn Quartus, we ensure that Clash generates VHDL that is “compliant” with these quirks.

You can inspect ./vhdl/Blinker/blinker/blinker.hs and the entity and its ports are named as specified in the Synthesize ANN pragma.

entity blinker is
  port( – clock
       CLOCK_50 : in blinker_types.clk_input;
       KEY0     : in boolean;
       KEY1     : in std_logic;
       LED      : out std_logic_vector(7 downto 0));

Further down, we can see that the synthesis attributes have propagated to the generated VHDL:

attribute altera_attribute : string;
attribute altera_attribute of LED : signal is "-name IO_STANDARD ""3.3-V LVTTL""";
attribute altera_attribute of KEY1 : signal is "-name IO_STANDARD ""3.3-V LVTTL""";
attribute altera_attribute of KEY0 : signal is "-name IO_STANDARD ""3.3-V LVTTL""";
attribute altera_attribute of CLOCK_50 : signal is "-name IO_STANDARD ""3.3-V LVTTL""";

attribute chip_pin : string;
attribute chip_pin of LED : signal is "L3, B1, F3, D1, A11, B13, A13, A15";
attribute chip_pin of KEY1 : signal is "E1";
attribute chip_pin of KEY0 : signal is "J15";
attribute chip_pin of CLOCK_50 : signal is "R8";

Quartus Project

Next up, we create a Quartus project. We could go through the GUI to set things up, but in this case, we will simply create the project manually. Create the following files, with their given content, and put them in the directory where Clash generated the VHDL: ./vhdl/Blinker/blinker.


create_clock -name {CLOCK_50} -period 20.000 -waveform { 0.000 10.000 } [get_ports { CLOCK_50 }]
set_false_path -from * -to [get_ports {LED[*]}]
set_false_path -from [get_ports {KEY0}] -to *
set_false_path -from [get_ports {KEY1}] -to *


set_global_assignment -name DEVICE EP4CE22F17C6
set_global_assignment -name TOP_LEVEL_ENTITY blinker
set_global_assignment -name VHDL_FILE blinker_types.vhdl
set_global_assignment -name VHDL_FILE blinker.vhdl
set_global_assignment -name QSYS_FILE altpllE588ED61A4853C9C.qsys
set_global_assignment -name SDC_FILE blinker.sdc



You should now have a directory with the following files:

  • altpllE588ED61A4853C9C.qsys
  • blinker.qpf
  • blinker.qsf
  • blinker.sdc
  • blinker.vhdl
  • blinker_types.vhdl

FPGA Bitstream creation

Now start Quartus Prime; once started, in the menu bar, click File -> Open Project and open blinker.qpf. In the menu bar, click: Processing -> Start Compilation. This can take up to a minute depending on your machine. If everything worked as it was supposed to the last messages in the logs should be in the spirit of:

Info (332101): Design is fully constrained for setup requirements
Info (332101): Design is fully constrained for hold requirements
Info: Quartus Prime Timing Analyzer was successful. 0 errors, 1 warning
	Info: Peak virtual memory: 892 megabytes
	Info: Processing ended: Thu Jul  9 23:52:25 2020
	Info: Elapsed time: 00:00:01
	Info: Total CPU time (on all processors): 00:00:01
Info (293000): Quartus Prime Full Compilation was successful. 0 errors, 9 warnings

Programming the FPGA

After synthesis has finished, it is time to program our FPGA board. Connect the FPGA board to a USB port, and start the programmer from the menu bar: Tools -> Programmer. Press the Start button on the left to program your FPGA and wait until the progress bar says 100% (Successful).

Your FPGA should now be fully configured with the Clash generated design for you to play with. So yeah, go press those buttons!

Clash 1.0 released

It’s here! Clash 1.0 is finally released! And it has a binary release for the very first time!
Check out the install instructions instructions on how to get it!

The team at QBayLogic would like to thank all of the Clash supporters and users for making this happen.
Read all about the new features of Clash 1.0 on the clash website.

Solving custom operations in KnownNat constraints

In my previous post I discussed a solver plugin for `KnownNat` constraints, and with regards to user-defined type-level operations, I said:

i.e. it cannot solve constraints involving subtraction (`-`) or user-specified type-level operations on type-level naturals […] For user-defined type-level natural operations we are currently out of luck: unlike for the standard operations in `GHC.TypeLits`, we cannot encode the corresponding _dictionary functions_ beforehand.

well… it turns our we do not have to know these corresponding dictionary functions beforehand. So, I’m hereby announcing version 0.2 of the ghc-typelits-knownnat plugin, which can now also solve `KnownNat` constraints over types consisting of:

  • Subtractions (`GHC.TypeLits.-`)
  • User-defined type-level functions

Supporting user-defined operations

So what is the “trick” for handling user-defined operations? Well, in my previous post, we defined a class, and corresponding instance for every type-level operation. For example, for addition (`GHC.TypeLits.+`), we defined:

class KnownNatAdd (a :: Nat) (b :: Nat) where
  natSingAdd :: SNatKn (a + b)

instance (KnownNat a, KnownNat b) => KnownNatAdd a b where
  natSingAdd = SNatKn (natVal (Proxy @ a) + natVal (Proxy @ b))

And for multiplication (`GHC.TypeLits.*`), we defined:

class KnownNatMul (a :: Nat) (b :: Nat) where
  natSingMul :: SNatKn (a * b)

instance (KnownNat a, KnownNat b) => KnownNatMul a b where
  natSingMul = SNatKn (natVal (Proxy @ a) * natVal (Proxy @ b))

However, what we would want, is a single type class for functions of a certain arity, and subsequently define multiple instances for this single class. We can then export these type-classes, and developers can add an instance for their own type-level operations, and the solver plugin can then just lookup these instances.

Using some singletons trickery we can define such a type class as:

 – | Class for arithmetic functions with /two/ arguments.
 – The 'Symbol' /f/ must correspond to the fully qualified name of the
 – type-level operation.
class KnownNat2 (f :: Symbol) (a :: Nat) (b :: Nat) where
  type KnownNatF2 f :: Nat ~> Nat ~> Nat
  natSing2 :: SNatKn (KnownNatF2 f @@ a @@ b)

As it says in the comments, the first argument of this class is a type-level Symbol corresponding to the fully qualified name of the type-level operation. We need to do this, because, unlike normal type constructors (.i.e. `Maybe`), type families cannot be partially applied. This means we cannot write a class and corresponding instance of the form:

class KnownNat2 (f :: Nat -> Nat -> Nat) (a :: Nat) (b :: Nat) where
  natSing2 :: SNatKn (f a b)

 – this does not work: we cannot partially apply (+)
instance (KnownNat a, KnownNat b) => KnownNat2 (+) a b where
  natSing2 = SNatKn (natVal (Proxy @a) + natVal (Proxy @b))

So, instead, for the first argument of the `KnownNat2`, we use a `Symbol` that corresponds to the name of the type-level operation. The solver plugin can then use the name of type-level operation to find the corresponding `KnownNat2` instance.

Now, again due to the restriction on the partial application of type families, the associated type family `KnownNatF2` cannot be of kind:

type family KnownNatF2 (f :: Symbol) :: Nat -> Nat -> Nat

Instead, we must use a defunctionalised form of our type-level operations, which are provided by the singletons package:

type family KnownNatF2 (f :: Symbol) :: Nat ~> Nat ~> Nat

Where the difference is that we use `~>` (tilde-right_angle) instead of `->` (dash-right_angle).

We can now define our addition and multiplication instances as:

instance (KnownNat a, KnownNat b) => KnownNat2 "GHC.TypeLits.+" a b where
  type KnownNatF2 "GHC.TypeLits.+" = (:+$)
  natSing2 = SNatKn (natVal (Proxy @a) + natVal (Proxy @b))

instance (KnownNat a, KnownNat b) => KnownNat2 "GHC.TypeLits.*" a b where
  type KnownNatF2 "GHC.TypeLits.*" = (:*$)
  natSing2 = SNatKn (natVal (Proxy @a) * natVal (Proxy @b))

And subtraction as:

instance (KnownNat a, KnownNat b, b <= a) =>
  KnownNat2 "GHC.TypeLits.-" a b where
    type KnownNatF2 "GHC.TypeLits.-" = (:-$)
    natSing2 = SNatKn (natVal (Proxy @a) - natVal (Proxy @b))

which gets an extra constraint that `KnownNat (a – b)` only holds when `b <= a`.

Adding operations to the solver

So now that we have a type-class for constraint-level arithmetic, what steps should a developer take so that her type-level functions are supported by the ghc-typelits-knownnat solver? We will give a short example.


Enable some language extensions

{-# LANGUAGE DataKinds, FlexibleInstances, GADTs, KindSignatures,
             MultiParamTypeClasses, ScopedTypeVariables, TemplateHaskell,
             TypeApplications, TypeFamilies, TypeOperators,
             UndecidableInstances #-}


Import the required modules

import Data.Proxy            (Proxy (..))
import Data.Singletons.TH    (genDefunSymbols)
import GHC.TypeLits.KnownNat
import GHC.TypeLits

import Data.Type.Bool        (If) – used just for this example


Define the type-level operation

 – | Get the maximum of two type-level 'Nat's
type family Max (a :: Nat) (b :: Nat) :: Nat where
  Max 0 b = b
  Max a b = If (a <=? b) b a

One restriction is that such a type-level operation must have at least two equations. This restriction is due GHC treating single-equation type families as type synonyms, which are expanded at the Core level.

That is, had we written our `Max` operation as a single-equation operation:

type family Max (a :: Nat) (b :: Nat) :: Nat where
  Max a b = If (a <=? b) b a

then a `KnownNat (Max x y)` constraint would show up as `KnownNat (If (x <=? y) y x)` inside the solver plugin. And, consequently, the solver wouldn’t be able to look up the `KnownNat2` instance of `Max`.


Use Template Haskell to generate the defunctionalised form of the type-level operation:

$(genDefunSymbols [''Max])

This will create the following definitions:

MaxSym0 :: Nat ~> Nat ~> Nat
MaxSym1 :: Nat -> Nat ~> Nat
MaxSym2 :: Nat -> Nat -> Nat

where we need `MaxSym0`.


And finally specify the `KnownNat2` instance:

instance (KnownNat a, KnownNat b) =>
  KnownNat2 $(nameToSymbol ''Max) a b where
    type KnownNatF2 $(nameToSymbol ''Max) = MaxSym0
    natSing2 = let x = natVal (Proxy @a)
                   y = natVal (Proxy @b)
                   z = max x y
               in  SNatKn z

where we use the `nameToSymbol` (exported by `GHC.TypeLits.KnownNat`) function to converts a Template Haskell `Name` to a `Symbol`.

Version 1.0?

The ghc-typelits-knownnat solver plugin is now at version 0.2; and it supports:

  • All the type-level arithmetic functions of GHC.TypeLits
  • User-defined type-level operations

Which might make you wonder: if it basically supports all type-level operations on `Nat`, why isn’t this version 1.0?

Well, I probably should have called it 1.0, and perhaps I will call it that in a next bugfix release. However, there are still some aspects that would be nice to add/fix:

  • Add a `KnownBool` and `KnownOrdering` class and add support for the comparison operators of GHC.TypeLits.
  • Find a way to index the `KnownNat` classes that is less error-prone than a `Symbol` corresponding to the fully qualified name of the type-level operation.

Solving GHCs KnownNat constraints

If you have ever worked with GHC.TypeLits then you have probably encountered an error that is very similar to:

Test.hs:9:7: error:
    • Could not deduce (KnownNat (n + 2))
        arising from a use of ‘natVal’
      from the context: KnownNat n
        bound by the type signature for:
                   f :: KnownNat n => Proxy n -> Integer
        at Test.hs:7:1-48

In our case, the above error originates in the following file:

{-# LANGUAGE DataKinds, ScopedTypeVariables, TypeOperators #-}
module Test where

import GHC.TypeLits
import Data.Proxy

f :: forall n . KnownNat n => Proxy n -> Integer
f _ = natVal (Proxy :: Proxy n) +
      natVal (Proxy :: Proxy (n+2))

The problem is that, even though we have a `KnownNat n` (a type-level natural `n` which we can reflect to a term-level `Integer`), GHC cannot infer `KnownNat (n+2)`. You would think that, in 2016, given a natural number `n`, an advanced compiler such as GHC should surely be able to infer the value corresponding to the natural number `n + 2`: you simply add 2 to already given `n`. Sadly, GHC can not do this… and, surely, complaints were made.

A type-checker plugin for solving KnownNat constraints

As I’ve blogged about in another post, GHC supports so-called type-checker plugins since version 7.10.1. Type checker plugins let us extend GHC’s constraint solver, i.e., extend the range of programs GHC can type check.

So, given that GHC 7.10.1 was released on March 27 2015, surely somebody must have already written a type-checker plugin that can fix our annoying `KnownNat` issue, right? Well, even if someone did, such a plugin has never been released on hackage, until today that is:

By just adding the following line:

`{-# OPTIONS_GHC -fplugin GHC.TypeLits.KnownNat.Solver #-}`

to our program:

{-# LANGUAGE DataKinds, ScopedTypeVariables, TypeOperators #-}

{-# OPTIONS_GHC -fplugin GHC.TypeLits.KnownNat.Solver #-}

module Test where

import GHC.TypeLits
import Data.Proxy

f :: forall n . KnownNat n => Proxy n -> Integer
f _ = natVal (Proxy :: Proxy n) +
      natVal (Proxy :: Proxy (n+2)

we can finally see that “all is right with the world (of KnownNat)”:

$ ghci -XDataKinds -XTypeApplications Test.hs
GHCi, version 8.0.1:  :? for help
[1 of 1] Compiling Test             ( Test.hs, interpreted )
Ok, modules loaded: Test.
*Test> f (Proxy @2)

Creating this type-checker plugin

Now, there are probably a multitude of reasons as to why a plugin for solving `KnownNat` constraints wasn’t build/released before now; one of the likely reasons being that both the GHC API, and type-checker plugins, are somewhat obscure. Then again, I, the author of this post, have written several type-checker plugins, and I am (at least somewhat) familiar with those parts of the GHC API that are involved with constraint solving. So what are the reasons that I didn’t build such a plugin (much) earlier:

  1. To be honest, I was personally never too bothered with typing in the extra `KnownNat` constraints. That is, until I wanted to simplify the API of one of the libraries that I work on.
  2. I thought it couldn’t be done.

I will now focus on point 2, the reason I thought it couldn’t be done.

Evidence for KnownNat constraints

Basically, what Type checker plugins must do is provide evidence for the constraints GHC wants them to solve; where this evidence takes on the form of EvTerm:

data EvTerm
  = EvId EvId
  – ^ Any sort of evidence Id, including coercions

  | EvCoercion TcCoercion
  – ^ coercion bindings
  – See Note [Coercion evidence terms]

  | EvCast EvTerm TcCoercionR
  – ^ d |> co

  | EvDFunApp DFunId [Type] [EvTerm]
  – ^ Dictionary instance application

  | EvDelayedError Type FastString
  – ^Used with Opt_DeferTypeErrors
  – See Note [Deferring coercion errors to runtime] in TcSimplify

  | EvSuperClass EvTerm Int
  – ^ n'th superclass. Used for both equalities and
  – dictionaries, even though the former have no
  – selector Id.  We count up from _0_

  | EvLit EvLit
  – ^ Dictionary for KnownNat and KnownSymbol classes.
  – Note [KnownNat & KnownSymbol and EvLit]

  | EvCallStack EvCallStack
  – ^ Dictionary for CallStack implicit parameters

  | EvTypeable Type EvTypeable
  – ^ Dictionary for (Typeable ty)

Where, for example, the evidence for a `KnownNat 8` constraint, would be encoded by `EvLit (EvNum 8)`. OK, so what’s the problem with `EvTerm`? Well, let me first explain what we are trying to do:

  • Given e.g. evidence for `KnownNat n` and `KnownNat 2`
  • We want to add the two values of the evidence to create evidence for `KnownNat (n+2)`.

And this is where we run into problems: `EvTerm` does not have a constructor for adding two `KnownNat` evidence terms. Actually, the more general problem is that it doesn’t allow us to create (arbitrary) Core expressions which will perform the addition. This problem is discussed in more detail at:

So, this made me think that it was simply not possible to create a `KnownNat` solver given the current (version 8.0.1) state of the GHC API. And then, yesterday (August 9th, 2016), just before bed, after staring for some time at the `EvTerm` definition, I realised I could use the `EvDFunApp` constructor to do what I want!

Building a dictionary (function)

As some of you might already know, class constraints in Haskell are translated to so-called dictionaries by GHC. A dictionary is record, where the fields correspond to the methods (and super-classes) of the class. So, e.g. for the `Eq` class:

class Eq a where
  (==) :: a -> a -> Bool
  (/=) :: a -> a -> Bool

GHC creates the following dictionary:

data DEq a
  = DEq
  { (==$) :: a -> a -> Bool
  , (/=$) :: a -> a -> Bool

Now, what perhaps fewer people know, is that for instances with class constraints, GHC creates dictionary functions: functions that create dictionaries. i.e. given the instance:

instance (Eq a, Eq b) => Eq (a,b) where
  (==) (a,b) (c,d) = a == c && b == d
  (/=) (a,b) (c,d) = a /= c || b /= d

GHC creates a dictionary function:

dfunEqTup d1 d2 =
  DEq { (==$) = \(a,b) (c,d) -> (&&) ((==) d1 a c)
                                     ((==) d2 b d)
      , (/=$) = \(a,b) (c,d) -> (||) ((/=) d1 a c)
                                     ((/=) d2 b d)

Now that we know what dictionary functions are, we can come back to the `EvDFunApp` constructor of `EvTerm`.
`EvDFunApp` applies dictionaries to a given dictionary function. That is, where dictionary functions are our evidence-level lambdas, `EvDFunApp` is our evidence-level application. This means that all we need to do is figure out how to write an instance that gives us our evidence-level arithmetic operations, and we would be in business.

Evidence-level addition

Below follows the code for our specially crafted class, and corresponding instance, so that we get an _evidence-level_ addition operation:

{-# LANGUAGE AllowAmbiguousTypes   #-}
{-# LANGUAGE DataKinds             #-}
{-# LANGUAGE FlexibleInstances     #-}
{-# LANGUAGE KindSignatures        #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE ScopedTypeVariables   #-}
{-# LANGUAGE TypeApplications      #-}
{-# LANGUAGE TypeOperators         #-}

import Data.Proxy
import GHC.TypeLits

newtype SNatKn (n :: Nat) = SNatKn Integer

class KnownNatAdd (a :: Nat) (b :: Nat) where
  natSingAdd :: SNatKn (a + b)

instance (KnownNat a, KnownNat b) => KnownNatAdd a b where
  natSingAdd = SNatKn (natVal (Proxy @a) + natVal (Proxy @b))

For the above code, GHC will generate the desired _dictionary function_ to do _evidence-level_ addition:

dfunKnownNatAdd :: forall a b . DKnownNat a -> DKnownNat b
                -> DKnownNatAdd a b
dfunKnownNatAdd kn1 kn2 =
  DKnownNatAdd { natSingAdd = SNatKn (natVal kn1 (Proxy @a) +
                                      natVal kn2 (Proxy @b))

OK, so we can now build a `DKnownNatAdd` dictionary, however, this only gives us _evidence_ for a `KnownNatAdd` constraint, but not for a `KnownNat` constraint. And this is where another lesser-known aspect of GHCs translation of classes comes in:

  • Single method classes are represented by `newtype` dictionaries in GHC.

And, we can safe cast a `newtype` to (and from) its underlying representation using coercions.
Looking at the code for our evidence-level addition, we can now see that:

  • `KnownNatAdd` is nothing more than a `newtype` wrapper around `SNatKn`
  • `SNatKn` is a newtype wrapper around `Integer`.

And, when we take a look at the (hidden) definition of `KnownNat`:

newtype SNat (n :: Nat) = SNat Integer

class KnownNat (n :: Nat) where
  natSing :: SNat n

we see that `KnownNat` is, ultimately, also nothing more than a `newtype` wrapper around `Integer`. This means we can safely cast our `DKnownNatAdd` dictionary to the `DKnownNat` dictionary (the code for these coercions can be found here):

  1. KnownNatAdd a b ~ SNatKn (a + b)
  2. SNatKn (a + b) ~ Integer
  3. Integer ~ SNat (a + b)
  4. SNat (a + b) ~ KnownNat (a + b)

And that’s it:

  1. We have our `DKnownNatAdd` dictionary function for _evidence-level_ addition.
  2. A way to safely cast a `DKnownNatAdd` dictionary to a `DKnownNat` dictionary, the evidence for `KnownNat` constraints.

Closing thoughts

So now finally, on August 10th 2016, using the ghc-typelits-knownnat type checker plugin, GHC can finally solve “complex” `KnownNat` constraints from other simple, single variable, `KnownNat` constraints. So does that mean it can infer all kinds of “complex” `KnownNat` constraints? Well… no, it can (currently) only infer constraints involving:

  • Type-level natural numbers
  • Type variables
  • Applications of the following type-level operations: `+`, `*`, and `^`.

i.e. it cannot solve constraints involving subtraction (`-`) or user-specified type-level operations on type-level naturals, such as:

import Data.Type.Bool
import GHC.TypeLits

type family Min (x :: Nat) (y :: Nat) :: Nat
    Min x y = If (x <=? y) x

Where the problem with subtraction is that it is a partial function on natural numbers. Let’s say someone writes:

g :: forall n . KnownNat n => Proxy n -> Integer
g _ = natVal (Proxy :: Proxy n) +
      natVal (Proxy :: Proxy (n-4))

then we should not just automatically solve the `KnownNat (n-4)` constraint from the given `KnownNat n` constraint, because for `0 <= n < 4`, `n-4` is not a natural number. In this case, we should only automatically infer `KnownNat (n-4)` when an additional `4 <= n` constraint is given:

g0 :: forall n . (KnownNat n, 4 <= n) => Proxy n -> Integer
g0 _ = natVal (Proxy :: Proxy n) +
       natVal (Proxy :: Proxy (n-4))

I think I know how to check for these additional constraints, so a future version of the plugin will automatically solve `KnownNat` constraints involving subtractions. For user-defined type-level natural operations we are currently out of luck: unlike for the standard operations in `GHC.TypeLits`, we cannot encode the corresponding dictionary functions beforehand. Perhaps, once arbitrary Core expressions can be used as evidence (discussed here), we can reflect type-level operations to evidence-level operations, and then also automatically solve `KnownNat` constraints over user-defined type-level operations.

CλaSH FPGA Starter

This version of the tutorial is outdated, please follow the new version.

In this tutorial we’ll be programming the Terasic DE0-Nano FPGA development board using the functional hardware description language CλaSH. The end result of this tutorial is demonstrated in the video below:

Watch this video on YouTube.

This tutorial is not a general introduction to CλaSH, nor to programming FPGAs. It is meant to demonstrate how to use of `TopEntity` annotations (added in version `0.5.5` of CλaSH) to configure your CλaSH designs for an FPGA, without writing a single line of VHDL or (System)Verilog. Even then, this tutorial is already almost too long for single blog post, but here goes:

This tutorial used to be specified in terms of `.topentity` files, a method which is now superseded by `TopEntity` annotions. You can still see the old version on github.

Blinker circuit

We start with some general information about the DE0-Nano board:

  • It has a 50 MHz crystal connected to a clock pin of FPGA, which we will connect to one of the PLLs that is on the FPGA to create a stable 50 MHz clock signal.
  • It has 8 green LEDs that turn on when we put a `1` on the FPGA pin connected to the LED.
  • It has two buttons which are already properly debounced by a Schmitt trigger. The buttons are `0` when they are pressed, and `1` when they are not.

The circuit that we are making will repeatedly do one of two things:

  • Invert the state of the LEDs
  • Rotate the state of the LEDs

We switch between these two modes when the `KEY1` button on the board is pressed and subsequently released. We reset the circuit by pressing the `KEY0` button.

CλaSH circuit specification

The `Blinker.hs` file given below implements the behaviour of circuit (sans reset behaviour) we just described. As the clock is running at `50 MHz`, we shouldn’t update the state of the LEDs every clock cycle, as they will just seem to to all be turned on all the time to the human eye. We hence use a counter that counts up to a certain amount of clock cycles, and only update the state of the LEDs when the desired number of cycles is reached and the count is reset.


module Blinker where

import CLaSH.Prelude

{-# ANN topEntity
    { t_name     = "blinker"
    , t_inputs   = ["KEY1"]
    , t_outputs  = ["LED"]
    , t_extraIn  = [ ("CLOCK_50", 1)
                   , ("KEY0"    , 1)
    , t_clocks   = [ (altpll "altpll50"
                             "not KEY0(0)")
    }) #-}
topEntity :: Signal Bit -> Signal (BitVector 8)
topEntity key1 = leds
    key1R = isRising 1 key1
    leds  = mealy blinkerT (1,False,0) key1R

blinkerT (leds,mode,cntr) key1R = ((leds',mode',cntr'),leds)
    – clock frequency = 50e6   (50 MHz)
    – led update rate = 333e-3 (every 333ms)
    cnt_max = 16650000 – 50e6 * 333e-3

    cntr' | cntr == cnt_max = 0
          | otherwise       = cntr + 1

    mode' | key1R     = not mode
          | otherwise = mode

    leds' | cntr == 0 = if mode then complement leds
                                else rotateL leds 1
          | otherwise = leds

`TopEntity` annotations

Now that we’ve created our CλaSH design, it’s time to move on to the important part of this tutorial, elaboration of the `TopEntity` annotation. The TopEntity is a Haskell data type that guides the CλaSH compiler in generating port names and setting up clocks. These annotations are applied using the `ANN` pragma:

`{-# ANN foo (TopEntity {t_name = .., .., ..}) #-}`

In our example, we extend the minimalist defTop annotation with:

  • `t_name`: the name our component should have. In our case: `blinker`
  • `t_inputs`: a list of names our inputs should have. In our case: `KEY1`
  • `t_outputs`: a list of names our outputs should have. In our case: `LED`
  • `t_extraIn`: a list of extra inputs that do not correspond to the arguments of our `topEntity` function.
    These extra inputs are always bit vectors.
    Every item in the `extra_in` list is a tuple (encoded as a 2-element list) of a name, and the number of bits for that input.
    In our case we add an extra 1-bit input `CLOCK_50` which will correspond to the pin to which the 50MHz crystal is attached, and a 1-bit input `KEY0` which is the button we will use as a reset.
  • `t_clocks`: a list of clock sources.

We create a single clock source by instantiating the _default_ template for the Altera PPL component altpll for the Cyclone IV. The first argument of this function is the name, the second an expression corresponding to the clock pin to connect, and the third the expression corresponding to the reset pin to connect. So to elaborate the arguments in order:

  • `name`: the name of the component generating our clock. In our case `”altpll50″`, an instantiated PLL component we will create later on in this tutorial.
  • `clock port`: Now, I lied a bit that we didn’t have to write any VHDL or SystemVerilog. As you can see, we connect the clock port to the expression `CLOCK_50(0)`, which is a VHDL expression. The `altpll50`s `clock port` expects a single bit, but our `CLOCK_50` port is a 1-bit vector. Hence we need to select the first (and only) bit: `”CLOCK_50(0)”`.
  • `reset port`: Again, we have to write some VHDL: `not KEY0(0)`. Remember that our extra inputs are bit vectors, but the `reset port` expects a single bit. Hence we need to select the first (and only) bit: `KEY0(0)`. Additionally, remember that the keys are `0` when they are pressed, but the reset port expects an active-high signal. Hence we need to negate the signal: `”not KEY0(0)”`.

VHDL generation

Now it’s time to generate some VHDL, yay! Make sure that the `Blinker.hs` file is in your current working directory, and then run the CλaSH compiler to generate VHDL:

`clash – vhdl Blinker.hs`

This will create a `./vhdl/Blinker/` directory, of which, for this tutorial, the most interesting file is within that directory is:


 – Automatically generated VHDL
library IEEE;
use work.all;
use work.types.all;

entity blinker is
  port(KEY1     : in std_logic_vector(0 downto 0);
       CLOCK_50 : in std_logic_vector(0 downto 0);
       KEY0     : in std_logic_vector(0 downto 0);
       LED      : out std_logic_vector(7 downto 0));

architecture structural of blinker is
  signal system1000      : std_logic;
  signal system1000_rstn : std_logic;
  signal altpll50_locked : std_logic;
  altpll50_inst : entity altpll50
    port map
      (inclk0 => CLOCK_50(0)
      ,c0     => system1000
      ,areset => not KEY0(0)
      ,locked => altpll50_locked);

  – reset system1000_rstn is asynchronously asserted,
  – but synchronously de-asserted
  resetSync_n_0 : block
    signal n_1 : std_logic;
    signal n_2 : std_logic;
      if altpll50_locked = '0' then
        n_1 <= '0';
        n_2 <= '0';
      elsif rising_edge(system1000) then
        n_1 <= '1';
        n_2 <= n_1;
      end if;
    end process;

    system1000_rstn <= n_2;
  end block;

  topEntity_0_inst : entity topEntity_0
    port map
      (key1_i1         => KEY1
      ,system1000      => system1000
      ,system1000_rstn => system1000_rstn
      ,topLet_o        => LED);

Port/Pin mappings

It’s almost time to start programming our FPGA. But first we need to setup how the component ports are connected to the right pins on the FPGA.
For this we will use the `DE0_Nano_SystemBuilder` which you can find on the DE0-Nano CD-ROM. This will create both a pin/port map file, and a Quartus II project file. Note that I work on OS X, so both `DE0_Nano_SystemBuilder` and `Quartus II` are running inside a VirtualBox VM running Windows 8.


In `DE0_Nano_SystemBuilder`, call the project `blinker`, and ensure that only `CLOCK`, `Button x 2`, and `LED x 8` are selected.
Then press the `Generate` button, this will open a `save` dialog window: select a location where you want to create the project and hit the `save` button.
You can now close `DE0_Nano_SystemBuilder`.

In the directory to which you saved the project you will find the files:

  • `blinker.htm`
  • `blinker.qpf`
  • `blinker.qsf`
  • `blinker.sdc`
  • `blinker.v`

You can delete `blinker.v` as we will be using our `blinker.vhdl`. The port/pin mappings that are generated by `DE0_Nano_SystemBuilder` do not completely correspond to our port names. We will hence need to update the mapping file `blinker.qsf`:

  • Replace every occurrence of `CLOCK_50` by `CLOCK_50[0]`
  • Replace every occurrence of `KEY[0]` by `KEY0[0]`
  • Replace every occurrence of `KEY[1]` by `KEY1[0]`

The final `blinker.qsf` file will look like:

# Build by Terasic System Builder

set_global_assignment -name FAMILY "Cyclone IV E"
set_global_assignment -name DEVICE EP4CE22F17C6
set_global_assignment -name TOP_LEVEL_ENTITY "blinker"
set_global_assignment -name ORIGINAL_QUARTUS_VERSION "12.0"
set_global_assignment -name LAST_QUARTUS_VERSION "12.0"
set_global_assignment -name PROJECT_CREATION_TIME_DATE "12:43:02 MAY 01,2015"
set_global_assignment -name DEVICE_FILTER_PACKAGE FBGA
set_global_assignment -name DEVICE_FILTER_PIN_COUNT 256
set_global_assignment -name DEVICE_FILTER_SPEED_GRADE 6
set_global_assignment -name SDC_FILE blinker.SDC

set_location_assignment PIN_R8 -to CLOCK_50[0]
set_instance_assignment -name IO_STANDARD "3.3-V LVTTL" -to CLOCK_50[0]

set_location_assignment PIN_A15 -to LED[0]
set_instance_assignment -name IO_STANDARD "3.3-V LVTTL" -to LED[0]
set_location_assignment PIN_A13 -to LED[1]
set_instance_assignment -name IO_STANDARD "3.3-V LVTTL" -to LED[1]
set_location_assignment PIN_B13 -to LED[2]
set_instance_assignment -name IO_STANDARD "3.3-V LVTTL" -to LED[2]
set_location_assignment PIN_A11 -to LED[3]
set_instance_assignment -name IO_STANDARD "3.3-V LVTTL" -to LED[3]
set_location_assignment PIN_D1 -to LED[4]
set_instance_assignment -name IO_STANDARD "3.3-V LVTTL" -to LED[4]
set_location_assignment PIN_F3 -to LED[5]
set_instance_assignment -name IO_STANDARD "3.3-V LVTTL" -to LED[5]
set_location_assignment PIN_B1 -to LED[6]
set_instance_assignment -name IO_STANDARD "3.3-V LVTTL" -to LED[6]
set_location_assignment PIN_L3 -to LED[7]
set_instance_assignment -name IO_STANDARD "3.3-V LVTTL" -to LED[7]

set_location_assignment PIN_J15 -to KEY0[0]
set_instance_assignment -name IO_STANDARD "3.3-V LVTTL" -to KEY0[0]
set_location_assignment PIN_E1 -to KEY1[0]
set_instance_assignment -name IO_STANDARD "3.3-V LVTTL" -to KEY1[0]

# End of pin assignments by Terasic System Builder

Also, since we will be using the `PLL` for clocks, you can replace the entire contents of `blinker.sdc` by:


Quartus II

We now open the `blinker.qpf` file in `Quartus II`. For this tutorial, I will be using version `14.1` of the `Quartus II` software.
The first thing we will do is add the VHDL files generated by CλaSH to our design. In the menu bar, go to, `Project -> Add/Remove Files in Project…`, press the `…` button next to `File name`, in the load dialog, select all the files generated by clash, and press `Open`. Then press `OK` to close the `Settings` dialog.


Next we will create a `PLL` component. On the right of the `Quartus II` main window you will see the `IP Catalog`. Go to `Installed IP > Library > Basic Functions > Clocks; PLLs and Resets > PLL`, and double click on `ALTPLL`.


This opens a dialog called `Save IP Variation`, at the end of the line enter `altpll50` and press `OK`.


This opens the `MegaWizard` dialog for the PLL. Change `What is the frequency of the inclk0 input?` to `50.000`.


Then press `Finish` twice. This closes the `MegaWizard` dialog, and opens a new dialog asking if you want to add this IP block to your project. We want this, so select ‘Yes’.

Synthesis time

Finally, we’re finished with configuration, and we can start creating the configuration file for the FPGA. In the menu bar, click: ‘Processing -> Start Compilation’. In my extremely slow VM this compilation/synthesis process will take a while, if but might be much faster on your machine.

Programming time

After synthesis has finished, it is time to program our FPGA board. Connect the FPGA board to a USB port, and start the programmer from the menu bar: `Tools -> Programmer`. Press the `Start` button on the left to program your FPGA and wait until the progress bar says `100% (Successful)`.


Time to play

Your FPGA should now be fully configured with the CλaSH generated design for you to play with. So yeah, go press those buttons!

This ends the tutorial on the use of `TopEntity` annotations to program your FPGA devices with CλaSH designs. I hope you got the gist of it. If/When I get my hands on a Xilinx FPGA board, I hope to make an alternate version of this tutorial, but then use Xilinx’ ISE/Vivado tool-flow instead.

GHC type checker plugins: adding new type-level operations

Since version 7.10.2, GHC supports so-called type-checker plugins which let us extend GHC’s constraint solver, i.e. extend the the range of programs GHC can type check. Several plugins have already been released, the ones I know are:

  • Adam Gundry’s uom-plugin: for supporting units of measure.
  • Iavor Diatchki’s type-nat-solver: which allows you to hook up SMT solvers to solve numeric constraints.
  • My own plugins:
    • ghc-typelits-natnormalise: which solves numeric constraints using a custom normalisation procedure as opposed to Iavor’s approach, which uses SMT solvers.
    • ghc-typelits-extra: which adds a type-level greatest common denominator (GCD) and (ceiling of) logarithm operator for GHCs type-level `Nat`ural numbers.

This post will be about writing our own type-checker plugin: we will build a plugin that (just like ghc-typelits-extra) allows GHC to perform GCD on types of kind Nat.

Why a type-checker plugin?

So the first question that might pop up: why do we even need to write a type-checker plugin to support GCD over `Nat`? The “problem” is that `Nat` is _not_ inductively defined like so:

data Nat = Z | S Nat

, however, it is simply defined as an `Integer` (see CoreSyn.hs).

For an inductively defined `Nat`s, it is fairly simple to write a (closed) type family symbolising an (arithmetic) operation, e.g.:

{-# LANGUAGE DataKinds, TypeFamilies #-}

import Data.Proxy

data Nat = Z | S Nat

type family Add (x :: Nat) (y :: Nat) :: Nat where
Add Z b = b
Add (S a) b = S (Add a b)

test :: Proxy (S (S (S (S Z)))) -> Proxy (Add (S (S Z)) (S (S Z)))
test = id

However, an inductively defined `Nat`s is not what GHC gives us, GHC gives us `Integer`s. Having only `Integer`s, we might try to start with:

{-# LANGUAGE DataKinds, TypeFamilies #-}

import GHC.TypeLits

type family Add (x :: Nat) (y :: Nat) :: Nat where
Add 0 b = b
Add 1 b = ???

where we have no real way to define the part indicated by “???”. Indeed, all the currently defined type-level operations on types of kind `Nat` as defined in GHC.TypeLits, are all implemented by compiler magic. Using type-checker plugins, that kind of magic is, however, no longer just reserved for the wizards over at GHC HQ!

As to why GHCs native type-level natural numbers are implemented using `Integer`, instead of an inductively defined datatype, I can only guess:

  • Unary encodings (such as the one seen earlier) are “expensive”, in terms of memory, for large numbers. I’m sure there must be a way to internally use a more efficient representation for what is, from the user’s point of view, a unary encoding. To my my knowledge, no efforts have been made to explore this option in GHC.
  • Base-2 encodings, though low-cost in terms of memory, have harder to define arithmetic operations.

With the advent of type-checker plugins, the `Integer`-representation aspect of `Nat` has not been a problem any longer for most of my projects.

@Bodigrim on reddit showed that we can just implement GCD with the tools already in `GHC.TypeLits`:

{-# LANGUAGE DataKinds #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE TypeOperators #-}
{-# LANGUAGE UndecidableInstances #-}

import Data.Proxy
import GHC.TypeLits

type family Cmp123 (o :: Ordering) (x :: Nat) (y :: Nat) (z :: Nat) :: Nat where
Cmp123 LT x y z = x
Cmp123 EQ x y z = y
Cmp123 GT x y z = z

type family GCD (x :: Nat) (y :: Nat) :: Nat where
GCD 0 x = x
GCD x 0 = x
GCD x y = Cmp123 (CmpNat x y) (GCD x (y - x)) x (GCD (x - y) y)

test :: Proxy (GCD 372 48) -> Proxy 12
test = id

I guess the lesson learned here is that type-checker plugins should always be a last resort measure.

Type-level GCD

OK, enough about why we need to implement GCD using a type-checker plugin. Let’s get on with implementing it. Before we implement the plugin, we first need to define our GCD operation:

{-# LANGUAGE DataKinds #-}
{-# LANGUAGE TypeFamilies #-}
module GHC.TypeLits.GCD
( GCD )

import GHC.TypeLits (Nat)

type family GCD (x :: Nat) (y :: Nat) :: Nat where
GCD 0 x = x

The reason I’ve chosen a closed type-family (as opposed to an open one) is that I don’t want our users to pull any shenanigans such as adding additional bogus equations. For GHC 7.10 we need to add at least one equation (see GHC issue #9840) to our closed type family; this should no longer be needed for GHC 8+. Although an empty closed type family is possible in GHC 8+, the current single equation version simplifies the design of the type-checker plugin, as we will see at the end of this blog post.

Type-checker plugin

I guess the reason there are not too many blog posts on GHC type checker plugins is because you need to be sort-of familiar with the GHC API. I’ve been using the GHC API for over 6 years now, so I’ve taught myself how to browse its documentation (the comments in the source code); however, it’s lack of haddock documentation makes it somewhat intimidating for the uninitiated. I try to keep the GHC API code to a minimum, but it can sadly not be avoided.

We first start with some imports:

{-# LANGUAGE CPP, TupleSections #-}
module GHC.TypeLits.GCD.Solver
( plugin )

 – external
import Control.Arrow ((***))
import Data.List (partition)
import Data.Maybe (mapMaybe)
import GHC.TcPluginM.Extra (evByFiat, lookupModule, lookupName)

import FastString (fsLit)
import Module (mkModuleName)
import OccName (mkTcOcc)
import Plugins (Plugin (..), defaultPlugin)
import TcEvidence (EvTerm)
import TcPluginM (TcPluginM, tcLookupTyCon)
import TcRnTypes (Ct, TcPlugin(..), TcPluginResult (..),
ctEvidence, ctEvPred)
import TyCon (TyCon)
import Type (EqRel (NomEq), PredTree (EqPred),
#if __GLASGOW_HASKELL__ >= 711
import TyCoRep (Type (..), TyLit (..))
import TypeRep (Type (..), TyLit (..))

Aside from the GHC API, and some Prelude modules, we also import some functions from my ghc-tcplugins-extra package, `GHC.TcPluginM.Extra`: it provides some type-checker plugin utility functions with the intention to provide a stable API spanning multiple GHC releases. As those of us who use the GHC API often know, the GHC API is fairly unstable, hence my personal need for ghc-tcplugins-extra. Currently, ghc-tcplugins-extra supports GHC 7.10 and GHC 8.0.

Next up, setting up the plugin:

plugin :: Plugin
plugin = defaultPlugin { tcPlugin = const (Just gcdPlugin) }

gcdPlugin :: TcPlugin
gcdPlugin =
TcPlugin { tcPluginInit = lookupGCDTyCon
, tcPluginSolve = solveGCD
, tcPluginStop = const (return ())

We use `defaultPlugin` to create a plugin that does nothing at all, and add our own `gcdPlugin` type-checker plugin. Plugin’s can be given command line options, but we just ignore them.

The `TcPlugin` record has three fields:

  • * `tcPluginInit`: Initialises the state used by our type-checker plugin. We will use it to look up the `TyCon` (type constructor) of the `GCD` type family. One can also perform IO operations in `tcPluginInit`, and have the state contain `IORef`s.
    * `tcPluginSolve`: The main solver function, this is were we will actually perform the _gcd_ calculation.
    * `tcPluginStop`: Cleaning up of the state. We have no use for it, but it can for example be used to remove files that are used as a communication medium with an SMT solver.

Our type-checker plugin solves equations involving our `GCD` type family, so we need to know that the types that we are dealing with actually involve `GCD`. To do that, we need GHC’s internal representation of `GCD`, which is a TyCon. We do the lookup of our `GCD` TyCon in the initialisation of our type-checker plugin, so that it becomes our plugin’s internal state:

lookupGCDTyCon :: TcPluginM TyCon
lookupGCDTyCon = do
    md      <- lookupModule gcdModule gcdPackage
    gcdTcNm <- lookupName md (mkTcOcc "GCD")
    tcLookupTyCon gcdTcNm
    gcdModule  = mkModuleName "GHC.TypeLits.GCD"
    gcdPackage = fsLit "ghc-typelits-gcd"

The name of our package is going to be “ghc-typelits-gcd”, and the module in which we defined `GCD` is “GHC.TypeLits.GCD”. In the above code, we first look up the `GHC.TypeLits.GCD` module in the `ghc-typelits-gcd` package, we then look up the full `Name` of the thing we call “GCD”, and finally look up the actual `GCD` TyCon.

Actually solving something

So we’ve done all the setting up for our type-checker plugin, and are now ready to implement the business end of it. But what does a type-checker plugin actually do? Well, it solves Constraints, the types that in our Haskell code usually appear on the left of the `=>` arrow.

In our case, we will be solving equality constraints, that is, given the following haskell code:

f :: Proxy (GCD 6 8) -> Proxy 2
f = id

we have to say whether the equality `GCD 6 8 ~ 2` can be proven. So let’s start with the type signature of our solver:

solveGCD :: TyCon – ^ GCD's TyCon
         -> [Ct]  – ^ [G]iven constraints
         -> [Ct]  – ^ [D]erived constraints
         -> [Ct]  – ^ [W]anted constraints
         -> TcPluginM TcPluginResult

The first argument is the plugin’s state, which in our case is just the (immutable) `TyCon` of our `GCD` type family. Next come three categories of constraints that GHC’s solver pipeline is giving to us:

  • First are the given constraints, these are the proven facts about types available to the compiler.
  • Second are the derived constraints… to be honest, I never really understood what they do, apparently they help with improving error messages, but I never investigated them in depth.
  • And finally, there are the wanted constraints.

These are the constraints that the compiler wants us to either:

  • Prove, so that the valid program type checks, or,
  • Disprove, so that it can give a good error message.

That is, unsolved constraints also result in errors, but constraints that are outright rejected result in better error messages. The result of our plugin is:

data TcPluginResult
  = TcPluginContradiction [Ct]
  | TcPluginOk [(EvTerm,Ct)] [Ct]

where we can either return `TcPluginContradictions cts`, where `cts` are all the wanted constraints that we deemed insolvable (think `2 + 3 ~ 4` for ordinary natural numbers). Or we can return `TcPluginOk proven newWanted`, where the `proven` are all the wanted constraints (`Ct`) that we solved together with with their proofs (`EvTerm`), and `newWanted` are all the new wanted constraints that must additionally be solved.

An example of the latter would be: given the constraint `x + GCD 6 8 ~ 2 + x`, my [ghc-typelits-natnormalise]( plugin knows that addition is commutative, but doesn’t know anything about `GCD`, so it would return: `TcPluginOK [(“Believe me”,x + GCD 6 8 ~ 2 + x)] [GCD 6 8 ~ 2]`. The new wanted constraint, `GCD 6 8 ~ 2`, would then be picked up by the plugin that we are currently describing in this blog post.

Next, the body of our `solveGCD` function:

solveGCD _     _ _ []      = return (TcPluginOk [] [])
solveGCD gcdTc _ _ wanteds = return $! case failed of
  [] -> TcPluginOk (mapMaybe (\c -> (,c) <$> evMagic c) solved) []
  f  -> TcPluginContradiction f
    gcdWanteds :: [(Ct,(Integer,Integer))]
    gcdWanteds = mapMaybe (toGCDEquality gcdTc) wanteds

    solved, failed :: [Ct]
    (solved,failed) = (map fst *** map fst)
                    $ partition (uncurry (==) . snd) gcdWanteds

The first clause basically says that, if there are no wanted constraints for us to solve we are done. The thing is, our type-checker plugin is used inside two phases of GHC’s solver pipeline. In the first phase, only given (and probably derived) constraints are passed through the pipeline. A type-checker plugin could then extend GHCs knowledge about the constraints by creating more given constraints, which could then help in solving future wanted constraints. In the second phase, we’ll be constantly given new wanted constraints that pop up during iterations of the solver, until all wanted constraints have passed trough the pipeline at least once.

This brings us to the second clause, which is the clause that handles this phase 2 part. It gets all the wanted constraints (involving `GCD`) for which there is some hope that we actually can solve them; these interesting wanted constraints are `gcdWanteds`. As we will see later on, the solving part basically consists of trying to reduce both sides of the equality relation to a number. So given the equality `GCD 6 8 ~ 2`, we reduce it to `2 ~ 2`. Consequently, all the `solved` constraints are the constraints where both sides of the equality reduced to the same number, and `failed` constraints are the constraints where the sides of the equality reduced to a different number.

Finally, when there are no `failed` constraints, we return all our solved constraints, together with a proof that the constraints hold; we don’t return any new wanted constraints. The proofs for the solved constraints are constructed by `evMagic` (which we will discuss later). If there are any failed constraints, we return those in a `TcPluginContradiction`.

Next comes the code by which we find the constraints that are interesting for our plugin:

toGCDEquality :: TyCon -> Ct -> Maybe (Ct,(Integer,Integer))
toGCDEquality gcdTc ct =
  case classifyPredType $ ctEvPred $ ctEvidence ct of
    EqPred NomEq t1 t2
      -> (ct,) <$> ((,) <$> reduceGCD gcdTc t1
                        <*> reduceGCD gcdTc t2)
    _ -> Nothing

reduceGCD :: TyCon -> Type -> Maybe Integer
reduceGCD gcdTc = go
    go (LitTy (NumTyLit i)) = Just i
    go (TyConApp tc [x,y])
      | tc == gcdTc = gcd <$> go x <*> go y
    go _ = Nothing

The `toGCDEquality` first looks at what type of constraint we are dealing with. We are only interested in the equality constraints; other constraints are things like class constraints and implicit parameters. If the constraint is an equality constraint (`EqPred`), we see if we can reduce both sides of the equality to a number using `reduceGCD`.

Finally, `reduceGCD` is the function that applies the actual `gcd` function! If `reduceGCD` finds a type-level natural, it’s done. If it finds an application of our `GCD` type family, it first tries to reduce the argument, and subsequently applies the `gcd` function. If it’s not a type-level natural, or the application `GCD` type family, we give up.

Burden of proof

The last part of our plugin is the evidence generation: the actual proof that the wanted equality relation holds. This is sadly the part where we take the easy way out (the easy way is often pragmatic :-)). We use the `evByFiat` function from ghc-tcplugins-extra, which, as its name implies, gives a proof by fiat. The `evByFiat` function basically uses the same “proof”, as the one that `unsafeCoerce` uses internally in GHC.

evMagic :: Ct -> Maybe EvTerm
evMagic ct = case classifyPredType $ ctEvPred $ ctEvidence ct of
    EqPred NomEq t1 t2 -> Just (evByFiat "ghc-typelits-gcd" t1 t2)
    _                  -> Nothing

Perhaps these “proofs” by fiat are OK for the tiny type-checker plugin we have written here, perhaps it is not. Looking at the Agda code for GCD it seems possible to provide a real proof for GCD. Then again, Agda’s proof uses inductively defined natural numbers, which we didn’t have to begin with.

In the end, I use Haskell for actually building programs that run, or actually, building programs that get translated to digital circuits; and not to prove things about programs. I’m fairly confident that the plugin we have written does not allow for incorrect programs to be type-checked (I know, assumptions are the mother of all …), so I’m personally OK with these by fiat “proofs”.

Wrapping up

The complete code for this blog can be found on It contains the following, very minor, test suite:

test1 :: Proxy (GCD 6 8) -> Proxy 2
test1 = id

test2 :: Proxy (GCD 0 x) -> Proxy x
test2 = id

which, using our type-checker plugin, passes the type-checker, YAY!

When you start writing “real” type-checker plugins, you’ll have to think _at least_ about the following aspects:

  • How am I going to let given and wanted constraints interact?
  • How do I ensure that I do not keep on generating new wanted constraints every time my plugin gets called?
    Because if you keep on generating new constraints, the GHCs solver will never finish, and compiling just becomes an exercise in watching a blinking cursor for all eternity.
  • How do I interact with other type-checker plugins?
    Remember where I said that: given the constraint `x + GCD 6 8 ~ 2 + x`, our ghc-typelits-natnormalise plugin knows that addition is commutative, but doesn’t know anything about `GCD`, so it would return: `TcPluginOK [(“Believe me”,x + GCD 6 8 ~ 2 + x)] [GCD 6 8 ~ 2]`. And the the GCD plugin would do the rest. That was actually a white lie! The evidence it generates is not our simple `evByFiat` (i.e. “Believe me”), but a _conditional_ evidence. We need it to be conditional, because if `GCD 6 8 ~ 2` turned out not to be true, then `x + GCD 6 8 ~ 2 + x` isn’t true either. I used to solve this problem with tricks/hacks, but GHC 8+ seems to allow me to do this in a more sanctioned way.

Remember where I said it was easier to have the single equation definition of the `GCD` type family as opposed to the zero equation definition. Our test case, `test2`, is the proof of that: our solver, as it is currently defined, is completely unable to handle `test2`. Our current solver only works when both arguments of `GCD` reduce to a natural number, and gives up otherwise.

Indeed, this blog post was written with the intention of demonstrating a “working” prototype of a type-checker plugin with a “minimal” amount of code, and I think it serves that purpose well. Given that there isn’t much material on writing type-checker plugins out there, I understand there aren’t too many type-checker plugins in the wild. I hope this blog post serves as that final push for some of you to start writing your own type-checker plugins and further explore its design space.

Relocatable Cabal sandboxes

Cabal 1.22 has preliminary support for relocatable packages, which is actually undocumented because I’m a bad patch writer.

Regardless, this preliminary support for relocatable packages partially enables relocatable sandboxes. What’s this partial part you might ask? Well, only the `.cabal-sandbox` directory is relocatable, not the sandbox config file. So basically the part that takes the longest time to build is relocatable. So, how do you go about the creating such a relocatable `.cabal-sandbox `directory? easy:

  • `cd <project_dir>`
  • `cabal sandbox init`
  • cabal install – dependencies-only – enable-relocatable

The created `.cabal-sandbox` directory is relocatable anywhere on the machine, and across machines if ghc is installed in the same directory. Even dynamically linked libraries will work on Linux and OS X because the libraries will use relative RPATHs.

All that’s left to having truly relocatable sandboxes, is to make the .config files also relocatable. And I should write some documentation for the next Cabal release.

An FPGA design house delivering "right the first time" solutions.

Design and realisation by:
Comyoo | creatieve studio


Institutenweg 25A
7521 PH Enschede
+31 (0)85 8000 380