Disclaimer: I made these categories up, and I'm excluding languages like Cryptol and Idris that only incidentally involve Haskell.
This compiles Haskell code to run directly on an embedded target. This requires:
Ajhc (https://github.com/ajhc/ajhc), a JHC-derived compiler from Kiwamu Okabe of METASEPI, is the only example of this I found - it could compile and execute on ARM Cortex-M3/M4. His subsequent switch to the ATS language may be a hint.
This uses an existing compiler for certain stages (such as the parsing and type-checking), but a custom back-end to actually produce code. This may adapt or disallow certain constructs.
GHC readily accomodates this by allowing developers to invoke GHC functionality, from Haskell, as a library. (GHCJS, a Haskell to JavaScript compiler, uses this.)
CλaSH (http://www.clash-lang.org/) from Christiaan Baaij uses this to compile a subset of Haskell to VHDL and SystemVerilog. CλaSH disallows certain things: recursive functions, recursive types, side effects, floating-point...
Reduceron (https://github.com/tommythorn/Reduceron) is an "FPGA Haskell machine" relying on massively-parallel graph reduction, complete with GC and lazy evaluation.
Conal Elliott worked with a Silicon Valley startup, Tabula, on massively-parallel execution of Haskell code on a new architecture (Spacetime), using an approach based on Cartesian Closed Categories (http://conal.net/blog/posts/haskell-to-hardware-via-cccs & https://github.com/conal/lambda-ccc/).
This uses an EDSL (embedded domain-specific language) inside of Haskell to direct the process of code generation to a lower-level representation. (Otherwise called: compiling.)
Note that in this case, Haskell code never actually runs on the embedded target. Rather, it uses specifications in the EDSL to build a representation of what will run there - in other words, a sort of metaprogramming.
The code that runs on the target is entirely decoupled from the Haskell runtime.
The official definition: "Atom is a Haskell EDSL for designing hard realtime embedded software. Based on guarded atomic actions (similar to STM), Atom enables highly concurrent programming without the need for mutex locking. In addition, Atom performs compile-time task scheduling and generates code with deterministic execution time and constant memory use, simplifying the process of timing verification and memory consumption in hard realtime applications. Without mutex locking and run-time task scheduling, Atom eliminates the need and overhead of RTOSes for many embedded applications."
Short version: Atom is a synchronous language: One specifies rules that apply on specific clock ticks, and all rules are atomic. Feed a specification into Atom, and Atom generates fairly bulletproof, deterministic C code.
Nearly everything that I reference should have a link at: http://haskellembedded.github.io/pages/links.html
My Atom introduction is at: http://haskellembedded.github.io/posts/2015-02-17-atom-examples.html
In explaining the "How?" and "What?", I probably ignored much of the "Why?", and this explains some of that: http://haskellembedded.github.io/posts/2015-02-06-how-i-got-here.html
See the #haskell-embedded IRC channel on Freenode to find me (hodapp) and a bunch of other people who are way better at this than I am.