Monadfix

TL;DR We're a Haskell and Agda consultancy. You can ask as many questions as you want, for a fixed monthly price:

$256/mo for hobbyists
$1999/mo for companies

We are also happy to do pair programming, tutoring, and mini-lectures on various Haskell topics. Write us at hi@monadfix.io.

About us

We are three Haskell devs with industry/teaching experience, libraries on Hackage, popular posts, and a book in progress. Oh, right, and Roman is also the top #1 StackOverflow user for the [agda] tag.

Artyom Kazak
@neongreen

8+ years of Haskell

Vladislad Zavialov
@int-index

6+ years of Haskell

Roman Kireev
@effectfully

5+ years of Haskell
2.5 years of Agda

Some of the stuff we can do:

For more examples of questions we've already answered for other clients, see the Library below.

You might also take a glance at the libraries we wrote and maintain in the Open-source section.

Pricing

A geeky price for humans: $256/mo. A businessy price for companies: $1999/mo. If you're a one-man company, pick the category you're more comfortable with.

Of course, we also do prorating. Here's how it works:

Why hire us

Don't waste time

It's pretty easy to spend several hours trying to understand how to use a new library, googling why a particular optimization didn't kick in, or preparing a pull request that fixes a bug in a library that you're using. Just kidding, the last one can take several days.

Outsource all of this to us, and spend more time on things that are actually important.

Avoid expensive mistakes

If you have a greenfield project, it's likely that you'll be stuck with many of the decisions you will make early on. Some of those decisions will, inevitably, suck. You will overengineer things, you will pick unmaintained libraries, you will sacrifice type safety for no good reason. Hey, maybe you'll even pick the wrong language (Haskell is not the best tool for every job, after all).

We have worked on some pretty big projects in our lives, and we'll be able to stop you from making choices that will haunt you later.

Contacts

Just send an email at hi@monadfix.io, and after that we can figure out where to chat. Keybase is a good option, but Slack also works.

NB: if some different pricing model will work better for you, we can certainly consider it. It never hurts to ask.

Open-source

We maintain a number of open-source Haskell libraries that we recommend to our customers.

named – keyword arguments for Haskell

A lightweight library for named function parameters (keyword arguments) based on overloaded labels. Keyword arguments have several advantages over positional arguments:

Unlike newtype wrappers, keyword arguments don't pollute the global namespace, don't require top-level definitions, and don't need to be exported.

microlens – a dependency-free lenses library

A well-documented, lens-compatible lenses library with no dependencies. It's quite useful if you are writing a library and want to use lenses in your code (or provide a lens-y interface), but don't want users of your library to suffer from a lens dependency.

When you need more features of lens, you can use one of the extra packages in the microlens family; and if you feel that you need the full power of lens, the migration is trivial.

Library

These are some of our clients' questions along with our answers (where we've been able to get permission to publish them).

2018

Writing a simple Read instance with generics

Tim

I have a bunch of sum types, some with several dozen members, for which I need to define Read instances that are simply lowercase versions of their constructors. I haven't dug into generics yet, is there a quick way to do this?

Artyom Kazak

Here's an example with generics-eot (a library that simplifies GHC.Generics by representing everything as Eithers and tuples). Of course, you can also do it with “pure” GHC.Generics, but I forget its API after each time I use it :/

Here's how it's going to work:

> data T = Foo | Bar | Qux deriving (Show, Generic)

> greadMaybe "bar" :: Maybe T
Just Bar

First, some extensions and imports:

{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE TypeApplications #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE MonoLocalBinds #-}

import Generics.Eot
import Data.Char

If you have a type data T = Foo | Bar | Qux, generics-eot will represent its constructors like this:

  Foo  <->  Left ()
  Bar  <->  Right (Left ())
  Qux  <->  Right (Right (Left ()))

We can use that to generate a list of all possible values for a sum type, assuming that none of the constructors have any fields:

enumValues :: (HasEot a, EnumType (Eot a)) => [a]
enumValues = map fromEot enumEot      -- 'fromEot' converts a
                                      -- generic representation
                                      -- into a value of 'a'

-- | A class for things with several zero-field constructors.
class EnumType eot where
  -- | Get all EOT-representations of constructors of the type.
  enumEot :: [eot]

instance EnumType Void where
  enumEot = []
instance EnumType () where
  enumEot = [()]
instance EnumType eot => EnumType (Either () eot) where
  enumEot = Left () : map Right enumEot

We can also get all constructor names by using datatype, which returns information about a type passed via Proxy. Unlike with GHC.Generics, the metadata in generics-eot is separate from the representation of the type, which often makes it easier to deal with.

enumNames :: HasEot a => Proxy a -> [String]
enumNames = map constructorName . constructors . datatype

Now we just need to construct a lookup table and we can write a generic analog of readMaybe (you can put the table into a Map for some extra performance):

greadMaybe
  :: forall a. (HasEot a, EnumType (Eot a))
  => String -> Maybe a
greadMaybe = (`lookup` table)
  where
    table :: [(String, a)]
    table = zip (map (map toLower) (enumNames (Proxy @a)))
                enumValues

By the way, generics-eot is pretty fragile and I had spent like ten minutes fighting with this error:

Couldn't match type ‘GHC.Generics.Rep a’
               with ‘GHC.Generics.M1 GHC.Generics.D c0 f0’
  arising from a use of ‘datatype’
The type variables ‘c0’, ‘f0’ are ambiguous

I recommend putting code that uses generics-eot into a separate module, enabling -XMonoLocalBinds there, and optionally moving stuff into top-level functions instead of where clauses (otherwise it often just doesn't compile for some reason).

Tim

That looks fairly succinct. But in terms of performance my options are either explicitly writing the pattern match for O(1) or using a once-generated map for O(log n)?

Artyom Kazak

GHC does not optimize string pattern matches. You only get jump tables (or binary trees) with integers and things that can be converted to integers, e.g. Chars. Everything else is going to be a linear comparison. A map (or a trie) will actualy be faster than pattern-matching, starting from certain amount of constructors.

I recommend godbolt.org for inspecting generated assembly, they somehow match it up with Haskell source code and it's cool.

Tim

Do you have any resources for how the same would be written with GHC.Generics? Or generics-sop, or whatever is the go-to library of choice when you need to smooth some roughness with generic programming.

You don't need to trouble yourself writing these, I was just wanting to see some existing resources so I could reference them when I need to solve a problem. I used the TypeShape library in F# for almost all my boilerplate needs and I imagine that something similar in Haskell exists as well.

Artyom Kazak

The generics-sop solution is going to be pretty similar to the generics-eot one, it's just that instead of Left and Right you'll have Z and S. As for existing resources:

Update 2018-03-24: Andres Löh provided a generics-sop solution on Reddit: https://gist.github.com/kosmikus/76ab73d92fce76091ff7b2401f271649.

“Can I use recursion-schemes with bound?”

Tim

I have a question regarding recursion schemes and higher-order functors.

I have an AST that, as it turns out, needs more that a couple recursive operations defined on it (substitution, evaluation, partial evaluation, etc). I thought maybe I could save myself some trouble and write them with some recursion-schemes. I've learned that my AST is type-indexed and I'll need a higher-order functor (following the post “Fixing GADTs”), but I don't immediately see a way to define the hfmap for constructors that contain functions.

Is there an alternate encoding of Let and Acc below that would allow me to define instances for HFunctor? If not, is there a concise way to deal with something like recursion-schemes for manipulating such an AST? The main requirement is that I be able to fuse multiple composite operations into single traversals.

data ExpF (c :: Symbol) :: (* -> *) -> * -> * where
  LitB :: Bool -> ExpF c r Bool
  LitI :: Int -> ExpF c r Int
  LitD :: Double -> ExpF c r Double
  LitC :: Dense c -> ExpF c r (Dense c)
  LitO :: Obs a -> ExpF c r a
  Let :: r a -> (r a -> r b) -> ExpF c r b
  UFn :: T.Text -> (a -> b) -> r a -> ExpF c r b
  BFn :: T.Text -> (a -> b -> bb) -> r a -> r b -> ExpF c r bb
  Acc :: (r a -> r b) -> Int -> r a -> ExpF c r b
  Variable :: T.Text -> a -> ExpF c r a

Vladislav Zavialov

You are venturing into a territory where I don't have answers prepared, unfortunately. Let's see. If your goal is to define type-indexed ASTs in a way that allows you to have a unified approach to defining operations, the first thing to realize is that it's an open research problem. One good paper I've seen on the subject is “Type-and-Scope Safe Programs and Their Proofs”, published in 2017.

Regarding using recursion schemes for this purpose, a higher-order functor indeed sounds like something that could work here. I'll report back with my thoughts when I read the blog post you linked.

Vladislav Zavialov

Okay, I've read the post. That's indeed a reasonable approach, matches what I had in mind when I've heard “higher-order functor” :)

I see that the reason you have functions in your AST is that you're using HOAS. I don't think it's compatible with recursion schemes, at least I haven't seen it anywhere. And you lose other desirable properties with this representation (such as cheap alpha-equivalence checking). Did you consider other strategies to represent binding? Edward Kmett covers them on School of Haskell, with pros and cons.

Tim

Oi, another problem that isn't going to be solved overnight, then. I guess I can either find a different encoding of lets and lambdas or I can retool everything to work with bound. I've asked on Reddit if anyone knows about examples of the former. If there's no good solution I suppose I'll have to go with bound.

Vladislav Zavialov

bound as a library won't play well with recursion schemes either, but you can reuse the general approach.

But I'd say the benefits that bound brings outweigh those of recursion-schemes, if you find yourself choosing.

Tim

Yes, I'm sure there are additional benefits. But I understand recursion-schemes (for the most part) already, so there's that... We'll see if anyone gives me an answer.

Can I fuse multiple operations in a single traversal with bound?

Vladislav Zavialov

That's not something the library provides, no. But I don't think that bound precludes you from writing code in the style of recursion schemes and from getting fusion. Your data type parameter will be used for binding rather than subexpressions, but this doesn't preclude you from having another parameter.

You won't get to reuse existing combinators from recursion-schemes, though.

Tim

I don't mind the combinators, they seem to be straight forward to ad-hoc once you get a *Functor on the datatype. I've read that recursion-schemes are difficult to work with in bound because of the polymorphism, maybe it's not so bad.

I also have to consider the fact that I'm going to need to be able to store some sort of state with regards to this AST. With PHOAS, the only way I can see, is to persist all the events that affect the tree separately, and then re-evaluate the AST when we need to figure out what the new current state is (I guess basically replace all lambda terms with data-base positions so that they can be restored). It would be easier to just “dump it all” to a bytestring as one could conceivably do in bound though former approach of event-sourcing with PHOAS has the ability to “replay” the system should something get erroneously computed. I'm not sure which in the end is the stronger case.

What language to learn for dependent types?

Elliot Cameron

Do you guys know any Idris? I'm thinking of reading the Idris book just to get a better grasp of dependent types, and then learn how to use those techniques in Haskell.

I expect I'll need some help at least with that second part, if not the first also.

Vladislav Zavialov

I recommend Agda if your only goal is to learn about dependent types – it's closer to Haskell, and the compiler is more mature (less bugs). And I know it better than Idris.

That said, I've heard that the Idris book is a good one. So it'll work as well. The basics are the same everywhere, once you grasp pi/sigma types it'll be an easy ride.

Elliot Cameron

Oh, wow. I didn't realize Agda was close to Haskell. Idris looks almost identical.

Vladislav Zavialov

Idris is very different because of strict evaluation, while Agda is lazy and basically is Haskell on steroids (better type system).

Agda has a neat Haskell FFI, because its main backend compiles to Haskell – a good portion of the generated code is unsafeCoerce, simply because Agda has verified the code already and asks GHC not to bother.

Elliot Cameron

Are there good learning resources?

Vladislav Zavialov

That's the tricky part. No beginner-friendly ones, unfortunately. So Idris still makes sense.

I managed to learn Agda using the first chapter of the HoTT book and Péter Diviánszky's Agda tutorial. It's terse but good.

Agda also has a well-maintained wiki with a list of tutorials.

Dependent maps with uniques or type reps as keys

Tim

How do I use a DMap with keys from prim-uniq? I'm not sure I understand what is happening well enough to know how I should accomplish this. Have you used them before?

Vladislav Zavialov

Haven't used this package in particular, but I've made a similar thing myself in the past. Let's backtrack a little so I can explain what's happening there.

When you have a heterogeneous collection, the problem you're facing is that you need to know what type of thing you're getting out of it on lookup. In a heterogeneous map (such as DMap), assume you inserted 1 :: Int at key k1 and "Hello" :: String at key k2. Then you do lookup k1 m. What type of value does it return? For this particular map, it's Int, but we know that because we've just inserted it, we keep track of it mentally. The compiler doesn't, unless we encode it in the types somehow.

The keys to a heterogeneous map should somehow determine the types of the values, so when we perform a lookup, we can know what result type to expect. For example, a key could be a Type.Reflection.TypeRep (added in GHC 8.2), then the key itself is a representation of value type, and naturally it determines the result type we get on lookup. Alternatively, we can have a singleton GADT where the constructor determines the type index. But there are cases when we want to generate keys on the fly (something like a singleton GADT with an infinite amount of constructors, if that makes any sense). Then we can use Integers as keys, where each key comes from a supply of unique values. When we generate a new unique key, we assign a type to it, so we can say “Let's generate a new key for a String that I want to store”, and if we do that in IO, we can get something like Tag (Uniq 12414123) :: Tag RealWorld String. From now on we know that whatever is stored at key 12414123 must be of type String, and we know that when we do a lookup as well.

So imagine you have a global GADT like this:

data Tags a where
  T1 :: Tags Integer
  T2 :: Tags String
  T3 :: Tags ...
  -- ...

Then getUniq adds a new constructor to this GADT, and returns it. You must not lose it, because you won't get access to it again.

Tim

Ok, then it functions as I thought. In terms of losing the keys, however, I can't see much utility in keeping opaque keys around. If I have keys x, y, z, and a DMap, which values do those keys refer to in the map? How do I keep track of it?

I also thought, but did not confirm, that if I added to a DMap with the following:

tmp :: PrimMonad m => m (DMap (Tag (PrimState m)) (ValidDatum f))
tmp = do
  x <- newTag
  y <- newTag
  pure $
    DM.fromList [ x :=> MkValidDatum (Age 24)
                , y :=> MkValidDatum (Width 33) ]

and I wanted to access Age at some point down the line, then I could initialize a different DMap with a tag that witnesses the same type and then use that to pull out the original value:

find (Rec dm _) = do
  mx <- newTag
  let n = DM.fromList [mx :=> MkValidDatum (Width 0)]
      Just mv = DM.lookup mx dm
  mv

Vladislav Zavialov

You can't get anything out of the map that you return from tmp, because you've lost the keys. Creating new keys won't let you access the values that are there, even if types match. If you want a type-indexed map, you can use TypeRep for keys.

Tim

So, I can use TypeRep as a key for a DMap?

Vladislav Zavialov

Yes, why not? But you'll have only a single value of each type.

Tim

I expect only a single value of each type as the collection is like an “open” record. As to “why not”, I just wanted to confirm that we were still taking about DMaps. I'll give that a go later today.

Tim

[...] I've looked at the Reflection module, it looks like I'm going to have to define GCompare TypeRep instances to make TypeRep a valid key. Will try to get to this tomorrow. I also came across type-map which looks to be exactly what I need.

Vladislav Zavialov

Defining GCompare for TypeRep should be easy, I will create a PR to dependent-map this evening. Here you go: PR #17.

Stack reruns passed tests even with --no-rerun-tests

Client

It looks like Stack does not properly track what tests have passed? It has option --no-rerun-tests that should remember when the tests have passed and avoid rerunning them during stack test, but for me it always reruns the tests, even with no code changes and the previous run having them pass.

Is that a bug or is there something more I need to be doing?

Artyom Kazak

I've investigated and it turns out that:

  • It is a bug in Stack.

  • --no-rerun-tests is broken completely, it can't ever work with Stack 1.6.1, on any project. It's been broken since Nov 4, 2015. The Stack team has noticed only now, apparently.

  • There is a workaround: when tests succeed, find a file called stack-test-success in your Stack directory and change the last byte from 0 to 1 :) Stack should be doing it by itself when tests pass but it doesn't.

  • I've asked Michael Sloan what would be the best way to fix the bug. (It's going to be a one-line change, the only question is “what is the best place to put that line”.) I'll ping you when my PR is accepted and the bug is fixed in HEAD.

Artyom Kazak

Update: I've fixed the bug and the next release of Stack (1.6.4 probably) will include the fix. You can also use the stable branch of Stack if you want to.

Client

Very cool! Looking forward to that release then. Thanks a bunch.

2017

What does “injective” mean in the context of type families?

Elliot Cameron

Can you explain what is meant by “injective”? Sometimes I get type errors surrounding that word. I know it has something to do with type families.

Vladislav Zavialov

Injectivity means that we can determine the input to a function from its output. For example, \b -> if b then 1 else 0 is an injective function, because we can look at its output (1 or 0) and deduce the corresponding input.

Formally, you would say that a function f is injective when f a == f b implies a == b.

Since you are seeing this in the context of type families, this is probably what you are getting from GHC:

NB: `F' is a type function, and may not be injective

In this case GHC is just trying to help you understand why it cannot deduce the necessary type equalities.

The typechecker relies on two facts about type constructors:

  • they are injective: T a ~ T b implies a ~ b
  • they are generative: T a ~ D b implies T ~ D

These properties hold for data, data family, newtype, class. They do not necessarily hold for a type synonym or a type family.

When you have a type function with a signature like this:

f :: Num a => Maybe a -> Bool

the compiler needs to know a to find the Num instance for it. When you pass an argument of type Maybe Int to it, GHC uses injectivity of Maybe to conclude that a is Int, and finds an instance for it, and all is well.

But assume the input was a type family:

type family F a where
  F Int  = Maybe Int
  F Char = Maybe Int

f :: Num a => F a -> Bool

Now you pass Maybe Int to f, but GHC cannot deduce that a is Int, because F a ~ Maybe Int for both Int and Char – the F type family is not injective.

Since GHC 8.0.1 you can annotate your type families as injective (see “Injective type families” in the GHC manual). For example:

type family Maybe' a = r | r -> a where
  Maybe' () = Bool
  Maybe' a  = Maybe a

Of course, if the type family is not actually injective, GHC will report a type error.

By the way, you can notice that the syntax of injectivity annotations is modelled after the syntax of functional dependencies. This is no coincidence, and the extension is actually called -XTypeFamilyDependencies. However, it is an immature extension: you cannot have more than one dependency for a single type family, or a dependency that involves several type variables. In addition, there are issues with type families (presented the talk “Constrained Type Families” from ICFP 2017) that prevent GHC from using the injectivity annotation in certain cases.

Elliot Cameron

Wow! That's very clear and helpful! Thank you very much.

Vladislav Zavialov

I've also looked at GHC sources to understand in what circumstances exactly GHC emits this hint about injectivity. Here's the relevant code:

tyfun_msg | Just tc1 <- mb_fun1
          , Just tc2 <- mb_fun2
          , tc1 == tc2
          , not (isInjectiveTyCon tc1 Nominal)
          = text "NB:" <+> quotes (ppr tc1)
            <+> text "is a non-injective type family"
          | otherwise = empty

and the nearby comment is:

-- (b) warning about injectivity if both sides are the same
--     type function application   F a ~ F b
--     See Note [Non-injective type functions]

So, you will be getting this hint whenever GHC tries to unify F a with F b, knowing that F is not injective. The actual error might be something else, this is more of a reminder. Again, from the source:

{-
Note [Non-injective type functions]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
It's very confusing to get a message like
     Couldn't match expected type `Depend s'
            against inferred type `Depend s1'
so mkTyFunInfoMsg adds:
       NB: `Depend' is type function, and hence may not
            be injective
-}
Machine-readable logs with distinct ADTs for log messages

Client

All my log statements are represented as ADTs. I have many log data types scattered through the code; if I had only one type to capture all log statements, I would have a dependency sink and every time I dot an ‘i’ in log I would have to recompile entire application.

I log in JSON so all those ADTs implement ToJSON and FromJSON. I also have a separate application which takes the logs and converts them back to ADTs, and I have Pretty instances for all those ADTs – to print log statements in human-readable form.

The problem is dispatch. When I read a JSON object I have a string which identifies the type, but my dispatch basically looks like a very long list of:

"LogScheduleTx" -> render (parse logEntry :: Either Err LogScheduleTx)

Is there a better way to do this dispatch?

Vladislav Zavialov

Okay, how do you envision this tool would figure out which types to try to parse from logs?

  1. Some explicit specification, like type LogTypes = '[LogX, LogY, LogZ]? This list could be at top level, where you don't need to recompile.

  2. Figure out the types from the scope, i.e. the logic could be “any type with a IsLogMessage instance”?

Client

I can have any. I am not sure how the typeclass solution would work but indeed perhaps a type list would do the trick.

Vladislav Zavialov

Okay, so here's how it could work. First, define a class:

class LogTag t where
  logTag :: Proxy t -> Text

All log types would define instances for it:

instance LogTag LogInfoX where
  logTag _ = "X"

This is needed because we want to associate a string to each type. This string will appear in the generated JSON to identify the type. Any module that defines a type for logging would depend on this class, but it's unlikely to change, so this would not cause recompilation.

For logging messages, you'd have a helper like this:

logObject :: LogTag t => t -> Aeson.Value

It would construct a JSON object with two fields:

{ tag: "x", message: ... }

and we'd know that any log message has a tag that we can read. If I understand correctly, this is pretty similar to what you have now: you already have a tag that you match on.

The last bit is to define FromJSON and Pretty in a generic way for the type-level list of the messages. You could do it with a union of types in the style of union. Ideally we'd want the following API:

data LogUnion :: [*] -> *  -- a sum type for all log types

type LogTypes = '[LogX, LogY, LogZ]

parseLogTypes :: Aeson.Value -> LogUnion LogTypes
prettyLogTypes :: LogUnion LogTypes -> Doc AnsiStyle

If this sounds right to you, I can describe how to implement parseLogTypes and prettyLogTypes.

Client

This all makes sense. Indeed I have the first part done (I did it with Typeable).

Vladislav Zavialov

A trouble with Typeable is that the type rep can change between compiler versions, while a string is stable.

Now to the solution. First we need LogUnion, which is trivial to define:

data LogUnion ts where
  LThis :: !t -> LogUnion (t ': ts)
  LThat :: !(LogUnion ts) -> LogUnion (t ': ts)

Then we'd want to implement FromJSON and Pretty for it. This would require two instances of these classes: the base case and induction.

The base case is an empty list of types. parseJSON would always throw a parsing error (because an empty union is the same as Void, and there are no values to construct). pretty does no job either, because it receives an empty union (which is similar to Void), and this cannot happen at runtime. Thus:

instance FromJSON (LogUnion '[]) where
  parseJSON _ = fail "not a valid log message"

instance Pretty '[] where
  pretty = \case {} -- see the EmptyCase extension

And the interesting work would be happening in the inductive case:

instance (LogTag t, FromJSON t, FromJSON (LogUnion ts)) =>
    FromJSON (LogUnion (t ': ts))
  where
    parseJSON j = withObject "LogObject" $ \o -> do
      tag <- o .: "tag"
      if tag == logTag (Proxy @t)
        then do
          message <- o .: "message"
          return $ This message
        else
          That <$> parseJSON j

Here we check the string tag (or type rep, if you go with the Typeable solution). In case it matches the head of the list, we try to parse the message of this type t. Otherwise we delegate the work to the instance for the rest of the list. Eventually either the tag would match, or we'd get to the base case and fail.

A similar thing for pretty:

instance (Pretty t, Pretty (LogUnion ts)) => 
    Pretty (LogUnion (t ': ts)) 
  where
    pretty (This t) = pretty t
    pretty (That t) = pretty t
“Converting GADTs to JSON without boilerplate?”

Client

We want to log the effects. Basically every time I evaluate an effect I would like to issue a log statement before and just after. It could be done almost without any effort, but our effects are represented with a GADT, and I cannot easly convert it to JSON. Basically for now I have to write things like:

data DBAction a where
  AddRequest   :: RequestId -> DBAction ()
  GetValue     :: RequestId -> DBAction (Either Value Value)
  PersistEvent :: Event -> DBAction ()
  -- more constructors omitted

data LogDBAction
  = LogDBActionAddRequest RequestId
  | LogDBActionGetValue RequestId
  | LogDBActionPersistEvent Event
  -- ...
  deriving (Generic)

instance ToJSON LogDBAction
instance FromJSON LogDBAction

toStartLog :: DBAction c -> LogDBAction
toStartLog (AddRequest r)   = LogDBActionAddRequest r
toStartLog (GetValue r)     = LogDBActionGetValue r
toStartLog (PersistEvent e) = LogDBActionPersistEvent e
-- ...

This all is like super redundant information. I have to do it 'cause I don’t see other way to convert this DBAction a to and from JSON.

Vladislav Zavialov

This is a common problem when dealing with GADTs.

You usually have several representations, which have progressively more info at type level:

  • String – you only know that you have a sequence of characters

  • Aeson.Value – you know that these characters represent a valid JSON document

  • LogDBAction – you know that this JSON represents a DB action

  • DBAction – you know that this DB action has expected input/output types

There should be – in theory – a general way to convert between all these representation providing only minimal info. For example, we can convert between LogDBAction and Aeson.Value using Generic, and between Aeson.Value and String using functions that aeson provides.

The issue is the last gap between representations, LogDBAction and DBAction. Ideally, it should be possible to derive LogDBAction from DBAction, together with conversion functions. In practice, everyone's doing this by hand.

Ghostbuster: A Tool for Simplifying and Converting GADTs (2016)

The latest research on this topic that I've seen was a prototype tool that automates this process, I recommend reading their paper for more info (it's fairly well written and understandable without too much background): “Ghostbuster: A Tool for Simplifying and Converting GADTs”. But I am not aware of a tool on Hackage that would do this transformation.

I sent an email to one of the authors of the Ghostbuster paper, asked him if the tool is still in development. When he answers, I will let you know.

Client

Cheers, that’s very informative. Thank you – I will read the paper.

Vladislav Zavialov

Meanwhile, I have received a reply. The author told me that he and another member of their research team had left Indiana University and the fate of the tool is uncertain.

Dependencies installed by Stack aren't being cached

Client

I'll start off with a fiddly build question. We've been using CircleCI, and are trying to sort out caching of the Stack build state. However despite what looks like a correct caching and restore of .stack-work, we're not seeing any actual caching.

On a build with no source changes we get:

foo-1.0.0: unregistering (missing dependencies: foo-util)
foo-lib-1.0.0: unregistering (missing dependencies: foo-test, foo-util)
foo-test-1.0.0: unregistering (missing dependencies: foo-util)
foo-util-1.0.0: unregistering (local file changes: util/Foo/Orange.hs 
    util/Foo/Pear.hs util/Foo/Apple.hs util/Foo/Banana.hs util/...)

I'm not sure if we're doing something silly and missing something to cache, or if cache save/restore doesn't actually work yet, or if stack is doing something odd.

Artyom Kazak

Can you send your circle.yml config?

Client

Sure, here it is:

version: 2
jobs:
  build_and_test:                         # NB. some parts omitted
    steps:
      - checkout
      - restore_cache: ...                # omitted
      - run:
          name: stack_build
          command: stack build --test
      - deploy: ...                       # omitted
      - save_cache:
          key: dependency-cache-{{ .Branch }}-{{arch}}-{{ .Revision }}
          paths: ".stack-work"

Artyom Kazak

Okay, so just to make sure: there's no problem with snapshot deps (from .stack), but caching/restoring of .stack-work doesn't work?

Client

It looks like it partially works. extra-deps are cached fine but it always rebuilds our code from scratch.

Artyom Kazak

Oh, I figured it out: apparently Stack creates .stack-work folders in subprojects as well, so you have to cache them too:

- save_cache:
    paths:
      - .stack-work
      - lib/.stack-work
      - util/.stack-work
      - test/.stack-work

Also, I asked authors of Stack and apparently currently it's not possible to ask Stack to consolidate .stack-works into one directory, so if you decide to look for a way to do that you should probably redecide :P

Client

Final update: got everything working! Still running tests even without compiling, but drops our noop builds from 13m to 1m.

Does GHC apply functor laws during optimization?

Elliot Cameron

[...] Totally different question, out of curiosity: does GHC apply functor laws in optimization?

Artyom Kazak

Rewrite rules defined in Control.Category

AFAIK, no. Some rules related to Arrow and Category are applied, and there are lots of rules for lists and some other stuff, but I've never seen Monad/Functor laws used in optimizations.

I'll grep through the sources of base to be sure.

Elliot Cameron

Haha, that's one way to solve it.

Artyom Kazak

The previous time I had this question, I didn't find anything :P

Yeah, apparently no Functor-related rules. Of course, GHC could have hard-wired Functor/Monad-related optimizations, but I remember asking someone knowledgeable about that and I got told there was no such thing built-in.

Elliot Cameron

I wonder how much faster Haskell code would be if it were safe to apply such optimizations.

Vladislav Zavialov

There was a Trac ticket about it. There's a good discussion in the comments.

The issue is that there are unlawful Functor instances out there and optimizations shouldn't break them.

Elliot Cameron

Ahhh, they can add the rules themselves.

That really is a good solution. That puts the onus on library authors to prove correctness before benefiting but doesn't punish code for breaking the laws.

But I wonder why the rules are not written for the functors in base – maybe the rules are in terms of the monomorphic functions instead of fmap?

Vladislav Zavialov

There are rules for monomorphic functions, such as map, but you don't get these rules applied in a polymorphic setting.

Elliot Cameron

Not even if map got inlined in place of fmap?

Vladislav Zavialov

That's not polymorphic anymore.

When does GHC inline?

Elliot Cameron

When does GHC inline? I just wrote a tiny function that is merely for convenience to convert things. It really could / should be inlined. Do I need to tell GHC this?

Vladislav Zavialov

If a function is small and non-recursive GHC will inline it.

Elliot Cameron

Even across module boundaries?

Vladislav Zavialov

Yes, it puts an optimized version of the function into interface files (.hi) and inlines across modules.

If you add an {-# INLINE #-} pragma it will put an unoptimized version of the function into the .hi file, so you'd want to avoid specifying this pragma unless you need to inline the function body verbatim and optimize at each use site (as opposed to optimizing at definition site).

Elliot Cameron

Ahh, yes that makes great sense, I can see how some cases would optimize much better at call site instead.

But even functions optimized at definition site can still be inlined at call site, correct?

Vladislav Zavialov

Yes, if they are small and non-recursive.

It also puts a toll on compile times if you use INLINE too much, because optimizing at use site means optimizing multiple times.

Speeding up Template Haskell in GHCJS

Elliot Cameron

[...] I'm using GHCJS and TH makes it go amazingly slow. [...]

Artyom Kazak

Update 2018-03-07: apparently, this has been fixed in GHCJS at some point in the past and now the trick described below doesn't result in any speedup at all (e.g. it brings no speedup on ghcjs-0.2.0.9006020_ghc-7.10.3, which is fairly old already).

Thanks to Niklas Hambüchen for pointing this out!

Vladislav Zavialov

There's a trick to speed up TH in GHCJS: make it go in one pass. Instead of:

data Foo = ...
makeLenses ''Foo

data Bar = ...
makeLenses ''Bar

try putting everything into one splice:

data Foo = ...
data Bar = ...

do
  decs1 <- makeLenses ''Foo
  decs2 <- makeLenses ''Bar
  return $ decs1 ++ decs2

Or with sequenceA:

concat <$> traverse makeLenses [ ''Foo, ''Bar ]

This is faster because GHCJS starts a Node.js process for each splice. A single splice requires less overhead.

Elliot Cameron

Wow that is an amazing trick! You may have just cut down my build times by a large factor!

Dealing with records in Haskell

Elliot Cameron

How do you guys deal with records? I really like lenses, but I don't really like Template Haskell and makeLenses. I really like generic-lens, but the syntax is so verbose:

record ^. field @"fieldname"

And then I still have the problem with record field names colliding.

Also, I'd love to just get rid of the record field accessors that get created for records. But then I can't construct a record because lenses don't let you construct new records from nothing. I've tried deriving Default... but that's not ideal and in some cases just downright bad. I can imagine a version of generic-lens's field function that allows you to de-mangle the record field name in some way.

Artyom Kazak

You can go about it like this:

declareFields [d|
  data Person = Person {
    _personName :: Text,
    _personAge  :: Int }
|]

This gives you lenses called name and age. Names can be duplicated (the lenses are polymorphic), and the actual generated type is not a record so accessors like _personName don't get generated.

If all fields in such a structure are lazy (i.e. without the bangs), you can construct a record with lenses like this (I wouldn't recommend this approach though because if you forget one field the compiler won't warn you):

bob :: Person
bob = undefined
        & name .~ "Bob"
        & age  .~ 24

To be honest, in my experience constructing records from scratch doesn't happen that often so I don't see “you have to use _personName once in a while” as a problem and just use makeFields.

Elliot Cameron

If you have a sensible default or some smart constructor then yeah.

Artyom Kazak

Well, if you have a sensible default it's even better but this doesn't seem that bad either:

return Person
  { _personName = ...
  , _personAge  = ...
  }

Elliot Cameron

It's only bad when Person is AppDbContextHandle and you have _appDbContextHandleName.

Artyom Kazak

Ah, in this case just go the “abbreviate all the things” route :) We do it at work:

data AppDbContextHandle = AppDbContextHandle
  { _adchFoo = ...
  , _adchBar = ...
  }

Though if you hate it I could write a Template Haskell helper for you that lets you write things like this:

construct [|
  let field1 = ...
      field2 = ...
  in AppDbContextHandle
|]

Elliot Cameron

I'm using beam which doesn't support makeFields so I have to use makeLenses.

The solution with the helper would indeed be nice. However I'm also using GHCJS and TH makes it go amazingly slow. On the other hand, that's not a huge deal because I rarely build the app in GHCJS. So many constraints...

Which is why I like generic-lens so much. No TH (so no slowness) and it obviously works fine with beam since it's just deriving from Generic. But I'd need to use -XOverloadedRecordFields or something.

Artyom Kazak

By the way, there's another solution that uses new -XOverloadedLabels: overloaded-records. You define a record as usual, and then call overloadedRecord:

import Data.Default (Default(def))
import Data.OverloadedRecords.TH (overloadedRecord)

newtype Bar a = Bar { _bar :: a }

overloadedRecord def ''Bar

Then you'll be able to use overloaded labels as lenses for Bar's fields without even having to import anything:

{-# LANGUAGE OverloadedLabels #-}

import Control.Lens ((+~))

add :: Int -> Bar Int -> Bar Int
add n = #bar +~ n

(I suspect there are some caveats though.)

Elliot Cameron

Possibly performance. generic-lens has no performance penalty, but I'm not sure about this library. As far as I know generic-lens has gone through a lot of effort to ensure that it does not impose runtime overhead. Take a look at generic-lens's use of inspection-testing. They “prove” that GHC generates the same lenses that you'd write by hand.

Vladislav Zavialov

This sounds cool. Although I think they couldn't prove that GHC manages to do that in every case, especially if optimizations kick in a different order (i.e. the lens is inlined first, then its part moved around, etc). inspection-testing checks the generated Core, but of course it can only check what you pass to it, and GHC might generate different Core in different circumstances.

Elliot Cameron

I think I'm in love with the overloaded labels approach to generic-lens described in this tweet. It uses generic-lens-labels. With -XDuplicateRecordFields you can allow the ambiguity which only matters in record updates anyway. But if the only time you use that is to create a new record, then there's no ambiguity.

Vladislav Zavialov

The only issue I have with overloaded labels is that different libraries have different use cases for them, and those may conflict.

For instance, haskell-gi uses them for FFI fields. Does it conflict with use of overloaded labels as lenses?

Elliot Cameron

Interesting. Yeah, I don't know. I use records so heavily that I'd be willing to sacrifice pretty much every other use though. ;)

Artyom Kazak

Update 2018-03-06: it wasn't mentioned in the original dialogue, but look at the -XDuplicateRecordFields extension which got introduced in GHC 8.0. It, too, makes working with records easier in Haskell.

“Where to start with generics?”

Elliot Cameron

Hi! I would like to do some things with generics but have no idea where to start.

Vladislav Zavialov

I think the best starting point for understanding generics is understanding the algebra of types. Check out this article series: The Algebra of Algebraic Data Types.

If you're comfortable with thinking about all types as sums and products, you already understand the bulk of generic programming.

Elliot Cameron

Awesome! I'll start there. I find records particularly annoying to work with and things like generic-lens give me hope, but that and similar tools are still a bit too inflexible so I hope to customize them a bit.

Artyom Kazak

If you're particularly curious about generics, I would also recommend reading Andres Löh's lecture notes on modern generics.

You don't really need it if you just want to use generics but if you've already heard about a bunch of concepts (GADTs, type-indexed lists, etc) without being especially familiar with them, it makes for nice bedtime reading and tricks you into understanding the generics-sop library along the way.

Briefly about monad transformers

Client

I've got a kinda generic question. How do monad transformers work?

Vladislav Zavialov

Well, the reason to use monad transformers is to get the Monad instances automatically. For example, there's no runtime difference between a -> IO b and ReaderT a IO b, but the latter has a Monad instance, whereas the former doesn't. Actually, that's it.

Of course, we could use a function instead:

state -> env -> IO (Either exc (a, state))

But we want a (>>=) that does all the plumbing, and to get this we express the same thing as a transformers stack:

StateT state (ReaderT env (ExceptT exc IO)) a

Client

Ok, that makes sense.

I’ll read that chapter of @bitemyapp’s book on that and ask you if I have any specific questions.

Vladislav Zavialov

And we couldn't define a Monad instance for a -> IO b because it's unclear whether we want it to behave as ReaderT a IO b or Reader a (IO b). So the structure of transformers layers tells the compiler how deep we want (>>=) to inspect the structure of the types.

(Well, actually, a -> IO b has a Monad instance and it's indeed equivalent to Reader a (IO b) rather than ReaderT a IO b.)

Briefly on Servant vs Yesod

Client

How does Servant compare to Yesod?

Vladislav Zavialov

The primary philosophical difference is that Servant tries to do everything using type-level tricks, while Yesod uses Template Haskell a lot. The code looks very different with these frameworks because of that. What's common between them is that they're both wai-based, meaning they can run atop any wai-compatible server (warp or warp-tls), and that's great. I recommend against frameworks that aren't wai-based (this includes Snap and Happstack) because warp is a great web server, it's performant and configurable. Now, I'm not going to give a point-by-point comparison for Servant and Yesod because they're just so different that having such a list might be misleading. For instance, they both support type-safe routing, but it's done in wildly different ways.

I'd say it's useful to know both. If you implement a web API that talks to the front-end using JSON, I recommend Servant. If you want a multi-page web-app that has user authorization and stuff like this, Yesod is the path of least resistance. And since both of them are wai-based (did I mention it's great? well, here's another reason why) you can embed a Servant website as a subsite for Yesod, and vice versa, you can run a Yesod website as part of a Servant API.

Client

I have more experience with Yesod (am a contributor, read a lot of the code) but I’m finding that I’m just ripping out most of it – so I’ll give Servant a try.

I’m a little worried it looks too complex to introduce to beginners.

Vladislav Zavialov

It definitely is very complex on the inside, if you try to look at the definitions they're monstrous – but I'd claim that using it isn't hard at all. Those definitions are so complex precisely to allow for elegant code that uses them, they try to hide a lot of complexity.

Of course, it doesn't help that Haskell doesn't have good facilities for type-level programming. In a language where functions are promoted and there's proper relevant quantification (might be getting a bit technical here), Servant could be implemented much nicer.

Slicing type-indexed tensors

Client

I’d like to write a generalized slicing function, I think it’s doable at the typelevel but I’m not sure how to write out all the constraints.

The general idea is something like:

  • You have a tensor with shape that’s a type-level list, say, [x,y].

  • You have two type level nats, one for the dimension to be sliced and another for the index into that dimension.

The return value should have dimensions that are the same as the tensor, but excluding the selected dimension, so

  • [x,y] with dimension 0, index I ⇒ a tensor of dimension [y]
  • [x,y] with dimension 1, index I ⇒ a tensor of dimension [x]
  • [x,y,z] with dimension 1, index I ⇒ a tensor of dimension [x,z]

There are some constraints, for example I cannot be less than 0 or bigger than the size in the specified dimension – e.g. for a tensor of [x,y] you can’t specify dimension 0 and index I if I > x.

There are some more complex generalizations on this, for example, ranges of indices, or exclusion of indices, but I want to see if we can come up with a solution for this simple case before moving to harder ones.

Any thoughts?

Roman Kireev

Here is a snippet that provides types you want for that exclusion function:

{-# LANGUAGE DataKinds, PolyKinds #-}
{-# LANGUAGE UndecidableInstances, TypeFamilies #-}
{-# LANGUAGE AllowAmbiguousTypes, TypeApplications #-}
{-# LANGUAGE ScopedTypeVariables, TypeOperators #-}
{-# OPTIONS_GHC -Wall #-}

module Main where

import GHC.TypeLits

type family ConsSnd (y :: b) (p :: Maybe (a,[b])) :: Maybe (a,[b]) where
  ConsSnd _  'Nothing           = 'Nothing
  ConsSnd y ('Just ('(,) x ys)) = 'Just ('(,) x (y ': ys))

type family ExcludeAt i (xs :: [a]) :: Maybe (a,[a]) where
  ExcludeAt _ '[]       = 'Nothing
  ExcludeAt 0 (x ': xs) = 'Just ('(,) x xs)
  ExcludeAt n (x ': xs) = ConsSnd x (ExcludeAt (n - 1) xs)

data Tensor (dims :: [Nat]) = Tensor

slice
  :: forall dim i n dims dims'. 
     (ExcludeAt dim dims ~ 'Just ('(,) n dims'), 0 <= i, i <= n)
  => Tensor dims -> Tensor dims'
slice = undefined

The slice function can be used like this:

test1 :: Tensor '[2, 4] -> Tensor '[4]
test1 = slice @0 @1

test2 :: Tensor '[2, 4] -> Tensor '[2]
test2 = slice @1 @3

test3 :: Tensor '[2, 3, 4] -> Tensor '[2, 4]
test3 = slice @1 @1

If you violate one of the constraints, it fails:

-- Couldn't match 'Nothing with 'Just
fail1 :: Tensor '[2, 3, 4] -> Tensor '[2, 4]
fail1 = slice @3 @1

-- Couldn't match 3 with 2
fail2 :: Tensor '[2, 3, 4] -> Tensor '[2, 4]
fail2 = slice @0 @2

-- Couldn't match 'True with 'False
fail3 :: Tensor '[2, 3, 4] -> Tensor '[2, 4]
fail3 = slice @1 @4

Client

The type family functions seem so low level, i guess there aren’t many packages for type family functions to plug in for this sort of thing. How does that @Nat syntax work?

Roman Kireev

There is the singletons package which provides the defunctionalization gadget which can be used in order to implement higher-order functions at the type level. It also contains a lot of functions promoted to type families.

That @ syntax is enabled by -XTypeApplications. Basically, instead of passing Proxys around you explicitly specify type parameters via type applications. In slice @0 @1, the 0 argument is for dim and the 1 is for i. This is nicer than proxySlice (Proxy :: Proxy 0) (Proxy :: Proxy 1).