Testing Coverage of Template Haskell functions
August 6, 2020
In my aeson-schemas
library, I implement quasiquoters that will generate Template Haskell to define type-level schemas and operations to extract data out of a JSON object. Then I would write tests that look like this:
let o :: Object [schema| { foo: Int } |]
o = either error id $ eitherDecode [aesonQQ| { "foo": 1 } |]
[get| o.foo |] `shouldBe` 1
I can run this with stack test
and it would succeed. But after this succeeds, I wanted to measure coverage with stack test --coverage
. The problem is that quasiquoters run and are spliced at compile-time, so the code that’s actually being tested looks like this:
let o :: Object ('SchemaObject '[ '("foo", 'SchemaScalar Int) ])
o = either error id $ eitherDecode ...
getKey (Proxy @"foo") o `shouldBe` 1
So we’re getting coverage on getKey
, but not getting any coverage on the get
or schema
quasiquoters.
The first problem here is that we need to actually run the quasiquoters at runtime, in addition to splicing them. Quasiquoters run in the Q
monad, so we would need to figure out a way to run the Q
monad at run-time.
The second problem is that I would like to avoid duplicating each of my test cases, e.g.
[ testCase "o.foo spliced" $
[get| o.foo |] `shouldBe` ...
, testCase "o.foo runtime" $
quoteExp get "o.foo" `shouldBe` ...
, ...
]
In this blog post, I’ll be going over how I solved these problems (with the help of a new library I wrote for this purpose), and hopefully it’ll help someone else needing to do a similar task.
Running TH code at runtime
The first problem is to run the Q
monad at runtime. Luckily, template-haskell
provides a helpful runQ
function which gets us 90% of the way there:
runQ :: Quasi m => Q a -> m a
The good news: IO
has a Quasi
instance, which means we can use this function to run the Q
monad in IO
. The bad news: the Quasi IO
instance only works for a very limited number of methods, and errors when calling functions like reify
or lookupTypeName
. More bad news: there’s no way to get the error message out of a failure. It’s an open issue that recover
doesn’t include the error message, and the MonadFail
instance of Q
throws away the fail
message, so naively using try (runQ ...) :: IO (Either SomeException ...)
will result in the unhelpful message Left "Q monad failure"
.
In order to solve this, I decided to write my own monad that implements Quasi
. That way, we can manually mock functions like reify
AND catch the actual error messages. Word of warning: this is a fairly hacky solution that depends on Template Haskell internals, but unfortunately, it’s the best we can do right now.
Catching error messages
The first thing we’ll do is write a monad that will pass everything through to the Q
monad, except with an error handling system that lets us get the actual error message. First, let’s look at how the Q
monad throws errors (source):
newtype Q a = Q { unQ :: forall m. Quasi m => m a }
instance Fail.MonadFail Q where
fail s = report True s >> Q (Fail.fail "Q monad failure")
So it uses the report True
function (True
indicates that this is an error message being reported, as opposed to a warning message) to report the error message (which ostensibly gets passed to GHC to display a pretty compile-time message) and then calls fail
in the Quasi
instance we’re defining. This shows us why we only ever get the unhelpful “Q monad failure” message if we used try
; because fail
never actually gets the error message!
So it seems like we’ll need to do an indirect pass of storing the message in our qReport
implementation (report
is basically just a proxy to qReport
) and then get back the message in our fail
implementation. Our first pass will look something like this:
newtype TestQ a = TestQ
{ unTestQ ::
-- store the message for ExceptT to throw
StateT (Maybe String)
-- gets the message from StateT and return 'Either String a'
( ExceptT String
Q
)
a
} deriving
( Functor
, Applicative
, Monad
, MonadIO
)
instance Quasi TestQ where
qReport _ msg = TestQ $ State.put $ Just msg
...
instance MonadFail TestQ where
fail msg = do
storedMessage <- TestQ State.get
TestQ $ lift $ Except.throwE $ fromMaybe msg storedMessage
tryTestQ :: Q a -> Q (Either String a)
tryTestQ =
Except.runExceptT -- ExceptT String Q a => Q (Either String a)
. (`State.evalStateT` Nothing) -- StateT ... a => ExceptT String Q a
. unTestQ -- TestQ a => StateT ... a
. runQ -- Q a => TestQ a
Notice that, now when fail
is called in the Q
monad, it will:
- Call
qReport
, which will:- Store the message in the
StateT
state
- Store the message in the
- Call
fail
, which will:- Get the stored message from
qReport
- Throw the stored message with
ExceptT
- Get the stored message from
And now with the tryTestQ
function, we can successfully catch and test the actual error messages thrown by our Template Haskell function! Catching errors is definitely important for testing coverage, but we still end up with the Q
monad. Next step is actually running the Q
monad at runtime.
Mock Q functions
If the Template Haskell function is pure enough (i.e. not needing to inspect the type environment), we could just use runQ
:
tryTestQ_IO :: Q a -> IO (Either String a)
tryTestQ_IO = runQ . tryTestQ
but this won’t work if you use features like reify
or lookupTypeName
. So let’s augment TestQ
to mock out these functions. For this blog post, I’ll only mock out lookupTypeName
, but it should be straightforward to extend for the other functions. First, we need a data type storing the mocks. Then, we’ll need to store it in TestQ
and use in qLookupName
(the proxy for lookupTypeName
and lookupValueName
).
data QState = QState
{ knownNames :: [(String, Name)]
}
newtype TestQ a = TestQ
{ unTestQ ::
ReaderT QState
( ...
)
a
} deriving (...)
instance Quasi TestQ where
qLookupName _ name = do
QState{knownNames} <- TestQ Reader.ask
return $ lookup knownNames name
...
tryTestQ :: QState -> Q a -> Q (Either String a)
tryTestQ qState =
Except.runExceptT
. (`State.evalStateT` Nothing)
. (`Reader.runReaderT` qState)
. unTestQ
. runQ
And now, running
tryTestQ (QState [("Maybe", ''Maybe)]) ...
will lookup using the names we’ve specified. At this point, we can go ahead and use tryTestQ_IO
, and it’ll now work with our mocks! We could even use unsafePerformIO
to get back pure results:
tryTestQ_IO :: QState -> Q a -> IO (Either String a)
tryTestQ_IO qState = runQ . tryTestQ qState
tryTestQ_Pure :: QState -> Q a -> Either String a
tryTestQ_Pure qState = unsafePerformIO . tryTestQ_IO qState
All in all, this was a ton of fun, but it’s a ton of work to do, all to set up what we actually want to do — write tests! So I’ve pushed all of this work into the th-test-utils
package (the implementation here isn’t pushed to Hackage yet, but it’s on master
), hopefully to make someone else’s life easier.
Avoid duplicating tests
Ok so now that we’re able to run Template Haskell functions at runtime, I would like to avoid duplicating all of my tests (one to test the spliced implementation, one to run at runtime and get coverage). So the general idea is to take the prior implementation:
[get| o.foo |]
===>
getKey @"foo" o
and write a new quasiquoter that will splice into running the quasiquoter itself (quine-style) in addition to the normal splice. We’ll use deepseq
to fully evaluate the quasiquoter before returning the usual value (you’ll need to implement NFData
for all the Template Haskell types, a separate package I’ll hopefully split off soon).
[runGet| o.foo |]
===>
quoteExp get "o.foo" `deepseq` getKey @"foo" o
Note: [get| ... |]
is just syntax sugar for quoteExp get "..."
when used as an expression. The implementation is fairly straightforward:
runGet :: QuasiQuoter
runGet = QuasiQuoter
{ quoteExp = \s ->
[|
quoteExp get s -- run quasiquoter at run-time
`deepseq`
$(quoteExp get s) -- run quasiquoter at compile-time and
-- actually return this value at run-time
|]
, ...
}
Test all the things!
And now, runGet
is a drop-in replacement for get
in our tests that will splice the get
quasiquoter as usual, as well as running the quasiquoter at runtime (ignoring the value), finally resulting in automatic coverage on the quasiquotation actually being spliced!
-- old, without coverage
testCase "Test o.foo" $
[get| o.foo |] `shouldBe` 1
-- new, with coverage
testCase "Test o.foo" $
[runGet| o.foo |] `shouldBe` 1
If you have a library that uses Template Haskell or QuasiQuoters, I hope the th-test-utils
library will help you write more/better unit tests. Getting coverage on my library actually helped me uncover bugs, and I hope it can do the same for you!