My last post on parsing in the presence of dynamic pragmas left us with this outline for calling the GHC parser.
flags <-
parsePragmasIntoDynFlags
(defaultDynFlags fakeSettings fakeLlvmConfig) file s
whenJust flags $ \flags ->
case parse file flags s of
PFailed s ->
report flags $ snd (getMessages s flags)
POk s m -> do
let (wrns, errs) = getMessages s flags
report flags wrns
report flags errs
when (null errs) $ analyzeModule flags m
Now, it's a fact that you'll not find in a GHC parse tree certain things like comments and the location of keywords (e.g. let
, in
and so on). Certainly, if you're writing refactoring tools (think programs like Neil Mitchell's awesome hlint
for example), access to these things is critical!
So, how does one go about getting these program "annotations"? You guessed it... there's an API for that.
If we assume the existence of a function analyzeModule :: DynFlags -> Located (HsModule GhcPs) -> ApiAnns -> IO ()
then, here's the gist of the code that exercises it:
POk s m -> do
let (wrns, errs) = getMessages s flags
report flags wrns
report flags errs
when (null errs) $ analyzeModule flags m (harvestAnns s)
Here harvestAnns
is defined as
harvestAnns pst =
( Map.fromListWith (++) $ annotations pst
, Map.fromList ((noSrcSpan, comment_q pst) : annotations_comments pst)
)
The type ApiAnns
is a pair of maps : the first map contains keyword and punctuation locations, the second maps locations of comments to their values.
You might think that's the end of this story but there's one twist left : the GHC lexer won't harvest comments by default - you have to tell it to do so by means of the Opt_KeepRawTokenStream
(general) flag (see the GHC wiki for details)!
Taking the above into account, to parse with comments, the outline now becomes:
flags <-
parsePragmasIntoDynFlags
(defaultDynFlags fakeSettings fakeLlvmConfig) file s
whenJust flags $ \flags ->
case parse file (flags `gopt_set` Opt_KeepRawTokenStream)s of
PFailed s ->
report flags $ snd (getMessages s flags)
POk s m -> do
let (wrns, errs) = getMessages s flags
report flags wrns
report flags errs
when (null errs) $ analyzeModule flags m (harvestAnns s)
For a complete program demonstrating all of this see this example in the ghc-lib
repo.