We will implement a simple static blog generator in Haskell, converting documents written in our own custom markup language to HTML.
Learn Haskell by building a blog generator - Book
In this part we'll explore a few basic building blocks in Haskell, including functions, types and modules, while building a small HTML printer library with which we will later construct HTML pages from our markup blog posts.
runghc hello.hs > hello.htmlResources:
Util DSL library for HTML - Lucid
Learn more about GHCi - GHC User Guide
Package Repository - Hackage
In this chapter we will define our own simple markup language and parse documents written in this language into Haskell data structures.
Our markup language will contain the following features:
- Headings: prefix by a number of * characters
- Paragraphs: a group of lines without empty lines in between
- Unordered lists: a group of lines each prefixed with -
- Ordered lists: a group of lines each prefixed with #
- Code blocks: a group of lines each prefixed with >
We can ask GHC to notify us when we accidentally write overlapping patterns, or when we haven't listed enough patterns to match all possible values, by passing the flag -Wall to ghc or runghc.
runghc hello.hs > hello.html -Wall
Testing parse markup
ghci> txt <- readFile "/tmp/sample.txt"
print $ parse txt
Resources:
Find out which module to import - Hoogle
In this chapter we are going to glue the pieces that we built together and build an actual blog generator. We will:
- Read markup text from a file
- Parse the text to a Document
- Convert the result to our Html EDSL
- Generate HTML code
- Write it to file
While doing so, we will learn:
- How to work with IO
- How to import external libraries to process whole directories and create a simple command-line interface
Project description is done in a cabal file. We can ask cabal or stack to generate one for us using cabal init --libandexe or stack new
The cabal.project and stack.yaml files are used by cabal and stack respectively to add additional information on how to build the package. While cabal.project isn't necessary to use cabal, stack.yaml is necessary in order to use stack, so we will cover it briefly.
Resources:
More information about imports, see this wiki article.
Haskell's central package archive - Stackage
The most popular package managers for Haskell are cabal and stack
You can find more licenses if you'd like at choosealicense.com.
The
optparse-applicativepackage has pretty decent documentation
You can find the laws for the applicative functors in this article called Typeclassopedia, which talks about various useful type classes and their laws.
We have left an unimplemented function last chapter, and there are a few more things left for us to do to actually call our program a static blog generator. We still need to process multiple files in a directory and create an index landing page with links to other pages.
Our general strategy for processing whole directories is going to be:
- Create the output directory
- Grab all file names in a directory
- Filter them according to their extension, we want to process txt file and copy other files without modification
- We want to parse each text file, build an index of the result, convert the files to HTML, and write everything to the target directory
- While our parsing function can't really fail, trying to read or write a file to the file-system can fail in several ways. It would be nice if our static blog generator was robust enough that it wouldn't fail completely if one single file gave it some trouble. This is a good opportunity to learn about error handling in Haskell, both in uneffectful code and for I/O code.
In the next few chapters we'll survey the landscape of error handling in Haskell before figuring out the right approach for our use case.
Haskell's ability to create very concise code using abstractions is great once one is familiar with the abstractions. Knowing the monad abstraction, we are now already familiar with the core composition API of many libraries - for example:
- Concurrent and asynchronous programming
- Web programming
- Testing
- Emulating stateful computation
- sharing environment between computations
- and many more.
The Control.Exception module provides us with the ability to throw exceptions from IO code.
As an aside, Handler uses a concept called existentially quantified types to hide inside it a function that takes an arbitrary type that implements Exception.
Haskell is a standardized language. However, GHC provides extensions to the language - additional features that aren't covered in the 98 or 2010 standards of Haskell. Features such as syntactic extensions (like LambdaCase above), extensions to the type checker, and more.
These extensions can be added by adding {-# language <extension-name> #-} (the language part is case insensitive) to the top of a Haskell source file, or they can be set globally for an entire project by specifying them in the default-extensions section in the .cabal file.
The list of language extensions can be found in the GHC manual, feel free to browse it, but don't worry about trying to memorize all the extensions.
Resources:
Monad transformers provide a way to stack monad capabilities on top of one another.
We'd like to add some sort of an environment to keep general information on the blog for various processings, such as the blog name, stylesheet location, and so on.
We can represent our environment as a record data type and build it from user input
Resources:
Sample:
cabal run HaskellBlogGenerator -- convert-dir --input .\tmp --output .\html-dist --replace We want to add some tests to our blog generator. At the very least a few regression tests to make sure that if we extend or change our markup parsing code, HTML generation code, or translation from markup to HTML code, and make a mistake, we'll have a safety net alerting us of issues.
We will use the hspec testing framework to write our tests. There are other testing frameworks in Haskell, for example tasty, but I like hspec's documentation, so we'll use that.
Resources:
There are many ways to help others to get started with our projects and libraries. For example, we can write tutorials, provide runnable examples, describe the internals of the system, and create an API reference.
In this chapter we will focus on generating API reference pages (the kind that can be seen on Hackage) from annotated Haskell source code using Haddock.
cabal haddokIf you'd like to learn even more about Haskell and continue your Haskell journey beyond this book, check out the appendix sections Where to go next and the FAQ.
Author site: https://gilmi.me/