My Writing & Publishing Pipeline

August 29, 2022 7 minute read

In early 2021 I set out to write Tao of React with nothing but raw ambition. I had finally decided to write a book, albeit a short one. So I opened a new blank document in a regular text editor and filled 130 pages with my knowledge about the framework.

Then I had to edit the thing.

And this is when I realized that putting together a technical book full of code examples and subtle visual elements is not an easy thing to do. At least not manually. I put page breaks by hand, put every little separator in place, and formatted the fonts and spacing for the code bits.

But when I did further edits the whole thing went to hell. I had to organize the code samples again and chase separators that had flown off their designated places. I bit the bullet and did that a couple of times, but little did I know, this was hardly over.

When I published Tao of React, naturally, I started getting DMs pointing out typos or problems with the code examples. I quickly found out that going through 130 pages to remove a duplicated word would be a nightmarish way to continue my writing journey.

I had somehow introduced technical debt into my book. So when I started writing my second book, Tao of Node, I did some improvements.

Writing Everything in Markdown

Perhaps the biggest mistake I did was that I used Pages, a regular text editor. I always imagined writers laboring over a blinking cursor on a white page so I did the same. I wasn’t just writing a book, I was fulfilling a dream of mine.

But I wasn’t writing a novel or a short story. A technical book requires far more attention to its formatting. You needn’t worry about plot or character development but you have to be damn sure that the file you submit in Gumroad is well-organized.

So I decided to write my second book in Markdown innstead. This means that I have a source of truth when it comes to formatting and placement. I can then convert it to PDF, HTML, or EPUB and use CSS to style them. This alone is a big upgrade over manually tweaking colors and positioning elements with the mouse.

But another nice thing is that I can utilize version control better. I keep all my writing in a private git repo so I don’t have to write on a single machine. I alternate between my personal and work laptops depending on what I’m doing and where I am.

With git, I have a clear history of my work and edits. Even though I rarely go over them, it’s exciting to see how the book is developed chapter by chapter. And besides that, it also serves as a backup. If I spill a hot cup of coffee on my machine I won’t lose my work.

Generating the Book

Once I have the Markdown file ready, I use the md-to-pdf command line tool to transform it. It’s important to note that versions of the package before 5.0.0 are vulnerable to remote code execution, so be careful.

md-to-pdf tao-of-node.md --stylesheet tao-of-node.css

I use HTML elements directly in the Markdown and then apply styles to them in the stylesheet. This is handy not just for visual customization but for layout control as well. I wanted every chapter to start on a new page so I added a page break before every h2 with a CSS rule.

Initially, I only published PDFs for my books but people kept asking about versions suitable for e-readers. I tried and tried, but the code examples just didn’t look good enough on Kindle.

But when someone asked for a refund because there wasn’t an EPUB version, I decided to leave my perfectionism aside and publish something.

I didn’t manage to find a suitable tool to convert Markdown to EPUB directly, so I had to use HTML as an intermediary. Thankfully, the md-to-pdf package has the option to output HTML.

md-to-pdf tao-of-node.md --as-html

Once I have that, I do some manual processing on it to remove some unnecessary tags added by default. I’m not sure if they mess up the final EPUB but I don’t want to take any chances.

Maybe you’re wondering why I’m doing this cleanup manually, though? Why not automate it?

I did some simple napkin math and it turned out that the time I spend automating it will be greater than just manually cleaning it up every time I want to publish a new version. Since this happens 2-3 times a year I’d rather do it by hand than risk a bug disfiguring the EPUB.

Then, I use the epub-gen library to convert the HTML to EPUB with some light processing done on it. The script below is just a plain JS file that I run from the command line.

import fs from 'fs'
import path from 'path'
import Epub from 'epub-gen'
import cheerio from 'cheerio'
import * as url from 'url'
import { epubCSS } from './templates.js'

function capitalizeFirstLetter(string) {
  return string.charAt(0).toUpperCase() + string.slice(1)
}

// Accept the book name as a command line argument.
// I've maintained consistency across folder structure and naming to make generation identical.
const [bookToGenerate] = process.argv.slice(2)
const bookIdentifiers = ['react', 'node']
const __dirname = url.fileURLToPath(new URL('.', import.meta.url))

// Make sure I'm not trying to generate a non-existing book and publish something broken.
if (!bookIdentifiers.includes(bookToGenerate)) {
  throw new Error(
    `You need to pass the book as an argument - ${bookIdentifiers.join(
      ' or '
    )}`
  )
}

const info = {
  title: `Tao of ${capitalizeFirstLetter(bookToGenerate)}`,
  author: 'Alex Kondov',
  cover: `./tao-of-${bookToGenerate}/tao-of-${bookToGenerate}.png`,
}

const savePath = (ext) =>
  path.join(
    __dirname,
    `./tao-of-${bookToGenerate}/tao-of-${bookToGenerate}.${ext}`
  )
const html = fs.readFileSync(
  path.join(
    __dirname,
    `./tao-of-${bookToGenerate}/tao-of-${bookToGenerate}.html`
  ),
  'utf8'
)

const $ = cheerio.load(html)
const chapters = []

$('h1').each(function (i, elem) {
  if (i === 0) {
    // This is the introduction.
    chapters.push({
      data: $(elem)
        .nextUntil('h1')
        .toArray()
        .map((e) => $.html(e))
        .join(''),
      title: $(elem).text(),
    })

    // We don't want separate entries in the TOC for the intro sub-sections so we return early.
    return
  }

  // It's a regular chapter heading.
  // Add all paragraphs and elements until the next sub-chapter heading.
  chapters.push({
    data: $(elem)
      .nextUntil('h2')
      .toArray()
      .map((e) => $.html(e))
      .join(''),
    title: $(elem).text(),
  })

  $('h2').each(function (_, subElem) {
    // Filter out all sub-headings for the current chapter and add them to the EPUB separately.
    // This allows us to have a more detailed TOC with an entry for every sub-heading.
    if (subElem.attribs['id'].startsWith(i)) {
      // This is not optimal because we iterate through all h2s multiple times.
      // A better implementation would be to split them in chunks.
      chapters.push({
        data: $(subElem)
          .nextUntil('h2')
          .toArray()
          .map((e) => $.html(e))
          .join(''),
        title: $(subElem).text(),
      })
    }
  })
})

const option = {
  title: info.title,
  verbose: true,
  author: info.author,
  cover: path.join(__dirname, info.cover),
  css: epubCSS,
  content: chapters,
}

new Epub(option, savePath('epub'))

And voila! The book is now readable by e-readers.

This code is far from the epitome of software design. The common wisdom when writing such small scripts is not to sweat the details. Make it run and toss it in the bin.

Yes, we’ll execute this code rarely, but chances are we’d need to make small changes every time we do. And that’s when we get into trouble.

Performance and algorithmic complexity are definitely something that shouldn’t be worrying you for a one-file script. But it would be best if you still made it ordered, readable, and documented. That’s why I’ve started writing comments, even in my personal projects.

Because the more seldom I work with them, the less context I remember.

This script has an obvious performance flaw, but it’s easy to read from top to bottom, and the comments document all the quirks I was dealing with at the time of its writing.

My Actual Writing Process

Figuring out smart ways to transform content is exciting but in the end of the day, the pages need to be filled with words.

I started both Tao books with an outline. A list of all the rules that I want to include, grouped in chapters. This made me think deeply about what I want to have in the book. I added and cut content numerous times until I was satisfied with the signal-to-noise ratio.

I put the list in a Markdown file and I started filling in the text for each rule.

I didn’t have a strict writing schedule. The only rule I followed is that I have to put some words in the file every day. Some days it was only 200 words or so, others it was 2000. But each day without exception I got closer to completing the first draft.

Looking at my history in GitHub helped me as well. Being able to visualize your streak can be very motivating, it almost turns writing into a game. Every day is another green box that has to be filled.

When I finished the first draft, I left it aside for a week. I didn’t look at it, I didn’t edit it, even when I realized that I had to change something. I wanted my mind to work on it in the background and most importantly - take a break from the book.

Then I picked it up and went through the whole thing again. Checking words and code, fixing typos and bugs in the examples. I’ve labored over every curly bracket in these pages and the thought that even when I was finished some error would still slip by didn’t give me peace.

But they say that books are never truly finished, only abandoned.

At this point the book is pretty much done, so I focus on the landing page and the images for social media. But before I press the button, I take another weekend to go through it again - a final time. With some luck, I manage to catch a couple more problems and send the book out into the world.

Tao of Node

Learn how to build better Node.js applications. A collection of best practices about architecture, tooling, performance and testing.