This is a note when I investigated Turborepo, a monorepo tool that Vercel develops. I had created a flashcard web application for my own language study, using a static JSON file to store the cards' data. I re-implemented it using TypeScript, Next.js, and GraphQL (Code).

What is “monorepo”?

Monorepo is a pattern of storing all code for an application or microservices in a single repository. When adopting a microservice architecture, repositories are usually divided into separate repositories. Which is generally called Polyrepo.

Monorepo is often misunderstood as Monolithic. Monolithic is a different concept. The counterpart of Monorepo is Polyrepo. Monolithic refers to a state which is tightly coupled, the counterpart of it is Modular. So, in other words, a project can be Monorepo and Monolithic, or Monorepo and Modular

Polyrepo makes it easier to develop, but there are some disadvantages. For example, it may be difficult to share code across repositories. Monorepo aims to eliminate these disadvantages.

Introduction to Turborepo

Turborepo is a monorepo tool developed by Vercel, a company that also develops Next.js as you know. The beauty of Turborepo is its simplicity. It’s like the features that enhance builds, that assist npm workspaces. There are fewer cumbersome configurations and dependencies.

Workspaces is a term that provides managing multiple packages within a single package. npm workspaces is similar to yarn workspaces. It’s been available since npm v7. Turborepo supports not only npm but also yarn and pnpm.

Turborepo provides these features. Through this investigation, I focused on Caching and Remote caching.

turborepo-features.png https://turborepo.org/docs#why-turborepo

Sample Project

https://github.com/tanakaworld/flashcard

Note: It is not production ready because there are some features missing, such as authentication.

  • Web: Flashcard client app
  • Admin: Dashboard to manage card data
  • API: GraphQL server that provides API for Web and Admin

There are three applications in the sample project. I tried to store the data in DB, fetch data dynamically using GraphQL, add an admin screen to manage the data. Then, I tried to use Turborepo to manage all services.

How to configure Turborepo

There are only two configuration that you need.

workspaces

workspaces in package.json on the root is a property of npm workspaces. Once you define it, package.json files under the defined directories would be evaluated, and it would create symlinks under node_modules directory on the root of the repo. This would allow us to refer to a package from a different package in a monorepo.

{
  "workspaces": [
    "apps/*",
    "packages/*"
  ]
}

turbo.json

Turborepo has only one config file, which is turbo.json. It defines build dependencies between packages. There are keys are the command name such as dev and build. Once you run turbo run <command>, all the commands in each package.json are executed.

For example, web#test:integration has dependsOn property, which has dependency to web#build. So once you execute turbo run test:integration --filter=web , web#build would surely be executed before test:integration.

{
  "pipeline": {
    "build": {
      "dependsOn": ["^build"],
      "outputs": ["dist/**"]
    },
    "web#build": {
      "dependsOn": [
        "^build",
        "$NEXT_PUBLIC_GQL_SERVER_URI"
      ],
      "outputs": [".next/**"]
    },
    "admin#build": {
      "dependsOn": [
        "^build",
        "$NEXT_PUBLIC_GQL_SERVER_URI"
      ],
      "outputs": [".next/**"]
    },
    "api#build": {
      "dependsOn": [
        "^build",
        "$DATABASE_URL",
        "$ORIGIN_NAME_WEB",
        "$ORIGIN_NAME_ADMIN"
      ],
      "outputs": ["build/**"]
    },
    "lint": {
      "outputs": []
    },
    "lint:fix": {
      "outputs": []
    },
    "test": {
      "outputs": []
    },
    "test:integration": {
      "outputs": []
    },
    "web#test:integration": {
      "dependsOn": ["web#build"],
      "outputs": []
    },
    "dev": {
      "cache": false
    },
    "clean": {
      "cache": false
    },
    "setup": {
      "cache": false
    }
  }
}

Investigation Topics

Topic 1: Caching

Turborepo generates a hash based on certain rules and saves the output of tasks. It reuses the output to improve build efficiency. A hash is used to determine whether a build will hit the cache, if matching a hash, it would skip executing that task, and moves or downloads the cached output.

Cloud caching Vercel provides called “Remote caching”, which is beta as of May 2022. Remote caching needs to sing in Vercel in each local environment. If one person has uploaded a cache, and then others hit that cache, it will be downloaded and replayed.

Once you run npx turbo run build, build script in each package is run. You would notice cache hit, replaying output which means Turborepo used cached artifacts without running the task. If hitting a cache, Turborepo replays stdout log saved in .turbo/.turbo/turbo-build.log.

Logs when running npx turbo run build:

√ flashcard % npx turbo run build
• Packages in scope: admin, api, config, jest-internal, tsconfig, ui, web
• Running build in 7 packages
• Remote computation caching enabled (experimental)
api:build: cache hit, replaying output 9effa2fc44a41e0f
•••
admin:build: cache hit, replaying output 183356efe25f7c63
•••
web:build: cache hit, replaying output b3b90d06be369eab
•••

 Tasks:    3 successful, 3 total
Cached:    3 cached, 3 total
  Time:    137ms >>> FULL TURBO

Contents in .turbo/.turbo/turbo-build.log:

admin:build: cache hit, replaying output ec363ea39f93b492
admin:build: 
admin:build: > admin@0.0.0 build
admin:build: > next build
admin:build: 
admin:build: info  - Loaded env from /Users/tanakaworld/ws/github.com/tanakaworld/flashcard/apps/admin/.env
admin:build: info  - Checking validity of types...
admin:build: info  - Creating an optimized production build...
admin:build: info  - Compiled successfully
admin:build: info  - Collecting page data...
admin:build: info  - Generating static pages (0/3)
admin:build: info  - Generating static pages (3/3)
admin:build: info  - Finalizing page optimization...
admin:build: 
admin:build: Page                                       Size     First Load JS
admin:build: ┌ ○ /                                      22 kB           134 kB
admin:build: ├   /_app                                  0 B             112 kB
admin:build: └ ○ /404                                   193 B           112 kB
admin:build: + First Load JS shared by all              112 kB
admin:build:   ├ chunks/framework-c4190dd27fdc6a34.js   42 kB
admin:build:   ├ chunks/main-24e9726f06e44d56.js        26.9 kB
admin:build:   ├ chunks/pages/_app-8b0b2d113007eece.js  41.7 kB
admin:build:   └ chunks/webpack-575f60f3d6dc216c.js     1.03 kB
admin:build: 
admin:build: ○  (Static)  automatically rendered as static HTML (uses no initial props)
admin:build: 

On the other hand, if not hitting a cache, Turborepo would run a task again. This is a log after modifying apps/admin. You would notice admin:build: cache miss, executing ec363ea39f93b492.

• Packages in scope: admin, api, config, jest-internal, tsconfig, ui, web
• Running build in 7 packages
• Remote computation caching enabled (experimental)
api:build: cache hit, replaying output 9effa2fc44a41e0f
•••
web:build: cache hit, replaying output b3b90d06be369eab
•••
admin:build: cache miss, executing ec363ea39f93b492
•••

 Tasks:    3 successful, 3 total
Cached:    2 cached, 3 total
  Time:    10.251s

Topic 2: GraphQL codegen and sharing types

Tried GraphQL codegen, and sharing generated types. apps/api is the GraphQL server for apps/web and apps/admin. Running npm run codegen, TS types would be generated under apps/api/src/types/generated/graphql.ts. apps/api/package.json has types that is a property of TypeScript1. You can export types with it.

apps/api/package.json:

{
  "types": "./src/types/__generated__/graphql.ts"
}

Then, your can import the exported types everywhere in the monorepo as usual. api is the package name of apps/api.

import { Card } from "api";
const cardList: Card[] = [];

Topic 3: Deployment & Remote caching

Vercel

I deployed Next.js apps on Vercel. That’s the easiest way to deploy and what we need is not so difficult. There are multiple projects in a repository, simply tell the build command and where the project root is.

vercel-build-setting-1.png vercel-build-setting-1.png

With --filter option, turborepo would execute tasks related to a package. turbo.json knows dependencies of web in this case, we don’t need to be aware of dependencies.

There are two pull requests and the pull request #1 has two jobs. #1-Job1 took 24 seconds on build because there were no cache. npm run build said cache miss, executing 9effa2fc44a41e0f. However, the build in #1-Job2 took only 1 second, saying cache hit, replaying output b3b90d06be369eab. It just replayed cached console output and move cached artifacts.

We can enable remote caching on CI as well2.

•••
jobs:
  build:
    •••
    env:
      TURBO_TOKEN: ${{ secrets.TURBO_TOKEN }}
      TURBO_TEAM: ${{ secrets.TURBO_TEAM }}

Heroku

apps/api is deployed on Heroku because it requires MySQL to store the data.

Heroku has terminated GitHub integration since April due to an incident3. I could not use GitHub repo integration and tried deploying via CLI on GitHub Actions.

Create an app on Heroku and configure Maria DB (MySQL compatibility database for free on Heroku).

$ heroku login
$ heroku create <app-name>

# https://devcenter.heroku.com/articles/nodejs-support
$ heroku buildpacks:set heroku/nodejs -a=<app-name>

# https://devcenter.heroku.com/articles/jawsdb-maria
$ heroku addons:create jawsdb-maria
$ heroku config -a=<app-name>
$ heroku config:set DATABASE_URL=mysql://••• -a=<app-name>

I wanted to build and deploy API related stuff only so customized the build step for Heroku. --filter is an option of Turborepo4. heroku-postbuild5 script is evaluated by Heroku on build step.

{
  "scripts": {
    •••
    "heroku-postbuild": "turbo run build --filter=api"
  }
}

Topic 4: Sharing configuration

Tried sharing Jest related configuration and dependencies. Created packages/jest-internal. We need to avoid conflicting package name against the packages in the world. That’s why naming jest-internal.

{
  "name": "jest-internal",
  "files": [
    "jest.config.js",
    "setupFiles/jest-fetch-mock.ts",
    "setupFiles/jsdom.ts"
  ],
  "dependencies": {
    "@testing-library/jest-dom": "5.16.4",
    "@testing-library/react": "12.1.5",
    "@types/jest": "27.5.1",
    "jest": "28.1.0",
    "jest-environment-jsdom": "28.1.0",
    "ts-jest": "28.0.2"
  },
  "devDependencies": {
    "jest-fetch-mock": "3.0.3"
  }
}

files property exports files in the package. Once npm dependencies are installed, they will be installed under node_modules directory on the root of the repo. That’s why Jest can be used anywhere in the repository.

You can reuse shared files from another package like this:

// apps/web/jest.config.js
const base = require("jest-internal/jest.config");

module.exports = {
  ...base,
  testEnvironment: "jsdom",
  setupFilesAfterEnv: [
    "jest-internal/setupFiles/jsdom.ts",
    "jest-internal/setupFiles/jest-fetch-mock.ts",
  ],
  moduleNameMapper: {
    "~/(.*)": "<rootDir>/src/$1",
  },
};

References


  1. Publishing types https://www.typescriptlang.org/docs/handbook/declaration-files/publishing.html ↩︎

  2. https://turborepo.org/docs/ci/github-actions ↩︎

  3. Plans to Re-enable the GitHub Integration https://blog.heroku.com/github-integration-update ↩︎

  4. --filter option https://turborepo.org/docs/core-concepts/filtering ↩︎

  5. heroku-post-build https://devcenter.heroku.com/articles/nodejs-support#customizing-the-build-process ↩︎