Bossy Lobster

A blog by Danny Hermes; musing on tech, mathematics, etc.

Edit on GitHub

What npm Can Learn from Go

npm and Go

This is cross-posted from the Hardfin engineering blog.

As programming language ecosystems evolve, they learn from each other. When this happens we all benefit from more useful features, patterns, libraries, and tools. A core part of a programming language ecosystem — packaging — can be a source of pain, e.g. from version conflicts, slow build times, security issues, etc.

At Hardfin we are happy users of both Go and TypeScript. We love the wealth of useful packages in the npm ecosystem that help us build modern web applications. There are a few common patterns in the Go packaging landscape that don't translate to the npm packaging experience, but we would love it if they did! In this post we'll introduce a tool that exemplifies language ecosystem cross pollination. This tool is a minimal implementation of an experiment to bring the benefits of Go package vendoring to the npm ecosystem.

Stuck on an airplane

Alice is scrambling to leave her apartment for a flight that takes off in less than an hour. She wants to get familiar with the application her new team maintains, so she clones the repo in a hurry before packing her laptop. In a mad dash she just manages to board the plane before the gate closes. Finally with a chance to collect herself and breathe, she opens her laptop and gets ready to start coding.

After reading the team's getting started document, she hops into the backend code and fires up the server:

go run ./cmd/feather-server/main.go

Now that the backend is running, she moves on to the frontend instructions in the getting started document. Unfortunately, this is a roadblock that she'll need inflight Wi-Fi to overcome. To build and run the frontend, she first needs to install all dependencies:

npm ci
npm run feather-app

Meanwhile on the ground, Alice's new teammates Eve and Trudy are struggling to understand why the frontend build caused the latest production deploy to fail. It's very unfortunate timing — they are trying to ship a new feature as part of a customer demo.

There were no signs of trouble in the latest merged pull request; all CI checks passed. But the build is failing to install dependencies with:

$ npm ci
npm ERR! code E404
npm ERR! 404 Not Found - GET https://registry.npmjs.org/skugga/-/skugga-9.1.1.tgz - Not found
npm ERR! 404
npm ERR! 404  '[email protected]://registry.npmjs.org/skugga/-/skugga-9.1.1.tgz' is not in this registry.
npm ERR! 404
npm ERR! 404 Note that you can also install from a
npm ERR! 404 tarball, folder, http url, or git url.

npm ERR! A complete log of this run can be found in:
npm ERR!     /root/.npm/_logs/2022-04-29T16_20_17_523Z-debug-0.log

After 30 minutes of scrambling and head scratching, they hop on Twitter and from reading posts in their timeline they realize the author of skugga has removed the package entirely from the package registry.

Why airplane mode for dependencies?

Why was Alice able to start the Go backend after only just cloning the repository? Her team uses go mod vendor and checks in the root vendor/ directory to the codebase. This means that after git clone, all application binaries can be built by the Go compiler — no dependencies to fetch, no inflight Wi-Fi necessary! In addition to enabling airplane mode, there are several other benefits1 from checking in the vendor/ directory:

  • New dependencies show up in pull requests as changes to vendor/; the larger the diff, the more "weight" that the dependency adds
  • All builds (Docker and otherwise) are hermetic and don't require fetching packages from the public internet2
  • Even if dependencies are deleted by a third party, this won't impact the current codebase (avoiding the "my code didn't change but it stopped working!" crisis that Eve and Trudy are in)
  • Fully consistent development environments for all team members (without need for consistent ~/.npm/_cacache, ~/Library/Caches/Yarn/v6, etc.)
  • Supply chain attacks cannot impact the dependencies without making it into a pull request

Switching npm into airplane mode

In order to bring the go mod vendor experience to the npm ecosystem, we've created an experimental CLI tool called npm-mod. Using the tool, Alice can download all package tarballs into the vendor/ directory and change the application's package.json and package-lock.json so that all remote package references are replaced with local file references.

$ npm-mod tidy
$ npm-mod vendor
Saved babel__helper-builder-binary-assignment-operator-visitor-7.16.7.tgz
Saved babel__generator-7.17.9.tgz
...
$
$ ls -1 vendor/
abab-2.0.6.tgz
...
$ ls -1 vendor/ | wc -l
    1135

After running this, both package.json and package-lock.json will be changed and a new .npm-mod.tidy.json file will be added:

diff --git a/package-lock.json b/package-lock.json
index 5303979..02c30e9 100644
--- a/package-lock.json
+++ b/package-lock.json
@@ -18,8 +18,8 @@
       }
     },
     "node_modules/@ampproject/remapping": {
-      "version": "2.1.2",
-      "resolved": "https://registry.npmjs.org/@ampproject/remapping/-/remapping-2.1.2.tgz",
+      "version": "file:vendor/ampproject__remapping-2.1.2.tgz",
+      "resolved": "file:vendor/ampproject__remapping-2.1.2.tgz",
       "integrity": "sha512-hoyByceqwKirw7w3Z7gnIIZC3Wx3J484Y3L/cMpXFbr7d9ZQj2mODrirNzcJa+SM3UlpWXYvKV4RlRpFXlWgXg==",
       "dependencies": {
         "@jridgewell/trace-mapping": "^0.3.0"
@@ -29,8 +29,8 @@
...
diff --git a/package.json b/package.json
index 82115d2..753457c 100644
--- a/package.json
+++ b/package.json
@@ -3,13 +3,13 @@
   "version": "0.0.1",
   "private": true,
   "dependencies": {
-    "@testing-library/jest-dom": "^5.16.4",
-    "@testing-library/react": "^13.1.1",
-    "@testing-library/user-event": "^13.5.0",
-    "react": "^18.0.0",
-    "react-dom": "^18.0.0",
-    "react-scripts": "5.0.1",
-    "web-vitals": "^2.1.4"
+    "@testing-library/jest-dom": "file:vendor/testing-library__jest-dom-5.16.4.tgz",
+    "@testing-library/react": "file:vendor/testing-library__react-13.1.1.tgz",
+    "@testing-library/user-event": "file:vendor/testing-library__user-event-13.5.0.tgz",
+    "react": "file:vendor/react-18.0.0.tgz",
+    "react-dom": "file:vendor/react-dom-18.0.0.tgz",
+    "react-scripts": "file:vendor/react-scripts-5.0.1.tgz",
+    "web-vitals": "file:vendor/web-vitals-2.1.4.tgz"
   },
   "scripts": {
     "start": "react-scripts start",

The next flight: future work

After doing this, Alice can happily install all of the npm packages for her team's application without any internet connection:

$ ping hardfin.com
ping: cannot resolve hardfin.com: Unknown host
$
$ time npm ci
...

added 1393 packages, and audited 1394 packages in 4s

...

real    0m4.456s
user    0m5.112s
sys     0m5.802s

This is still an experiment and we'd want to see some improvements before recommending npm-mod as a core part of your team's workflow:

  • This tool is intended for applications, it should not be used for a library published to a package registry.
  • The vendoring process should hook into npm audit (problems with npm audit notwithstanding).
  • The existence of npm install scripts still poses a large security concern. There is a growing movement to disable install scripts.
  • Though the vendor/ approach mitigates supply chain "denial of service" attacks like the colors incident, actually catching instances of this during code review is still a challenge if the only artifacts checked into the vendor/ directory are tarballs.
  • There is no npm equivalent of the Go build cache (go env GOCACHE); if the same Go package is built multiple times, the previous builds can be reused. The nearest equivalent is the presence of the same (already installed) package in node_modules/.

Landing the plane

We'd love to continue the discussion and improve npm-mod to a point where teams like Alice's and our own are comfortable using it in daily workflows. By adopting this workflow and gaining airplane mode, teams can avoid unexpected breakages and ensure all code required to build the repo is present after checkout.

In addition to the tooling, there is a cultural norm in the Go community of viewing new dependencies with skepticism. This is captured by a Go proverb: A little copying is better than a little dependency. Checking application dependencies into the codebase means they are front and center, both for security and for their contents. In addition to enabling airplane mode, we hope that by vendoring the weight of dependencies starts to get more consideration.

Gopher Landing

  1. The primary downside of checking in vendor/ is a larger git checkout.
  2. Mostly hermetic. This really only applies to the components of the build containing Go or TypeScript code (and packages).

Comments