Proxies are complicated: RCE vulnerability in a 3 million downloads/week NPM package

Pac-Resolver, a widely used NPM dependency, had a high-severity RCE (Remote Code Execution) vulnerability that could allow network administrators or other malicious actors on your local network to remotely run arbitrary code inside your Node.js process whenever you tried to send an HTTP request.

This is bad!

This package is used for PAC file support in Pac-Proxy-Agent, which is used in turn in Proxy-Agent, which then used all over the place as the standard go-to package for HTTP proxy autodetection & configuration in Node.js. It's very popular: Proxy-Agent is used everywhere from AWS's CDK toolkit to the Mailgun SDK to the Firebase CLI (3 million downloads per week in total, and 285k public dependent repos on GitHub).

I found this lovely little issue a short while back, while adding proxy support to HTTP Toolkit (yes, code reviewing your dependencies is a good idea!). The vulnerability was fixed in v5.0.0 of all those packages recently, and was formally disclosed last week as CVE-2021-23406.

First things first: are you personally at risk? This vulnerability seriously affects you if:

You depend on Pac-Resolver before v5.0.0 (even transitively) in a Node.js application
And, you do one of the below:
- Explicitly use PAC files for proxy configuration.
- Read & use the operating system proxy configuration in Node.js, on systems with WPAD enabled.
- Use proxy configuration (env vars, config files, remote config endpoints, command-line arguments) from any other source that you wouldn't 100% trust to freely run code on your computer.

In any of those cases, an attacker (by configuring a malicious PAC URL, intercepting PAC file requests with a malicious file, or using WPAD) can remotely run arbitrary code on your computer any time you send an HTTP request using this proxy configuration.

If you're in this situation, you need to update (to Pac-Resolver v5 and/or Proxy-Agent v5) right now.

If not, you're probably not in any immediate risk (but it's a good idea to update anyway). For now, settle in and let's talk about why this matters, how this works, and how it can be exploited.

What's a PAC file?

PAC stands for "Proxy Auto-Config". A PAC file is a script written in JavaScript that tells an HTTP client which proxy to use for a given hostname, using dynamic logic to do so.

This is a system originally designed as part of Netscape Navigator 2.0 in 1996 (!) but it's still in widespread use today.

An example PAC file might look like this:

function FindProxyForURL(url, host) {
    // Send all *.example requests directly with no proxy:
    if (dnsDomainIs(host, '.example.com')) {
        return 'DIRECT';
    }

    // Send every other request via this proxy:
    return 'PROXY proxy.example.com:8080';
}

It defines a function in JavaScript, which can be used to find the right proxy for a URL.

This is designed to be run in a sandbox, accessing only a few specific useful methods (like host regex matching) that are required. Still, by using those methods these scripts can become very complicated - MDN has some good docs on the features available if you're interested.

PAC files provide a way to distribute complex proxy rules, as a single file that maps a variety of URLs to different proxies. They're widely used in enterprise environments, and so often need to be supported in any software that might run in an enterprise environment.

How exactly is this distributed though, you ask? Usually from a local network server, over plain-text HTTP (it's a local address, so there's often no certs available). Distributing it from a remote server, rather than locally configuring the file, is useful & very common as it allows network administrators to change it quickly & easily, safe in the knowledge that clients will always have the most up to date version.

In fact this is so common that there's a standard for automatically discovering the PAC file URL when connecting to a network: WPAD (Web Proxy Auto-Discovery Protocol). Your local network can give you a PAC file URL via DNS or DHCP when you connect to the network, and many systems (including Windows, by default) will automatically download & use this file as the system's proxy configuration.

It's a JavaScript file you have to execute to connect to the internet, which is loaded remotely, often insecurely and/or from a location that can be silently decided by your local network. 1996 was truly a simpler time. What could go wrong?

What's the vulnerability?

Pac-Proxy-Agent attempts to provide support for PAC files specifically for Node.js - a noble goal. It doesn't automatically provide WPAD support (fortunately, given this vulnerability) although only because the PR was never completed. WPAD can easily be supported manually though, or even implicitly by reading the PAC URL from an OS that autodetects it using WPAD (i.e. Windows).

Pac-Proxy-Agent is instantiated with a PAC file URL, such as pac+http://config.org.local/proxy.pac. It retrieves the PAC file from that URL, and then acts as a Node.js HTTP agent (middleware for outgoing requests) which runs that PAC file for every outgoing URL before sending the request onwards upstream according to the PAC file's result.

So far so good. This is how PAC files are designed to work, and some implementation of this is necessary to support the many enterprise environments that use them.

This then is used in Proxy-Agent, which takes arbitrary proxy URLs and maps them to the appropriate agents. This is very convenient if you need to support a variety of system configurations! Read the system config, pass it to Proxy-Agent, and use the resulting agent for all outgoing requests. If you pass a pac+... URL to Proxy-Agent, it'll instantiate a Pac-Proxy-Agent for you from that URL, immediately giving you an agent you can use to make HTTP requests via the proxy.

Unfortunately however, Pac-Proxy-Agent doesn't sandbox PAC scripts correctly. Internally, it uses two modules (Pac-Resolver and Degenerator) from the same author to build the PAC function. Degenerator is designed to transform arbitrary code, and returns a sort-of sandboxed function, using Node.js's 'vm' module, that's then executed by Pac-Resolver.

VM's documentation starts with:

The vm module is not a security mechanism. Do not use it to run untrusted code.

Uh oh.

This is an easy mistake to make - it's small text (frankly, it should be the headline on that page and next to every method) and MongoDB did the exact same thing too in 2019, with even worse consequences.

Unfortunately though this creates a big problem. While VM does try to create an isolated environment in a separate context, there's a long list of easy ways to access the original context and break out of the sandbox entirely (we'll take a look at an example in a minute, but for now just trust me), allowing code inside the 'sandbox' to basically do anything it likes on your system.

If you accept and use an untrusted PAC file, this is Very Bad. Every time you make a request using the PAC file, it can run arbitrary code and do anything on your system. If it's malicious, you're in big trouble.

How might you end up using a malicious PAC file? Let me count the ways:

You read your proxy configuration from a config file, API endpoint, command line argument or environment variable and somebody manages to add their malicious PAC file's URL there.
You load a trusted PAC URL insecurely, and somebody else on your network changes its contents in transit.
You securely use a trusted PAC URL, but somebody successfully attacks the PAC file host and changes the file.
WPAD is enabled on your system (as it is by default on Windows), somebody on your local network abuses it to configure your system with their PAC file URL, and you use that system configuration (Node doesn't use system proxy config by default, but many implementations will do so explicitly).
You take proxy configuration in any other happy-go-lucky way, under the reasonable misapprehension that doing so can only risk exposing insecure traffic that you explicitly send via the proxy, and that that's acceptable for your case.

In practice, this either requires an attacker on your local network, a specific vulnerable configuration, or some second vulnerability that allows an attacker to set your config values.

If you end up in any of those situations though, it's game over, and it's easier than it sounds - anybody using a Node.js CLI tool designed to support enterprise proxies in a coffee shop, hotel or airport is potentially vulnerable, for example.

How could this be exploited?

To exploit this, the attacker needs to somehow provide a malicious PAC file (see above for ways this could happen), with contents that looks something like this:

// Here's the real PAC function:
function FindProxyForURL(url, host) {
    return "DIRECT";
}

// And here's some bonus arbitrary code:
const f = this.constructor.constructor(`
    // Here, we're running outside the sandbox!
    console.log('Read system env vars:', process.env);
    console.log('!!! PAC file is running arbitrary code !!!');
    process.exit(1); // Kill the HTTP client process remotely
    // ...steal data, break things, etc etc etc
`);
f();

That's it - this is all that's required to break out of the VM module sandbox. If you can make a vulnerable target use this PAC file as their proxy configuration, then you can run arbitrary code on their machine.

The example here will log env vars to the console in the client application and then shut it down, but of course it could silently send them elsewhere instead, write to files on the machine, attack other devices on the network, change application behaviour to attack clients, start mining crypto, etc etc.

This is a well-known attack against the VM module, and it works because Node doesn't isolate the context of the 'sandbox' fully, because it's not really trying to provide serious isolation. In line 7 above, this comes from a context passed to vm.runInContext to create the sandbox, which comes from an object parameter in the external Node.js environment. We can follow that chain to get a function constructor for the external runtime environment, and from there we can instantiate a function (f) outside the sandbox from a string. Then we just run it, and the code we provided runs without any of the sandbox's constraints.

What's the fix?

This is now fixed in Pac-Resolver v5.0.0, Pac-Proxy-Agent v5.0.0, and Proxy-Agent v5.0.0. The fix is simple: use a real sandbox instead of the VM built-in module.

In this case, this was done using the VM2 npm module, which provides a similar API while being explicitly designed to run untrusted code, and hardened to block sandbox escapes like this. Switching to VM2, the above exploit code prints:

process is not defined

It's hard to guarantee that it's impossible to escape VM2, like any sandbox, but it's widely used for this exact purpose. There are no known ways to escape VM2 today, and using a sandbox that's designed for running untrusted code is a dramatic improvement on the flimsy (by design) isolation provided by the VM module.

Using VM2 also makes it likely that any future sandbox escapes will be far more complicated and will be quickly dealt with when they appear, requiring just an update to VM2 to remain secure, and creating a difficult & moving target for any attacker.

Wrapping up

If you depend on Pac-Resolver, and there's any way you might be using PAC files in your proxy configuration: update to Pac-Resolver v5+ now.

For everybody else, I hope this was an interesting walk into some of the dangerous eccentricities of proxy configuration! Hopefully you're safe from the worst of the risk here, but you should probably update when you get a minute anyway.

What about the future? Is this going to happen again?

Yes, unfortunately, it definitely is. Even ignoring malicious supply chain attacks, there will be plenty of insecure code we're unaware of on NPM today, and on every other community package platform.

There are tactical mitigations that can be made (I'd be fully onboard with deprecating Node's VM module entirely for example, since it's a massive footgun, and better sandboxing primitives generally everywhere would really help - Deno being a good start) but the best thing you can do is keep an eye on published vulnerabilities which could affect you and ensure you quickly handle them.

While right now in Node-land that's quite a painful process (npm audit is noisy to the point of uselessness) there is very promising progress there that could really help to find the signal in the noise and make this easier to do. Here's hoping that comes soon, and package managers elsewhere follow suit!

This is also a great example of the value of code reviewing your dependencies! In many cases, especially for large applications with complex dependencies, that's not possible for all dependencies, but at least reviewing your most sensitive dependencies (like automatic proxy configuration) will help you catch these unintentional bugs and help get them fixed for everybody.

I do think this is not an example of the classic "developers nowadays use too many dependencies" or "NPM's many pointless tiny dependencies create risks" arguments that tend to get bandied around. There are real problems there, but if you need to support enterprise environments then writing your own proxy autoconfiguration code from scratch is a bad alternative, from both a productivity and a correctness standpoint, and building support for niche features like PAC files & WPAD into Node.js itself doesn't seem good either. Dependencies need management, but they're not always bad.

Lastly, I should give a big thanks to Snyk.io & their team for their help resolving this. I disclosed the issue to them directly (they provide support for reporting community package vulnerabilities for many languages here) since I couldn't see an clear way to get in touch with the maintainer privately, and they made contact, handled all the communication, coordinated the fix, and managed the disclosure itself. As a developer, rather than a full-time security researcher, it's definitely useful having somebody familiar with best practices to ensure vulnerabilities are resolved safely & responsibly.

Have any thoughts, questions or feedback? Get in touch on Twitter or send me a message and let me know.

Want to inspect, mock & debug Node.js HTTPS for yourself, for debugging & testing, with no vulnerabilities required? Try out HTTP Toolkit.

Published 3 years ago by Tim Perry Picture of Tim Perry