IBM Launches their Digital Asset Platform Powered by DfnsRead the News

Research

Deploying MPC in the Wild

Antoine Urban
Antoine Urban
March 13, 2026
Read time:

Lessons learned from deploying large-scale MPC systems for secure key management, digital asset custody, and blockchain infrastructure.

On March 7, I gave a keynote at Real World MPC (RWMPC) in Taipei, a workshop co-located with Real World Crypto 2026. My talk, “MPC in the Wild: Architecture and Benchmarks for Key Management Networks,” focused on a question that matters far beyond research papers: what does it actually take to deploy MPC for digital asset custody in production?

The answer is not just “better cryptography.” It’s better system design. The keynote makes a simple but important point. In digital asset infrastructure, security does not depend only on the protocol. It also depends on how the protocol is deployed, how the network is coordinated, how availability is handled, and how performance is engineered at scale. This is where theory meets reality.

The custody problem starts with the key

Digital assets are bearer assets. Control comes from control of a private key. If you hold the key, you control the funds. If the key is stolen or lost, the funds are gone. 

That creates a severe architectural constraint. A single private key, whether it is stored on disk, in a server, or inside an HSM, is still a single point of failure. One breach, one rogue insider, one operational mistake, one deletion event, and the entire system fails. This is the core tension of custody: to use the key, it must be available; to protect it, it must be hard to access.

Threshold cryptography exists to break that tradeoff. Instead of one machine holding one key, a threshold signature scheme splits signing power across multiple parties. No single machine ever holds the full secret. Signing is done through MPC, with multiple shares participating in the computation. That removes the single point of failure and creates a more robust security model for custody.

But that is only the beginning. Once you decide to use threshold signing, the next question is how to deploy it.

Deployment models: shared custody vs. key management network

Two ways to deploy threshold signing

Once the cryptographic primitive of threshold signing is selected, the immediate architectural question becomes how to deploy it. During the keynote, I outlined two divergent operational models

  1. The first is shared custody. Commonly seen in self-custodial wallet applications, this model typically splits key shares between the end user's personal device and a service provider. It can work, but it comes with serious drawbacks. Users still carry security responsibility. Recovery is harder. Delegation is harder. Availability can also become fragile, because if one required machine is offline, signing stops.
  2. The second model—and the one we strongly advocated for—is a “key management network” (KMN). A KMN is a network of servers that holds shares of many keys on behalf of many users. The goal is to separate asset ownership from security infrastructure. Users still own the assets, but the operational burden of protecting signing material shifts to systems that are built for monitoring, logging, recovery, redundancy, and controlled execution. That changes the shape of the problem. Instead of asking every user to behave like a security engineer, you design a network whose job is to protect keys at scale.

Why a key management network makes sense

At Dfns, we framed the KMN around two main design principles.

  1. The first is to trust the client as little as possible. In practice, this means reducing the amount of sensitive responsibility placed on end users and their devices.
  2. The second is to trust the custodian as little as possible. This is an equally important point. A serious custody architecture should not simply replace one concentrated trust assumption with another. By isolating key shares within secure hardware enclaves and federating the network across independent organizations, this ensures that no single operator or entity exerts unilateral control over the secret material.

This leads to three major advantages.

  1. The first is usability. A KMN lets users authenticate with familiar methods such as WebAuthn, biometrics, or device-based credentials, without forcing them to manage a separate key share for every wallet. That creates a much cleaner user experience and simplifies recovery.
  2. The second is security. Professional infrastructure can support continuous monitoring, anomaly detection, audits, certifications, disaster recovery, and formal incident response. More importantly, recovery in this model does not mean handing private keys back to users. It means restoring access through fresh authentication credentials while the signing system remains protected.
  3. The third is availability and efficiency. A dedicated network can be built for uptime, redundancy, and operational scale in a way that ad hoc user-managed schemes cannot.

This is one of the most practical messages from the talk: custody systems should be designed around how they will behave under pressure, not just how elegant they look in a protocol diagram.

Network topology and the coordinated star architecture

The hidden deployment question: mesh or star?

One of the main points I wanted to make in this keynote was about network topology. Academic MPC papers often assume a mesh network, where every participant talks directly to every other participant over secure channels. While natural on paper, deploying a full mesh in production introduces a lot of friction.

Why? Because every signer becomes a server. That means every signer needs to expose ports to the Internet, accept inbound traffic, track IP addresses, manage peer discovery, update firewall rules, and maintain many active connections. This expands the attack surface and turns the deployment model into an operational headache. Machines holding sensitive material should not have to look like public network endpoints.

To resolve this, Dfns takes a different approach and employs a coordinator-based star topology. Instead of every signer talking to every other signer directly, each signer opens an outbound connection to a central coordinator, which relays protocol messages. This changes the deployment model in a big way.

No signer needs inbound ports. Signers can stay behind strict NATs and firewalls. Discovery becomes simpler. The number of connections grows linearly rather than quadratically. And operational control becomes much easier. This is one of those ideas that sounds obvious once stated, but it is exactly the kind of system decision that determines whether a protocol can survive contact with the real world.

Does a coordinator reintroduce trust?

At first glance, introducing a central coordinator seemingly conflicts with the decentralized premise of MPC. If the goal is to eliminate central trust, why introduce a central routing component?

The distinction is critical: the coordinator is central for liveness, not for security. This liveness dependency is easily mitigated by leveraging multiple coordinators operating in parallel. For security, we operate under a severe  adversarial model where the coordinator and the underlying network may be fully controlled by an attacker. In that scenario, the attacker may drop, delay, or reorder packets. However, despite all these powers, the attacker remains cryptographically restricted to denial of service. They cannot read messages, modify execution states, replay traffic across sessions, or break signature integrity.

That is enforced through cryptographic bindings in the transport layer itself. Messages are encrypted end-to-end using keys established between the actual protocol parties. The coordinator only sees ciphertext. Messages are signed together with monotonic counters so reordering or missing messages are immediately detected. They are also tied to a unique execution context so cross-session replay attacks fail.

This is a crucial deployment lesson. In real systems, the transport layer is not just plumbing. It is part of the security model. That idea sits at the heart of the talk: if you want MPC to work in production, you cannot treat networking as an afterthought.

Managing complexity: the coordinator as a control plane

Beyond security, there is a broader engineering case for this architecture. Distributed, multi-round cryptographic protocols are notoriously difficult to orchestrate. A coordinator provides the system with a practical control plane. It centralizes execution state tracking, facilitates agreement on participant sets, and drives message flow in a consistent manner. This centralization of logic does not make the coordinator trusted for secrets; rather, it makes it an indispensable tool for managing system complexity.

This matters because production cryptography is not only about correctness. It is also about observability, auditability, and recoverable operations. A coordinator-based system is easier to reason about, easier to monitor, and easier to maintain than a fully distributed swarm of signers each trying to reconstruct the same global state from partial views. Accepting this tradeoff—centralizing orchestration while mathematically decentralizing trust—is precisely what separates a deployable architecture from a purely academic one.

Protocol engineering and empirical benchmarks 

To achieve true operational scale, elegant network architecture must be paired with rigorous protocol engineering. The final part of my talk moves from architectural design to empirical benchmarks, translating theoretical protocol choices into measurable performance. 

Performance is not optional

Focusing specifically on threshold ECDSA—a standard for digital asset systems—we confront a fundamental challenge: threshold signing is inherently interactive and expensive. Executing the full cryptographic workload synchronously for every signature request severely degrades user experience and prevents the system from scaling. 

The practical solution is precomputation, or presigning. The idea is to move the expensive work into an offline phase, before the message to be signed is even known. Then, when a live signing request arrives, the online phase becomes much lighter. This is not just a nice optimization. It is the difference between a system that feels usable and one that does not.

In the benchmarks I shared, the contrast was dramatic. A heavy offline phase might take around 1,500 milliseconds, while the online signing step can drop to microseconds once presignatures are available. That changes how MPC can be used in production. Instead of making users wait for cryptography, the system prepares for future demand.

Honest majority changes provisioning

While precomputation is mandatory, the underlying protocol determines how efficiently these presignatures can be managed. In dishonest-majority protocols such as CGGMP21/24 and DKLS19/23, presignatures are tied to a specific key. This introduces a severe provisioning problem: the network must predict which key will transact next and generate presignatures accordingly. In high-concurrency environments with many users and wallets, this is highly inefficient ; if transaction patterns shift, unused presignatures are wasted.

In contrast, an honest-majority protocol like KU25 introduces a crucial operational advantage: key-independent presignatures. In this model, presignatures are completely decoupled from specific keys. They can be pooled globally and consumed by any key when needed. That is a major operational advantage as it means the network does not need to guess which customer will transact next. It can maintain a large shared pool and draw from it dynamically. That makes provisioning simpler, reduces waste, and aligns much better with the bursty, uneven demand patterns seen in real systems. 

This is a good example of how protocol properties translate directly into operational architecture. Key independence is not just an elegant theoretical feature. It changes how capacity planning works.

Shared state and batching create real gains

The logic can be pushed further. If a KMN has a shared state and a global view of presignature generation, it can fully leverage the batching capabilities of KU25 to reduce the amortized cost.  The empirical gains are substantial: if the legacy generation of a presignature using CGGMP21 takes roughly 1500 milliseconds, an optimized KU25 flow brought single-presignature generation down to roughly 680 milliseconds. But the real leap came from batching: for a batch of 10,000 presignatures, the amortized cost dropped to about 1.3 milliseconds each. That is the kind of number that changes what is possible. 

Once again, the message is larger than the benchmark itself. Performance in MPC is not only about faster local cryptography. It is about architecture, pooling, scheduling, and exploiting shared state across the network.

The benchmark takeaway and what it means for Dfns

Ultimately, empirical testing on AWS EC2 c5.large instances  validates that these architectural and protocol choices yield order-of-magnitude throughput gains, with KU25 significantly outperforming legacy models, especially when presignatures are available.

The exact numbers matter, but the bigger takeaway matters more. When you combine (1) a coordinator-based deployment model, (2) offline presigning, (3) key-independent provisioning, and (4) large-scale batching, you don’t get a marginal improvement—you get a different class of system. That is what “deploying MPC in the wild” really means. It means making design choices that let secure cryptography survive the demands of uptime, traffic spikes, failure recovery, and production latency.

This keynote was not just a generic overview of MPC. It was a clear articulation of the design philosophy behind how Dfns thinks about digital asset infrastructure. At Dfns, MPC is not treated as an isolated signing primitive. It is part of a larger system that includes transport security, coordination, authentication, observability, recovery, and operational scale. 

The broader lesson

MPC is maturing. The real frontier is no longer only about proving that threshold cryptography works. It is about proving that it works under real-world constraints.

Too much of the conversation around MPC still focuses on protocol families in isolation: who has fewer rounds, who has lighter assumptions, who is more elegant on paper. Those questions matter, but they are incomplete. In practice, the winning architecture is the one that can be deployed safely, operated predictably, and scaled without turning security into an operational liability. That means dealing with firewalls, service discovery, outages, throughput limits, user recovery, deployment boundaries, and messy operating environments. It means designing systems that assume components may fail or be hostile, while still preserving confidentiality and integrity. And it means accepting that the architecture around the protocol is often just as important as the protocol itself.

My keynote aimed to close the concrete gap between a protocol that exists on paper and a custody network that protects digital assets every day in production. Custody systems require network designs that minimize attack surfaces. They need transport layers with strict cryptographic guarantees. They need realistic provisioning strategies, alongside robust performance techniques like presigning and batching, that make MPC workable at production scale.

At RWMPC in Taipei, I gave a framework for thinking about the future of MPC in custody. The future will not belong to the systems with the cleanest diagrams alone. It will belong to the systems that combine strong cryptography with disciplined architecture and hard-earned operational realism.

That is what deploying MPC in the wild actually requires. And that is the standard Dfns is helping define.

Authors