Runtime Policy as Code

When Milliseconds and Security Matter

Aug 13, 2025

Also, you can find my book on building an application security program on Amazon or Manning

In parts one and two of this series, we've explored how Policy as Code (PaC) can help secure our development lifecycle and protect our supply chain. To close out this series, let’s cover a critical last component: when your policies need to make split-second decisions in a production environment. Many organizations struggle to get their OPA policies beyond the development sandbox. The difference isn't just about writing good Rego, it's about architecting for security while still maintaining performance and reliability in the realities of production environments.

The Runtime Reality Check

Runtime security policy enforcement is easy until you accidentally kill your application's performance with what looks like innocent policy code or block legitimate traffic from your clients. The problem may not be with the policy logic itself, but how it was structured. For low-latency/high-performance use-cases (think microservice API authorization) policy evaluation has to make decisions in under a millisecond. That's not much room for error, especially when you consider that OPA evaluation is just one part of an overall latency budget. You can equate this authorization evaluation to a librarian needing to check if someone has permission to access a specific book. If the librarian has to search through every single book in the library every time (non-linear policies), it's going to take a while. But if the librarian has an efficient card catalog system with proper indexing (linear fragment policies), they can find the answer almost instantly.

Linear fragment policies in OPA are streamlined rules designed for near-constant-time evaluation, making them ideal for high-throughput scenarios like API gateways where speed is critical. Non-linear policies are more flexible and expressive. These are capable of handling complex logic but they can slow down as rule size and input complexity grow. With techniques like partial evaluation, OPA can sometimes reshape non-linear policies into faster linear fragments without losing accuracy.

The linear fragment of the language is where evaluation amounts to walking over the policy once with no search required. This means OPA can make decisions in nearly constant-time, regardless of how many rules you have, as long as you structure them properly. In other words, write your policies to optimize iteration and search. For example, use objects instead of arrays when you have a unique identifier for the elements of the array.

# Good vs Badusers := {
"u123": {"name": "Alice"},
"u456": {"name": "Bob"},
"u789": {"name": "Carol"}
}

# Inefficient
some i
users[i].id == "u789"

# More efficient
users["u789"].name

Architectural Patterns to Consider

Your policy is ready to go to production. You’re faced with several architectural choices, each with different performance implications. Here are three patterns to consider in enterprise environments.

The Sidecar Pattern: Lightning Fast but Resource Hungry

The sidecar pattern is useful for high-scale organizations, and for good reason. Performance is fast for microservices, because authorization requests are local to the server and require no network hops. You're essentially putting a tiny policy engine right next to your application using something like a proxy.

The proxy intercepts every request and checks it against your OPA policies before reaching your application. Since everything is localhost communication, latency stays minimal. One concern is serial call latency accumulation where even a local call to OPA can begin to stack requests in serial succession that builds latency. It’s possible to use partial evaluation to precompute parts of the policy, avoid unnecessary chaining of authorization calls, or even batch or parallelize where possible.

The Gateway Pattern: Centralized but Potentially Bottlenecked

The gateway architecture places OPA instances at the gateway of your cluster (as the name suggests). This means there is a single, centralized location for authorization and policy decisions and downstream traffic, post-gateway, is presumed authorized.

This pattern works well when you need consistent policy enforcement across multiple services and you're okay with the trade off of network latency for centralized management. It can be effective for coarse grained access control where you're making decisions about whether a user can access a service at all, rather than fine-grained decisions about specific operations.

This works in systems where an organization needs to enforce rate limiting, basic authentication, and tenant isolation policies. Since these decisions happen once per request rather than multiple times, the small latency due to the network hop to a centralized OPA cluster can be acceptable.

The Middleware Pattern: Framework Native Integration

For applications that aren't ready for the complexity of sidecars or proxy architectures, integrating OPA directly into the application itself can work best. Most frameworks now have OPA integration libraries that make this straightforward.

The advantage here is the simplicity of architecture since policy enforcement happens with minimal infrastructure changes. However, this model requires more coding and management by the development team and any OPA failures would need to be handled gracefully in the application code.

Performance Optimizations

Any time you ask to include an additional security layer to an application, inevitably the question of performance will come up. To this end, the architecture choices matter, but so does the complexity of the policies.

Memory Management and Bundle Optimization

One of the biggest performance killers is if an organizations loads massive data sets into OPA that don't actually need to be there. OPA loads everything into memory, so every piece of data in your bundle directly impacts memory usage. OPA's disk storage feature allows policy and data to be stored on disk, but for most use cases, keeping the right amount of data in memory with proper structure is more effective than trying to optimize storage. To be fair, this is a case of simple housekeeping and data optimization.

Rule Indexing and Query Structure

OPA has indexing algorithms that can make policy evaluation time remain constant regardless of data size, but only if you structure your policies to take advantage of them. The more effective the indexing is the fewer rules need to be evaluated. For example, when OPA evaluates a query like input.user.role == "editor", it can immediately jump to only the rules that match that role, rather than evaluating every single rule in your policy.

Monitoring and Troubleshooting in Production

Runtime policy enforcement is invisible until it breaks, which is why monitoring is absolutely critical. OPA exposes a built-in HTTP endpoint at /metrics that can be used to collect performance metrics for all API calls.

Ideally your organization will track three categories of metrics related to the operation of OPA in production.

First is decision latency: End-to-end Latency is the latency measured from the end user's perspective. OPA Evaluation is the time taken to evaluate the policy. gRPC Server Handler is the total time taken to prepare the input for the policy, evaluate the policy and prepare the result.

Second is resource utilization: Memory usage is particularly important because OPA keeps everything in memory, and memory pressure can cause garbage collection pauses that spike latency unpredictably.

Third is policy health: Whether your policies are loading successfully, whether bundle updates are working, and whether the policies are making the decisions you expect them to make.

Handling Failures and Insecurity in Policies

A common production issue is related to policies that work fine in testing but perform poorly under load. To get help with this, your organization can use the profiler to help identify portions of the policy that would benefit the most from improved performance. Another common issue is policy evaluation errors that only surface under specific conditions. This is where decision logging can help. It captures every policy evaluation with the input data and result, which can prove to be invaluable during debugging of authorization issues.

From a security perspective, attackers using built-in functions like http.send or net.lookup_ip_addr to leak environment variables or credentials is a real threat. Your policies should focus on decision-making, not data fetching or external communication. Additionally, consider that some security issues appear when your Rego policies exceed their intended scope where they allow for unintended functionality or network access. To combat these potential security issues, consider building in test into your CICD pipeline to identify insecure patterns before they reach production.

Lastly, never expose OPA's API endpoints without proper authentication and authorization. This exposure can allow attackers a way to test authorization scenarios against your policies. And for good measure, ensure that TLS is enabled for all communication, use signed bundles to ensure policy integrity, never run OPA as root, and implement proper network segmentation so OPA can only communicate with authorized services.

Production Ready Checklist

So now you’re ready to make policy decisions using OPA. Here is a practical checklist that can improve the chances of a more secure (and performant) production deployment.

Start with performance benchmarking using realistic traffic patterns, not just synthetic tests. Production environments can operate vastly differently from testing environments, and what looks achievable in test may be a stretch in production when you deploy with real API requests containing complex nested objects. In other words, test with production-like data complexity, not just production-like volume.

Implement comprehensive monitoring before you go live, not after. Monitor API call latencies over time, identify performance bottlenecks, set up alerts for abnormal response times, analyze the impact of optimization efforts. The worst time to realize you don't have visibility into your authorization system is during an outage.

Plan your failure handling strategy carefully. Circuit breaker patterns are essential because OPA failures should not cascade into application failures. Your application needs to have a reasonable default behavior when it can't reach OPA for authorization decisions. Sometimes that means failing secure (denying access), sometimes it means failing safe (allowing access), and sometimes it means falling back to a simplified authorization model.

Lastly, while this may sound boring, document your runbooks and train your team on debugging procedures. Authorization issues are often the hardest to troubleshoot because they involve the intersection of user identity, resource permissions, and business logic. Having clear procedures for common scenarios saves precious time during incidents.

The Path Forward

We've covered a lot of ground in this series, from basic PaC concepts through supply chain security and now runtime enforcement. The common thread is that PaC isn't just about writing policies, it's about building systems that can enforce those policies reliably, securely, and without impacting performance.

If you've been following along with this series and experimenting with PaC in your own environment, I'd love to hear about your experiences!