Developer Exchange Blog
Lately, we’ve been working on multiple projects involving interesting and difficult authorization challenges. By the word authorization, what is meant is the granting of permissions that control access by users to data in complex systems. This article discusses some of the challenges enterprises face in a general way and proposes a direction or architecture for a comprehensive Enterprise Authorization Strategy.
We’ve done it for years, right? Everyone has a technique. Well, let’s just say that there are a lot of ways that systems are designed with regard to security. Lately, though, there has been increased scrutiny over these methods. From Sarbanes-Oxley and HIPAA concerns to just plain old good business practices and prudent concern over control, IT organizations are concerned about good practices that ensure that their data is properly protected. As our systems get more and more complex, the problem gets harder and harder to solve. What’s scary about this is that our industry has not done a stellar job of protecting data (e.g., credit card leaks, identity theft, data loss); so, the thought of things getting worse is a bit disturbing.
Before we get started, let’s cover a little background. There are several aspects to security. It roughly breaks into these distinct parts:
- Session Management
Authentication is concerned with the confirmation of identity. Before we can move the discussion to authorization, or permissions, the fundamental question that has to be answered first is “who is attempting to access something?” Authentication mechanisms fill this gap by requiring a user to identify themselves and then present some form of verification of their identity. There can be many approaches to authentication. The most common, of course, is a simple username and password. Other approaches require more complex credentials when confirming identity, like certificates, synchronized but changing numeric keys from a key generator, biometric verification, and so forth in addition to a password.
There is a lot to be said about authentication, and you can find plenty of discussion on this if you are interested. The main thing to frame in your mind here is that authentication is only a single part of security, and it should not be thought of as being implicitly bundled with authorization. It should be designed so that before someone gains access to any data, step one is to confirm one’s identity using a method that is acceptable to the enterprise. The enterprise may have multiple ways that users are authenticated, requiring some users to identify themselves with more rigorous methods than others. The principal concern with regard to authentication is that a protective ring be placed around the enterprise such that access is only possible by authenticated users. Inside the ring we need to know the identity of any user attempting access ,and we trust that the user has been verified with authentication. By utilizing technical mechanisms, like a Identity Tokens, which can be provided by a Secure Token Service (STS), there can be assurance that an asserted identity is trustworthy (i.e., that it could only have been generated by a user that successfully presented confidential credentials).
The point to all this is that with these mechanisms we can not only know the identity of an accessor, we can pass that identity around from application to subsystem along a service delegation chain, while retaining the trust that only the original user could have asserted and confirmed the identity. In other words, we're not just passing userids around, we're passing confirmed identity around.
Session management is concerned with maintaining an authenticated session over a period of multiple transactions. After all, it wouldn’t be too convenient to have to re-authenticate every time you click a button or submit a form. Session management is again, a separate concern in the area of security, and an enterprise may have various techniques that it uses to maintain sessions. When a shared session management strategy is in place with a way to perform identity transference, one can gain the benefits of single sign-on, or SSO. This can be a great benefit to enterprise users. This, too, has nothing to do with authorization, it’s only purpose is to make sure we know who is accessing the system on a subsequent transaction. So, Session Management is now defined and separate from what this article is about.
Now, with Authentication and Session Management out of the way and assumed as an underlying part of security infrastructure, we can move on to the principal concern of this article, which is authorization or permissions management. Another term used for this is Security Policy, and sometimes it’s simply referred to as Policy. Authorization control depends on a strong foundation of authentication and session management strategy; so, there is no intention to dismiss those subjects. The idea is to separate these concepts mentally and literally, in terms of implementation.
The focus of the remainder of the article is a simple question, which is “is the person that is attempting to access or manipulate certain data permitted to do so?” The knowns we have to work with are the following:
- we know the identity of who is accessing the system, and
- we know the data they are attempting to access, presumably through request parameters or other implicit request characteristics.
While this problem may initially appear to be trivial, it is not.
Simple Authorization Techniques for Application Data
A common starting point for many systems is to add “application authorization.” Sometimes it starts with something like a simple role based table that has the following:
User - The identity of the user, generally a username.
Application - This may be implicit if permissions are in the application database.
Role - The "rolled-up" permissions of the user. Sometimes this is as simple as an "Access-Level" role.
Permission - Permission values often start very logical, with values of "readonly", "edit", "admin", and so forth - not necessarily a simple yes or no to the permission question.
Generally, at this point the enterprise has a strong application orientation and there's likely a database that goes one-for-one with a fronting "application". Usually, in the beginning, application security may be adequately served with one user having one role. The application code protects data at Security Decision Points by asking a question about role. “Does the current user have the role and permission necessary?” At this simple level of authorization, we aren’t really talking about specific data.
The next thing that generally happens is that the one role isn’t quite good enough; so, more roles are added. This might happen initially to add some scoping of data to the roles (e.g., a role of “Customer-Data-Access-Level”, “Human-Resources-Data-Access-Level”, and so forth). And then security starts to get confusing because role permutations quickly become hard to manage. So, after that the next general step in the evolution is to move to users having multiple roles and permissions.
Things are still pretty simple because the code that answers the question, “Does the current user have the role and permission necessary?”, is only a little bit more complicated in that it has to decide among a collection of roles whether the user has the one needed. After all, we’re still just in one application, and we’re going after data identified in large subsets. The security decision points are still pretty broad and easy to deal with.
Then comes the next challenge. There is a lot of data that needs protection. And there is a need to be able to control which users can access which data. The role strategy alone isn’t going to be sufficient. This necessitates an associative relationship between the users and the data being protected. At this point, we need to know how to identify the data. As a simplification for this discussion, let’s just say that each collection of data we want to protect has some Data Identifier, the “DataID”. This could be a scalar identity (like a number) or something more complicated. However, the concept is simplified here as a DataID. This ultimately gives way to authorization control that is something like this:
User - The identity of the user, generally a username.
Application - This may be implicit if permissions are in the application database.
Role - The "rolled-up" permissions of the user (e.g., "Access-Level").
Permission - The permission for the user and role (e.g., "readonly", "edit", "admin").
DataID - The application Data Identifier related to this user, role and permission (e.g., CustomerAccount1).
At this point, honestly, things are still pretty simple. The maintenance of the authorization data may be a bit challenging, but the core question the security check needs to perform is still pretty simple. As the code is considering a security decision as to whether the user should be allowed to access certain data, it just checks the authorization setup to see whether the user in question has access to the data in question based on the role that protects the data. It’s still application focused and still reasonably simple.
Enterprise Data makes Authorization Complex
It’s somewhere beyond the above needs that the train begins to come off the rails. One issue that begins to surface is that there is a network of data spread across the enterprise with complex relationships that needs authorization control. Generally, the fundamental issue comes down to a need for multiple scopes of control with regard to the data network. For example, hierarchies of data, where access control needs to be provided, or at least managed, at high levels in the hierarchy, while also being able to grant or deny permissions at much lower levels in the hierarchy. At this point, the DataIDs are usually a problem because more often than not, there is no master data management plan and no comprehensive strategy with regard to identifying data in the enterprise. A final challenge here can be that there is no good way to arbitrarily identify collections of data for the purpose of access control.
Recent trends in enterprise architecture makes this challenge even worse as enterprise security needs go beyond application orientation. Lately, this issue has become an increasing concern because many IT organizations have begun to move toward service oriented architectures (SOA). In an SOA, a Service Inventory is created, which is much like a distributed class library. The Services in the inventory work together in Service Compositions to serve, protect and manage enterprise data, usually at the business concept level. The Services encapsulate various data and functionality and provide access points that are carefully created to hide the underlying implementations. A consumer of a given service may be a user-facing application or it may be another Service in the Service Inventory along a delegation chain in a Service Composition. The consequence of this is that data security becomes even more challenging. It’s not enough to focus authorization at the application level because the underlying Services need to provide security on the data being served and manipulated within them. If Security is not provided within each Service, it’s a free-for-all with security responsibility distributed across any applications and Services along a Service Composition chain to an underlying Service. Distributed security responsibility is not good from an enterprise view, since definitive knowledge regarding whether a given user can or can not perform a certain function on specific data is difficult to ascertain. To confirm the precision of security settings with access control delegated outward to each application would mean that one would have to know exactly how each application protects the data. With no common strategy, this is at least theoretically prohibitively complex.
Enterprise Authorization Foundations
This sets the stage for a look at the comprehensive authorization challenge enterprises face. Here, we will build the foundation for an Enterprise Authorization Strategy and then pull it all together.
The first challenge is to get a handle on the users. Consider this illustration.
The graphic illustrates that there are two sets of users to consider. The first set includes the people that are a part of your organization. The second set includes people that are not an integral part of your organization. From an enterprise viewpoint, all of these people can have a role to play in interacting with your enterprise data. So, there must be a strategy to not only authenticate these people, but to somehow bring order to managing access control for them.
As one major part of an Enterprise Authorization Strategy, there is a need to have a way to assemble groups of people together. It is tempting to think of this in an organizational way. For example, the various organizational departments like customer service, accounting, and so forth would likely fit exactly with the org chart. However, the grouping problem is more arbitrary than that. Sometimes people are assembled into cross-functional task forces, for example. In the end, what is needed is a way to assemble groups for the purpose of conveniently and accurately performing authorization control. To avoid assumptions, it is proposed that these groups be thought of as Teams. Each team has a leader and each team has team members. The leadership aspect can be important, since many authorization problems have special characteristics that must be reserved for a select subset of the team. Don’t assume that Team Members will come exclusively from either internal or external users. There are many cases where trusted external users have roles to play in teams that might normally be considered internal. For example, vendors like auditors, consultants, and others can blur the lines of an internal versus external user. Make sure the Team assembly tool is capable of creating arbitrary user associations. Finally, it’s most likely obvious, but users can be associated with any number of teams.
Exactly how these teams are assembled is not the focus of this article. However, an Enterprise Authorization Strategy should include a tool to manage these collections. And a good solution would allow not only arbitrary Team Member assignments but, also, rule based team membership, driven out by data from human resources. For example, allowing for a team to be formed by department, job function or other properties related to human resource data would mean that the team membership could be dynamic and automatically synchronized with HR changes. When someone is added to a department with a certain role, they would automatically get or lose Team Membership in appropriate groups. This can be essential as the organization gets large, for without it, team maintenance is a daunting task, and inappropriate associations are almost certain. Here we can’t forget external users. They are likely to be served by the enterprise alongside internal users. With an SOA, this is even more likely as customer facing, internet based applications integrate with the Service Inventory.
However it is attained, one core part of your Enterprise Authorization Strategy should be to have an Authorization Manager that provides functionality to create and manage Authorization Teams. This tool would most likely read from a user directory and write the teams back to the directory. The techniques and possible sources of data for this are many. Here’s an illustration of the simplest case.
The tool itself may be quite functional or it may be very simple. For a large enterprise, there is no doubt that the tool would need to be capable of rule based team membership with an expressive rules language for inclusion and exclusion filters based on data maintained by human resources.
As a foundation, consider that the Authorization Manager allows authorized users (recursion here) the ability to create teams, manipulate team membership, and delete teams. The tool will require access to the directory (or directories) of users, and it will need a way to save the teams, most likely back into a directory. Note that at this point we have not discussed permission setting, we are simply laying the foundation from which authorization control will be possible. The result of the Authorization Manager is that the enterprise user directory is extended with teams. With a directory of individual users and teams, the foundation is set to allow authorization configuration in desirable user collections, but also in individual cases, where necessary.
Data Identification and Cataloging
At this point we need to turn our attention to the data for which we want to provide access control.
The point of this article is to consider a solution for enterprise authorization in the general case. And, in the general case, enterprise data is complex. It is held in multiple databases, in a variety of formats, held together by unpredictable and sometimes very loose, difficult to navigate coupling, and is generally a pretty big mess. From comprehensive authorization viewpoint, there needs to be a strategy to simplify the situation.
Above, we discussed the notion of a Data Identifier. The idea is that every piece of data we need to protect needs to have a key, or identifier, that we can use for authorization control. While this seems simple enough, in practice it is not. Data keys can be complicated. Sometimes data keys require complex and structured compositions made up of some sort of address path. Here, we’re talking about some sort of locational and identifying path that leads to the data. An example might be: Product Inventory -> Store X -> Shoes -> SKU1234567. Here, the point is that to control access, we might need information about where the data is located, what server or services provide the data, and finally other keys to locating the data within the server or service. The core issue is that the mechanism to do so is largely undefined, and we need to bring order and methodology to it.
An additional need is to have a way to identify collections, or sets, of data. In the general case, we need to consider that data may be assembled into hierarchies. From the data identification example above, it is typically desirable to be able to perform authorization control at any node along the data identifier’s path. So, if we wanted to identify all of the Shoe inventory at Store X, the identifier from the above example would conceptually be Product Inventory -> Store X -> Shoes.
Here’s where we need an Enterprise Data Catalog. The catalog needs to include each data item that we want to protect (i.e., the catalog only needs to identify the smallest chunks of data that need protection). For example, if you only want to protect access to customer accounts, you would only need to identify the individual customer account, not everything that might exist about a customer account. However, if your authorization control goes beyond this, you may need a way to identify components of a customer account, such as the account’s contact information, billing history, account balance, and so on.
Along with the catalog of the individual data we want to protect, we may need to support an ability to define multiple paths to the data. The reasons for this can be authorization related. For example, a straight identifier hierarchy, path-wise, might not provide the collections that need to be controlled. So, it may be necessary to allow ‘data routes’ that are logical. In our simple example above, we may need to have a path to the shoe inventory by locale. Imagine that we could also get to the shoe inventory because we have data paths like: Alabama -> 35242 -> Store X or Alabama -> 35226 -> Store Y or potentially a direct path to products by Vendor. These data paths could serve authorization control at higher levels because we can now identify a collection of stores by geographic location, for example. Since we know the path to the data from the store nodes, authorization control would be possible along these logical paths, like “All Alabama stores” or “All stores in a certain zip code”.
This gives way to a data catalog that looks something like model below from an entity relationship viewpoint.
With a simple data structure like this, data paths can be modeled, cataloged, data can be identified, and metadata can help identify or classify the data. Addresses can be visual and presented in an explorable hierarchy, and data can be further identified with rules based on metadata comparisons. Finally, there can be multiple paths to a given node of identified data, and each node along the path can have its own metadata. A root AddressNode would have a null ParentNodeID and a terminal AddressNode would have a null ChildNodeID. NodeIDs are ensured enterprise unique by the catalog service.
Another important concern for an enterprise would be “how will this catalog be built and maintained?” The answer is that catalog content maintenance needs to be automatic. Enterprise services need to be capable of publishing their catalog of data to the Enterprise Data Catalog. Catalog publishing would be done either on demand by the Services directly publishing and updating them to the Enterprise Data Catalog Service, or by the Enterprise Data Catalog Service pulling the updates periodically via a synchronization mechanism. In concept, somewhere near the authoritative source of the data, the catalog concept would need to be integrated, and the sourcing Service would need to support adding its data addresses to the catalog. The NodeIDs would be generated and served by the catalog and any needed mapping for the service would be done in the meta-data. Please excuse the modeling simplification, but the various types of values are shown in the illustration just to acknowledge that the metadata attribute values may be of various types.
Security Policy Management
At this point we have two of the three major parts we need to provide an Enterprise Authorization Strategy. The final, missing component is to add Security Policy, or authorization control, as a binding between our users or teams and nodes in the Enterprise Data Catalog.
Since this is a concept article, let’s consider this an extension of the Authorization Manager. The idea here is that with a given user or team open in the Authorization Manager, an authorized security administrator user would have features available to view, add, update or delete the Security Policy attached to each user or team. The Authorization Manager should be capable, likewise, of showing exactly which users have been granted access to for which Decision Points at any given DataNode.
In the simplest case, the Security Policy would define a security “checkpoint”, which may be thought of as a Security Decision Point. The Security Decision Point corresponds to a point in a Secured Module, where a decision needs to be made regarding whether a user is to be allowed or denied some access or action. In the Security Policy, a permission would be set for a decision point, user and a data node from the data catalog. It breaks out something like this:
Simple Security Policy
Identity - The identity of the user, generally a username.
DataNode - The specific DataNode being protected.
Decision Point - The logical Policy Decision Point. This might be a high-level concept like "Access" or it might be more specific and low level like "View" or "Edit" depending on the security decision to be made.
Permission - Here a simple and clear value of either Allowed or Denied is suggested.
This data would allow a Policy Decision Point to ask the Security Service whether a given user is to be allowed to pass the decision point or not for a given piece of data, based on a NodeID for the data in question. The Enterprise Security Policy Server would examine the Security Policy data store and resolve whether the user in question is Allowed or Denied at the Decision Point by examining the AddressNode at or nearest the terminal DataNode in the Security Policy data with a concrete answer to resolve the decision. Note that the Security Policy may or may not be placed directly on the DataNode in question. It very well may be ‘upstream’ from that node. Since our design provides for multiple paths to each DataNode, a resolution scheme would need to be defined for conflicting permissions. Most likely, the safest solution would be restrictive and result in a Decision Point denial if two or more Policies were found at exactly the same distance from the DataNode, where one or more Denies the Decision Point.
Complex Security Policy
With an understanding of the Simple Security Policy, one can now envision the concept of a more complex, Rule Based Security Policy. A Rule Based Security Policy would not only allow for Team identities, it would also allow for metadata matching criteria with the identified DataNode. However, regardless of the complexity of a Rule Based Security Policy, it would simply resolve, in concept, to a set of distinct Simple Security Policies that represent both the User Identities targeted and the DataNodes targeted.
Secured Module Decision Points
At this point, we’ve got quite a bit of fundamental security infrastructure in place. Making a decision regarding access to a single DataNode is not overly difficult. The decision for a given DataNode could be made by a fast, highly available Security Service, synchronously at the point the decision is needed, only adding milliseconds to the transaction.
Unfortunately, this isn’t good enough. Systems need to make broad decisions quickly. After all, it is not practical in a query, for example, to ask the Security Service, synchronously, “what are all the DataNodes that this user can access?” Common enterprise data will be very large, and from a practicality standpoint, filters and restrictions must be possible at the data layer with joins or other fundamental data relationships that make it possible to technically produce results in a reasonable amount of time. Even at 100 milliseconds per decision, asking a Security Service to render a decision on just a million DataNodes would take almost two minutes. Obviously, this will not work.
To deal with this there must be some way for the Security Service to resolve the DataNodes for users prior to an incoming transaction. The most likely mechanism would be for a given Secured Module with specific Security Decision Points to query the DataNodes for the Decision Points of interest in the Secured Module. The logical query would be something like this:
For each decision point in (decision point list of interest),
List all users for all DataNodes that are Allowed to pass the Decision Point.
Of course, the resultant data could be massive. Further, it’s likely that the Secured Module is only concerned with a small subset of the data catalog. With a good strategy on Decision Point naming, it may be that a query of this type would automatically reduce to the minimum set of interest by the Secured Module. However, it may be necessary in the above query for the Secured Module to be able to filter the DataNodes based on metadata matching.
With the resultant data, however, the secured module could use the returned data in association with the underlying enterprise data to provide an ability to perform security checks in-process, rather than synchronously going to the Security Service. And, the decisions could be made very, very fast. Assuming optimization, it simply couldn’t get faster.
The only thing this leaves is the need for a strategy to keep the replicated security policy up-to-date and synchronized with the Security Service. A side benefit to the replication of the Security Policy is the elimination of the need, transactionally, for the Security Service to be up for service to be delivered by the secured module. Of course, the negative to this is that if connectivity between the Security Service and the Secured Module is disrupted, the replicated security perspective could be out-of-date. However, a strategy could be employed to ensure that this is not allowed to be the case for an extensive period of time without a systems alert or disabling of access.
Comprehensive Enterprise Authorization Strategy requires security infrastructure to bring order to users and complex networks of data. The essential elements include an Authorization Manager that provides features to allow a security administrator to create, manage and delete Teams of users. An Enterprise Data Catalog is needed that will bring addressability to enterprise data and allow for arbitrary collections for authorization purposes. Features are needed in the enterprise to define the Security Policy that will govern the Security Decisions Points that arise at various points in the enterprise, whether at the edge of the enterprise in a user-facing application, or whether embedded in a Service at the heart of the enterprise. Features to manage security policy can be a natural extension of the Authorization Manager.
Perhaps the biggest challenge proposed herein is the Enterprise Data Catalog. The reason this is a challenge is that it requires all applications or Services within the realm of the Enterprise Authorization Strategy to be capable of publishing their data catalog and to be capable of performing Decision Point inquiries of the Security Service in the context of a published DataNode. Assuming that systems infrastructure in a given enterprise is not complete chaos, it is likely that an Enterprise Data Catalog adapter could be created to ease this burden. In the worst case, significant work may be required to integrate with the Enterprise Data Catalog and Security Service.
With an Enterprise Authorization Strategy, a comprehensive approach to access control is possible. From a compliance viewpoint it's hard to imagine how any organization can be assured of appropriate data protection without one. An Enterprise Authorization Strategy like the one proposed here, would allow a security administrator to definitively know whether any given user has access to any given unit of secured data.