In EPiServer CMS 5 R2 there is a change in the GetPage() functionality which is marked as a breaking change. The change concerns security, specifically how security is checked when you call DataFactory.GetPage().
Up to now, which means all previous EPiServer versions, including the 4.x range, EPiServer has done a security check for you when GetPage is called. It means, if your code is calling GetPage(), and the currently logged in user is not allowed to see the page you’re “Get’ing” an AccessDeniedException is thrown.
The breaking change is that the access check has been removed from the internals of GetPage. The AccessDeniedException will never be thrown from the GetPage method, even though the current user is not allowed to see the page. The GetPage method will return the PageData to you, even if the requesting user is not allowed to see the page.
How Does This Affect You?
What does this mean for your code, and your sites? Well - it depends. Because we’re going from a more restricted scenario to a lesser one, your code will not break.
However, there is a risk that you’ll show more information than you’d like, which is called an Information Disclosure Vulnerability. How so? Even though the chances of this happening are small, that AccessDeniedException would prevent the data from being shown.
Here is an example:
You’ve got a template that has a right region for related information. From your code, you search or look up the information that is to be shown to the right, doing a GetPage() that shows the MainBody from a related page.
The Editor then adds a new page which is tagged so it will be among the related pages, but with sensitive information. This Editor – having received the proper training of course – knows that such information must be secured, and removes read for Everyone.
Pre R2 this would have lead to an exception, possibly taking down large parts of the site if this information was to be shown on many pages. In R2, this secure information will be shown to Everyone.
I’m assuming the code searching or looking up the related information must do so without security being applied. Like doing a GetChildren(), using an external indexer / search tool or getting raw page ID’s from somewhere.
You need to think carefully about your code, are you vulnerable? As I said, the chances are small, but you can’t ignore it.
Deliberate Security Circumvention
While the previous example might not apply to you, there is a chance this will. As there has never been a way for us to actually check if a given user is allowed to see a page until the PageData object has been loaded, people have relied on the exception to filter out pages. You’ll find code like this out there:
try
{
// Load page, will throw if not allowed to see
page = DataFactory.Instance.GetPage(pageref);
pagesColl.Add(page);
}
catch (PageNotFoundException notFoundEx)
{
// Page does not exist
continue;
}
catch (AccessDeniedException accessDeniedEx)
{
// User is not allowed to see page
continue;
}
This code will – pre R2 – only add pages the user is allowed to see, to the pagesColl PageDataCollection. In R2 the AccessDeniedException will not occur, so all pages will be added. If this is a security breach depends entirely on the use of the pagesColl. You might be showing just a list of links you say, but beware that a lot can be revealed in a page name (“Apple to release new MacBooks”, “Interest Increases 0.5%”, etc. You get the idea.)
The code above was actually taken from the popular MultiPage Property on EPiCode. Which means – even if you did not write vulnerable code, you might be affected anyway. There are probably other modules affected by this too.
What Can You Do?
For starters, search your code for AccessDeniedException, if you rely on it for anything, you need to change the code (which is highly probable if you find it.) From now on, you need to perform the access check yourself:
bool canRead = page.ACL.QueryDistinctAccess(AccessLevel.Read);
Not a big deal – when you know how to do it. The code above checks if the currently logged on user is allowed to Read the page.
Background
I talked to Magnus Stråle, the chief architect, and this is what he said:
The Short Story
“We had a series of requirements and changes leading up to this decision and the two primary reasons for making the change are consistency (the “Principle of least surprise”) and Page Providers.”
The Long Story
“The DataFactory class has been virtually unchanged since the first EPiServer 4 release and a few of the design decisions that we made then does not strike me as being the best choices looking at it with the luxury of hindsight. What stands out today is that security is so tightly coupled with page manipulation (completely breaks the “single responsibility” rule) and that we did not support paging (back then the sites were usually not that big so the need for paging was considered a feature we could do without).
During EPiServer 5 development we did a lot of things to improve performance, the primary change was to let DataFactory return read-only object instances straight from the cache. In EPiServer 4 we always copied the object and returned the copy to allow callers to modify the object without compromising data integrity. The copy process was a performance killer as well as wasting memory since in most cases the returned object never needs to change.
We also wanted to support object data sources in ASP.NET 2.0 and we wanted to support binding the data sources to web controls with paging support. That meant that we had to do paging all the way back to the database, including DataFactory.GetChildren. If GetChildren were to do access rights filtering we would defeat much of the performance gains that we got with the “read only” change since we then would have to recreate cached listings based on the access rights of the calling user. By removing access rights filtering from GetChildren we can re-use the same cached listings over and over again.
The result of these changes was that we moved filtering and access checks up in the layers to the base classes for our controls.
Now, finally, this leads us to the consistency problem – why should GetChildren return something that GetPage cannot? I.e. it does not make sense to have one API preventing access to a data item and another API giving you the very same data item when the methods are present on the same class.
Another factor, although not as important as the first, is the new Page Provider concept, basically allowing you to insert any data source as being the source of PageData in EPiServer. One design decision was that implementing security in a Page provider should be optional, i.e. depending on where the data come from we might not have information available to decide if we should throw AccessDenied or not. Yes it is not a big thing to check if security is enabled for the current provider and check access based on that information, but it was another small thing (there were a few more) that nudged us to this decision.
We've considered introducing a “backward compatibility switch” that is disabled by default for new installations, but enabled for upgraded sites. Maybe something to consider for SP1. Personally I hate this kind of “legacy preservation trick” – the only clean way of preserving legacy semantics is to use the class factory pattern to keep legacy code isolated to distinct classes, preferably in a separate assembly. But we may have to do that anyway if enough people have problems with our change.”
When CMS 5 R1 was released, GetChildren did indeed ignore security settings on pages, and it struck me then that GetPage was behaving inconsistently to GetChildren. Knowing the background for the change, I understand why it was done. Adam even devoted a great blog posting about the inability to check page access without relying on exceptions.
R1 was a brand new version, and no automatic upgrade path existed, so this could be communicated to people in due time. This time however, the change is more subtle, so we need to tell you in bigger letters.
One important fact to note though, the web controls will do security filtering for you. This change is for the DataFactory class, and only apply to you if you’re doing GetPage yourself, or use 3rd party modules that does.
AccessLevel.NoAccess
A lot of code uses pages without needing the security checks, and the usual work-around to prevent the AccessDeniedException is to call GetPage with an access specifier, like this:
DataFactory.Instance.GetPage(
pageLink,
AccessControlList.NoAccess);
This overload has been depreciated, and is marked as Obsolete in R2, giving you a compile time warning.
“EPiServer.DataFactory.GetPage(EPiServer.Core.PageReference, EPiServer.Security.AccessLevel)' is obsolete: 'Use overload without access parameter. Access rights are ignored.”
Just remove the access parameter from your call, and you're good.