BC Library Licensing Authentication Landscape – Challenges & Potential Future Directions
BC Library Licensing Authentication Landscape – Challenges & Potential Future Directions
Scott Leslie, BC Libraries Cooperative, June 20, 2019
Over the last 20 years public libraries, and the Co-op along with them, have evolved a number of different approaches to solving the problem of how to provide patrons authenticated access to licensed digital content, especially when they are not in-branch. This paper will look at the current approaches and the issues they face, propose interim steps to remedy some of these issues, and lay out a long-term strategy to move to an open standards based approach that will provide long-term stability, costs-savings at scale, promote privacy-by-design and ultimately serve as a platform for the expansion of library offerings from just digital content to digital services.
Current Approaches and Issues
Up until recently we have seen a reliance on two methods primarily to provide authentication for licensed content: SIP interfaces and URL rewriting proxies such as Ezproxy. Both made sense for their time but both now have some potential issues.
In addition, content vendors have also in the past accepted barcode pattern matching, GeoIP validation and “referring URLs” as methods to limit access to their resources. While all of these have represented easy fixes for libraries, all three are increasingly falling out of favour with publishers as they are all circumventable and do not provide a high degree of protection to protected resources.
The first of these is SIP. SIP is the “standard interface protocol” first developed by 3M in 1993 to allow for basic check in and check out operations to be performed by self-checkout machines. As such it was explicitly built for communication over a local area network and did not itself provide for any form of encryption. Three major issues have come to light in the years since it first began to be widely used to authenticate licensed products – cost, privacy and security/
In regards to the first issue, cost, while the Co-op is able to provide SIP endpoints for Sitka libraries as part of the basic package, many other ILSs charge additional (sometimes hefty) fees for each SIP endpoint, meaning libraries have either had to pay additional monies or else look for alternative solutions. This was indeed the genesis of the Co-op’s KAPA service, a standalone basic SIP server that imported valid patron barcodes from libraries on a daily basis and is still in service for around 6 libraries in BC.
Even when libraries were able to avail themselves of a SIP server from their vendor, these interface presented another challenge, privacy. Many of the transactions supported by a full SIP server reveal personal and private information that does not need to be shared with a vendor, and in the case of US-based content providers, violates FIPPA to do so. Some ILS vendors have responded by turning off these privacy violating fields, while others have not. In the Co-op’s case, we developed an “anonymized” SIP interface that works like our KAPA service, responding with only a privacy-enhancing “Valid/Invalid” response when presented with a patron barcode, and nothing else.
However, this approach has not been taking up by members and is also now coming into question, as vendors (most recently both Lynda.com and Hoopla) are insisting that patrons authenticate with both a barcode AND a PIN. While both Sitka SIP and KAPA could be made to do this, because of the lack of built in encryption (and the content vendors’ unwillingness to support encrypted tunnels) this would mean transmitting a barcode/PIN combination “en clair,” or unencrypted, over the wider internet, a security issue the Co-op is recommending against and not willing to impose on our libraries’ patrons. So generally speaking, while there are incremental improvements that have been made to SIP from a privacy perspective, in the end it is not designed to be an internet-wide authentication protocol.
URL Rewriting Services (e.g. ezProxy)
The other approach that is regularly used in libraries to provide remote access is URL rewriting services like ezProxy. ezProxy, developed by OCLC, is a “web proxy server used by libraries to give access from outside the library’s computer network to restricted-access websites that authenticate users by IP address. This allows library patrons at home or elsewhere to log in through their library’s EZproxy server and gain access to resources to which their library subscribes, such as bibliographic databases.” The Co-op uses ezProxy to provide access to Queen’s Printer publications from the IP ranges of public libraries in BC, and other libraries have implemented it as a way for patrons to log into publisher content (e.g. Burnaby uses it to log their patrons into Overdrive.)
ezProxy is actually a decent solution for the situations in which it is supported. It generally does a good job with preserving patron anonymity (patrons log into the proxy server/ILS and their traffic is sent to the content server. The only thing the third party validates on is the IP address of the proxy request.) However it has a couple of issues, price and potential content vendor abandonment. Regarding price – with version 6 OCLC changed its licensing model for ezProxy. While it may still be within reach for a small to mid-sized library, when the co-op requested a license for all BC public library users to access the single set of Queen’s Printer publications, we were quoted a 5-figure annual cost. But further to that, it is a method which publishers (especially academic ones) are pushing libraries to abandon. The public library sector deals with a different set of publishers than the academic sector, and so it may be that the publishers we deal with will not require public libraries to move on from ezProxy, but they likely have a similar set of complaints (ability to spoof IPs and the resulting potential for piracy) that are causing the shift in the post-secondary space and so it’s entirely possible such a shift will happen for public libraries too. Finally, ezProxy is primarily a service to proxy access to content, not dynamic services. While this might fulfill many of the current use cases it does not offer a great platform for future services. So while ezProxy and similar may not disappear from the library authentication landscape anytime soon, they also do not represent an investment in the future.
More recently, one ILS company’s API has emerged as a method that some content vendors are starting to support to facilitate patron authentication. III released their PatronAPI in June 2014 for Millennium (becoming Sierra). The PatronAPI is a modern, albeit proprietary (vendor-controlled and closed source), RESTful approach to patron authentication that can be secured via conventional TLS to encrypt the communications. Currently, there are only 5 BC libraries whose ILS can support it natively, although a few others have taken steps to implement code against their own ILS to emulate it (VPL, SPL).
As an interim step, the Co-op was already looking to develop out a web authentication API for Sitka. This is useful as it creates a standardized way for us to integrate any authentication solution, be it ezProxy or any other authN service we want. Work on this should be complete by end of Summer 2019.
The Co-op is being asked by lynda.com to add patron PINs to our current SIP endpoint. We purposely leave these out and because of the lack of encryption do not feel like it is a prudent step to take. The only other option lynda.com is giving us is to use PatronAPI. While Sitka does not support this (as it’s a closed, proprietary API) the advent of our own web API for authentication will make it fairly straightforward to implement code to emulate the basic authentication routine of PatronAPI, thus in the interim allowing Sitka patrons to authenticate to lynda.com. In addition, the Co-op is working with non-Sitka libraries with ILS’s that do not currently implement PatronAPI to develop and share code that will also emulate this interface.
While this workaround of implementing a pseudo “PatronAPI-like” interface will allow both the Co-op and some other libraries to implement a solution in the interim to log into 3rd party content vendors demanding libraries to adopt a proprietary standard, I believe it is both foolhardy and shortsighted to end our efforts there.
In the first place, we would be emulating a closed, proprietary API owned by a single vendor. While it is true that there are occasions when such technologies become de facto “standards,” time and again we and countless other industries have shown the long term dangers (e.g. lock-in, lack of transparency and accountability for changes to the specification) that simply acceding to such de facto standards bring. Also, because we are not actually privy to III’s specification, this leaves the door open for a content vendor to point all blame for a failed integration back on our code even if it is not the problem, as we have no way to fully verify that our implementation matches III’s perfectly.
But more than that – III’s PatronAPI is not ubiquitously or widely adopted by content vendors yet. The “Vendor Authentication Map” (https://docs.google.com/spreadsheets/d/1QIYzE6HsbjfN8y0RaCIbeJUbEUwxfNjn9Y_rlbjWVl4/edit#gid=0) developed by the Co-op shows that currently only 4 of the content vendors we actively deal with support it. And if that were not enough to question the wisdom of landing on this vendor-specific solution, the bigger issue is what, by dint of being *library* specific, such a solution does NOT enable – integration with services across the wider internet that aren’t just content collections.
Especially in the public library space (as contrasted with post-secondary), authentication of library patrons has been deeply tied to existing legacy systems, specifically the ILS. This makes some sense as the need has always been perceived as authenticating and authorizing a patron in relation to a collection, something that the original ILS were built to do. But in almost every other sector, this identity provision/authentication function has been separated out from the specific systems that consume them. This has largely been driven by the understanding that user identity and these functions are too important across multiple systems and lines of business to be buried within a single one. By removing them from a specific line of business system they can be placed in a central service that can be consumed across multiple types of systems in an open standard way, not in a way that was peculiar to a specific systems business logic or silo’d protocols.
There are numerous internet-wide open standards that have emerged over the past 10 years to allow disparate systems to authenticate users. The predominant ones to consider are the Security Assertion Markup Language (hereafter SAML), OpenID, and CAS (as well as potentially Oauth and OpenID Connect.). SAML, OpenID and CAS all work slightly differently but in essence they allow a 3rd party service to authenticate a user to use their service in a way that is standards-based and architected to work across the web, not just within the confines of a single LAN, and in a way that does allows libraries to control what information is shared, including the possibility of sharing none. While Oauth is an authorization framework, it can be made to do authentication, and OpenID Connect represents a specific profile of OAuth (essentially a rewrite of SAML using OAuth 2.0.) All of these have multiple implementations (and strengths and weaknesses), both commercial and open source, and indeed some products that support many of them on the same server (for example the slightly confusingly named Apero CAS Server, distinct from the CAS protocol.) This last fact means that the most prudent approach is likely to implement a solution that can support a number of these open protocols, offering most choice while still only requiring a single integration on the back end.
Already, there 6 products on the licensing authentication map which support SAML. 4 are within the proposed “core suite”(,EBSCOhost, OverDrive, Mango Languages and Proquest) 5 if Lynda, who is known to support it for post-secs, is included. So for sure there is some distance to go before it is adopted across all licensed products. In this regards, the ready availability of software libraries to act as SAML service provider clients will encourage adoption, as is the improvement this will bring to any vendor concerns about unauthorized access. In addition, SAML is widely adopted within the post-secondary library world, and indeed with the rise of the RA21 project looks to be positioning itself as the preferred approach.
In the fall of 2019 the Co-op will commence a pilot to test integration of an open standards authentication server with Sitka. Part of the pilot will be assessing the best candidate for the server. The current frontrunner is Apero’s CAS server as it is both open source and supports multiple protocols, although there are others that similarly fit the bill.
We believe taking this step is important to preserve both privacy and security in licensing authentication. However, we also think this step has additional benefits that make it even more compelling. In addition to facilitating access to licensed content, these standards were created to facilitate internet-wide access to services. By implementing open standards based solutions, we have the chance in the online world to fulfil the tantalizing promise of “library as platform.”
There are vast sets of digital services and software that are compatible with these open standards, both commercial and open source. Implementing this kind of open standard identity infrastructure would allow libraries to finally offer digital services, not just content, to both staff and patrons in a way that can bring badly-needed library values and perspectives to this space. An exhaustive list of services that can authenticate against a SAML server is not possible, as it is always growing, however to name but a few, currently services such as Zoom, Dropbox, Evernote, WebEx, the Google apps suite and Slack can validate users against a SAML server. As part of the pilot, the Co-op plans to strike a working group to develop out the list of use cases and applications libraries may be interested in integrating.
Issues and Risks
Such Servers are Complex to Implement
Implementing one of these open standard authentication servers can be non-trivial. In the case of Sitka, we would only need to implement it once to provide this facility to all Sitka library users and the Co-op is well staffed with technologists to implement this. In addition, the advent of the web authentication API already in development will ease the integrations.
Some of the large CULC libraries in the province have already looked at this technology and are likely able to implement it themselves if they chose to. But what of the rest – the non-Sitka libraries that are possibly not able to do this themselves?
The complexity may actually have a silver lining, and the Co-op may already be positioned to facilitate this at scale. If every library implements such a server themselves, this still leaves the vendors with 22 integrations to do in the province (the 21 non-Sitka libraries and Sitka). However, if a central authentication server is developed as a central service, that greatly reduces the technical burden on the vendors, which is something that can be put to good advantage in licensed cost negotiations. And the Co-op already talks to 65 library ILS in the province for the NNELS project. For those who employ SIP with NNELS, these connections could be secured once, via SSH tunnel, to a central authentication server at the Co-op which would then provide secure and private auth services to vendors, saving lots of work to secure these (and work that likely needs to happen anyways for the connections to NNELS to be secure.) This is entirely analogous to the ezProxy service the Co-op already successfully runs for every public library in BC to facilitate access to Queen’s Printer resources.
Vendors may not move to open standards
For a move to an open standard authentication solution to be successful, it needs to solve both current problems as well as enabling future solutions. As previously mentioned, 5 of 14 potential “core suite” products do support an open standard method, SAML. But that still means 9 more don’t.
Moving towards these new solutions will be a process. The good news is that there are existing methods that can be left in place while the transition happens. In addition, there are readily available open source client libraries to enable the content vendors to consume any of these services. And ultimately, this move is in general line with where the larger publishing industry is going as it ultimately represents a technically better solution for both libraries AND publishers.
Vendors do not negotiate lower license prices based on easier to integrate solution
It is entirely possible that even after doing this work, vendors do not significantly discount their products to recognize a reduction in effort required to do authentication integration. The good news is – such a reduction is a bonus. These solutions should not significantly add to the cost burden of libraries and indeed may significantly offset them in the case of libraries whose ILS require additional fees for external auth interfaces. And any costs need to be understood as a small price to pay to position libraries in the delivery of digital services to patrons.
Not all BC libraries participate
Participation by all BC libraries is in no way a requirement to move forward with this, though the extent to which more do it means potentially greater economies of scale but also more pressure on the vendors to move more quickly in this direction. Indeed the greater risk is not simply lack of adoption in BC but lack of sector-wide adoption from libraries continuing out-dated authentication practices. As part of this work, some thought should be given to cross-province communication and liaison.
Cost Overruns / Staffing + Resource Implications
Implementing such services is not without costs or risk. A detailed project plan with associated resources would still need to be developed, however initial conversations amongst Co-op technical staff indicate that this is both feasible and aligned with our existing roles in facilitating licensing authentication in a number of manners.
Libraries Shifting ILS
Were a library to change ILS, this would mean re-implementing the back-end integration. This is inevitable. However it means this can be done once between the authentication server (whether self-hosted or via the Co-op) and the library without needing to involve the content vendors, and the fact that the Co-op currently has methods to connect to every variety of ILS currently adopted in BC means we are well positioned to facilitate this centrally if desired.