Machine Learning: A Practical Guide To Managing Risk

Rihad Variawa on 2019-11-03

The ultimate aim of this article is to enable data science and compliance teams to create better, more accurate, and more compliant ML models.

Photo by Riley McCullough on Unsplash

A fundamental question raised by the increasing use of machine learning (ML) — is quickly becoming one of the biggest challenges for data-driven organizations, data scientists, and legal personnel around the world. This challenge arises in various forms and has been described in various ways by practitioners and academics alike, but all relate to the basic ability to assert a causal connection between inputs to models and how that input data impacts model output.

According to Bain & Company, investments in automation in the US alone will approach $8 trillion in the coming years, many premised on recent advances in ML. But these advances have far outpaced the legal and ethical frameworks for managing this technology. There is simply no commonly agreed-upon framework for governing the risks — legal, reputational, ethical, and more — associated with ML.

This post aims to provide a template for effectively managing this risk in practice, with the goal of providing lawyers, compliance personnel, data scientists, and engineers a framework to safely create, deploy, and maintain ML models, and to enable effective communication between these distinct organizational perspectives. The ultimate aim of this article is to enable data science and compliance teams to create better, more accurate, and more compliant ML models.

Key Objectives & Three Lines of Defense

Projects that involve ML will be on the strongest footing with clear objectives from the start. To that end, all ML projects should begin with clearly documented initial objectives and underlying assumptions. These objectives should also include major desired and undesired outcomes and should be circulated amongst all key stakeholders. Data scientists, for example, might be best positioned to describe key desired outcomes, while legal personnel might describe specific undesired outcomes that could give rise to legal liability. Such outcomes, including clear boundaries for appropriate use cases, should be made obvious from the outset of any ML project. Additionally, expected consumers of the model — from individuals to systems that employ its recommendations — should be clearly specified as well.

Once the overall objectives are clear, the three “lines of defense” should be clearly set forth. Lines of defense — refer to the roles and responsibilities of data scientists and others involved in the process of creating, deploying, and auditing ML. For example, the importance of “effective challenge” throughout the model lifecycle by multiple parties as a crucial step that must be distinct from model development itself. The ultimate goal of these measures is to develop processes that direct multiple tiers of personnel to assess models and ensure their safety and security over time. Broadly speaking, the first line is focused on the development and testing of models, the second line on model validation, legal and data review, and the third line on periodic auditing over time. Lines of defense should be composed of the following five roles:

Some organizations rely on model governance committees — which represent a range of stakeholders impacted by the deployment of a particular model — to ensure members of each above group perform their responsibilities, and that appropriate lines of defense are put in place before any model is deployed. While helpful, such review boards may also stand in the way of efficient and scalable production. As a result, executive-led model review boards should shift their focus to developing and implementing processes surrounding the roles and responsibilities of each above group. These boards should formulate and review such processes before they are carried out and in periodic post-hoc audits, rather than individually reviewing each model before deployment.

Critically, these recommendations should be implemented in varying degrees, consistent with the overall risk associated with each model. Every model has unforeseen risks, but some deployments are more likely to demonstrate bias and result in adverse consequences than others. As a result, it's recommended that the depth, intensity, and frequency of review factor in characteristics including the model’s intended use and any restrictions on use (such as consumer opt-out requirements), the model’s potential impact on individual rights, the maturity of the model, the quality of the training data, the level of interpretability, and the predicted quality of testing and review.

Focusing On The Data Input

Once proper roles and processes have been put in place, there is no more important aspect to risk management than understanding the data being used by the model, both during training and deployment. In practice, maintaining this data infrastructure — the pipeline from the data to the model — is one of the most critical, and also the most overlooked, aspects of governing ML Broadly speaking, effective risk management of the underlying data should build upon the following recommendations:

Use Model Output Data As A Window Into Your Model

Understanding the outputs of a model — both during training and once in deployment — is critical to monitoring its health and any associated risks. To that end, it’s recommended that data owners, data scientists, validators, and governance personnel:

As with the above recommendations on underlying data shift, actionable alerts should also be a priority in monitoring the model’s output. It is critical that these alerts are received by the right personnel, and that such alerts be saved for auditing purposes.

Conclusion

Effective ML risk management is a continuous process. While this post has been focused on the deployment of an individual model, multiple models may be deployed at once in practice, or the same team may be responsible for multiple models in production, all in various stages. As such, it is critical to have a model inventory that’s easily accessible to all relevant personnel. Changes to models or underlying data or infrastructure, which commonly occur over time, should also be easily discoverable. Some changes should generate specific alerts, as discussed above.

There is no point in time in the process of creating, testing, deploying, and auditing production ML where a model can be “certified” as being free from risk. There are, however, a host of methods to thoroughly document and monitor ML throughout its lifecycle to keep risk manageable and to enable organizations to respond to fluctuations in the factors that affect this risk.

To be successful, organizations will need to ensure all internal stakeholders are aware and engaged throughout the entire lifecycle of the model.

Thank you for reading my post.