0 Items: 0

Want to start reading immediately? Get a FREE ebook with your print copy when you select the "bundle" option. T+Cs apply.

Data Ethics and Governance in the Age of LLMs

Linear streams of light omitting from centre

In a very short period, we have seen the wave of Artificial Intelligence (AI) break on the shores of wide-scale business adoption and mainstream media coverage of Large Language Models (LLMs), most famously ChatGPT.

While these technologies have immense potential to benefit organizations and society, they also bring with them an equal, if not greater, risk. Not only this, but they raise some different questions for consideration from an ethics and governance perspective.

Key ethical perspectives

It is important at this juncture to view these developments not as isolated information points but to see them through the lens of a holistic framework such as the E2IM framework, we set out in Data Ethics.

This describes three ethical perspectives that we need to consider when thinking about the ethical use of data and data-related technologies such as LLMs. These need to be considered when trying to interpret the current state of debate and discourse, both in public and within our organizations.

In the E2IM model:

  1. The Ethic of Society reflects the broadly agreed upon ethical norms and expectations of Society. These are often codified and expressed in things like Charters of Fundamental Rights or in legislation and other statutory regulatory frameworks.
  2. The Ethic of the Organization describes the core values of the organization as reflected in the organization culture, values, and internal governance structures of the organization.
  3. The Ethic of the Individual relates to the personal ethical maturity and perspectives of the individual worker in the organization. These personal values are often influenced by social factors in our personal background.

Organizations can influence the Ethic of Society through lobbying on legislation or other forms of regulation. The goal of this lobbying is to influence the regulatory landscape in a way that benefits the interests of organizations.

Individuals can seek to influence the Ethic of the Organization either through boycotting businesses whose values and ethical norms they disagree with or, in the case of people inside organizations, by formalizing the support structures for individuals who are trying to influence the values of the organization.

Potential risks and harms

It’s worth noting that the current wave of AI risks and harms being discussed as requiring regulation are medium to long-term harms that might arise in the future. There is limited discussion now of the immediate and present harms and ethical problems of AI tools. These range from discrimination in automated decision making due to data quality issues or bias in training data, to the processing of personal data in LLM training data.

These issues are significant challenges in the development and adoption of advanced data processing technologies such as AI. Ironically, addressing many of these issues that are challenges today might help mitigate some of the doomsday scenario risks that are the focus of the long-term agenda when discussing the regulation of these technologies in society.

However, the quest for regulation (or business-friendly regulation) needs to be considered in the context of how organizations are developing and nurturing the Ethic of the Organization when it comes to the development of these technologies. Sadly, the news here is not good and there is a glaring disconnect between the public concern and the internal controls. A key influencer of ethical outcomes in data management is the nature of the situational moderators of ethical behaviour within the organization.

The internal data governance structures of the organization, cascading principles to policies and processes, are important in ensuring that individuals take appropriate and ethical actions in respect of data. The termination of AI ethics oversight functions by Microsoft and others in recent months has resulted in the removal of key governance functions in terms of establishing and translating ethical principles into practice. More worrying is the strongly suggestive ‘tone from the top’ on AI ethics that it no longer matters in the era of ChatGPT.

So, we find ourselves in a situation where the organizations promoting the development of AI tools are seeking to influence external regulation and, at the same time, are removing their internal regulation and safeguards. This, in turn, should raise concerns about the effectiveness of any regulation if organizations are removing the data governance structures that would be needed to ensure compliance with the requirements of any external regulation.

Another key consideration for organizations in the design of governance frameworks for the ethical use of data is whether individuals in the organization can raise concerns about the ethics of the organization’s use of AI technologies. Unfortunately, there is a history of lone voices in these firms being dismissed or sidelined when they raised ethical concerns.

Six important lessons to be learned

So, with all this uncertainty surrounding the future of AI and the risks and harms involved, here’s what we do know:

  1. Regulation of AI will happen. The EU is already well advanced on enacting their AI Regulation, other jurisdictions are examining legislation and Altman’s call at the Congress will result in the US taking some steps in this direction.
  2. Regulation will take different forms in different jurisdictions, so organizations will need to focus on their internal data governance controls to translate generic ethical principles into tangible data management practices and to ensure that they can demonstrate compliance with regulatory requirements across different jurisdictions.
  3. These data management practices will need to include considering data quality issues in source data that is used to train internal LLM systems as well as the data protection and other issues that might arise from automated decision making based on machine learning models. The ethical questions of transparency of the algorithms and models developed will need to be addressed.
  4. Organizations will also need to address the challenges of data skill and understanding within their organizations to avoid the risk of ‘garbage in/garbage out’ in the development and implementation of AI processes. The ‘secret sauce’ of publicly available LLMs such as ChatGPT has been the Human Reinforced Learning that has fine-tuned the models currently available. Organizations will need to consider how this reinforcement learning will be carried out in their organizations, and by whom.
  5. Issues such as data protection, intellectual property rights, and commercial confidentiality will need to be considered as part of the development of the internal data governance frameworks for AI in your organization.
  6. When considering the design of future work in our organizations we will need to consider what we will want the future of work to be. After all, we need to ensure that the society we deliver through AI is a society that meets the expectations of those living in it and can support the dignity of all.

Now is the moment when organizations need to shift from an aspirational stance on data ethics to one that starts getting real about the practical challenges of governing data in the emerging age of AI and LLMs.

Important lessons need to be learned. The genie won’t be getting put back in the bottle.

Related Content

Information & Knowledge Management, Business Improvement
Digital and Technology, Business Strategy, Information & Knowledge Management
Business Strategy, Information & Knowledge Management, Digital & Technology

Get tailored expertise every week, plus exclusive content and discounts

For information on how we use your data read our  privacy policy