Algorithm Governance Roundup #6

Community Spotlight: Ramak Molavi Vasse’i, Mozilla | ChatGPT and LLMs

Welcome to April’s Roundup. This has been a busy month for algorithm governance especially as regulators and legislators respond to ChatGPT and large language models (LLMs) – given this, we've included dedicated sections on ChatGPT and LLMs in the newsletter.

This month’s community spotlight is on Ramak Molavi Vasse’i, the lead researcher of the Meaningful AI Transparency Project at Mozilla. We talk about Mozilla's recent publication AI Transparency in Practice, which investigates explainable AI tools and the challenges AI builders face to implement transparency.

As a reminder, we take submissions: we are a small team who select content from public sources. If you would like to share content please reply or send a new email to algorithm-newsletter@awo.agency. Our only criterion for submission is that the update relates to algorithm governance, with emphasis on the second word: governance. We would love to hear from you!

Many thanks and happy reading!

AWO team

This Month's Roundup

In Europe, the European Commission has designated 17 companies as Very Large Online Platforms (VLOPs) and two companies as Very Large Online Search Engines (VLOSEs) under the Digital Services Act. VLOPs are Alibaba AliExpress, Amazon Store, Apple AppStore, Booking.com, Facebook, Google Play, Google Maps, Google Shopping, Instagram, LinkedIn, Pinterest, Snapchat, TikTok, Twitter, Wikipedia, YouTube and Zalando. VLOSEs are Bing and Google Search.

In Luxembourg, the European Court of Justice has published Advocate General Pikamäe’s Opinion in Case C-634/21 SCHUFA Holding and Others (Scoring) on profiling under Article 22 of the GDPR. The Advocate General held that the automated establishment of a probability concerning the ability of a person to service a loan constitutes profiling where it plays a decisive role in the final decision. Read the Digital Constitutionalist analysis here.

In Spain, the European Centre for Algorithmic Transparency held a launch event and expert workshops on data access and audit. They also announced a partnership with France’s Pôle d’Expertise de la Régulation Numérique. (For new readers, the January issue of this newsletter featured a Community Spotlight on ECAT.)

In the UK, the newly-established Department for Science, Innovation and Technology published their White Paper for A pro-innovation approach to AI regulation, which introduces a framework underpinned by five principles. The White Paper does not recommend the introduction of specific AI legislation; instead these principles will be implemented by existing regulators.

The UK-US Summit for Democracy announced the winners of their Challenge for Privacy Enhancing Technologies (PETs). The winning approaches combined different PETs to allow AI models to learn to make better predictions without exposing sensitive data. A demonstration day will be held on 22 May.

The Information Commissioner’s Office has concluded its investigation into FaceWatch, which provides live facial recognition to supermarkets. The regulator accepted the company’s improvements and found they had a legitimate purpose for using people’s information for the detection and prevention of crime. AWO was instructed by Big Brother Watch on this complaint.

In the US, Mozilla Foundation has invested $30 million in Mozilla.ai to create a trustworthy AI ecosystem. Another Mozilla project on Meaningful AI Transparency has published the report AI Transparency in Practice exploring Explainable AI (XAI) tools and the challenges faced by AI builders in operationalising transparency.
In this month’s community spotlight we talk to lead researcher Ramak Molavi Vasse’i about this research.

ChatGPT and Large Language Models (LLMs)

In Europe, the European Parliament has published an explainer on General-purpose artificial intelligence (GPAI). The document covers generative AI tools, harms and legal issues, and regulation under the EU’s AI Act.

An MEP staffer, Kai Zenner, described GPAI as one of the last sticking points for AI Act parliament negotiations.

Researchers at the AI Now Institute, DAIR, Mozilla and Yale have published a policy brief that argues GPAI should not be excluded from the EU’s AI Act as GPAI is an expansive category beyond chatbots/LLMs, carries inherent harms, and must be regulated throughout the product cycle. Providers should not be able to relinquish responsibility using a standard legal disclaimer.

In France, the Pôle d’Expertise de la Régulation Numérique (PEReN) has published a briefing, Shedding light on…n°6 - ChatGPT and the rise of conversational AI models. PEReN are a government data science organisation that supports digital platform regulation. Their technical briefing considers the main challenges and limits of the technology and regulation under the EU’s AI Act.

In Italy, the Italian DPA Garante issued an immediate temporary limitation on OpenAI’s processing of personal data in Italy for ChatGPT. Notice reasons include lack of information to users or data subjects, lack of legal basis, lack of accuracy, and ability for under 13s to access the service in the absence of age verification. This limitation will be suspended if OpenAI complies with certain measures. These include the publication of an information notice; the creation of tools to object to the processing of personal data and to obtain rectification or, where not possible, erasure; changing the legal basis to consent or legitimate interest; development of age verification tools for under 13s; the publication of a mass media information campaign. Read the TechCrunch analysis here.

The European Data Protection Board has created a dedicated task force to foster cooperation on enforcement actions against Chat GPT.

In the US, The Washington Post has published Inside the secret list of websites that make AI like ChatGPT sound smart. The investigation analyses the Google C4 data set which includes millions of websites and was used to train LLMs such as Google’s T5 and Facebook’s LLaMA. It found that half of the top 10 sites were news sites, the copyright symbol appeared over 200 million times and it contained adult and racist content.

In China, draft measures have been introduced to regulate generative AI services. The regulation covers data protection, non-discrimination, bias and the quality of training data, and obligations to protect the rights and interests of end-users, including detailed transparency requirements. Providers will need to complete a security assessment and submit this to the national security agency.

Research

The Ada Lovelace Institute has published Inclusive AI Governance: Civil society participation in standards development. This report explores the role technical standards play in the implementation of the AI Act and recommends ways to improve civil society participation to protect fundamental rights. This blog post explains the role of standards in AI governance.

The AI Now Institute has published Confronting Tech Power: 2023 Landscape. This report focuses on concentrations of technology power and the data, computing power, and geopolitical advantages that accompany it. It discusses algorithmic accountability, including moving beyond audits, impact assessments and data access to address structural problems, and the use of data minimisation.

Babl has published The Current State of AI Governance. This report examines the state of organisational AI governance and identifies key trends concerning implementation. These include repositories, risk assessments, lack of external stakeholder engagement, lack of metrics and difficulty in recruiting the right skillsets.

The European Artificial Intelligence & Society Fund and Eticas has published How public money is shaping the future direction of AI: An analysis of the EU’s investment in AI development. This report analyses how funding is allocated and who receives it. It also interrogates how EU values and ethics are guiding funds, finding that only 30% of AI funding calls mention trustworthiness, privacy or ethics. The recommendations include publicly accessible data, evaluation of the real-world impacts of funding, and mechanisms for civil society participation in funding.

Fair Trials has published an example law enforcement profiling tool. The advocacy tool is based on information which is used by law enforcement and criminal justice authorities in predictive and profiling systems.The full press release is here.

The Institute for the Future of Work has published Good Work Algorithmic Impact Assessment: An approach for worker involvement. This guidance is for employers on how to involve workers in the assessment of algorithmic systems used in the workplace that may have significant effects on access, conditions and quality of work.

The Institute of Electrical and Electronics Engineers (IEEE) has made their AI Ethics and Governance Standards freely available through the GET Program. The Standards include Age Appropriate Design, Ethical Concerns during System Design, Transparency, Data Privacy, and Assessing Impacts on Human Well-Being. IEEE welcomes feedback on standards via email at AIE-GET@ieee.org.

The Knight First Amendment Institute has published Twitter showed us its algorithm. What does it tell us? This article explores Twitter’s source code release as a case study of social media transparency. It investigates what the code does and doesn’t reveal, from the perspective of information propagation and algorithmic amplification on social media.

Stanford University’s Human-Centred AI Centre has published the 2023 AI Index Annual Report. The report analyses data related to AI, including on foundation models, including their geopolitics and training costs, the environmental impact of AI systems, AI education, and public opinion trends in AI.

ChatGPT and Large Language Models (LLMs)

The OECD has published AI Language Models: Technological, socio-economic and policy considerations. This report provides an overview of national initiatives for the development of language models. It then analyses the technical aspects using the OECD Framework for Classification of AI systems and the policy considerations through the lens of the OECD AI principles.

The Tech Won’t Save Us podcast published ChatGPT Is Not Intelligent with Emily M. Bender to discuss the history of LLM development and scepticism of current hype, including through discussion of her paper On the Dangers of Stochastic Parrots, co-authored with Timnit Gebru, Margaret Mitchell and Angelina McMillan-Major in 2021.

Opportunities

Data&Society is hiring a Senior Researcher and Researcher for its AIMLab Project. The research involves developing new models for algorithmic impact assessments. The roles are US-based.
Applications are reviewed on a rolling basis.

The Competition and Markets Authority is hiring a range of roles in the Data, Technology and Analytics (DaTA) unit, including Assistant Director, Intern, Adviser. The Assistant Director role asks for experience in investigating and auditing large-scale sociotechnical systems.
Application deadlines are 02, 09, 15 May respectively.

The European Consumer Organisation (BEUC) is hiring a Digital Policy Officer. The role involves monitoring issues and advocacy on data protection, privacy, the data economy and platform regulation. The role is Brussels-based.
Application deadline is 05 May.

The European Commission is calling for evidence for the Delegated Regulation on data access in the Digital Services Act.
Evidence deadline is 23 May.

The Digital Freedom Fund and European Digital Rights (EDRi) are requesting input on their draft programme for their Decolonising Digital Rights initiative.

Upcoming Events

Generative AI and the Creative Sector: The EU’s AI Act
Online: 27 April 14:00 - 15:00 GMT
This webinar is organised by Access Partnership. It discusses the regulation of artificially generated content under the EU’s Artificial Intelligence Act, with a focus on deep fakes and ChatGPT.

Is halting AI development the right aim for Europe’s AI policy?
Online: 27 April 17:00 - 18:00 GMT
This event is organised by SNV. It is a discussion between SNV's Project Lead on Artificial Intelligence, will discuss general purpose AI (GPAI), the EU’s AI Act and the open letter to pause the development of GPAI with co-initiator of the letter, Max Tegmark.

Child Rights by Design
Online: 15 May 17:00 - 18:30 GMT
This webinar is organised by 5Rights Foundation. It showcases their work through the Digital Future Commission on Child Rights by Design. It will explore how innovators, policymakers and NGOs can advance children’s best interests in a digital world, recognising international developments and challenges.

How are New Technologies impacting Human Rights?
Online: 16 May 17:00 - 18:00 GMT
This panel is organised by the Minderoo Centre for Technology and Democracy. The panel features AI and human rights experts, a forensic consultant, and an anthropologist of genocide and digital technologies to explore what rights-promoting technology looks like.

Re:Publica
In-person: 05 - 07 June, Arena Berlin and Fetsaal Bruezberg, Berlin
Festival for digital society. Tracks include Politics and Society, Net for Good, Arts and Culture, Media, Spheres of Work, The Human Touch.
Standard tickets are €299.

Community Spotlight: Ramak Molavi Vasse’i, Mozilla

Ramak Molavi Vasse’i (The Law Technologist) is a digital rights lawyer and senior researcher. She leads the Meaningful AI Transparency Research Project at Mozilla, which recently published the report AI Transparency in Practice.

Q: What was the impetus for Mozilla Foundation’s Meaningful AI Transparency Project?
Ramak: The Mozilla Foundation has been working on Trustworthy AI for several years. The objective of this work is to encourage the development of human-centred AI that benefits users and society. Mozilla’s work on Trustworthy AI has three main areas: transparency, bias, and data governance. Transparency is crucial because it unlocks accountability: we need information to assess whether a system’s outcome is biased and to mitigate the negative impacts of AI on people and the environment.

Q: What is the concept of Meaningful AI Transparency?
Ramak: We have learnt a lot about transparency from existing regimes, particularly that it cannot be an end in itself. For example, the GDPR requires a lot of information to be shared and documented. However, privacy policies are often so detailed and long that the information tends to be overwhelming rather than useful, which can have an obscuring rather than empowering effect.

Therefore, we adopted the concept of Meaningful AI transparency which has two dimensions, thematic and functional.

The thematic dimension encompasses the elements of AI we need information about:

Social transparency: Information about the potential impact of the system at an individual, community and societal level. This could include technology assessments or information on working conditions.
Ecological transparency – Information about the environmental footprint of an AI system. For example, the energy used to train large language models is very energy and water intensive.
Responsibility chain: Information about the people who built the system, who annotated the data, who cleaned the system's results, or who decided on the specific use of the AI systems.
Explainability: Information about how the system generated a particular outcome.
AI Purposes and Metrics and Data Provenance: Technical information including model size, architecture, training methods and training datasets to evaluate and assess and the system and outcomes. For example, when Open AI released Chat GPT-4 they stated the input data included everything available on the internet but they did not share any of this information, which makes it incredibly difficult to assess.

The functional dimension encapsulates the idea that information has a function and to be meaningful it must be useful, actionable, and verifiable. The function always depends on the stakeholder the information is aimed towards. For example:

AI builders are most interested in security through debugging and improving systems.
Clients are interested in trusting system outcomes and relating it to their own knowledge.
Impacted persons may be interested in how to provide alternative information to get an alternative outcome or challenge a system. For example, if they have not been considered for a particular job or have been denied a loan.
Civil society may want to take a deeper look at a system to review its impact on privacy or societal and environmental well-being, which will require API access.
Regulators are interested in assigning accountability and enforcing regulation, which will require system/data access and internal documentation such as assessments and mitigations.

Q: Can you tell us about the AI Transparency in Practice Report?
Ramak: The initial idea for the Meaningful AI Transparency Project was to create practical guidance for how to use explainable AI tools (XAI). AI explanability tools are a variety of methods used to understand and explain the predictions of black-box AI models.

To gather useful insights, we decided to focus on the perspective of AI builders working on transparency, because they are essential for the implementation of these tools but are under-researched. AI builders are the developers and deployers who are directly involved in the design and creation of AI systems, or at a later stage in the AI lifecycle. They are data scientists, system architects, machine learning or data engineers, (highly involved) product owners and UX designers.

We surveyed 52 self-identified AI builders working in industry organisations about transparency. Our survey was anonymous and had many open answer response fields to encourage honest and accurate answers. In conducting the survey, we found a huge gap between the research on explainable AI and the use of ex-post explainability tools in the day-to-day practice of AI builders. It emerged that these tools were immature, difficult to use in practice and there was a lack of confidence in the results. Therefore, we were unable to provide practical guidance. Instead, we decided to take an action research approach and identified gaps and issues to explore in more depth. This included the challenges AI builders face when using explainable AI tools. We conducted 7 in-depth interviews for a deeper understanding of these challenges.

We found that AI builders conducting transparency work focus mainly on system accuracy and debugging. They are more concerned with internal OKRs and client expectations than helping end-users and impacted persons understand algorithmic decisions. They explained that management do not prioritise further transparency work because it is costly and slows down the development process. Respondents are least concerned with compliance with internal ethical codes or regulation.

We have identifies several areas that influence the implementation of explainable AI, leading to a non-implementation cycle in day-to-day work.

(Perceived) lack of regulation: Respondents stated they are operating in a regulatory gap. It was suggested that management would prioritise transparency work once the industry becomes regulated by the EU’s Artificial Intelligence Act. We were surprised by the apparent lack of awareness of the applicability of existing regulation, such as data protection, anti-discrimination, competition, and copyright law. It also appears that the discussions around the AI Act may have deprioritised or distracted from the requirements of existing regulation.

Ethics: Insufficient information is shared around bias and data provenance. More critically, almost no information is shared about the ecological costs nor the impacts on persons or societal well-being. The respondents had almost no awareness of these issues because they view AI systems as products rather than socio-technical tools.

Methods: Explainable AI tools are immature, in practice are difficult to use and the explanation produced is not trusted. Therefore, we were unable to recommend specific explainable AI tools for builders to use. Instead, we recommend that interpretable models are deployed in any high-stakes context. Not every task requires the use of a black box model and interpretable models can often do the job. Interpretable models are models used for prediction that can be directly inspected and interpreted by human experts. Whilst they are not understandable for laypersons, they are easier to reverse engineer.

Transparency delivery: Developers and data scientists found it challenging to share information in the right form for different stakeholders. Many respondents were challenged to think about this for the first time during the research process. There was a very large gap between how respondents currently share information and how they thought it would be useful to provide information. We are interested in this gap and are planning to do further research on the operationalisation of transparency.

Q: What are your next steps for the project?
Ramak: The next phase of our research is investigating how to operationalise meaningful AI transparency. We will do this by looking at the transparency requirements of the EU’s Artificial Intelligence Act. We are beginning this research early to support compliance for all actors, including SMEs who are less able to access advice, and inform the approach of enforcement and auditing bodies.

Firstly, we want to create guidance on the transparency requirements in Article 13. This Article requires high-risk systems to be “sufficiently transparent to enable users to interpret the system’s output and use it appropriately”, which is quite vague.

Secondly, we are interested in prototyping the labelling and disclosure requirements for synthetic content in Article 52. This Article requires developers to ensure that end-users understand they are interacting with synthetic content, such as a chatbot or a deep-fake. A lot of focus has previously been on empowering end-users through media literacy, but we are concerned about how much we can demand.

As technology improves, detecting synthetic content becomes a moving target that continues to accelerate. For example, Midjourney made a huge leap in the generation of AI images in the past few months alone. Whilst it was possible to easily detect synthecity (e.g. by detecting an incorrect number of fingers or excess of blur), this will become increasingly difficult. The human senses will no longer be able to distinguish AI products from real content. It will become increasingly difficult to rely on media literacy, and instead we need to develop labelling tools and meta solutions, such as browser recognition or machine-readable disclosures.

We will map existing projects and research, develop ideas and engage stakeholders, including companies and impacted people, to identify what information and tools work. We are keen to work with industry partners to prototype ideas. If anyone is working on existing projects or has thoughts and approaches on any of this work, we would love to collaborate. Please get in touch at ramak@mozillafoundation.org.

Q: What recommendations do you have for regulators working on AI?
Ramak: We have an uneven playing field when it comes to people's interests being heard and taken into account in the policy-making process. Big Tech organisations are over-represented which makes it difficult for SMEs and civil society to be heard. I would encourage regulators to emphasise a better balance of interests that reflects society and to further develop their own technical knowledge to become more independent of external expertise. This would enable them to demand more information to penetrate the often-opaque wall of marketing speech. We also need to strengthen our verification systems, including AI auditing, especially second and third party auditing.

In addition, it is crucial to enforce existing regulation. As our research has shown, management and developers perceive the AI industry to be unregulated despite being affected by a number of legal obligations. New laws should be communicated in a timely and accessible way.

There is also huge potential to use technology to enforce the law against large organisations. Instead, we see a wide use of technology to enforce the law against people, for example, in surveillance, policing or IP-blocking. We are keen to help enforcement bodies bridge this gap.

:// Thank you for reading. If you found it useful, forward this on to a colleague or friend. If this was forwarded to you, please subscribe!

If you have an event, interesting article, or even a call for collaboration that you want included in next month’s issue, please reply or email us at algorithm-newsletter@awo.agency. We would love to hear from you!

You are receiving Algorithm Governance Roundup as you have signed up for AWO’s newsletter mailing list. Your email and personal information is processed based on your consent and in accordance with AWO’s Privacy Policy. You can withdraw your consent at any time by clicking here to unsubscribe. If you wish to unsubscribe from all AWO newsletters, please email privacy@awo.agency.

A W O
Wessex House
Teign Road
Newton Abbot
TQ12 4AA
United Kingdom