Three New Roles in the World of AI

Most organisations building with AI have a quality function for code and nothing equivalent for the data their systems learn from, the values those systems encode, or the rate at which they are deployed. That work falls between existing teams, which means it falls to no one.

The three roles in this document provide ownership of the things we need to tackle. More roles will emerge as AI matures. These three are a start at giving both new and existing skills a clear home in that work.

  1. The Ethical QA Engineer tests what a system has encoded before it ships, the way a tester finds defects before release.

  2. The Data Provenance Lead works further upstream, governing what is allowed to enter the system in the first place, where the data came from, what consent covers it and who it represents.

  3. The Verification Lead sits at the far end, owning whether the organisation can evaluate what it deploys as fast as it deploys it. Intake, build and deployment, each with a person accountable for it.

None of these is a compliance role. Each sits inside the technical work, close to the decisions, and each is built for the interdisciplinary person who currently finds no clear home in an organisation chart.

The specifications that follow set out what each role is for, what it owns and where its edges are.


 

Ethical QA Engineer

Embedded ethical quality assurance for AI and algorithmic systems

Purpose

Technical teams ship code through quality assurance, yet nothing equivalent tests the values those systems carry, and this role exists to close that gap. The Ethical QA Engineer sits inside the engineering team and treats embedded values the way a tester treats defects, finding them, logging them and making them open to challenge while the system is still in development and before any harm has reached production. The work is to surface and pay down bias debt, the biases a system inherits from the data it learns on, from the categories that shaped how that data was collected, and from the assumptions that have become so familiar the team no longer notices them.

Where the role sits

The role sits inside the product or engineering team, alongside QA, design and architecture, and the placement is deliberate, because policy, compliance and a central ethics board all sit too far from the decisions that actually encode values into a system. It reports to engineering or product leadership, though placing it there may mean these profiles have to change over time, or that the role runs on a dotted line into more than one function. It works closely with QA leads, data teams and whichever domain specialists are on the build.

What you will own

  • A bias risk register. A live log of the biases identified so far and the mitigations against each, held and visible at team level so that the risk no longer lives inside one person's judgement.

  • Trade-off documentation. A record of the choices made during development, the variables that were excluded, the data that was prioritised and the edge cases that were set aside, all kept visible and open to challenge.

  • Dataset auditing. A review of training data for provenance, consent basis and representation, carried out after the early experimentation and before deployment, while bias is still cheap to fix.

  • Interrogation of outputs. Sustained questioning of what a system produces and why it produces it, on the principle that trust in an algorithm is earned through that questioning over time and never simply assumed at the outset.

  • Shared vocabulary. Working glossaries that let engineers, designers and domain experts mean the same thing by the same terms, developed alongside the team so that the definitions belong to the people using them.

What good looks like

Embedded values are surfaced and contested well before a system ships, and the trade-offs behind them are documented and can be defended to the people the system affects. The bias debt on the register is visible and falling, and ethical reasoning has become part of how the team builds, present in the earliest decisions long before anything reaches a final checkpoint.

Who you are

You combine technical fluency with ethical reasoning, comfortable reading a model card, querying a dataset and sitting in a code review without needing anyone to translate for you. You can hold a position under pressure to ship while keeping your reasoning legible to the engineers around you, and you are at ease being the person who reopens what everyone else assumed was settled.

Likely backgrounds include responsible AI, applied ethics with technical training, data science with a fairness specialism, or QA and engineering alongside formal study in ethics. The interdisciplinary graduate who currently finds no clear home in an organisation belongs here.

What this role is not

This is not a compliance sign-off at the end of the pipeline, and it is not an ethics committee working at one remove from the build. Where compliance concerns itself with whether the agreed process was followed, this role stays with the harder questions of what is being built, why, whom it serves and how it ought to work.


 

Data Provenance Lead

Ownership of data entering AI and algorithmic systems

Purpose

A model inherits whatever its training data carries, and most data arrives with its composition undocumented, its consent basis uncertain and its quality unrecorded. This role owns the provenance of that data before it ever reaches a model. Where the Ethical QA Engineer inspects the values a system has already encoded, the Data Provenance Lead governs what is allowed to enter it in the first place, so that the limits of anything built on that data are known and stated in advance, well before they would otherwise surface in production.

Where the role sits

The role sits upstream of the build, with data engineering and the teams that source and acquire data, close to the acquisition decision where provenance is actually set, since by the point of use it is already too late to shape it. It reports to data or engineering leadership, works with legal on the scope of consent and with domain specialists on questions of representation.

What you will own

  • Data lineage. A documented origin for every dataset, traceable back to its source, with its quality established and on record before anyone builds on it.

  • Consent basis. The scope of permitted use, defined and held against the expectations of the people the data came from, so that any dataset whose consent does not cover the intended use is flagged while there is still time to act on it.

  • Representation. Datasets characterised for representation along whatever dimensions matter to the use, so that a model trained on them ships with its limits already stated and any unrepresentative data is carried forward as a known constraint on the build.

  • Datasheets. Documentation that travels with each dataset, so the teams downstream can see what they are building on without having to reconstruct it for themselves.

  • The provenance standard. Deciding what level of assurance a given task requires, since cheap undocumented data and fully assured data each have their place, with the choice between them made deliberately and matched to the stakes of the work.

What good looks like

Every dataset entering a model carries a documented origin, a defined consent scope and a representation profile, and its limitations are stated up front and carried forward so the teams downstream can see exactly what they inherit. No model is trained on data whose provenance would fail to withstand scrutiny from the people it affects.

Who you are

You come from data governance, research data management, or work with sensitive data, and you bring the standards of an accredited environment into your context. You read a consent framework and a dataset with equal fluency, and you can turn down convenient data when its origin will not hold, explaining the refusal in terms the team will accept.

What this role is not

This is not a privacy sign-off, and it is not a one-off data-cleaning pass carried out before launch. It governs intake as a continuous function and owns the standard that the organisation holds all of its data to.


 

Verification Lead

Ownership of evaluation capacity across the deployment boundary

Purpose

AI capability grows faster than an organisation can build the capacity to evaluate it, and the distance between the two is where unexamined machine judgement slips into real decisions. This role owns that evaluation capacity and keeps it in step with capability, so that systems are assessed at the rate they are deployed and the organisation never comes to rely on judgement, or logic, it has no way to check.

Where the role sits

The role sits across the deployment boundary, between the teams building or adopting AI and the teams accountable for the decisions it feeds. It reports to whoever owns deployment risk and holds enough independence to gate a release that the building team is keen to ship.

What you will own

  • Evaluation capacity. The organisation's ability to assess AI outputs at the rate it produces them. Capability and evaluation are tracked against one another, and the gap between them is reported as a measured figure the organisation can see and act on.

  • An evaluation risk register. A record of where AI judgement and logic have been deployed ahead of the capacity to evaluate it, the exposure that this creates, and the plan in place to close it.

  • Keeping humans in the generative role. Defining the points at which a person must still produce and contest the logic and conclusions themselves, and marking the decisions that must not be allowed to slide quietly from human judgement into human sign-off.

  • Deployment gates tied to evaluability. A system clears the gate at the point the organisation can evaluate it, which is a later and harder threshold than the point at which it simply starts working, and the gate is held to that standard.

  • Independent evidence. Evaluation that does not depend on the system under evaluation, so that confidence in an output comes from assessing it from the outside and never rests on the system's own account of itself.

What good looks like

Evaluation capacity keeps pace with deployed capability, and any unassessed machine judgement is visible and bounded where it would otherwise stay silent and grow. Humans keep the generative and contesting role on the decisions that matter most, and no machine judgement enters a decision the organisation cannot independently assess or trace.

Who you are

You come from model evaluation, assurance, test and evaluation, or research methodology, and you are willing to press a working system on the question of whether it can be evaluated at all, which is a different and harder question than whether it performs. You can hold a deployment gate under pressure to ship and show your reasoning clearly to the people pushing against it.

What this role is not

This is not QA of model performance, and it is not a benchmarking function. It owns the harder question of whether the organisation can keep pace with what it chooses to deploy.