Ratio

Making Employment Tribunal decisions accessible to SMEs

Every year, thousands of Employment Tribunal decisions are published on Gov.UK. They contain a wealth of practical insight on how judges have ruled on unfair dismissal, discrimination, unpaid wages, and whistleblowing. But they’re buried in PDFs, written in legal language and accessible through basic keyword search.

If you are a sole trader or run a small business, you almost certainly can’t afford a solicitor to tell you where you stand. Ratio changes that.

What it does

Ratio is an AI-powered research tool that lets you ask plain English questions about Employment Tribunal case law and get clear, sourced answers in seconds.

Ask something like “What happens if I change my store shifts without employee agreement?” or “How can I dismiss my sales assistant during their probation period?” and Ratio searches hundreds of thousands of published tribunal decisions to find relevant cases, then summarises what the tribunals have actually decided, including full citations and links to the original rulings.

It’s not legal advice. It’s the information you’d need to decide whether to seek it.

Who it’s for

Ratio is built for employees and workers at UK SMEs and micro businesses, the people least likely to have access to an HR department or in-house legal team, and most likely to face workplace issues without knowing their rights. Around 5.5 million UK micro businesses employ roughly 10 million people. When something goes wrong at work, most of those people start with a Google search. Ratio gives them something better: answers grounded in what tribunals have actually ruled.

How it works

Ratio runs an automated pipeline that collects every new Employment Tribunal decision as it’s published on Gov.UK. Each decision is downloaded, parsed into structured sections (issues, facts, law, conclusions), and split into semantically meaningful chunks that preserve legal context.

These chunks are converted into dense vector embeddings using transformer models and indexed in a vector database. When a user asks a question, a multi-stage retrieval process finds the most relevant passages across the full corpus first by semantic similarity, then refined using extracted legal metadata such as the type of claim, the tribunal, and the legal concepts involved. A local large language model synthesises the retrieved evidence into a coherent answer, with source attribution back to specific decisions.

The entire system runs on AWS infrastructure with daily automated updates, meaning the knowledge base stays current as new decisions are published.

The compute challenge

Processing hundreds of thousands legal decisions, parsing, chunking and generating embeddings, is computationally intensive work. Doing it properly and at scale requires serious hardware.

Ratio has been awarded access to the Dawn national AI supercomputer through the UKRI AIRR Gateway route. Dawn, hosted at the University of Cambridge, is a flagship UK research computing facility. This grant provided the GPU compute needed to embed the full corpus of tribunal decisions and to experiment with embedding models and chunking strategies that would have been impractical on commodity hardware. The embedding pipeline runs across multiple GPUs in parallel, processing tens of thousands of document chunks per minute.

Without access to this kind of infrastructure, a project like Ratio built by a small team, not a large legal tech company, simply wouldn’t be feasible at this scale.

Quality assurance

Legal information tools carry real responsibility. A wrong or misleading answer could lead someone to make a poor decision about their employment rights. Ratio addresses this through several design choices:

  • Source attribution: Every answer links back to specific tribunal decisions. Users can verify claims against the original published rulings rather than trusting a black box.
  • Confidence scoring: The retrieval pipeline provides relevance scores at each stage, and the system indicates when evidence is thin rather than fabricating certainty.
  • No hallucination by design: The language model is constrained to synthesise only from retrieved passages. It cannot invent case law that doesn’t exist in the corpus.
  • Structured parsing: Rather than treating decisions as flat text, Ratio parses them into their legal structure — separating issues, findings of fact, legal analysis, and conclusions — which improves both retrieval accuracy and answer quality.
  • Continuous validation: The automated pipeline includes deduplication, error handling, and status tracking across every stage, from metadata collection through to vector indexing.

Despite these safeguards, Ratio presents information, not advice. Users are clearly informed of this distinction.

GDPR and data protection

Employment Tribunal decisions are published by the UK Government as public records. However, they contain personal data such as the names of claimants, respondents, and witnesses which creates obligations under UK GDPR even when the source material is public.

Ratio’s approach:

  • Lawful basis: Processing relies on legitimate interest (Article 6(1)(f) UK GDPR). The decisions are published specifically to serve the public interest in open justice, and Ratio’s use is consistent with that purpose.
  • No additional personal data: Ratio does not collect any personal data from its users beyond what is technically necessary to deliver a response. There are no user accounts, no tracking, and no profiling.
  • Published judgments only: The corpus consists entirely of decisions that the UK Government has chosen to publish. Ratio does not access restricted or redacted materials.
  • Proportionality: Names and personal details appear only in direct citations of published decisions, exactly as they appear in the government’s own publications. Ratio does not aggregate personal data across cases or create profiles of individuals.

The tension between open justice and data protection in tribunal decisions is an active area of legal and policy discussion. Ratio navigates this by staying within the boundaries of what is already public, while being transparent about the data it processes.