Legal
AI disclosure
Last updated 19 April 2026 — draft pending legal review
TariffFlow is an AI-assisted tool. Every classification you run is produced by a large language model applying the UK commodity schedule plus a library of General Rules of Interpretation (GRI) and HMRC-specific conventions. This page explains exactly how that works, what it's good for, and what it is emphatically not.
What the AI does
A classification runs through five deterministic stages on our servers:
- Synthesize. Your product description, any uploaded datasheets (parsed to text), and any image descriptions (from a vision model) are merged into a canonical product summary plus a structured attribute table — material, function, use case, etc.
- Retrieve. That summary is embedded into a numeric vector and matched against ~18,000 declarable UK commodity codes and ~30 interpretive rule cards. The top 15 candidate codes and top 6 rule cards are selected.
- Reason. A single structured-reasoning prompt presents the candidates + rules to the model, which picks the best code and cites the GRI rule used (usually GRI 1; GRI 3(a) for specificity ties; GRI 3(b) for composite goods, etc.).
- Commit-check. A second AI pass critiques the commit, surfacing any plausible competing heading the reasoning may have missed. The check can upgrade the risk flag but never silently overrides the pick.
- Verify. The final 10-digit code is looked up against the live HMRC Trade Tariff API for duty, VAT, and measures. The result is cached for 24 hours per case.
What the AI explicitly is not
- Not a binding tariff ruling. For formal certainty you can apply for Binding Tariff Information (BTI) from HMRC. TariffFlow is not a substitute.
- Not legal or compliance advice. Customs mis-classification carries real duty, VAT, and penalty exposure. You remain responsible for the code you declare.
- Not customs broker replacement. Complex consignments (anti-dumping, preference rules, controlled goods) benefit from professional review.
- Not infallible. The model occasionally surfaces the wrong leaf within the correct heading, or misreads ambiguous product text. Confidence scores and risk flags signal when this is likely.
How to read the output
- Confidence— the model's self-assessment. 85%+ is strong; 60–85% worth a second look; below 60% we surface prominently and recommend human review before declaring.
- Key reasons — the tests the model applied (principal function, material, GRI rule). Read them as a sanity check: if a reason looks wrong, the pick probably is.
- Alternatives considered — runners-up with explicit rejection rationale. Useful when your product sits at a boundary between chapters or headings.
- Risk flags — specific issues flagged by the reasoning. High-severity flags mean the commit-check stage found a genuine competing interpretation.
- Classification walk — the section → chapter → heading → subheading → leaf path with per-level rationale and GRI citation. Part of the Case Pack audit trail.
Which AI models we use
The AI is not a single model. Different stages use different providers so we can keep costs low on high-volume stages and quality high on reasoning-heavy ones.
- Synthesis, reasoning, commit-check: Ollama Cloud (currently gpt-oss:120b). OpenAI-compatible API.
- Text embeddings for retrieval: OpenAI text-embedding-3-small at 1024 dimensions.
- Image vision (when you attach an image): Ollama Cloud vision model (qwen3-vl:8b at time of writing).
Model choice is reviewed periodically against a 12-case golden test set covering known-difficult UK classifications (barcode scanners, knitted gloves, hex bolts, essential-character composites, etc.). If we swap a model, we disclose it here and stamp the Case record with the model version used so regressions are traceable.
Your data
When you run a classification, the following leaves our servers:
- To Ollama Cloud: your canonical product summary, extracted attributes, candidate codes retrieved from our DB, and rule cards — assembled into the reasoning prompt.
- To OpenAI: the short text used to compute the retrieval embedding. No code lists, no rule cards.
- To HMRC: the final 10-digit code, to retrieve live duty and VAT. HMRC does not see your product description.
Neither Ollama Cloud nor OpenAI trains their production models on API traffic under the enterprise terms we operate under. All transmission is TLS 1.3. The full subprocessor list and data regions live on the privacy page.
Uncertainty and corrections
The model is calibrated to show uncertainty honestly:
- Where the GRI rules don't decide cleanly between two headings, you'll see a follow-up question rather than a silent pick.
- Where the tariff schedule itself is ambiguous (residual "Other" subheadings, cross-chapter exclusions), the reasoning names the rule applied and the specific note consulted.
- Where retrieval surfaced a weak match, confidence drops and we'll recommend providing more detail — material, finishing, intended use.
If a classification is wrong, the feedback button on every recommendation flows into our prompt and canonical-exemplar library. Specific product descriptions improve accuracy for that product class for future users.
Accuracy commitments
We publish honest accuracy numbers rather than marketing claims. The current internal golden-test accuracy (12 canonical UK cases, run before every prompt change) is public on request. We don't claim 99% accuracy — anyone who does is selling, not measuring.
Accuracy is an engineering target, not a warranty. See our Terms of Service for the formal allocation of liability.
Questions
If you're evaluating TariffFlow for regulated use and need more detail on the AI pipeline — model names, prompt versions, failure-mode tests — we're happy to walk you through it. Email support@tariffflow.app.
This disclosure is updated whenever we change the AI pipeline materially (swap a model, change a stage, add a new data source). The "Last updated" date at the top reflects the current version.