AI Data Services

Managed AI data services from Lagos — multilingual, technical, domain-grounded.

We supply frontier-model labs and AI platforms with production-grade training data: native-fluency annotation across five African languages, code and technical annotation by working software engineers, and domain-expert labeling sourced from our own fintech, government and healthcare engagements.

What we deliver

Multilingual annotation across African languages

Native-fluency annotators in Hausa, Yoruba, Igbo, Nigerian Pidgin and Nigerian English. We handle text classification, instruction tuning, preference ranking, dialogue evaluation and transcription. Scarce in the global supplier base — exactly where frontier models are weakest.

Technical and code annotation

Code review, RLHF for coding models, dev-tool evaluation, bug-vs-feature classification, refactor preference rating — delivered by our in-house software engineering team, not generalist labelers. Languages we cover: PHP/Laravel, Node.js, Python, TypeScript, Flutter/Dart, Go.

Domain-expert annotation

Annotators who actually work in fintech (banking platforms, wallets, payments), government and public sector, healthcare delivery, and e-learning. Drawn from our own client portfolio and partner network. Premium-tier reliability for evaluations that punish crowd-sourced opinion.

RLHF and preference data

Pairwise comparison, multi-turn dialogue rating, harmlessness evaluation, helpfulness scoring — at the quality bar frontier labs require. Inter-annotator agreement tracked on every batch.

Red-teaming and safety evaluation

Adversarial prompt design, jailbreak attempts, bias evaluation across African demographic contexts. Useful for safety evaluations that under-represent sub-Saharan use cases.

Custom dataset creation

Bespoke datasets — collected, written, annotated and QA'd to a delivery spec. From hundreds of examples to tens of thousands. Documented schema, sample audits, sign-off-driven delivery.

Why Gsoft

A software company, not a labour collective

Registered Nigerian Ltd since 2017. MSA-ready, NDA-ready, USD invoicing, GMT+1 (overlaps both EU and US East Coast working hours). See more about us.

Native African-language coverage

Five Nigerian languages with native fluency — Hausa, Yoruba, Igbo, Nigerian Pidgin, Nigerian English. Genuinely scarce on the global supplier side, and a competitive moat for any frontier model that wants to perform well in Africa's most populous market.

Engineering-grade quality control

Every delivery passes through a QA layer staffed by the same engineers who ship our production software. Inter-annotator agreement, gold-standard tasks, sampling audits — built in.

Domain-grounded expertise

We've shipped production systems for the Rivers State House of Assembly, fintech wallets, healthcare platforms and more. Read the portfolio. When a task needs an annotator who's actually used the workflow, we have one.

How we engage

Pilot (2–3 weeks)

Small bounded scope, 1–2 annotators, fixed delivery date. Ideal first engagement — proves the QA layer before either side commits to scale.

Dedicated pod (ongoing)

5–15 annotators with a team lead and embedded QA. Monthly billing, MSA-governed. The right shape when a programme needs reliable throughput.

Specialist on-demand

Engineering or domain experts surfaced for short, high-value annotation tasks. Per-task or hourly billing. Useful when only a small number of expert hours will unblock an eval set.

Languages we cover

Native fluency: Hausa · Yoruba · Igbo · Nigerian Pidgin · Nigerian English

On request via vetted partner network: Igala · Ibibio · Tiv · Edo · Fulfulde · Kanuri

Industries with domain depth

How we ship to a frontier-model standard

Our internal AI work — including the production AI features we ship into client products and the AI-powered search on the Rivers State HoA platform — is built on Anthropic Claude and OpenAI. We know what frontier models look like in production. That knowledge sits beneath every annotation guideline we write.

Frequently asked questions

What's the smallest engagement you'd take?

A pilot of a few hundred annotations, scoped at a fixed fee. Lets us prove the QA layer before we propose dedicated capacity.

Can you sign our MSA / DPA?

Yes. Gsoft is a registered Nigerian Ltd with the legal footing to enter standard enterprise agreements, including NDAs and data processing addenda.

How do you handle confidential data?

No customer data leaves systems you control. We can work on your platform, or stand up a private secure environment. NDPR-aligned data handling, audit logs on every action.

What's the rate?

Depends on annotation type. Multilingual and technical/expert work commands higher rates than general-purpose annotation. We provide a rate card on the intro call.

Can you scale to hundreds of annotators?

We scale up to mid-size dedicated pods (10–25 annotators) on our own bench, and extend further through our vetted partner network. We're transparent about where in-house ends and partner capacity begins — useful when compliance terms require it.

Sourcing AI training data and want to talk?

Tell us the language, the task type and the volume — we'll come back with a feasibility note and a pilot proposal within 48 hours.

Chat with us