Artificial Intelligence

Between tech innovation and human interaction: We train, fine-tune, and localise AI models

We provide authentic, structured communication data to help train, fine-tune, and localise AI models. Our focus is not just on scaling data, but ensuring it reflects the complexity, tone, and intent behind everyday human communication.

We specialize in:

Supervised Fine-Tuning (SFT)

This involves training a base model using curated input-output pairs. The goal is to expose the model to high-quality, human-written examples across diverse use cases. We build SFT datasets that capture regional context, domain specificity, and varying levels of formality, enabling AI to respond more accurately in both professional and informal settings.

Reinforcement Learning with Human Feedback (RLHF)

In this phase, we present the model with multiple plausible outputs and have human annotators rank them. This feedback is then used to optimize the model’s reward function, helping it produce responses that align more closely with human expectations. Our contribution here includes building ranking frameworks, annotator guidelines, and calibration protocols rooted in linguistic nuance and social awareness.

Vision Training

For multi-modal systems, we collect and annotate image-caption pairs, scene descriptions, and visual references grounded in regional culture and everyday use. This improves a model’s ability to “understand” images in context, recognising objects, settings, or gestures that might be unfamiliar to generic datasets.

Why Human Data Still Matters

While these methods enhance AI performance, they don’t automatically teach models the nuances of human communication, contextual subtleties that real people understand effortlessly.

We focus on filling that gap.

For example, humans know that when someone asks, “What’s your name?” in a casual context, they usually just want a first name, not a full legal name. Or that Yukon Gold potatoes, despite the name, are actually from British Columbia and not Yukon.

AI may also struggle with understanding tone, intention, and the purpose behind prompts. It might fail to differentiate:

  • A casual, entertainment-driven query vs. a formal, informational one
  • Local slang, sarcasm, or indirect phrasing
  • Cultural assumptions or implicit user expectations

Implied meaning and user goals

Consider a user who asks, “Can I use this in South Africa?” The AI must infer what “this” refers to, the regulatory context, and whether the question is about legality, logistics, or cultural norms. Generic datasets rarely prepare models to handle that well.

Safety discernment

Understanding when a prompt is safe or unsafe requires awareness of phrasing, context, and user behaviour patterns, a capability best taught through human-labeled examples grounded in ethical reasoning.

Prompt intent recognition

Models may treat all prompts equally, failing to distinguish a lighthearted joke from a formal research query. This leads to tone mismatches, dry responses to entertainment queries, or overly casual answers to professional ones.


Our work bridges these gaps, layering AI systems with training data that reflects actual human interaction (not just how we write on the internet), intent, and ambiguity. We help models move from technically correct to humanely useful.