Going in-house with AI, without going it alone

If your strategy is to grow AI delivery capability inside your own team, the harness is the part that decides whether it works. Here's the shape we use to set it up, then hand the keys over.

An ordered structural lattice taking shape - interlocking beams and pillars, warm amber blocks fitted into the structure. Capability being built carefully.

We see two patterns when leadership decides to invest in AI. The first is to outsource the build entirely, ship a product, and treat AI as a vendor relationship. The second, which is becoming far more common, is to build AI capability inside the existing engineering team. Cost discipline, control of the IP, speed of iteration, retention of the people doing the work. All of those favour going in-house.

It also tends to fail in a way that takes six to nine months to become obvious.

The bit teams underestimate

The model isn't the hard part. There are good models, good SDKs, good worked examples, and any competent engineer can have a working prototype on a laptop by the end of the week. The hard part is the surrounding system that turns a prototype into a product the business can rely on. We call ours the harness, and most of it is invisible from the outside.

Evals. Programmatic checks that say whether a prompt or model change made the product better or worse. No evals, no honest answer to “is this an improvement?”
Observability. Cost, latency and quality on a live dashboard, with every interaction replayable for debugging. Without it, nobody knows what production is actually doing.
Governance and guardrails. Policies enforced, sensitive data handled properly, AI outputs checked before they go anywhere. Audit-ready from day one, not retro-fitted when legal asks the awkward question.
Identity, RBAC and audit. SSO, role-based access, an audit trail. The boring scaffolding that survives a regulatory review.
Code analysis and review. Quality, security and architecture compliance checked automatically on every change, the same bar for human-written and AI-written code.
CI/CD with eval gates. A pipeline that fails a deploy when AI quality drops below the bar the team set. Upgrading a model becomes a measured decision, not a leap of faith.

Each of those is a few weeks of work in isolation. The composition is the hard bit, and the team building the product is rarely the team best placed to build the harness around it from scratch.

“The model isn't the hard part. The system around the model is the part that decides whether you have a product or a demo.”

What “in-house, with a partner” actually looks like

The shape we've found that works is two-phase. We stand the harness up with your team, building the product alongside it so nothing is theoretical. Your engineers learn it by working in it, not by reading a deck about it. Then we step back. Your team owns delivery, and we're on call for the harder moments - the patterns you haven't seen before, the regressions that don't have an obvious cause, the model upgrades that need a careful eye.

The split usually looks something like this.

Weeks 1–2, discovery. What you're building, what the data shape is, what the harness needs to look like, who on your team will own which parts. Comes out as a buildable plan, not a slide deck.
Weeks 3–10, build alongside. We build the first product with your team embedded. The harness goes in at the same time, not at the end. Pair programming, joint code review, shared eval ownership. By the end of it your team has shipped a real thing and built the muscle to ship the next one.
Weeks 11+, handover and partner mode. We shift to a lighter cadence - code review, eval debugging, model-upgrade reviews, architecture calls when something new comes up. Your team is doing the delivery; we're the partner you call when it gets hard.

What “on speed-dial” actually means

It's not a retainer with a minimum hours commitment that becomes invoice padding. It's a relationship where your engineers know they can put a question into a shared channel and get a useful answer the same day, where we sit on the harder code reviews, and where we run a quarterly check-in on the harness itself - what's drifted, what new patterns you're hitting, what the wider AI landscape has done that affects you.

Joint code reviews on the changes your team flags as risky.
Eval debugging when a number moves and nobody can explain why.
Model-upgrade reviews - before you bet a quarter on a new release.
Architecture calls when a new use case lands that needs a fresh shape.
A quarterly harness review, so quality doesn't drift quietly.

We're happy to be the contractor in the room when a regulator, auditor or board reviewer wants someone independent to explain the architecture. That's usually the moment a partner relationship earns its keep.

When this is the right shape

You have a competent engineering team that's new to AI delivery specifically.
AI capability is a strategic asset you want owned internally, not a vendor-managed black box.
You're willing to invest in the harness up-front, because you understand the cost of not having one.
Leadership is on board with a partner relationship that tapers, rather than a hand-off cliff.

When it isn't

You need the product shipped end-to-end and don't have an in-house team to grow. That's a straight build engagement instead.
The strategy is to outsource AI entirely and let the supplier carry the risk. Fine, but pick the supplier on that basis.
The team isn't ready to learn a new operating model alongside their day jobs. The harness only sticks if the team owns it.

The honest verdict

The teams that go in-house and succeed are almost always the ones that built the harness with someone who'd done it before. The teams that struggle are the ones that treated AI delivery as a normal engineering problem and discovered, six months in, that production was a different shape from prototype. The cost of the partner is small compared to the cost of getting that wrong.

If this is the shape you're considering, the conversation we'd want to have first is about your team - what they can already do, what they want to own, what they'd rather lean on a partner for. The harness flows out of that, not the other way round.

Keep reading.

Case notes

PayWise, two years on - what an OutSystems product looks like at maturity

Zero incidents in two years. Ten thousand statements processed. A small team. Fixed operating costs. The honest, unglamorous case for what an enterprise low-code platform actually buys you over time.

7 min read

Case notes

From OutSystems to AI-native, how we re-platformed Finbridge in 10 weeks

What it actually looks like to move a regulated financial services platform off OutSystems onto a modern AI-native stack, including the parts that surprised us.

9 min read

Practice

When AI earns its place, and when it doesn't

Most AI features fail in production not because the model was wrong, but because the surrounding system was unbuilt. A short field guide.

7 min read

Want to talk about this?

We're always up for a conversation about the work, the patterns we're seeing, what's worked, what hasn't. No pitch deck.

hello@doddledesign.co.uk →

Let's talk