How long does a typical AI pilot project take?

4 to 6 weeks from contract start to pilot in production, plus a 30-day measurement period afterwards. Smaller, well-scoped pilots (one process, one function) can often be landed in 3 weeks. Larger cross-functional pilots (RAG portal, document classification across departments) typically take 6 to 8 weeks. If someone sells you a pilot project of 4 months, it is no longer a pilot. It is a full implementation project and should be priced accordingly.

What does an AI pilot project cost for an SME?

At EnterpriseIQ a pilot costs DKK 50,000 to 150,000 depending on scope. The lower end covers simple n8n flows or a prompt library for one concrete workflow. The upper end covers a custom RAG portal with customer data, or integration with existing systems like CRM or document archive. 20 percent of the price can be tied to documented savings during the 30-day measurement period, so we share the risk.

Which use case should we choose for our first pilot?

The use case that scores highest on three dimensions: high impact (time savings or quality lift you can measure), low to medium effort (can be landed in 4 to 6 weeks without major system integrations), and low risk (no highly sensitive customer data, no automated decisions). For law firms: contract review or document summarisation. For accountants: materiality assessment or draft audit notes. For IT services: ticket prioritisation or an internal knowledge base.

How do we measure ROI on an AI pilot?

Establish a baseline BEFORE the pilot goes live: how many hours are spent today on the workflow the pilot covers, what is the error rate or throughput time. Then measure for 30 days with the pilot active. ROI is the difference in time or quality. For concrete pilots, time savings typically land at 30 to 60 percent on the specific workflow being automated, which nets out to 5 to 15 percent on total work time for the employees using the pilot.

What happens if the pilot does not work?

That is the whole point of a pilot. Better to find out after 4 weeks and DKK 80,000 than after 6 months and DKK 800,000. We design the pilot so the three critical assumptions (data quality, user willingness, output quality) are tested as early as possible. If an assumption fails, we document why and which alternative direction would be more promising. That is not failure, it is learning at a fraction of the cost of a full implementation project.

Should we use ChatGPT, Claude, or something self-hosted in the pilot?

It depends on data sensitivity. For internal workflows without client-confidential data, Claude Team or Microsoft Copilot for M365 are typically the right choice. For client-confidential data: Claude Enterprise with EU residency or self-hosted Llama 3.3 70B on Proxmox via n8n. We never recommend consumer tier (Claude Pro, ChatGPT Plus) for pilots that touch customer data, since inputs may be used for training. Details in our AI tools pillar.

Which employees should be involved in the pilot?

Three roles. 1) A sponsor at executive or partner level who owns the decision and can clear obstacles. 2) One engaged power user from the team that will use the pilot. Not necessarily the most technical one, but the one who will most actively work with the solution. 3) An IT contact who can ensure access to relevant systems (CRM, document archive) and handle GDPR and access control. Three people are enough. Larger pilot teams slow things down without raising quality.

How do we scale from pilot to full implementation?

Pilot success criteria first (ROI documented, users satisfied, no GDPR flags). Then three paths. 1) Scale the same pilot to the entire team (typically 3 to 6 weeks). 2) Add related workflows on the same stack (so-called fast-follow pilots). 3) Consolidate multiple pilots onto a shared platform with common governance. Most SMEs run 2 to 4 pilots in parallel during the first year before it makes sense to talk about platform consolidation.

How do we get started concretely this week?

Three steps. 1) Take the EnterpriseIQ Score (5 minutes, free) for a baseline on your AI maturity. 2) Hold a 60-minute use case prioritisation session with leadership plus 1 to 2 key employees. Use the pilot canvas from this pillar as the structure. 3) Pick the use case that scores highest on the impact/effort/risk matrix, and order either a Quick Scan (DKK 12,000 to 18,000, 1 day) or directly a pilot project (DKK 50,000 to 150,000, 4 to 6 weeks). Quick Scan is the recommendation if you do not yet have full visibility on your AI maturity.

What should we NOT use an AI pilot for?

Three types of use cases are wrong for a pilot. 1) Automated decisions about individuals (HR, credit scoring, case handling with legal effect). That is Article 22 GDPR and EU AI Act high-risk territory, and it requires full implementation with human oversight from day one. 2) Mission-critical processes where errors cost more than the pilot's budget. Start at a lower risk level first. 3) Use cases where you cannot articulate one concrete success metric. If you cannot measure whether the pilot succeeded, then it has not.

Pillar · Published 2026-05-27 · 14 min read

AI pilot projects: how to start

Q: Should the pilot be documented for the EU AI Act?

Yes, even when the pilot is not a high-risk AI system. Every pilot becomes part of your AI system inventory (regardless of risk level), and every AI system has minimum documentation requirements: which model is used, what data is sent in, what is the business purpose, who owns it, what is the fallback if it fails. It takes 1 to 2 hours per pilot to document properly, and it eases the path later when you face an EU AI Act audit. See our EU AI Act pillar.

Most knowledge-intensive SMEs that want to get going with AI run aground on the same question. Where do we start? The right answer is rarely the first one proposed in the executive meeting. A structured pilot of 4 to 6 weeks with clear success criteria is the path with the highest likelihood of leading to a lasting AI practice. This pillar walks through how to prioritise use cases, design a pilot that can actually be measured, and how to avoid the seven typical pitfalls.

Written by Jesper Sachmann, founder of EnterpriseIQ. Pilot project experience from the Archer platform combined with hands-on work on n8n agent flows and custom RAG solutions on Proxmox since 2023.

TL;DR

→A pilot is not a scaled-down implementation. It is an experiment with three critical assumptions: data quality, user willingness, output quality.
→Use case prioritisation on three axes: impact, effort, risk. Score 1 to 5 on each, pick the highest combined score.
→Pilot canvas: problem, solution, data, success metrics, risks, team, timeline. One A4 page, not a 30-page document.
→Measure the baseline BEFORE the pilot starts. ROI is documented over 30 days with the pilot active versus the baseline.
→Price: DKK 50,000 to 150,000, 4 to 6 weeks, 20 percent risk share on documented savings.

Why pilot rather than big bet

It is tempting to think that if you have decided to invest in AI, you should go straight to a full implementation. That is understandable. The pilot phase looks like a delay on the real project. But data from Danish SMEs' AI projects over the past three years tells a different story. Roughly half of the total AI investments that started without a pilot phase landed below 30 percent of the expected ROI. Roughly three quarters of those that started with a pilot delivered ROI within the expected band.

The difference is not the technology. It is the three assumptions every AI implementation rests on. Are you assuming your data is structured enough for AI to work with? Are you assuming your employees will use the solution in their daily workflows? Are you assuming AI output can reach the quality level your customers expect? Those three assumptions cannot be verified in the executive meeting. They have to be tested against reality.

That is what a pilot does. It costs DKK 50,000 to 150,000 and 4 to 6 weeks to gather concrete data on the three assumptions. If all three hold, the subsequent full implementation is significantly more likely to succeed. If one or more fall, you have saved a major investment and learned where the assumption needs to be reformulated before the next attempt.

Use case prioritisation: impact, effort, risk

The first real decision is which use case you pilot. That is also where most teams stall, because every department has 5 to 10 ideas that could be promising, and the executive team lacks a structure to choose between them.

Three-dimensional scoring works well. For each candidate use case, score on three axes with a 1 to 5 scale:

Impact (1 to 5)

How large is the saving or quality lift if the AI solution works as expected? 1 = marginal improvement on a small workflow. 5 = transforms a core process that consumes 30+ percent of a team's time.

Example: automated classification of incoming contracts (impact 4) versus improved internal search (impact 2 for a small team, impact 5 for a large team).

Effort (1 to 5, where 1 is lowest)

How hard is it to build the pilot? 1 = standard prompt library or simple n8n flow on existing stack. 5 = custom integration with 3+ systems, new data pipelines, or training on your own data.

Example: prompt library for contract review (effort 1) versus custom RAG over 50,000 historical cases (effort 4).

Risk (1 to 5, where 1 is lowest)

How significant is the consequence if the AI solution fails or produces poor output? 1 = internal workflow, errors are embarrassing but reversible. 5 = automated customer-facing decision with legal effect.

Example: draft of an internal email (risk 1) versus automated credit scoring (risk 5, actually EU AI Act high-risk).

Combined score = impact - effort - risk (higher is better). The use case with the highest score is typically your first pilot. Beware of the trap where you pick the most exciting use case (typically high impact but also high effort or risk) as the first pilot. That gives the worst chance of success. Pick the one that scores highest on the matrix rather than the one that sounds most interesting at the executive meeting.

Pilot Canvas on one page

Once the use case is selected, the pilot is formulated on a one-page canvas. It is not a 30-page document. It is the clear frame that ensures everyone holds the same expectations.

Use case name: [clear, concrete name]

Problem: What concrete problem is being solved? How many hours or errors does it cost today?

Solution: Which AI approach is applied? Which model, which stack?

Data: What data is sent in? How sensitive? GDPR considerations?

Success metrics: 2 to 3 measurable goals (time saved %, error rate, user satisfaction). Baseline values.

Three critical assumptions: Which three assumptions MUST hold for the pilot to succeed?

Risks: What could go wrong? What is the mitigation strategy?

Team: Sponsor (name), power user (name), IT contact (name). Max 3 people.

Timeline: 4 to 6 weeks build + 30-day measurement period. Milestone plan.

Decision point: After 30 days: scale, reformulate, or stop?

A good pilot canvas takes 60 to 90 minutes to fill out together with the leadership and the power user. If it takes 4 hours to get through, the use case is probably too diffusely formulated. Go back to the prioritisation matrix and pick something more bounded.

30-day measurement framework

The most frequent mistake on AI pilots is not the technology choice. It is that the pilot starts without a baseline, so ROI cannot be documented afterwards. So always establish the baseline BEFORE the pilot goes live.

A 30-day measurement period is the right level. Shorter gives datasets that are too small. Longer introduces bias from holidays, quarterly shifts or other outside factors. What you measure depends on the use case, but three types of metrics are typically relevant.

Time metrics

How many minutes or hours does a power user spend on the workflow the pilot covers? Measure before the pilot (baseline) on at least 5 to 10 cases. Measure after the pilot (30 days) on at least 20 cases.

Typical expectation: 30 to 60 percent time saving on the specific workflow.

Quality metrics

How many errors, reworks or customer complaints come out of the workflow? Compare baseline to pilot.

Important: AI can reduce certain error types and introduce new ones. Both count. Watch for shifts in the error pattern.

User metrics

How often does the power user actually use the pilot? How satisfied are they (NPS-style question)? Which friction moments do they report?

Strongest signal: if the power user stops using the pilot despite measurable time savings, something in UX or trust is not working. Investigate it.

After 30 days: make the decision. Three outcomes. 1) Pilot delivers ROI as expected, scale to the entire team. 2) Pilot shows partial ROI or one of the three critical assumptions did not fully hold. Reformulate and run an adjusted pilot for 14 days. 3) Pilot does not work, document why, pick the next use case from the prioritisation matrix.

Industry examples

Law firm: contract review assistant

Use case: AI reads incoming contracts and identifies deviations from the firm's standard clauses plus risk flags. Output is a 1-page report the attorney can use as a starting point for the full review.

Scores: impact 4 (saves 30 to 60 minutes per contract), effort 2 (prompt library plus n8n flow, no integration), risk 2 (attorney always reviews the final output).

Stack: Claude Team or Claude Enterprise with EU residency, n8n agent flow on Proxmox, prompt library version-controlled in Git. Typical pilot price: DKK 60,000 to 90,000.

Accounting firm: materiality assessment in audit planning

Use case: AI reads the prior year audit documentation plus the current year trial balance and proposes materiality thresholds plus indicators on work areas that should be prioritised. The auditor reviews and approves before final planning.

Scores: impact 4 (saves 4 to 8 hours per audit engagement), effort 3 (requires structured data input from the audit software), risk 3 (FSR standards apply, requires documented human review).

Stack: Claude Enterprise (client-confidential data), Python script for data extraction from audit software, audit-trail PDF per generation. Typical pilot price: DKK 100,000 to 130,000.

Financial advisory: client report generation

Use case: AI summarises the client's portfolio performance plus relevant market insights for the monthly or quarterly client report. The adviser adds the personal advice and approves.

Scores: impact 5 (report generation is often 40 to 60 percent of the adviser's administration), effort 2 (structured data from the portfolio system, prompt library), risk 3 (client-confidential financial data, GDPR Article 9 in some cases).

Stack: Claude Enterprise with DPA, integration with portfolio system via API, quality check loop with adviser review before distribution. Typical pilot price: DKK 80,000 to 120,000.

IT services firm: ticket prioritisation plus internal knowledge base

Use case: AI reads incoming support tickets and classifies them by urgency plus suggests relevant runbooks from the internal knowledge base. The support agent reviews prioritisation before it is executed.

Scores: impact 4 (shortens response time by 40 percent), effort 2 (RAG on existing knowledge base, n8n flow), risk 2 (internal workflow, no customer-facing automation).

Stack: Llama 3.3 70B self-hosted on Proxmox (data sovereignty), Qdrant vector DB, n8n routing flow to ticket system. Typical pilot price: DKK 70,000 to 110,000.

Seven typical pitfalls

The pattern recurs in pilots that did not land ROI. The pitfalls are not hard to avoid once you know them.

Pitfall 1: No baseline

The pilot starts without measuring the before state. After 30 days there are no concrete numbers to compare against, and ROI becomes a gut feel. Fix: spend a week before pilot start measuring the baseline.

Pitfall 2: Scope too broad

A pilot covering three departments or five workflows is not a pilot. It is a full project disguised as a pilot. Fix: one clear workflow, one team, one success metric.

Pitfall 3: Wrong sponsor

If the sponsor is a middle manager without authority to clear obstacles (IT access, GDPR approval, workflow change), the pilot stalls halfway through. Fix: sponsor at executive or partner level.

Pitfall 4: Tech first, problem second

The pilot starts with "we want to use Claude" or "we want to build a RAG portal" rather than "we want to solve problem X". That leads to solutions looking for a problem. Fix: pilot canvas starts with the problem, the technology choice comes after.

Pitfall 5: No power user involved

If the pilot is built without the employee who will actually use the solution, it ends up technically correct but unusable in practice. Fix: the power user is an active part of the pilot team from day one, not a recipient of the finished solution.

Pitfall 6: GDPR concerns ignored until go-live

Pilot is built and tested on synthetic or anonymised data. On the day before go-live it is discovered that the real data is client-confidential and cannot be sent to the chosen AI provider. Fix: data sensitivity is resolved in the pilot canvas BEFORE the build phase starts.

Pitfall 7: No decision point

After 30 days no one is responsible for making the decision: scale, reformulate, or stop. The pilot lives on without a clear purpose, consumes the power user's time and cannibalises momentum for the next use case. Fix: the pilot canvas includes an explicit decision point with date and owner.

How to scale from pilot

If the pilot delivers ROI as expected, the next question follows. How do we go from one power user's pilot to the entire team or the entire organisation?

Three paths, typically in this order:

Path 1: Same use case, full team (3 to 6 weeks)

The pilot goes from one power user to the entire team that uses the same workflow. Focus is training, documentation and bug fixes based on the experience from the 30 days. It is the most natural scaling and typically where 80 percent of the pilot value is realised.

Path 2: Fast-follow pilots on the same stack

Once one pilot works on a stack, you can build 2 to 3 related pilots on the same technical foundation without repeating the stack investment. For a law firm that landed the contract review pilot: the next fast-follow could be document summarisation or a legal research assistant. Each fast-follow takes 50 to 70 percent of the time the first pilot took.

Path 3: Platform consolidation (typically year 2)

Once you have 3 to 5 pilots in production, it makes sense to consolidate them onto a shared platform with common governance, audit trail, model management and user access. That is not something you plan on day one. It is something you do when platform fragmentation has begun to cost more than the consolidation investment.

Note: do not jump straight to Path 3 from the first pilot. Many SMEs fall for the "AI platform" narrative and build an infrastructure project rather than delivering concrete value. That path leads to IT projects that close without going to production. Live with 2 to 3 separate pilots for a year before the platform question becomes real.

Three steps you can take this week

Step 1: Take the EnterpriseIQ Score

5 minutes, free. Baseline on AI maturity across 6 dimensions.
You get feedback on which dimensions are not yet strong enough for pilot.
A score under 4 on a dimension like "data foundation" or "governance" means the pilot journey typically starts with closing that gap first.

Step 2: Hold a 60-minute use case prioritisation

The executive team plus 1 to 2 key employees gather for a structured workshop.
Brainstorm 5 to 10 candidate use cases (10 minutes).
Score each on impact, effort, risk (30 minutes).
Pick the top 2 for pilot consideration (20 minutes).

Step 3: Order either a Quick Scan or pilot directly

If you are uncertain about maturity: Quick Scan (DKK 12,000 to 18,000, 1 day). You get a 10-page report plus a concrete pilot recommendation.
If use case and prioritisation are clear: go directly to a pilot project (DKK 50,000 to 150,000, 4 to 6 weeks).
Booking via /en/contact. We respond within 24 hours on business days.

FAQ

How long does a typical pilot take?

4 to 6 weeks build plus 30-day measurement period. Smaller, well-scoped pilots can be landed in 3 weeks. Pilots over 8 weeks are no longer pilots, they are full implementations.

What does a pilot cost?

DKK 50,000 to 150,000 depending on scope. 20 percent can be tied to documented savings in the 30-day measurement period, so we share the risk.

Which use case should we pick?

The one that scores highest on the impact minus effort minus risk matrix. Typically: document summarisation, classification, internal knowledge base, or draft generation.

What if the pilot does not work?

That is the point of a pilot. We design it so the three critical assumptions are tested early. If one fails, we learn why and suggest an alternative direction. That is not failure, it is learning.

Should the pilot be documented for the EU AI Act?

Yes, even when not high-risk. Every pilot becomes part of the AI system inventory with minimum documentation. Takes 1 to 2 hours per pilot.

Which employees should be involved?

Three: sponsor at executive or partner level, one engaged power user from the team, IT contact. Larger pilot teams slow things down.

Next steps

Three paths depending on where you stand:

Free

Take the EnterpriseIQ Score

12 questions, 5 minutes. Baseline on AI maturity and which dimensions need to be closed before pilot.

DKK 50,000 to 150,000

AI Pilot Project

4 to 6 weeks delivery plus 30-day measurement period. 20 percent risk share on documented savings.

Free

30-minute conversation

No obligation. We walk through the use case list and find the highest-score pilot for you.

About the author

Jesper Sachmann is the founder of EnterpriseIQ. 27 years of IT leadership across Oracle, Logica and Capgemini plus 11 years of Archer experience as Alliance Director Europe and Integrated Risk Management Lead Nordics, combined with hands-on pilot work on n8n agent flows and custom RAG solutions since 2023.

AI attribution: This article is AI-assisted produced with Claude Opus 4.7, human review by Jesper Sachmann. See our AI transparency policy for how we use AI in every deliverable.

Citing this article? "EnterpriseIQ: AI pilot projects for SMEs (2026-05-27)" or link to enterpriseiq.dk/en/insights/ai-pilot-projects-for-smes.