Module 3 · Rapid Prototyping & Iterative Validation

Setup & Inputs

Required artefacts

Future-state workflow diagram with AI touchpoints.
Synthetic datasets and coverage reports.
Governance checklist and data lineage updates.
Top opportunity backlog items selected for prototyping.

Environment preparation

Create virtual env: python -m venv venv && source venv/bin/activate.
Install packages: pip install llama-index langchain chromadb sentence-transformers streamlit neo4j.
Start Neo4j AuraDB Free instance (optional) or use networkx for local graph visualisation.
Install Playwright or Selenium for automated UX test scripting later in the week.

Knowledge sources

Gather 8–12 documents relevant to your workflow (SOPs, policy excerpts, FAQs, transcripts). Store them under data/sources/. These feed the knowledge graph and RAG pipeline.

Prototype architecture: synthetic data feeds knowledge assets, powering RAG assistants and prototypes instrumented for feedback.

Learning Outcomes

Construct knowledge graphs or vector indexes connected to governed data and policies.
Build working prototypes (assistants, dashboards, automations) aligned to real stakeholder journeys.
Plan and rehearse AI-augmented usability tests to gather rapid feedback.
Evaluate vendors and tools against security, integration, and governance requirements.

Concept Briefings

Knowledge Graph Foundations

Graph structures help you trace stakeholder intents, data, and regulatory obligations. Use them to inform RAG retrieval and highlight dependency chains.

python from llama_index import SimpleDirectoryReader, KnowledgeGraphIndex from llama_index.storage.storage_context import StorageContext from llama_index.graph_stores import Neo4jGraphStore reader = SimpleDirectoryReader("data/sources") documents = reader.load_data() graph_store = Neo4jGraphStore( username="neo4j", password="", url="neo4j+s://" ) storage_context = StorageContext.from_defaults(graph_store=graph_store) index = KnowledgeGraphIndex.from_documents( documents, storage_context=storage_context, max_triplets_per_chunk=5, include_embeddings=True )

RAG & Evaluation

Retrieval-Augmented Generation keeps AI grounded in your domain knowledge. Evaluate RAG pipelines with precision/recall, hallucination checks, and latency measurement.

codex prompt rag-eval "Given this RAG pipeline description, propose 10 evaluation prompts covering edge cases, compliance checks, and ambiguity scenarios. Include expected references to cite."

AI-Augmented Usability Testing

Use AI to generate probing questions, summarise sessions, and detect sentiment. Combine with real user walkthroughs for rapid iteration.

gemini prompt usability "Act as a UX researcher. I will share a prototype transcript. Generate follow-up questions that uncover trust, comprehension, and failure points for an AI assistant used by [persona]."

Guided Exercise Timeline

Step 1 · Select User Journey (45 min)

Choose the flow to prototype

Pick the highest-value workflow slice from Module 2. Define start/end triggers, success metrics, and stakeholders. Record assumptions to test.

Step 2 · Build Knowledge Asset (120 min)

Create knowledge graph or vector index

Ingest curated documents, split into chunks with metadata (persona, step, sensitivity). Generate embeddings and store them in Chroma or Neo4j.

Governance tip: tag each node/chunk with policy obligations and data classifications to enforce safeguards downstream.

Step 3 · Prototype Creation (150 min)

Develop assistant + dashboard prototypes

Build at least two artefacts:

LLM-powered assistant (Streamlit or LangChain agent) supporting your persona.
Analytics artefact (Power BI mock, Streamlit dashboard) showing AI-generated insights.

Instrument prototypes with logging for question types, response confidence, and user feedback.

Step 4 · Usability Test Plan (90 min)

Run dry-run and prepare scripts

Use AI to draft tasks and follow-up probes. Conduct an internal dry-run (teammate) and capture learnings. Prepare a highlight reel approach for real stakeholder sessions.

Step 5 · Vendor & Tool Assessment (60 min)

Evaluate solution stack

Assess each tool against security, compliance, integration, and TCO. Document alternatives and exit strategies should the vendor not pass due diligence.

Lab 03 · Prototype & Validation Package

Deliver working prototypes, evaluation plans, and vendor analysis ready for stakeholder review.

Inputs

Synthetic datasets (Module 2).
Knowledge sources (policies, SOPs).
Future-state workflow and governance documents.

Outputs

Prototype repo with instructions (prototypes/assistant, prototypes/dashboard).
RAG architecture diagram and description (artifacts/week3/rag_architecture.md).
Usability test plan (artifacts/week3/usability_test.md).
Vendor assessment matrix (artifacts/week3/vendor_assessment.xlsx).

Collaboration

Partner dry-run session recorded or summarised.
Peer feedback thread in Slack #week3-demos.
Mentor review of RAG architecture for risk mitigation.

Execution Steps

Run python scripts/build_kg.py (customise per your data). Validate in Neo4j Bloom or network graph visual.
Develop Streamlit assistant: streamlit run prototypes/assistant/app.py. Ensure persona context injection and guardrails (max tokens, filtered responses).
Create dashboards or process visual prototypes using synthetic data; include scenario toggles to highlight AI impact.
Draft usability plan with tasks (3 primary, 2 stretch), success criteria, and observation form. Incorporate AI-generated probes.
Complete vendor assessment matrix with scoring rubric (Security, Compliance, Integration, Cost, Support). Provide rationale for chosen tools.

Validation Checkpoints

Prototypes load within 3 seconds locally and respond with references to knowledge sources.
Logs capture user interactions (question types, time stamps, confidence scores).
Usability plan includes consent language, success metrics, and fallback plan.
Vendor matrix cites policy references and integration dependencies.

Reflection & Submission

Submission Checklist

Prototype repository zip or private repo link.
artifacts/week3/rag_architecture.md + architecture SVG.
artifacts/week3/usability_test.md with dry-run notes.
artifacts/week3/vendor_assessment.xlsx (or CSV).
Reflection log (artifacts/week3/reflection.md)—what did users struggle with, what surprised you?

Assessment Rubric

Prototype Fidelity (35%): Functionality, alignment to workflow, logging/guardrails.
Knowledge Management (25%): Graph/index structure, citations, evaluation plan.
User Validation (25%): Test design, insights captured, iteration plan.
Vendor Due Diligence (15%): Security/compliance analysis, alternative strategies.

Submission Process

Merge Week 3 branch into main once peer-reviewed. Upload artefacts to portal and schedule your stakeholder demo rehearsal. Share a 3-minute loom or live demo during Friday’s cohort session.

Troubleshooting & FAQ

RAG outputs irrelevant answers.

Check chunking strategy (overlap, size), ensure metadata filters (persona, step) are used, and adjust embedding model. Validate queries by logging top retrieved nodes and reviewing their relevance.

Prototype performance is slow.

Enable caching for embeddings, store retrieved contexts, and reduce model size (switch to gpt-4o-mini or Gemini 1.5 Flash for testing). Optimise Streamlit components for response batching.

Users distrust AI responses.

Surface citations, confidence scores, and human oversight messaging. Include escalation options within the prototype. Capture feedback and plan explanation enhancements.

Further Study & Next Steps

Recommended Resources

LangChain cookbook: evaluation and guardrail techniques.
Mattermost AI RAG case study for regulated industries.
Baymard Institute guidelines for explanation UX in automation.

Prepare for Module 4

Instrument prototypes with event logging and metrics. Gather baseline KPI data and plan for telemetry integration. Identify stakeholders responsible for change management.

Preview Module 4 →