Noodle

Network

Student Journey Co-Pilot · 40 program agents · Langfuse × Golden Evals

Updated May 28, 7:32 PM
Network Share (Flagship)
7.8%
Online MBA Flagship · avg across 4 touchpoints
7-day change
+0.1%
vs. previous 7d
Network Rank
#1
of 40 program agents
Trace Volume
9
Langfuse traces, trailing 7d

Agent Engagement — 30 days

Live data

Daily engagement share across 40 program agents. Higher = larger slice of network turn volume on that touchpoint.

Program Agent Leaderboard

Engagement share across all 4 touchpoints · last 24h

7-day Δ
  1. 1Online MBA - FlagshipYou
    7.8%0.1
  2. 2Online MBA - East Coast
    3.3%0.1
  3. 3Master of Education
    3.3%0.1
  4. 4Online MBA - Southern
    3.1%0.0
  5. 5Online MBA - West Coast
    3.1%0.2
  6. 6Executive MBA
    2.9%0.1
  7. 7Data Science Bootcamp - Cohort 1
    2.8%0.5
  8. 8MPH - Epidemiology
    2.8%0.2
  9. 9MS in Data Science
    2.7%0.4
  10. 10Master of Public Health
    2.7%0.7
  11. 11RN to BSN
    2.7%0.1
  12. 12MEd - Educational Leadership
    2.6%0.2
  13. 13Online JD
    2.6%0.2
  14. 14MEd - Curriculum & Instruction
    2.6%0.1
  15. 15MS in Management
    2.5%0.2
  16. 16MSW - Clinical Practice
    2.5%0.2
  17. 17MS in Analytics
    2.5%0.3
  18. 18MS in Finance
    2.4%0.2
  19. 19MS in Business Analytics
    2.4%0.1
  20. 20MPH - Global Health
    2.4%0.1
  21. 21MS in Instructional Design
    2.3%0.2
  22. 22Master of Social Work
    2.3%0.8
  23. 23MS in Strategic Marketing
    2.2%0.2
  24. 24MS in Software Engineering
    2.2%0.1
  25. 25MSN - Nurse Leadership
    2.2%0.1
  26. 26MSN - Family Nurse Practitioner
    2.1%0.2
  27. 27MS in Higher Education Admin
    2.0%0.0
  28. 28MS in Electrical Engineering
    2.0%0.1
  29. 29Online LL.M.
    2.0%0.0
  30. 30MS in Information Systems
    2.0%0.1
  31. 31MSW - Macro Practice
    2.0%0.0
  32. 32MS in Health Services Administration
    2.0%0.1
  33. 33MEd - Special Education
    1.9%0.1
  34. 34MS in HR Analytics
    1.9%0.0
  35. 35MS in Cybersecurity
    1.9%0.0
  36. 36MS in Healthcare Analytics
    1.9%0.2
  37. 37MS in ML Engineering
    1.8%0.2
  38. 38MSN - Nursing Informatics
    1.8%0.0
  39. 39MS in Civil Engineering
    1.8%0.3
  40. 40MSW - Advanced Standing
    1.7%0.0

Active Alerts

Auto-detected from Langfuse traces, golden-set evals, red-team sessions, and SLA monitors

2 high priority

MPH Flagship - hallucination spike on Learning agent (rolled back)

Learning

After the May 24 model rotation, the MPH Learning agent's hallucination rate jumped from 1.2% to 4.8% on case-study coursework prompts (Langfuse golden-set eval, n=120). Pattern: invented citations to non-existent CDC reports. Pinned the agent back to the prior prompt+model version. Root cause review queued; suspect a tool-call format regression on the new model.

-5d ago·

MSW Flagship Support - three-way chat queue depth growing

Support

MSW Support routes to a campus advisor for high-sensitivity cases (mental health, accommodation, Title IX). Advisor SLA on the three-way chat slipped from 12 min to 47 min over the last 10 days. Best guess: a staff turnover on the campus side. Agent is correctly handing off but the human is no longer there in time. Escalated to the program lead.

-4d ago·

3 programs have prompt-version regressions awaiting review

Pre-deploy eval suite flagged regressions on MS in Cybersecurity (tone shift in adversarial probes), MEd Special Education (fallback handling dropped on out-of-scope queries), and MS in Finance (longer responses than the rubric allows). Blocking deploy until reviewed.

-6d ago·

Online MBA Flagship crossed 50K trace milestone in Langfuse

Enrollment

Flagship MBA agent surpassed 50,000 traces in Langfuse this week with a sustained 96.2% acceptance rate on the golden eval set. The longest-running deployment is now the most stable - prompt maturity compounds. Time for a case-study writeup for the partner success deck.

-6d ago·

Red team flagged fallback weaknesses on 2 programs

Internal red-team session surfaced reproducible fallback failures on MSN-Nurse Leadership (agent answered when it should have escalated to faculty) and Online JD (provided generalized legal commentary outside the program's scoped knowledge base). Fixes in progress; new fallback rubric being added to the eval suite.

-7d ago·

Data Science Bootcamp Cohort 1 - first-cohort volatility expected

The DS Bootcamp agent (deployed 38 days ago) is showing day-over-day swings of plus or minus 18 percent across all touchpoints. Pattern matches the first-90-day shakedown across every new program. Suppressing volatility alerts on this agent until day 90. Quality metrics are within range.

-7d ago·

Recent Inquiries

Student inbound · program routing highlighted · 40 inquiries tracked total

Module 4 case study - I don't understand the discounted cash flow setup
Enrollment-7d ago

Routed to the MS in Civil Engineering Learning agent. Pulled the module 4 syllabus plus the relevant case-study appendix from the knowledge base, walked through the DCF setup, and queued a TA office-hour link if the learner needs more.

I missed last week's synchronous session - is there a recording?
Learning-7d ago

Online MBA - West Coast Support agent looked up the relevant policy and offered to open a ticket with the registrar. No financial-aid triggers fired (this drop wouldn't affect their aid window). Escalation path queued in case of follow-up.

Can I apply if my undergrad GPA was 2.7?
Support-6d ago

MS in Management enrollment agent surfaced the holistic-review policy, recommended bringing GRE or recent quantitative coursework as a strengthening signal, and offered to set up a 15-minute fit call with an enrollment counselor. Agent flagged for human review (admissions-policy sensitive).

My professor mentioned a paper on Bayesian inference - which one?
Faculty Assist-6d ago

Faculty Assist (three-way chat) returned the paper title plus DOI plus a one-paragraph summary, then asked the professor to confirm the citation before sending to the learner. Professor approved.

All Program Agents

40 agents · engagement across 4 touchpoints · 30-day trends

1
Online MBA - Flagship
You
Engagement share
7.8%
0.1 7d
30-day trend
Per-touchpoint share
Enrollment
#1
9.7%
Learning
#1
7.8%
Support
#2
6.1%
Faculty Assist
#1
7.8%
Color · #6366f1
2
Online MBA - East Coast
Engagement share
3.3%
0.1 7d
30-day trend
Per-touchpoint share
Enrollment
#4
3.3%
Learning
#3
3.5%
Support
#14
2.6%
Faculty Assist
#2
3.8%
Color · #fb923c
3
Master of Education
Engagement share
3.3%
0.1 7d
30-day trend
Per-touchpoint share
Enrollment
#3
3.4%
Learning
#6
3.3%
Support
#4
3.3%
Faculty Assist
#5
3.3%
Color · #f59e0b
4
Online MBA - Southern
Engagement share
3.1%
0.0 7d
30-day trend
Per-touchpoint share
Enrollment
#10
2.7%
Learning
#4
3.5%
Support
#5
3.0%
Faculty Assist
#4
3.4%
Color · #a78bfa
5
Online MBA - West Coast
Engagement share
3.1%
0.2 7d
30-day trend
Per-touchpoint share
Enrollment
#5
3.2%
Learning
#2
3.7%
Support
#15
2.5%
Faculty Assist
#9
3.0%
Color · #10b981
6
Executive MBA
Engagement share
2.9%
0.1 7d
30-day trend
Per-touchpoint share
Enrollment
#9
2.8%
Learning
#7
3.2%
Support
#16
2.5%
Faculty Assist
#6
3.2%
Color · #38bdf8
7
Data Science Bootcamp - Cohort 1
Engagement share
2.8%
0.5 7d
30-day trend
Per-touchpoint share
Enrollment
#17
2.4%
Learning
#29
2.0%
Support
#1
6.4%
Faculty Assist
#40
0.6%
Color · #eab308
8
MPH - Epidemiology
Engagement share
2.8%
0.2 7d
30-day trend
Per-touchpoint share
Enrollment
#18
2.4%
Learning
#8
3.1%
Support
#6
2.9%
Faculty Assist
#10
2.9%
Color · #fbbf24
9
MS in Data Science
Engagement share
2.7%
0.4 7d
30-day trend
Per-touchpoint share
Enrollment
#7
2.8%
Learning
#9
3.0%
Support
#24
2.2%
Faculty Assist
#11
2.9%
Color · #c084fc
10
Master of Public Health
Engagement share
2.7%
0.7 7d
30-day trend
Per-touchpoint share
Enrollment
#2
3.4%
Learning
#40
0.2%
Support
#3
3.5%
Faculty Assist
#3
3.8%
Color · #60a5fa
11
RN to BSN
Engagement share
2.7%
0.1 7d
30-day trend
Per-touchpoint share
Enrollment
#6
3.0%
Learning
#17
2.5%
Support
#13
2.6%
Faculty Assist
#14
2.6%
Color · #f472b6
12
MEd - Educational Leadership
Engagement share
2.6%
0.2 7d
30-day trend
Per-touchpoint share
Enrollment
#15
2.4%
Learning
#13
2.7%
Support
#10
2.7%
Faculty Assist
#13
2.7%
Color · #ec4899
13
Online JD
Engagement share
2.6%
0.2 7d
30-day trend
Per-touchpoint share
Enrollment
#8
2.8%
Learning
#10
2.9%
Support
#19
2.5%
Faculty Assist
#21
2.2%
Color · #ec4899
14
MEd - Curriculum & Instruction
Engagement share
2.6%
0.1 7d
30-day trend
Per-touchpoint share
Enrollment
#21
2.3%
Learning
#14
2.7%
Support
#23
2.2%
Faculty Assist
#7
3.1%
Color · #818cf8
15
MS in Management
Engagement share
2.5%
0.2 7d
30-day trend
Per-touchpoint share
Enrollment
#14
2.4%
Learning
#15
2.6%
Support
#18
2.5%
Faculty Assist
#17
2.5%
Color · #f472b6
16
MSW - Clinical Practice
Engagement share
2.5%
0.2 7d
30-day trend
Per-touchpoint share
Enrollment
#12
2.5%
Learning
#11
2.9%
Support
#27
2.1%
Faculty Assist
#16
2.5%
Color · #10b981
17
MS in Analytics
Engagement share
2.5%
0.3 7d
30-day trend
Per-touchpoint share
Enrollment
#20
2.3%
Learning
#18
2.5%
Support
#12
2.7%
Faculty Assist
#18
2.4%
Color · #60a5fa
18
MS in Finance
Engagement share
2.4%
0.2 7d
30-day trend
Per-touchpoint share
Enrollment
#19
2.3%
Learning
#12
2.7%
Support
#21
2.4%
Faculty Assist
#20
2.3%
Color · #fb7185
19
MS in Business Analytics
Engagement share
2.4%
0.1 7d
30-day trend
Per-touchpoint share
Enrollment
#16
2.4%
Learning
#19
2.4%
Support
#32
1.9%
Faculty Assist
#12
2.8%
Color · #facc15
20
MPH - Global Health
Engagement share
2.4%
0.1 7d
30-day trend
Per-touchpoint share
Enrollment
#22
2.2%
Learning
#22
2.3%
Support
#8
2.8%
Faculty Assist
#22
2.2%
Color · #f87171
21
MS in Instructional Design
Engagement share
2.3%
0.2 7d
30-day trend
Per-touchpoint share
Enrollment
#13
2.5%
Learning
#20
2.4%
Support
#28
2.1%
Faculty Assist
#23
2.2%
Color · #14b8a6
22
Master of Social Work
Engagement share
2.3%
0.8 7d
30-day trend
Per-touchpoint share
Enrollment
#11
2.7%
Learning
#5
3.4%
Support
#40
0.0%
Faculty Assist
#8
3.1%
Color · #fb923c
23
MS in Strategic Marketing
Engagement share
2.2%
0.2 7d
30-day trend
Per-touchpoint share
Enrollment
#31
2.0%
Learning
#24
2.3%
Support
#26
2.2%
Faculty Assist
#15
2.6%
Color · #34d399
24
MS in Software Engineering
Engagement share
2.2%
0.1 7d
30-day trend
Per-touchpoint share
Enrollment
#26
2.1%
Learning
#21
2.4%
Support
#29
2.1%
Faculty Assist
#19
2.4%
Color · #a3e635
25
MSN - Nurse Leadership
Engagement share
2.2%
0.1 7d
30-day trend
Per-touchpoint share
Enrollment
#27
2.1%
Learning
#16
2.6%
Support
#31
2.0%
Faculty Assist
#28
2.0%
Color · #facc15
26
MSN - Family Nurse Practitioner
Engagement share
2.1%
0.2 7d
30-day trend
Per-touchpoint share
Enrollment
#28
2.1%
Learning
#28
2.0%
Support
#17
2.5%
Faculty Assist
#34
1.8%
Color · #fb7185
27
MS in Higher Education Admin
Engagement share
2.0%
0.0 7d
30-day trend
Per-touchpoint share
Enrollment
#34
1.8%
Learning
#31
1.9%
Support
#22
2.4%
Faculty Assist
#25
2.1%
Color · #6366f1
28
MS in Electrical Engineering
Engagement share
2.0%
0.1 7d
30-day trend
Per-touchpoint share
Enrollment
#33
1.8%
Learning
#36
1.7%
Support
#7
2.8%
Faculty Assist
#35
1.8%
Color · #818cf8
29
Online LL.M.
Engagement share
2.0%
0.0 7d
30-day trend
Per-touchpoint share
Enrollment
#23
2.2%
Learning
#33
1.8%
Support
#30
2.0%
Faculty Assist
#27
2.0%
Color · #14b8a6
30
MS in Information Systems
Engagement share
2.0%
0.1 7d
30-day trend
Per-touchpoint share
Enrollment
#32
2.0%
Learning
#26
2.1%
Support
#34
1.9%
Faculty Assist
#26
2.1%
Color · #fbbf24
31
MSW - Macro Practice
Engagement share
2.0%
0.0 7d
30-day trend
Per-touchpoint share
Enrollment
#35
1.8%
Learning
#23
2.3%
Support
#39
1.7%
Faculty Assist
#24
2.2%
Color · #a78bfa
32
MS in Health Services Administration
Engagement share
2.0%
0.1 7d
30-day trend
Per-touchpoint share
Enrollment
#29
2.0%
Learning
#25
2.2%
Support
#38
1.7%
Faculty Assist
#29
2.0%
Color · #22d3ee
33
MEd - Special Education
Engagement share
1.9%
0.1 7d
30-day trend
Per-touchpoint share
Enrollment
#24
2.1%
Learning
#32
1.9%
Support
#35
1.8%
Faculty Assist
#32
1.9%
Color · #eab308
34
MS in HR Analytics
Engagement share
1.9%
0.0 7d
30-day trend
Per-touchpoint share
Enrollment
#25
2.1%
Learning
#27
2.0%
Support
#37
1.7%
Faculty Assist
#33
1.8%
Color · #c084fc
35
MS in Cybersecurity
Engagement share
1.9%
0.0 7d
30-day trend
Per-touchpoint share
Enrollment
#30
2.0%
Learning
#38
1.3%
Support
#20
2.4%
Faculty Assist
#30
2.0%
Color · #f87171
36
MS in Healthcare Analytics
Engagement share
1.9%
0.2 7d
30-day trend
Per-touchpoint share
Enrollment
#36
1.7%
Learning
#35
1.7%
Support
#25
2.2%
Faculty Assist
#31
2.0%
Color · #a3e635
37
MS in ML Engineering
Engagement share
1.8%
0.2 7d
30-day trend
Per-touchpoint share
Enrollment
#37
1.7%
Learning
#39
1.3%
Support
#9
2.8%
Faculty Assist
#38
1.5%
Color · #22d3ee
38
MSN - Nursing Informatics
Engagement share
1.8%
0.0 7d
30-day trend
Per-touchpoint share
Enrollment
#39
1.7%
Learning
#37
1.5%
Support
#11
2.7%
Faculty Assist
#39
1.4%
Color · #34d399
39
MS in Civil Engineering
Engagement share
1.8%
0.3 7d
30-day trend
Per-touchpoint share
Enrollment
#38
1.7%
Learning
#34
1.8%
Support
#33
1.9%
Faculty Assist
#36
1.7%
Color · #f59e0b
40
MSW - Advanced Standing
Engagement share
1.7%
0.0 7d
30-day trend
Per-touchpoint share
Enrollment
#40
1.6%
Learning
#30
2.0%
Support
#36
1.8%
Faculty Assist
#37
1.6%
Color · #38bdf8

Inquiry Templates

10 canonical inquiry types · sampled across 4 touchpoints · 40 captured agent replies

Live from Langfuse
admissions-fitEnrollmentLearningSupportFaculty Assist
"Can I apply if my undergrad GPA was 2.7?"
25% routed to flagship agent·last -6d ago
enrollment-FAQEnrollmentLearningSupportFaculty Assist
"Compare this program to the on-campus version for a working professional"
25% routed to flagship agent·last -5d ago
student-servicesEnrollmentLearningSupportFaculty Assist
"How do I drop a course without affecting my financial aid?"
50% routed to flagship agent·last -3d ago
student-servicesEnrollmentLearningSupportFaculty Assist
"I missed last week's synchronous session - is there a recording?"
50% routed to flagship agent·last -7d ago
accessibilityEnrollmentLearningSupportFaculty Assist
"I need an accommodation for my final - who do I talk to?"
0% routed to flagship agent·last -3d ago
coursework-helpEnrollmentLearningSupportFaculty Assist
"I'm stuck on the linear regression assignment - is there a TA office hour?"
25% routed to flagship agent·last -6d ago
coursework-helpEnrollmentLearningSupportFaculty Assist
"Module 4 case study - I don't understand the discounted cash flow setup"
0% routed to flagship agent·last -7d ago
faculty-followupEnrollmentLearningSupportFaculty Assist
"My professor mentioned a paper on Bayesian inference - which one?"
50% routed to flagship agent·last -7d ago
enrollment-FAQEnrollmentLearningSupportFaculty Assist
"What is the application deadline for the spring cohort?"
25% routed to flagship agent·last -6d ago
outcomesEnrollmentLearningSupportFaculty Assist
"What kind of jobs do alumni get? Specific examples?"
25% routed to flagship agent·last -6d ago

What I'd Want to Dig Into in 30 / 60 / 90

7 pillars · 70 starting hypotheses · anchored against the public JD. Click any pillar to expand.

Open to being wrong about most of it
Built from the public Noodle Prompt Systems Engineer JD only. I don't have your internal context — what's already in flight, what's already shipped, what the team has decided is the wrong direction — so treat everything below as questions and starting points, not prescriptions. It's here to show how I'd scope the role, not to tell you what to do.
Phase 1First 90 Days
Days 1-90
Phase 2Production Quality
Days 91-180
Phase 3Scale + New Surfaces
Days 181-365
📝

Prompt Portfolio Build-Out

Phase 1Days 1-45

Every program agent has a versioned, tested prompt for each touchpoint (enrollment, learning, support, faculty assist). Personas, scope constraints, fallback handling, and tone are explicit and documented per program.

Anchored against JD

Write, iterate, and maintain system prompts and instruction sets for Noodle's AI agents across the student journey.

Day 1Day 180
  1. 1Audit the existing prompt library across all program agents and tag each one by maturity (production, beta, experimental, deprecated).
  2. 2Define a standard prompt structure - persona, scope, tone, fallback, escalation paths - and migrate the top 10 prompts to the standard first.
  3. 3Build a per-program persona guide co-authored with program leads (voice, formality, jargon density, when to escalate).
  4. 4Map the canonical inquiry types per touchpoint - the 10-20 questions that cover most volume - and ensure every program has a tested answer.
  5. 5Multi-turn conversation logic: explicit branching for ambiguity, clarification, and graceful exit conditions.
  6. 6RAG-augmented prompt patterns that draw from program-specific knowledge bases (syllabi, policies, alumni stories).
  7. 7Few-shot example library curated per program category so new programs get a head-start, not a blank slate.
  8. 8Output-format constraints (length, tone, citation requirements) encoded in prompts and verified by evals.
  9. 9Sensitive-topic guardrails (mental health, accommodation, Title IX, financial aid) with explicit human-routing logic.
  10. 10Quarterly prompt-portfolio review: which prompts are still in production, which got retired, which need a rebuild.
🎯

Eval Framework + Golden Sets

Phase 1Days 1-60

Every prompt change runs against a versioned golden set before deploy. Accuracy, tone, hallucination rate, task completion, and rubric alignment are measured and surfaced as a blocking quality gate.

Anchored against JD

Build and maintain evaluation frameworks to measure agent accuracy, tone, hallucination rate, task completion, and alignment with rubric-based learning objectives.

Day 1Day 180
  1. 1Define the eval rubric per touchpoint - what does 'good' mean for an Enrollment agent vs. a Learning agent vs. Faculty Assist.
  2. 2Build a golden test set per program (20-100 inputs each) with reference outputs and tolerance bands, co-authored with learning designers.
  3. 3Implement a model-graded eval for tone + persona adherence, with a human-graded random-sample audit weekly.
  4. 4Hallucination detection: structured citation requirements + automated source-validation in the eval pipeline.
  5. 5Pre-deploy gate: any prompt change blocks if it regresses on >5% of the golden set; auto-PR review with diff summary.
  6. 6Per-program rubric calibration with the learning team - the same answer might be A+ for an MBA agent and a C for a Public Health one.
  7. 7Continuous eval: nightly runs against a sampled production trace set, regressions paged to Slack within an hour.
  8. 8Eval portability: rubrics expressed as code so they can be cloned, versioned, and audited like any other artifact.
  9. 9Adversarial eval set maintained alongside the happy-path set: jailbreak attempts, ambiguous queries, hostile users.
  10. 10Quarterly eval-framework review: what's catching regressions, what's missing, what's too noisy to act on.
📡

Langfuse Observability + Alerting

Phase 1Days 30-90

Every agent turn is traced. Quality regressions, hallucination spikes, latency drift, and unusual user patterns surface as actionable alerts before users complain.

Anchored against JD

Use Langfuse to monitor prompt performance in production, identify regressions, and prioritize prompt improvements.

Day 1Day 180
  1. 1Standardize Langfuse trace metadata across all program agents (program_id, touchpoint, prompt_version, model, tool calls, latency).
  2. 2Per-program dashboards: acceptance rate, escalation rate, top failure modes, p50/p95 latency, cost per resolved turn.
  3. 3Anomaly detection: hallucination-rate spikes, fallback-rate spikes, sentiment dips, queue depth growth - all paged with reasonable thresholds.
  4. 4Sample-based human review queue for high-stakes touchpoints (accessibility requests, mental health flags, admissions policy).
  5. 5Trace-to-eval linkage: every trace tied to the prompt version + golden-set score that was in production at that moment.
  6. 6Cost tracking per program per touchpoint, with budget alerts before overruns become headline-grabby.
  7. 7Latency budget per touchpoint encoded as an SLO with error-budget burn alerts.
  8. 8Weekly observability digest auto-drafted from the dashboard for partner success teams.
  9. 9On-call rotation for prompt-systems incidents with a documented runbook per common failure mode.
  10. 10Quarterly observability retro: what alerts fire too often, what's silent that shouldn't be, what's our mean-time-to-detect.
🛡️

Red-Team + Adversarial Testing

Phase 2Days 60-150

Failure modes are surfaced before students find them. Jailbreaks, scope drift, sensitive-topic mishandling, and out-of-scope confident answers are caught in red-team sessions and fixed at the prompt layer.

Anchored against JD

Design red-teaming and adversarial testing protocols to surface edge cases and failure modes before agents reach students.

Day 1Day 180
  1. 1Define the red-team rubric: what categories of failure we test for (jailbreak, scope drift, sensitive-topic, hallucinated authority, persona drift, etc.).
  2. 2Quarterly red-team session per program agent with a rotating internal team (engineers + learning designers + partner stakeholders).
  3. 3Automated adversarial prompt-generation seeded by recent production failures and external LLM-jailbreak research.
  4. 4Sensitive-topic test suite: every program's Support agent must correctly route mental health, accommodation, and Title IX inquiries.
  5. 5External red-team partnership annually with a vendor that doesn't have our priors - fresh eyes find what we've stopped seeing.
  6. 6Public bounty channel for university partners to surface failure modes their staff have caught in the wild.
  7. 7Red-team findings versioned as eval cases - every reproducible failure becomes a permanent test in the golden set.
  8. 8Pre-deploy red-team gate on high-stakes prompt changes: a small panel signs off on persona + scope changes before production.
  9. 9Post-incident writeups blamelessly documenting any production failure that bypassed the eval suite, indexed and searchable.
  10. 10Annual red-team report: what we learned, what got fixed, what failure classes are still unsolved.
👥

Three-Way Chat + Multi-Agent Workflows

Phase 2Days 90-180

The Faculty Assist surface - learner + campus/Noodle staff + that staff member's AI assistant in the same thread - works smoothly and scales beyond pilot programs. Multi-step chained agent workflows handle the cases a single prompt can't.

Anchored against JD

Create the learner experiences defined by 'three-way' chat between the learner, a campus or Noodle support staff member, and that staff member's AI assistant. Design prompt architectures for multi-step and chained agent workflows.

Day 1Day 180
  1. 1Define the three-way chat protocol: who speaks when, what context the AI sees, what context it doesn't, escalation triggers.
  2. 2AI-assist persona for staff: drafts replies, surfaces relevant policies, flags sensitive topics - but the staff member ships the message.
  3. 3Latency budget for the assist surface: staff are watching live, so the AI has milliseconds before it becomes noticeable.
  4. 4Audit trail: every staff-AI co-authored message is logged with the AI's draft + the staff edit diff for quality review.
  5. 5Chained workflows: complex inquiries that require multi-step retrieval (policy lookup + transcript review + financial-aid check) handled by orchestrated sub-agents.
  6. 6Hand-off design: when a single agent should escalate to the three-way chat vs. directly to a human-only ticket.
  7. 7Multi-turn memory boundaries: what state persists, what resets, what gets pinned by the staff member.
  8. 8Cross-program transfer: a learner asking about transferring credits across two programs needs an agent that holds both contexts.
  9. 9Failure-mode planning: what happens when the staff member is offline mid-thread, when the AI is uncertain, when the learner disengages.
  10. 10Quarterly three-way chat review with the campus staff who use it daily - usability + trust score + what they'd kill.
🧱

Reusable Prompt Components + Internal Standards

Phase 2Days 90-180

Prompt patterns, scoring rubrics, fallback templates, and persona scaffolds live in a shared library. New programs onboard in days, not weeks. Internal teams configure their own agents against the same standards.

Anchored against JD

Establish prompt versioning practices and maintain a library of tested, reusable prompt components. Contribute prompt engineering guidelines and best-practices documentation for internal teams who configure their own agents on Noodle's AI orchestration platform.

Day 1Day 180
  1. 1Build the prompt-component library: persona blocks, fallback handlers, escalation routers, citation formatters, tone guards.
  2. 2Versioning convention: semver for prompts (breaking persona shifts vs. tone refinements vs. typo fixes) and changelogs co-located with the component.
  3. 3Internal documentation site: prompt patterns, anti-patterns, the eval rubric, the red-team rubric, runbooks.
  4. 4Self-serve agent configuration playbook: 'I'm a Noodle PM, how do I spin up a new program agent against our standards?'
  5. 5Lint-style prompt checker: surfaces missing fallback paths, missing scope guards, missing escalation routes before deploy.
  6. 6Per-component eval suite: every reusable block has its own micro golden set so library upgrades are safe.
  7. 7Internal office hours for teams configuring their own agents on Noodle's platform - lower the lift for non-engineers.
  8. 8Library deprecation lifecycle: components don't live forever; sunset path is documented.
  9. 9Reusable evaluator library too: tone classifiers, hallucination detectors, sentiment graders shared across programs.
  10. 10Quarterly library review: what got used, what got cloned (signal it should be standardized), what got ignored.
🎙️

Voice + Multimodal Extension

Phase 3Days 180-365

The same agent stack drives voice interfaces (phone enrollment, study-mode tutoring) and multimodal inputs (a learner pastes a screenshot of an assignment question). The prompt + eval + observability stack carries over.

Anchored against JD

Experience with voice agents or multimodal prompting (preferred).

Day 1Day 180
  1. 1Voice-first prompt patterns: shorter sentences, conversational repair, explicit confirmations on action-taking turns.
  2. 2Phone-enrollment companion pilot with one program: prospective student calls in, voice agent handles intake + scheduling.
  3. 3Multimodal inquiry handling: a learner pastes a screenshot of a homework question and the Learning agent uses vision + text together.
  4. 4Transcript-quality monitoring: the new failure mode is ASR mistakes, not LLM mistakes - separate eval surface for that.
  5. 5Voice-specific eval rubric: latency under 800ms perceived, no awkward silences, graceful fallback when the user goes off-script.
  6. 6Per-program persona-as-voice: the persona is also a voice - tone, pace, warmth - chosen with the program lead.
  7. 7Multimodal red-team: jailbreak via image, prompt-injection via screenshot text, all the new attack surfaces.
  8. 8Latency engineering: streaming partial responses for voice, smart preloading of likely follow-ups.
  9. 9Cross-modal hand-offs: voice agent escalates to a chat thread with context preserved.
  10. 10Annual review of the voice + multimodal surfaces - is the same agent + eval stack still the right architecture, or do they fork.
The dashboard you're looking at is a working stab at the first pillar (AEO Engineering)— built to make this concrete instead of abstract. The other 6 pillars are draft thinking; I'd expect them to be reshuffled by week 2 once we actually talk.
Student Journey Co-Pilot for Noodle · built bySterling Mull·Claude Sonnet 4.6 · Vercel AI SDK