The Science Library

Why the AI actually works.

We make strong claims about retention, mastery, and learning outcomes. Every claim ties to peer-reviewed research — and we link to it. This is the work behind the engine.

7 pillars of learning science 40+ cited papers Updated May 2026 Open peer review welcomed
The Seven Pillars

Seven principles. One adaptive engine.

Each pillar is a body of cognitive-science research. Each one shows up in a specific AI capability. Click any pillar to read the deep dive — including the studies, the math, and the open questions.

Pillar 01 · Featured

Memory Science & Spaced Repetition

Why scheduling reviews right before forgetting beats cramming by 30–40%, and why the AI Memory Coach is so different from a content calendar.

Powers · AI Memory Coach
Pillar 02

Adaptive Diagnostic & Item Response Theory

How 24 questions can pinpoint a learner’s ability per concept — and why this saves 85% of the “review the basics” time most courses waste.

Powers · AI Skill Diagnostic
Pillar 03

Confusion Pairs & Interleaving

The 43% retention boost from mixing related-but-distinct concepts — and why blocked practice (do all the mitosis questions, then all the meiosis) is the most common mistake in curriculum design.

Powers · AI Confusion Detector
Pillar 04

Misconception Repair & Conceptual Change

Why “wrong answer” usually reflects a coherent-but-broken mental model — and why simply re-teaching the topic almost never fixes it.

Powers · AI Misconception Repair
Pillar 05

Confidence Calibration & Brier Scoring

The biggest learning gap isn’t knowledge — it’s calibration. Why learners who learn to honestly rate their own confidence outperform those who don’t, on every downstream metric.

Powers · AI Confidence Coach
Pillar 06

Socratic Coaching & Bloom’s Two Sigma

Bloom’s 1984 finding that personal tutoring produces 2σ better outcomes than classroom teaching — and how the AI Tutor approaches that effect at scale, by asking the right next question.

Powers · AI Tutor
Pillar 07

Knowledge Graphs & Transfer

Why understanding the structure between concepts — prerequisites, look-alikes, applications — predicts transfer to new problems better than mastery of any single concept.

Powers · AI Knowledge Map
Science ↔ Product

Every science principle maps to a working AI feature.

Below: the claim, the underlying body of research, and the AI capability that operationalizes it. Each row is also the spine of one Science deep-dive page.

Science principle What it means in practice AI capability
Spaced repetition Reviewing right before forgetting boosts long-term retention 30–40%. Cramming is the most common form of wasted effort in learning. AI Memory Coach
Item Response Theory Each question carries different information about a learner’s ability. Selecting the next question to maximize that information lets us place learners precisely in ~24 items. AI Skill Diagnostic
Interleaving Mixing related concepts in the same session improves discrimination — learners actually learn to tell the look-alikes apart, not just memorize each. AI Confusion Detector
Conceptual change Wrong answers usually come from coherent (but wrong) mental models. Re-teaching the topic doesn’t fix it. Targeting the specific misconception does. AI Misconception Repair
Metacognitive calibration Learners who can honestly rate their own confidence (low Brier score) make better study decisions, recover from errors faster, and transfer skills better. AI Confidence Coach
Tutoring effectiveness Bloom (1984) found one-to-one tutoring produced 2σ improvement vs classroom instruction. Socratic prompting captures much of the effect. AI Tutor
Knowledge graph & transfer Mastery of isolated facts doesn’t transfer. Mastery of the structure between facts — what depends on what — does. AI Knowledge Map
Open Questions

What we don’t yet know.

Pretending we have all the answers would make this section dishonest. Below are the four questions our research team is actively working on, with academic collaborators. We publish updates here as we learn.

  1. How do AI Tutors compare to expert human tutors on transfer tasks?

    Socratic prompting captures most of Bloom’s two-sigma effect on retention — but does it transfer to novel problems the way a great human tutor does? Currently piloting a study with three universities.

  2. Does AI-generated misconception detection match expert teacher diagnosis?

    Our misconception model is trained on patterns. Experienced teachers diagnose misconceptions through dialogue. We’re studying agreement rates and where the AI is systematically wrong.

  3. What’s the optimal confidence-calibration training regimen?

    Brier score improves with practice, but the dose-response curve isn’t well-documented. We’re running A/B tests on calibration prompt frequency and feedback format.

  4. How does engagement decay differ across languages and cultures?

    Most spaced-repetition research is in English, with North American or European participants. We’re partnering with a state education department to study optimal review schedules for Hindi-medium learners.

Endorsements

From researchers who’ve reviewed the work.

The integration of IRT-based adaptive testing with spaced-repetition scheduling is among the most rigorous I’ve seen in commercial learning platforms. Cognitive scientist R1 research university
Most edtech companies cite the science as a marketing veneer. Future Proof is one of the few I’ve audited where the engine actually implements what they claim. Learning scientist Independent reviewer
Read the deep dive

Start with the most-cited pillar.

The Memory Science deep dive — the research, the math, and the practical takeaway — is the most-read page in this library.

40+ cited papers Updated May 2026 Open peer review