[GH-ISSUE #750] Tutorial proposal: wiki-first chat memory (LLM Wiki pattern, with Beever Atlas as reference) #3181

New Issue

GiteaMirror · 2026-05-02T01:52:42-05:00

GiteaMirror commented

2026-05-02 01:52:42 -05:00

Originally created by @jhkchan on GitHub (Apr 27, 2026).
Original GitHub issue: https://github.com/Shubhamsaboo/awesome-llm-apps/issues/750

Hi Shubham — fan of the cookbook approach. PR #749 (Karpathy LLM Wiki tutorial) was the closest precedent I found to what I want to ask about; happy to take guidance on whether a different shape would land better.

The pattern

Karpathy's LLM Wiki gist defines the what — three layers (Sources → Wiki → Schema), three operations (Ingest, Query, Lint). What's missing from the public template space is a runnable implementation of the what applied to conversational corpora (Slack, Discord, Teams), where chunk-first RAG demonstrably fails.

Over the last six months we built and open-sourced (Apache 2.0) Beever Atlas as a production implementation of that pattern, with:

A six-stage extraction pipeline (preprocessor → fact extractor → entity extractor → cross-batch validator → relationship graph → persister), written in Google ADK
Dual memory: Weaviate for semantic, Neo4j for graph, with an LLM-classifier query router picking semantic / graph / both per question
A 16-tool MCP server that exposes the same memory to Claude Code and Cursor
make demo → docker-compose stack pre-loaded with a public Wikipedia corpus → grounded answers with citations in <5 minutes

What I'm proposing

A self-contained tutorial under advanced_llm_apps/llm_apps_with_memory_tutorials/ (or wherever fits the structure) that distills the pattern down to a runnable template:

Single file or small directory, not the full Beever Atlas codebase
Chat-corpus → fact extraction → dual memory → cited Q&A
Provider-agnostic (Claude / Gemini / OpenAI swap, matching your stack norm)
~300-line implementation, runnable in 3 commands

The Beever Atlas codebase would be a reference; the tutorial itself would be original code, written to your style (one-file-runnable, end-to-end tested, matching the existing karpathy_llm_wiki/ and rag_tutorials/ shapes).

Happy to be told this isn't the right fit, or to adjust the scope. I'd rather contribute something that fits your editorial standard than push a self-serving project link.

— Jacky (maintainer disclosure: I'm a Beever Atlas maintainer; Beever AI Limited, Toronto)

Originally created by @jhkchan on GitHub (Apr 27, 2026). Original GitHub issue: https://github.com/Shubhamsaboo/awesome-llm-apps/issues/750 Hi Shubham — fan of the cookbook approach. PR #749 (Karpathy LLM Wiki tutorial) was the closest precedent I found to what I want to ask about; happy to take guidance on whether a different shape would land better. ## The pattern Karpathy's LLM Wiki gist defines the *what* — three layers (Sources → Wiki → Schema), three operations (Ingest, Query, Lint). What's missing from the public template space is a runnable implementation of the *what* applied to **conversational corpora** (Slack, Discord, Teams), where chunk-first RAG demonstrably fails. Over the last six months we built and open-sourced (Apache 2.0) [Beever Atlas](https://github.com/Beever-AI/beever-atlas) as a production implementation of that pattern, with: - A six-stage extraction pipeline (preprocessor → fact extractor → entity extractor → cross-batch validator → relationship graph → persister), written in Google ADK - Dual memory: Weaviate for semantic, Neo4j for graph, with an LLM-classifier query router picking semantic / graph / both per question - A 16-tool MCP server that exposes the same memory to Claude Code and Cursor - `make demo` → docker-compose stack pre-loaded with a public Wikipedia corpus → grounded answers with citations in <5 minutes ## What I'm proposing A self-contained tutorial under `advanced_llm_apps/llm_apps_with_memory_tutorials/` (or wherever fits the structure) that distills the pattern down to a runnable template: - Single file or small directory, not the full Beever Atlas codebase - Chat-corpus → fact extraction → dual memory → cited Q&A - Provider-agnostic (Claude / Gemini / OpenAI swap, matching your stack norm) - ~300-line implementation, runnable in 3 commands The Beever Atlas codebase would be a reference; the tutorial itself would be original code, written to your style (one-file-runnable, end-to-end tested, matching the existing `karpathy_llm_wiki/` and `rag_tutorials/` shapes). Happy to be told this isn't the right fit, or to adjust the scope. I'd rather contribute something that fits your editorial standard than push a self-serving project link. — Jacky (maintainer disclosure: I'm a Beever Atlas maintainer; Beever AI Limited, Toronto)

Sign in to join this conversation.

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/awesome-llm-apps#3181