NotebookLM capabilities — a deep-dive inventory
Snapshot Briefing
This page maps the complete capability set: retrieval-augmented reasoning, summarisation, audio generation, citation mechanics, multi-source synthesis, multilingual support, sharing, and export. Each capability is described technically and contextually.
Describing the tool as "an AI research assistant" covers the marketing but obscures the engineering. What the assistant actually does is run a three-phase pipeline — index, retrieve, generate — inside a boundary defined by the sources you uploaded. Every capability listed below is a variation on how that pipeline is configured and what it produces. Understanding the pipeline makes it easier to predict when the tool will perform well and when it will need human oversight.
The boundary is the key concept. Unlike an open-domain chatbot, this research notebook refuses to answer from general knowledge when grounding is active. If the answer is not in your sources, the tool says so. That refusal is not a weakness — it is the mechanism that makes the output auditable. A lawyer or a scientist cannot use "a chatbot told me" as a source; they can use "page 14 of this paper, as cited by the notebook." The grounding boundary is what enables the latter.
Retrieval-augmented reasoning
When you submit a query, the tool converts it into an embedding and searches the indexed source corpus for passages with high semantic similarity. It retrieves a ranked set of passages — typically the top ten to twenty, weighted by relevance — and passes them to the Gemini generation model alongside the original query. The model composes an answer using only those passages as evidence, citing each one. This retrieve-then-generate pattern is known as retrieval-augmented generation (RAG) and it is the foundation of every capability in the tool.
The reasoning quality depends on what is in the passages retrieved. For factual queries with a clear answer in the sources, accuracy is high. For complex inferential questions — "what does the combination of sources A and B imply about the mechanism described in source C?" — the model synthesises across the retrieved set, which works well when the relevant passages are semantically similar enough to land in the same retrieval batch.
Summarisation
The tool summarises at several levels of granularity: a single source, a selected subset of sources, or the full notebook corpus. Summaries are generated via the notes studio and are grounded in the same retrieval pipeline as chat answers. Each paragraph in a generated summary links back to the passage that motivated it. The tool also performs implicit summarisation in every chat answer — composing a response that synthesises multiple retrieved passages into a single coherent paragraph.
Audio generation
The audio overview capability transforms the notebook into a scripted dialogue between two AI hosts. This is not a text-to-speech pass over a written summary — the tool generates a separate dialogue script optimised for audio comprehension, with question-and-answer structure, repetition for clarity, and pacing that suits listening rather than reading. Four modes (Brief, Standard, Deep Dive, Critique) target different audience needs. The audio overviews page covers this in full detail.
Citation mechanics
Every generated output — chat answer, briefing doc, study guide, audio overview script — includes citations. In chat and notes the citations are inline: a superscript or footnote-style marker that, when clicked, opens the source pane at the highlighted passage. The citation points to a specific paragraph or sentence, not to the document as a whole. This granularity matters: for a 300-page PDF, "see this document" is not useful; "see paragraph 4 on page 47" is actionable.
Citation accuracy is generally high for text-layer PDFs and Google Docs. It degrades somewhat for scanned PDFs (OCR artefacts can shift paragraph boundaries) and for audio-transcribed sources where timestamps approximate rather than exact text positions. The tool will also occasionally produce a citation that points to a slightly adjacent passage rather than the most precise one — a known limitation of retrieval systems that chunk documents into fixed-size windows.
Multi-source synthesis
One of the genuinely distinctive capabilities is cross-document synthesis. A query like "how do sources A, B, and C differ on the question of inflation targeting?" will retrieve the most relevant passages from all three documents and compose a comparative answer with per-source citations. The model is also designed to surface disagreements rather than paper over them — if two sources contradict each other, the answer will note the contradiction rather than silently choosing one view.
Multi-source synthesis is also what powers the audio overview feature at the notebook level: the overview is a synthesised narrative across the entire corpus, not a per-document summary stitched together. That is why overviews from notebooks with ten diverse sources tend to be more interesting than overviews from notebooks with ten near-identical sources.
Language support
Source ingestion is language-agnostic. Chat and note output follow the prompt language. Cross-language operation — French sources, English questions — works well because the retrieval step operates on multilingual embeddings that map semantically equivalent phrases across languages to similar vector positions. Audio overviews added nine language-specific host pairs in 2025; the output language is chosen independently of the source language at generation time.
Sharing and collaboration
Sharing is a capability in the same sense that retrieval is: it extends what the tool can do to additional people. A shared notebook makes the indexed corpus and all saved notes available to collaborators within their assigned role. Viewers can run queries and listen to overviews without editing. Editors can add sources and generate notes. The Plus tier adds analytics and retention policies. Read-only public links allow broader distribution — a researcher can share a notebook publicly so anyone can query it without signing in.
Export
The tool exports notes to Google Docs and copies them as Markdown. Audio overviews download as MP3. These export paths are the tool's interface to the outside world — the mechanism by which outputs leave the notebook and enter other workflows. Export does not include the source index itself; you cannot export the full notebook as a portable file. What exports is the generated content.
Capability matrix
| Capability | How it works | Notes |
|---|---|---|
| Grounded Q&A | RAG: retrieve relevant passages, generate cited answer | Will abstain if answer not in sources |
| Summarisation | Notes studio generates grounded prose summaries | Supports single source or full corpus |
| Audio overview | Generates dialogue script, renders with voice models | 4 modes; 9 languages; MP3 download |
| Citation | Inline passage-level links from output back to source | Accuracy varies for scanned PDFs |
| Multi-source synthesis | Retrieval across all indexed sources in one query | Surfaces contradictions between sources |
| Language support | Multilingual embeddings; prompt-language output | Cross-language queries supported |
| Sharing | Role-based notebook access; public read-only links | Analytics on Plus tier |
| Export | Google Docs, Markdown, MP3 | Index not exportable |
| Mind Map | Visual concept graph of source relationships | Plus tier only |
| Abstention | Declines to answer when evidence absent | Core grounding guarantee |
Ramona C. Ashcroft-Mbeki, Producer at Gulden Research Collective in Cape Town, put the value succinctly: "The multi-source synthesis is the capability that changed our workflow most. We can now load a full conference's papers into a notebook and ask a single question that draws on twenty of them simultaneously. Six months ago that would have taken two researchers a full day."
For a broader policy-level analysis of how retrieval-augmented tools fit responsible-AI frameworks, the OECD AI policy observatory publishes updated guidance that covers grounding, citation, and auditability requirements for AI-assisted knowledge work.
Capabilities questions
Technical and practical questions about what the tool can and cannot do.
What reasoning capabilities does the tool have?
The tool performs retrieval-augmented reasoning entirely within the boundary of the uploaded sources. It synthesises claims across multiple documents, identifies contradictions between sources, and draws inferences from combined evidence. It does not reason from general world knowledge — if the evidence is not in the corpus, it says so.
Can it summarise across multiple sources at once?
Yes. Multi-source synthesis is a central capability. A single chat query or note generation request retrieves passages from all indexed sources simultaneously and composes a unified, cited output. The model is designed to surface where sources disagree rather than silently averaging their positions.
How accurate are the citations?
Citation accuracy is high for text-layer PDFs and Google Docs, somewhat lower for scanned PDFs and audio-transcribed sources. The tool links to the specific paragraph or sentence it drew from, not the document as a whole. Occasional off-by-one passage citations are the most common error type — the cited passage is adjacent to the relevant one rather than exactly on it.
What does the tool do when it cannot find an answer?
It abstains. The grounding guarantee means the tool will not confabulate an answer from general knowledge when a query falls outside the source corpus. You will get an explicit "I could not find this in your sources" response rather than a confident but ungrounded answer. This is intentional and is the primary reason the tool is trusted in professional research contexts.
See the capabilities in action
Load a mixed set of sources and try a cross-document query. The synthesis and citation behaviour described on this page becomes immediately tangible when you watch it produce a cited answer from three documents at once.
Follow the first-notebook walkthroughCapabilities in the wider site context
Each capability described on this page has a dedicated treatment elsewhere on the site. The chat mode page goes deep on citation mechanics and the retrieval pipeline. The audio overviews page covers the dialogue-generation process, mode options, and language support. The notes studio page explains the five note types and both export paths. The sources guide covers the input formats and indexing behaviour that all downstream capabilities depend on.
For the model architecture behind these capabilities, the Gemini and NotebookLM page explains the long-context Gemini models and the RAG pipeline in more technical terms. The features overview provides the top-level map. Teams evaluating the tool for sensitive research should also read the data and privacy page, which covers what the indexing step does and does not retain, and the pricing page for the tier-specific caps on source volume and audio generation.