2Slides Logo
How to Batch-Generate 100 Vocabulary Decks with the 2Slides API (Content Factory Playbook for 2026)
2Slides Team
15 min read

How to Batch-Generate 100 Vocabulary Decks with the 2Slides API (Content Factory Playbook for 2026)

Once you've validated the manual workflow β€” generate one vocabulary deck, narrate it, export the assets β€” the next bottleneck is volume. A language school with 12 levels and 30 weekly themes needs 360 decks a year. A faceless TikTok channel posting daily needs 365 decks plus the variant aspect ratios. A content team at an EdTech company needs hundreds of decks segmented by L1/L2 pairs.

You don't build 360 decks by hand. You build a content factory.

This guide is the practical 2026 playbook for batch-generating vocabulary decks (and any other slide content) with the 2Slides API. The single most important architectural decision β€” and the one most often gotten wrong β€” is picking the right generation endpoint.

Pick the right endpoint first (this is where most factories break)

2Slides exposes two distinct generation flows over the API. Only one of them produces decks that can subsequently be narrated.

EndpointWhat it producesNarration possible?Credits
POST /api/v1/slides/generate
Fast PPT β€” template-driven PPTX. Requires a
themeId
from the templates library.
❌ No. The narration endpoint explicitly rejects jobs created here.10 / page
POST /api/v1/slides/create-pdf-slides
Nano Banana β€” image-generated slides from a text prompt. Same engine as Workspace.βœ… Yes10 (planning) + 100 / slide (1K/2K) or 200 / slide (4K)
POST /api/v1/slides/create-like-this
Nano Banana β€” image-generated slides matching a reference image.βœ… YesSame as above

For a vocabulary-card content factory with narration and exportable audio, use

create-pdf-slides
(or
create-like-this
if you have a reference layout).
Do not use
/api/v1/slides/generate
β€” that's the Fast PPT endpoint and you cannot add narration to it.

If your factory only needs silent PPTX (no audio, no video), Fast PPT via

/api/v1/slides/generate
is the cheapest path. The rest of this playbook assumes the narrated workflow.

The architecture in one diagram

[Source data] [Orchestrator] [2Slides API] [Outputs] β”‚ β”‚ β”‚ β”‚ vocabulary ──prompt──▢ Job queue ──POST──▢ /api/v1/slides/create-pdf-slides ──▢ jobId (UUID) spreadsheet (Cron/script) β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ Poll ──GET──▢ /api/v1/jobs/{jobId} β—€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ every 20-30s status: success β”‚ β”‚ β–Ό β”‚ [pages slides PNG Β· downloadUrl PDF] β”‚ β–Ά POST /api/v1/slides/generate-narration β”‚ (jobId, voice, mode, etc.) β€” async only β”‚ Poll ──GET──▢ /api/v1/jobs/{jobId} message: "Voice narration generation in progress" β†’ success β”‚ β–Ά POST /api/v1/slides/download-slides-pages-voices β”‚ (free; returns ZIP) β–Ό pages/*.png + voices/*.{wav,mp3} + transcript.txt β”‚ β–Ό (Optional) compose MP4 client-side with FFmpeg, or use the Workspace UI β”‚ β–Ό [LMS / TikTok / newsletter / S3]

Source data β†’ orchestrator β†’ API β†’ ZIP of pages + voices β†’ distribution. MP4 composition is optional and is not a public API endpoint as of 2026 β€” it's a Workspace UI feature using FFmpeg.wasm in the browser. The API equivalent is the pages-and-voices ZIP, which you can compose with

ffmpeg
server-side if you need MP4 in the factory.

Step 1 β€” Design the source schema first

The single highest-leverage move is defining the source data schema before any API call. Decks built from a clean schema are reproducible; decks built from ad-hoc prompts are not.

A vocabulary-deck source row that scales:

deck_id: vocab-b1-travel-2026-w14 source_l1: en # learner's native language target_l2: es # language being learned cefr_level: B1 theme: travel words: - { word: "boarding pass", ipa: "/ˈbɔːrdΙͺΕ‹ pΓ¦s/", pos: noun, l1: "tarjeta de embarque" } - { word: "layover", ipa: "/ˈleΙͺoʊvΙ™r/", pos: noun, l1: "escala" } - { word: "to delay", ipa: "/dΙͺˈleΙͺ/", pos: verb, l1: "retrasar" } # ... 27 more generation: endpoint: create-pdf-slides aspect_ratio: "9:16" # vertical for short-form review resolution: "2K" page_count: 30 content_detail: "concise" narration: enabled: true voice: "Puck" # see /tts_sample_voices for the catalog mode: "single" distribution: social: [tiktok, reels, shorts] newsletter: monday-2026-w14

That object is the unit of work. Everything downstream consumes it.

Build the source schema in whatever you already have: a Google Sheet for non-technical teams, a Postgres table for engineering teams, a CMS with structured fields for content teams. Avoid building it in plain Markdown files β€” Markdown is fine for human writing but bad for batch automation.

Step 2 β€” Authenticate

Get an API key from the API management page. The format is:

sk-2slides-{64-character-hex-string}

All requests use bearer auth:

Authorization: Bearer sk-2slides-...

Per-endpoint rate limits are documented at 2slides.com/api.md. For batch production:

  • create-pdf-slides
    and
    create-like-this
    : design your queue around their concurrency limits with exponential backoff on 429
  • jobs/{id}
    (poll): respect the polling cadence below β€” 20–30s, not aggressive
  • download-slides-pages-voices
    : free and faster, but still rate-limited

Step 3 β€” Submit a Nano Banana generation job

Vocabulary cards work best in async mode (the per-slide image generation takes 1–3 minutes for a 30-card deck).

curl -X POST "https://2slides.com/api/v1/slides/create-pdf-slides" \ -H "Authorization: Bearer sk-2slides-..." \ -H "Content-Type: application/json" \ -d '{ "userInput": "<your deck-shaped prompt β€” see Step 4>", "responseLanguage": "en", "aspectRatio": "9:16", "resolution": "2K", "page": 30, "contentDetail": "concise", "mode": "async" }'

The response contains the

jobId
(a UUID). Poll for completion:

curl -X GET "https://2slides.com/api/v1/jobs/{jobId}" \ -H "Authorization: Bearer sk-2slides-..."

Polling cadence: every 20–30 seconds. Do not poll faster β€” the API documentation explicitly calls this out, and aggressive polling is the most common cause of 429s. Most decks complete in 1–3 minutes.

When

status: "success"
, the job has slide images stored on R2 and a
downloadUrl
for a PDF compilation. The slide images themselves are what you'll later combine with audio.

Step 4 β€” Build prompt templates that hold at scale

The single biggest difference between a flaky factory and a reliable one is prompt templates. Don't write prompts at runtime per deck. Define a template per deck type and substitute values.

Vocabulary deck template (

userInput
):

Generate a {{cefr_level}}-level vocabulary deck for {{source_l1}}-speaking learners of {{target_l2}}. Theme: {{theme}}. Number of cards: {{word_count}}. For each card, output exactly: - Target word (in {{target_l2}}) - Part of speech - IPA transcription - Translation in {{source_l1}} - Two example sentences in natural {{theme}} context, B1 syntax, 8–14 words each Words to include: {{word_list_yaml}} End with a 3-card recap of the most useful 3 words from the deck.

Visual style is controlled by the

designStyle
parameter (custom prompt) or left to the default ("clean infographic, no photographs, balanced typography"). Keep prompts versioned in git. When a prompt changes, log the version with each generated deck so you can correlate quality regressions with prompt changes.

Step 5 β€” Add narration

Once the generation job is

status: "success"
, kick off narration. Narration is async-only and operates on the same
jobId
:

curl -X POST "https://2slides.com/api/v1/slides/generate-narration" \ -H "Authorization: Bearer sk-2slides-..." \ -H "Content-Type: application/json" \ -d '{ "jobId": "550e8400-e29b-41d4-a716-446655440000", "mode": "single", "voice": "Puck", "speakerName": "Vocabulary Coach", "contentMode": "concise", "includeIntro": true }'

Then poll the same

/api/v1/jobs/{jobId}
until the message transitions from "Voice narration generation in progress" to a success state.

Two voice patterns work well for vocabulary cards:

  • mode: "single"
    with one voice β€” straightforward word + IPA + sentence reading
  • mode: "multi"
    with two voices β€” example sentences split between speakers, ideal for verbs and idioms

The voice catalog is published at

/tts_sample_voices/
. Common picks include
Puck
,
Aoede
,
Charon
,
Kore
. Confirm support with the latest API docs before pinning to a specific voice in production.

Important: this single endpoint generates both voice text and voice audio. Do not call separate "voice text" and "voice audio" endpoints β€” there is no public API for those steps independently. Configure the narration request once and the API does both.

Step 6 β€” Export pages and voices (free)

Once narration completes, retrieve all assets in a single ZIP:

curl -X POST "https://2slides.com/api/v1/slides/download-slides-pages-voices" \ -H "Authorization: Bearer sk-2slides-..." \ -H "Content-Type: application/json" \ -d '{ "jobId": "550e8400-e29b-41d4-a716-446655440000" }'

The response includes a

downloadUrl
(valid for 1 hour) for a ZIP that contains:

pages/ page_01.png page_02.png ... voices/ page_01.wav page_02.wav ... transcript.txt

This export is free β€” no credits consumed. Download the ZIP and store the assets in your own object store. The presigned URL expires after 1 hour.

Step 7 β€” (Optional) Compose MP4 server-side

The 2Slides API does not currently expose an MP4 composition endpoint β€” MP4 generation lives in the Workspace UI via FFmpeg.wasm in the browser. For a content factory, compose MP4 server-side with

ffmpeg
:

# For each page, build a clip of (image still) + (voice audio). ffmpeg -loop 1 -i pages/page_01.png -i voices/page_01.wav \ -c:v libx264 -tune stillimage -c:a aac -b:a 192k \ -pix_fmt yuv420p -shortest clips/page_01.mp4 # Concatenate all per-page clips into the final MP4. ffmpeg -f concat -safe 0 -i clip_list.txt -c copy final.mp4

The audio cadence per page is whatever the narration generator produced β€” typically 5–12 seconds per slide for vocabulary cards. The result is the same MP4 a user would download from the Workspace UI, but produced headlessly in your factory pipeline.

If you want vertical (9:16) and horizontal (16:9) variants of the same deck, the cleanest path is to generate the deck twice at different aspect ratios at the slide-generation stage (

aspectRatio: "9:16"
vs
"16:9"
). FFmpeg cropping after the fact often produces ugly results because the slides were laid out for a specific aspect.

Step 8 β€” Build the orchestrator

A minimal orchestrator handles five loops:

# Pseudocode while there_is_work(): deck = pull_one_pending_deck_from_source() if not deck: sleep(60); continue # 1. Generate slides via Nano Banana endpoint job = post("/api/v1/slides/create-pdf-slides", body=build_payload(deck)) deck_artifact = poll_until_complete(job.data.jobId) # 2. Narrate (async only) if deck.narration.enabled: post("/api/v1/slides/generate-narration", body={ "jobId": deck_artifact.id, "voice": deck.narration.voice, "mode": deck.narration.mode, }) poll_until_narration_complete(deck_artifact.id) # 3. Export pages + voices ZIP (free) zip_url = post("/api/v1/slides/download-slides-pages-voices", body={"jobId": deck_artifact.id}) # 4. Download and store assets in your object store download_to_s3(zip_url, deck.id) # 5. (Optional) compose MP4 with ffmpeg, then distribute if deck.distribution.social: compose_mp4(deck.id) distribute(deck)

Run this on a worker box with a queue. For 100 decks per day, one worker is plenty. For 1,000+, fan out to a small worker pool β€” but ensure the pool respects each API endpoint's rate limits, not just worker count.

Step 9 β€” Distribution patterns

The distribution layer turns artifacts into business value:

  • LMS: upload composed MP4 to Canvas / Moodle / Blackboard / Google Classroom via their respective APIs
  • TikTok / Reels / Shorts: queue 9:16 MP4 to a posting tool (Buffer, Later, native scheduler), one per day
  • Newsletter: embed the PDF compilation (from the original generation job's
    downloadUrl
    ) as a download link in the weekly issue
  • Sales / lead magnet: upload the PDF to a Stan Store / Gumroad page; the carousel teaser drives traffic

Don't try to invent distribution. Use the platform-native APIs and let your orchestrator drop a row in your scheduler.

Cost math (the part to plan for first)

For Nano Banana decks with narration, the credits add up faster than the Fast PPT pricing some readers may have seen before. The math per 30-card deck (1K/2K resolution, with narration):

  • Planning: 10 credits
  • Slide generation: 30 Γ— 100 = 3,000 credits
  • Narration (text + audio): 30 Γ— 210 = 6,300 credits
  • Pages + voices export: 0 credits (free)
  • Total: ~9,310 credits per narrated 30-card deck

Without narration, the same deck is ~3,010 credits. At 4K resolution, double the slide-generation portion: 30 Γ— 200 = 6,000 β†’ ~12,310 credits with narration.

For a 100-deck/month factory: 100 Γ— 9,310 = ~931,000 credits/month. Compare against the pricing page to choose a tier β€” and budget for 4K only when the output is going to a context that benefits from it (large screens, premium video). For TikTok / Reels review videos, 1K or 2K is plenty.

Operational patterns that prevent fires

Idempotency

Every deck submission should be idempotent on

deck_id
. If your worker crashes mid-batch, restarting the queue must not produce duplicate decks. The cleanest pattern: store
(deck_id, status)
in a database row; transition states (
pending β†’ generating β†’ narrating β†’ exporting β†’ composed β†’ distributed
).

Quality gates

Don't auto-distribute. Before pushing to TikTok or Canvas, run a machine-readable quality check on the artifact:

  • Page count matches the requested count
  • ZIP contains the expected number of
    pages/page_NN.png
    and
    voices/page_NN.wav
    files
  • Audio duration per page is between 3 and 15 seconds (a 30-second card almost always means a hallucinated long script)
  • transcript.txt
    is non-empty and contains the target words

For the first 50 batches, also do a manual spot check of 1 in 10 decks. The first 50 batches are where systemic prompt issues surface.

Versioning

Every artifact stores: prompt template version, image model version (

gemini-3-pro-image-preview
vs
gemini-3.1-flash-image-preview
), narration voice, generation timestamp. When the model improves or a prompt changes, you can re-run only the affected decks.

Cost telemetry

Each deck has a known credit cost (see math above). Track credits consumed per deck. When credit usage per deck doubles unexpectedly, something changed (page count drift, retries, switched to 4K). Find it before the credit bill catches you off guard.

Failure handling

A failed job is normal β€” network blip, model load, rare 5xx. Retry once after a backoff. After two failures, push the deck to a

needs_human
queue. Don't loop infinitely.

Build vs buy: when to use the API at all

The API is the right answer when:

  • You produce >10 decks/week
  • You have structured source data
  • You need narrated MP4s that you'll compose server-side and distribute
  • You integrate with an LMS, scheduler, or CMS
  • You want reproducibility under prompt versioning

The API is overkill when:

  • You produce 1 deck a week and tune visually each time
  • You're a learner building decks for personal study (the UI is faster β€” and the Workspace UI also does the MP4 composition for you)
  • You're a teacher building one deck per lesson (use Create Slides from File or Create Slides Like This and skip the orchestration)

Frequently Asked Questions

Where do I get an API key?

2slides.com/api. Keys live in the API management tab.

Why can't I add narration to a
/api/v1/slides/generate
job?

The

generate
endpoint is Fast PPT β€” template-driven PPTX. Its output is a finalized .pptx file, not a slide-image-plus-text job that the narration generator can read. The narration generator explicitly only accepts jobs from
create-pdf-slides
or
create-like-this
, which produce nano banana slide jobs with structured per-page content.

Can I export MP4 directly from the API?

No, not as of 2026. MP4 export is a Workspace UI feature implemented client-side with FFmpeg.wasm. The API equivalent is

download-slides-pages-voices
which returns a ZIP of slide images, audio files, and a transcript β€” you compose the MP4 yourself with
ffmpeg
if you need it in a content factory pipeline. See Step 7.

What languages does the API support for generation?

22+ languages including Spanish, French, German, Arabic, Japanese, Korean, Hindi, Vietnamese, Russian, Polish, Italian, Portuguese, Indonesian, Thai, Turkish, and Chinese (Simplified/Traditional). Pass via

responseLanguage
.

What's the credit cost?

For Nano Banana decks: 10 (planning) + 100/slide at 1K/2K (or 200/slide at 4K) for slide generation, plus 210/page (10 text + 200 audio) for narration. Pages + voices export is free. A 30-card narrated deck at 2K is ~9,310 credits. See the pricing page and the cost-math section above.

How do I handle 429 rate limits?

Exponential backoff. Start at 1s, double up to 60s. After three consecutive 429s, slow your concurrent worker count by half. Do not poll

/api/v1/jobs/{id}
faster than every 20 seconds β€” that's the most common cause of 429.

Can I integrate with Zapier / Make / n8n?

Yes β€” any tool that can make authenticated HTTP requests can drive the 2Slides API. n8n in particular is popular for content factories because it handles the polling and queue patterns natively.

How do I prevent generated decks from being indexed publicly?

Decks generated via the API are private to your account by default. Public sharing is a separate explicit action.

How do I generate vertical (9:16) and horizontal (16:9) versions of the same deck?

Generate the deck twice β€” once with

aspectRatio: "9:16"
and once with
aspectRatio: "16:9"
. Slides are laid out per aspect ratio at generation time, so post-hoc cropping rarely looks good. Yes, this means doubling the credit cost; that's a deliberate tradeoff for clean visuals.

The takeaway

A content factory is structured source data + a stable orchestrator + the right API endpoints. The 2Slides API is the third piece; you're responsible for the first two. The single most common factory failure is using

/api/v1/slides/generate
(Fast PPT) and then trying to narrate it β€” that path is closed. Use
create-pdf-slides
or
create-like-this
instead, narrate with
generate-narration
, export with
download-slides-pages-voices
, and compose MP4 server-side with
ffmpeg
.

For the manual side of the same workflow, see the vocabulary cards guide and the creator workflow guide. The UI patterns there are the same patterns you're automating with the API; understanding the manual flow first makes the API integration much faster.

About 2Slides

Create stunning AI-powered presentations in seconds. Transform your ideas into professional slides with 2slides AI Agent.

Try For Free