From ec8d5a612469ef7b5d02191136deaa522c8715e8 Mon Sep 17 00:00:00 2001 From: Benjamin Hardy Date: Mon, 9 Mar 2026 22:45:24 +0100 Subject: [PATCH] chore: added documentation --- docs/authentication.md | 78 +++++++++++++++++++++++++++ docs/database.md | 104 ++++++++++++++++++++++++++++++++++++ docs/docker-deployment.md | 109 ++++++++++++++++++++++++++++++++++++++ docs/file-management.md | 94 ++++++++++++++++++++++++++++++++ docs/frontend.md | 103 +++++++++++++++++++++++++++++++++++ docs/job-management.md | 92 ++++++++++++++++++++++++++++++++ docs/monochrome.md | 106 ++++++++++++++++++++++++++++++++++++ docs/onboarding.md | 108 +++++++++++++++++++++++++++++++++++++ docs/unified.md | 66 +++++++++++++++++++++++ docs/votify.md | 103 +++++++++++++++++++++++++++++++++++ firebase-debug.log | 0 11 files changed, 963 insertions(+) create mode 100644 docs/authentication.md create mode 100644 docs/database.md create mode 100644 docs/docker-deployment.md create mode 100644 docs/file-management.md create mode 100644 docs/frontend.md create mode 100644 docs/job-management.md create mode 100644 docs/monochrome.md create mode 100644 docs/onboarding.md create mode 100644 docs/unified.md create mode 100644 docs/votify.md delete mode 100644 firebase-debug.log diff --git a/docs/authentication.md b/docs/authentication.md new file mode 100644 index 0000000..fa63adb --- /dev/null +++ b/docs/authentication.md @@ -0,0 +1,78 @@ +# Authentication & Authorization + +## Overview + +Trackpull uses session-based authentication backed by a SQLite database. There are two roles: **admin** and **user**. Passwords are hashed using werkzeug's PBKDF2-based scheme — no plaintext is ever stored. + +--- + +## Login Flow + +1. User submits credentials via `POST /login`. +2. `get_user_by_username()` looks up the record in the `users` table. +3. `check_password_hash()` verifies the submitted password against the stored hash. +4. On success, Flask session is populated with `user_id`, `username`, and `role`. +5. User is redirected to the main app. On failure, the login page re-renders with an error. + +Logout is a simple `GET /logout` that clears the session and redirects to `/login`. + +--- + +## Authorization Enforcement + +A `@app.before_request` hook runs before every request. If the session lacks a `user_id`, the request is redirected to `/login`. + +Public (unauthenticated) routes are whitelisted: +- `/login` +- `/logout` +- `/static/*` +- `/offline` +- `/sw.js` + +Admin-only routes check `session["role"] == "admin"` via a `require_admin()` helper. Unauthorized admin access returns `403`. + +--- + +## Role Permissions + +| Action | User | Admin | +|--------|------|-------| +| Download (Votify/Monochrome/Unified) | Yes | Yes | +| View own jobs & files | Yes | Yes | +| Cancel/delete own jobs | Yes | Yes | +| Change own password | Yes | Yes | +| View any user's jobs/files | No | Yes | +| Manage users (create/delete/reset) | No | Yes | +| Upload cookies.txt / device.wvd | No | Yes | +| Change global settings | No | Yes | + +--- + +## Admin Seeding + +On first run, if no users exist in the database, an admin account is created automatically from `ADMIN_USERNAME` and `ADMIN_PASSWORD` environment variables (see [docker-deployment.md](docker-deployment.md)). + +--- + +## Session Security + +- Sessions are encrypted using Flask's `SECRET_KEY` env var (should be a 32-byte random hex string). +- Sessions survive application restarts because the key is stable. +- There is no token-based auth or "remember me" — sessions expire when the browser closes by default. + +--- + +## Password Management + +- `update_user_password()` in `db.py` re-hashes and saves a new password. +- Users change their own password at `POST /api/account/password`. +- Admins reset any user's password at `POST /api/admin/users//password`. + +--- + +## Key Files + +| File | Relevance | +|------|-----------| +| [app.py](../app.py) | Route definitions, `before_request` hook, `require_admin()` | +| [db.py](../db.py) | `create_user`, `get_user_by_username`, `verify_password`, `update_user_password` | diff --git a/docs/database.md b/docs/database.md new file mode 100644 index 0000000..6fba245 --- /dev/null +++ b/docs/database.md @@ -0,0 +1,104 @@ +# Database + +## Overview + +Trackpull uses a single SQLite database at `/config/trackpull.db`. All access goes through `db.py`, which uses thread-local connections to stay safe under multi-threaded Gunicorn workers. + +--- + +## Schema + +### `users` + +| Column | Type | Notes | +|--------|------|-------| +| `id` | TEXT PK | UUID | +| `username` | TEXT UNIQUE | Login name | +| `password_hash` | TEXT | werkzeug PBKDF2 hash | +| `role` | TEXT | `admin` or `user` | +| `created_at` | TEXT | ISO datetime | +| `last_login` | TEXT | ISO datetime, nullable | + +### `jobs` + +| Column | Type | Notes | +|--------|------|-------| +| `id` | TEXT PK | UUID | +| `user_id` | TEXT FK → users | Cascading delete | +| `urls` | TEXT | JSON array of download URLs | +| `options` | TEXT | JSON object of download parameters | +| `status` | TEXT | `queued`, `running`, `completed`, `failed`, `cancelled` | +| `output` | TEXT | JSON array of log lines (max 500) | +| `command` | TEXT | Full CLI command string (Votify jobs) | +| `return_code` | INTEGER | Process exit code | +| `created_at` | TEXT | ISO datetime | +| `updated_at` | TEXT | ISO datetime | + +### `app_settings` + +| Column | Type | Notes | +|--------|------|-------| +| `key` | TEXT PK | Setting name | +| `value` | TEXT | Setting value (always a string) | + +Current settings keys: `fallback_quality`, `job_expiry_days`. + +--- + +## Threading Model + +`db.py` creates a new connection per thread using `threading.local()`. Each call opens a connection, runs the query, and closes it. This avoids SQLite's "objects created in a thread can only be used in that same thread" limitation. + +Foreign keys are enabled on every connection via `PRAGMA foreign_keys = ON`. + +--- + +## Key Functions + +### Users + +| Function | Purpose | +|----------|---------| +| `create_user(username, password, role)` | Hashes password and inserts row | +| `get_user_by_username(username)` | Lookup for login | +| `get_user_by_id(user_id)` | Lookup for session validation | +| `list_users()` | Admin user list | +| `delete_user(user_id)` | Cascades to jobs | +| `verify_password(username, password)` | Returns user row or None | +| `update_user_password(user_id, new_password)` | Re-hashes and saves | +| `update_last_login(user_id)` | Stamps `last_login` | + +### Jobs + +| Function | Purpose | +|----------|---------| +| `upsert_job(job_dict)` | Insert or replace job record | +| `get_job(job_id)` | Fetch single job | +| `list_jobs_for_user(user_id)` | All jobs for a user | +| `delete_job(job_id)` | Remove single job | +| `delete_jobs_older_than(days)` | Expiry cleanup | + +### Settings + +| Function | Purpose | +|----------|---------| +| `get_setting(key, default)` | Fetch value with fallback | +| `set_setting(key, value)` | Upsert a setting | +| `get_all_settings()` | Return all as dict | + +--- + +## Initialization + +`db.py` calls `init_db()` on import, which: +1. Creates all tables if they don't exist. +2. Seeds the first admin user from `ADMIN_USERNAME` / `ADMIN_PASSWORD` env vars if the `users` table is empty. + +--- + +## Key Files + +| File | Relevance | +|------|-----------| +| [db.py](../db.py) | Entire database layer | +| [app.py](../app.py) | Calls db functions for all CRUD operations | diff --git a/docs/docker-deployment.md b/docs/docker-deployment.md new file mode 100644 index 0000000..d55b12e --- /dev/null +++ b/docs/docker-deployment.md @@ -0,0 +1,109 @@ +# Docker & Deployment + +## Overview + +Trackpull is fully containerized with Docker. The `docker-compose.yml` handles all volume mounts and environment configuration. A single `docker compose up -d --build` is enough to get a running instance. + +--- + +## Quick Start + +```bash +cp .env.example .env +# Edit .env: set ADMIN_USERNAME, ADMIN_PASSWORD, SECRET_KEY, and PORT +docker compose up -d --build +``` + +The app will be available at `http://localhost:{PORT}` (default: 5000). + +--- + +## Environment Variables + +All configuration goes in `.env`. Copy `.env.example` to get started. + +| Variable | Default | Description | +|----------|---------|-------------| +| `ADMIN_USERNAME` | — | Username for the seeded admin account | +| `ADMIN_PASSWORD` | — | Password for the seeded admin account | +| `SECRET_KEY` | — | Flask session key; use a 32-byte random hex string | +| `PORT` | `5000` | Host port the app is exposed on | +| `HOST_DOWNLOADS_DIR` | `./downloads` | Host path for downloaded files | +| `HOST_CONFIG_DIR` | `./config` | Host path for DB, cookies, device cert | +| `DOWNLOADS_DIR` | `/downloads` | Container-internal downloads path (rarely changed) | +| `COOKIES_PATH` | `/config/cookies.txt` | Path to Spotify cookies file inside the container | +| `CONFIG_DIR` | `/config` | Config directory inside the container | +| `WVD_PATH` | `/config/device.wvd` | Path to Widevine device certificate inside the container | + +> **Important**: `SECRET_KEY` must be a stable secret. Changing it invalidates all active sessions. + +--- + +## Volumes + +| Host path (from `.env`) | Container path | Contents | +|-------------------------|----------------|---------| +| `HOST_DOWNLOADS_DIR` | `/downloads` | Per-user download directories | +| `HOST_CONFIG_DIR` | `/config` | SQLite DB, cookies.txt, device.wvd | + +Both directories are created automatically by Docker if they don't exist. + +--- + +## Dockerfile Summary + +**Base image**: `python:3.12-slim` + +**System packages installed**: +- `ffmpeg` — audio conversion +- `aria2` — download manager +- `git`, `curl`, `unzip` — tooling + +**Binaries installed**: +- **Bento4 `mp4decrypt`** — MP4 DRM decryption (version 1.6.0-641, downloaded from bok.net) + +**Python packages**: +- From `requirements.txt`: Flask, gunicorn, mutagen, werkzeug, etc. +- `websocket-client` — WebSocket support +- `votify-fix` — Spotify downloader (installed from GitHub: GladistonXD/votify-fix) + +**Runtime command**: +``` +gunicorn --bind 0.0.0.0:5000 --workers 1 --threads 4 app:app +``` + +One worker with four threads keeps SQLite contention low while still handling concurrent requests. + +--- + +## Persistent Data + +| File | Created by | Purpose | +|------|-----------|---------| +| `/config/trackpull.db` | App on first run | All users, jobs, settings | +| `/config/cookies.txt` | Admin upload | Spotify auth for Votify | +| `/config/device.wvd` | Admin upload | Widevine cert for Votify | + +Back up `/config/` to preserve all user data between rebuilds. + +--- + +## Updating + +```bash +docker compose down +docker compose up -d --build +``` + +The database and config files persist on the host, so user data survives rebuilds. + +--- + +## Key Files + +| File | Relevance | +|------|-----------| +| [Dockerfile](../Dockerfile) | Container image definition | +| [docker-compose.yml](../docker-compose.yml) | Service orchestration | +| [.env.example](../.env.example) | Environment template | +| [requirements.txt](../requirements.txt) | Python dependencies | diff --git a/docs/file-management.md b/docs/file-management.md new file mode 100644 index 0000000..bf50870 --- /dev/null +++ b/docs/file-management.md @@ -0,0 +1,94 @@ +# File Management + +## Overview + +Each user has an isolated directory under `/downloads/{user_id}/`. The frontend provides a browser-style file tree with support for downloading individual files, downloading folders as ZIP archives, and deleting files or folders. + +--- + +## Directory Structure + +``` +/downloads/ +└── {user_id}/ + └── {collection or album name}/ + ├── Track 1.flac + ├── Track 2.flac + └── cover.jpg +``` + +Single-track downloads are automatically wrapped in a folder named after the track. + +--- + +## Security + +All file access goes through a path traversal check: + +1. The requested relative path is joined with the user's base directory. +2. `.resolve()` canonicalizes the result (expands `..`, symlinks, etc.). +3. The resolved path is checked to ensure it starts with the resolved user directory. +4. Any path that escapes the user directory returns `400 Bad Request`. + +Admins can browse any user's directory through dedicated admin endpoints that accept a `user_id` parameter. + +--- + +## API Endpoints + +### User endpoints + +| Endpoint | Method | Description | +|----------|--------|-------------| +| `/api/files` | GET | List directory contents at `?path=` (relative to user root) | +| `/api/files/download` | GET | Download a single file at `?path=` | +| `/api/files/download-folder` | GET | Download a directory as a ZIP at `?path=` | +| `/api/files/delete` | DELETE | Delete a file or directory at `?path=` | + +### Admin endpoints + +| Endpoint | Method | Description | +|----------|--------|-------------| +| `/api/admin/files` | GET | List a specific user's directory; requires `?user_id=` and optional `?path=` | +| `/api/admin/files/download` | GET | Download a file from any user's directory | +| `/api/admin/files/download-folder` | GET | Download a folder as ZIP from any user's directory | +| `/api/admin/files/delete` | DELETE | Delete from any user's directory | + +--- + +## Directory Listing Response + +```json +[ + { "name": "Album Name", "path": "Album Name", "is_dir": true }, + { "name": "Track.flac", "path": "Album Name/Track.flac", "is_dir": false, "size": 24601234 } +] +``` + +Directories always appear before files. Paths are relative to the user's root. + +--- + +## ZIP Downloads + +Folder downloads are streamed directly as a ZIP file using Python's `zipfile` module. Files are added with paths relative to the requested folder, so the ZIP extracts cleanly into a single directory. + +--- + +## Post-Processing (after download) + +After a Votify or Monochrome download completes, several cleanup steps run automatically: + +1. **Flatten nested directories** — removes redundant intermediate folders. +2. **Rename from metadata** — reads embedded ID3/FLAC tags and renames files to `Title - Artist.ext` (see `rename_from_metadata()` in `utils.py`). +3. **Wrap single tracks** — if a download produces only one file with no folder, it is moved into a named subfolder. +4. **Cleanup empty dirs** — removes any directories left empty after the above steps. + +--- + +## Key Files + +| File | Relevance | +|------|-----------| +| [app.py](../app.py) | All file route handlers | +| [utils.py](../utils.py) | `sanitize_filename`, `rename_from_metadata`, `cleanup_empty_dirs` | diff --git a/docs/frontend.md b/docs/frontend.md new file mode 100644 index 0000000..2e19075 --- /dev/null +++ b/docs/frontend.md @@ -0,0 +1,103 @@ +# Frontend + +## Overview + +The frontend is a vanilla JavaScript single-page-style app served by Flask from `templates/index.html`. It uses no frontend framework. The UI has a dark theme with Spotify green (#1db954) accents and is fully responsive. + +--- + +## Pages / Tabs + +### Unified Download (default) +- Textarea for one or more Spotify URLs. +- Single submit button that calls `POST /api/unified/download`. +- Buttons to switch to the Votify or Monochrome-specific tabs. + +### Votify Download +- Multi-URL textarea. +- Audio quality selector (AAC / Vorbis options). +- Collapsible "Advanced" section with format, cover size, video, and other toggles. +- Calls `POST /api/download`. + +### Monochrome Download +- Single URL input (track, album, or playlist). +- Quality selector (Tidal quality levels). +- Calls `POST /api/monochrome/download`. + +### Jobs +- Lists all jobs for the current user (live + historical). +- Status badges: `queued`, `running`, `completed`, `failed`, `cancelled`. +- Collapsible output log per job. +- Cancel button (running jobs), delete button (terminal jobs). +- Auto-refreshes every 3 seconds while any job is running. + +### Files +- Browser-style file tree rooted at the user's download directory. +- Breadcrumb navigation. +- Download individual files or entire folders (as ZIP). +- Delete files or folders with confirmation. + +### Settings +- **All users**: Change own account password. +- **Admins only**: + - Upload `cookies.txt` (Spotify authentication). + - Upload `device.wvd` (Widevine certificate). + - Set fallback quality (used by Unified system). + - Set job expiry (days before old jobs are auto-deleted). + +### Admin: Users (admin only) +- List all users with creation date and last login. +- Create new users (username, password, role). +- View any user's jobs or files. +- Delete users (with confirmation prompt). +- Reset any user's password. + +--- + +## Responsive Design + +- **Desktop**: Horizontal tab bar at the top. +- **Mobile**: Bottom navigation bar. +- Minimum tap target size: 36×36px. +- Font sizes scale down on smaller screens. + +--- + +## Progressive Web App (PWA) + +The app can be installed as a standalone PWA. + +### Manifest +`/static/manifest.json` defines the app name, theme color, icons (192×192 and 512×512), and `standalone` display mode. + +### Service Worker +`/static/sw.js` provides: + +| Resource type | Strategy | +|---------------|----------| +| API calls (`/api/*`) | Network only (no caching) | +| Page navigation | Network first, fallback to offline page | +| Static assets | Cache first, network fallback | + +The service worker caches the app shell (HTML, CSS, JS, icons) on install and clears old caches on activation. If the network is unavailable during navigation, `/static/offline.html` is served. + +--- + +## Templates + +| File | Purpose | +|------|---------| +| [templates/index.html](../templates/index.html) | Main app (all tabs, JS logic, CSS) | +| [templates/login.html](../templates/login.html) | Login form | + +--- + +## Key Files + +| File | Relevance | +|------|-----------| +| [templates/index.html](../templates/index.html) | Entire UI: HTML, CSS, JavaScript | +| [templates/login.html](../templates/login.html) | Login page | +| [static/manifest.json](../static/manifest.json) | PWA manifest | +| [static/sw.js](../static/sw.js) | Service worker | +| [static/icons/](../static/icons/) | App icons for PWA | diff --git a/docs/job-management.md b/docs/job-management.md new file mode 100644 index 0000000..b95a0ee --- /dev/null +++ b/docs/job-management.md @@ -0,0 +1,92 @@ +# Job Management + +## Overview + +Downloads run as background jobs. Each job is tracked in-memory during execution and persisted to the database on completion. The frontend polls for status updates every ~1–3 seconds. + +--- + +## Job Lifecycle + +``` +User Request → Create Job (queued) → Spawn Thread → status: running + → [completed | failed | cancelled] → Upsert to DB +``` + +1. A job record is created in the in-memory `jobs` dict with `status: queued`. +2. A Python thread is spawned to run the download function. +3. The thread updates `status`, `output`, and `return_code` in-memory as it runs. +4. On finish (success, failure, or cancellation), the job is upserted into SQLite. + +--- + +## In-Memory Job Structure + +```python +{ + "id": str, # UUID + "user_id": str, # Owner + "urls": list[str], # Spotify URLs + "options": dict, # Download parameters + "status": str, # queued | running | completed | failed | cancelled + "output": list[str], # Log lines (capped at 500) + "command": str, # CLI command string (Votify jobs only) + "return_code": int, # Process exit code + "process": Popen, # Subprocess handle (for cancellation) + "created_at": float, # Unix timestamp +} +``` + +`process` is only present while the job is running; it is not persisted to the database. + +--- + +## Output Streaming + +- The subprocess stdout is read line-by-line in the runner thread. +- Each line is appended to `job["output"]`. +- The list is capped at 500 entries (oldest lines are dropped first). +- The frontend reads output via `GET /api/jobs/` and displays the log incrementally. + +--- + +## Cancellation + +1. Frontend calls `POST /api/jobs//cancel`. +2. Job `status` is set to `cancelled` in-memory. +3. `process.terminate()` is called on the Popen handle. +4. The runner thread detects the cancellation flag and logs a cancellation message. +5. The job is then upserted to the database with `status: cancelled`. + +--- + +## Job Expiry + +A background daemon thread runs hourly and calls `delete_jobs_older_than(days)` where `days` comes from the `job_expiry_days` setting (default: 30). This removes old job records from the database but does **not** delete downloaded files. + +--- + +## API Endpoints + +| Endpoint | Method | Purpose | +|----------|--------|---------| +| `/api/jobs` | GET | List all jobs for the current user | +| `/api/jobs/` | GET | Get a single job (status + output) | +| `/api/jobs//cancel` | POST | Cancel a running job | +| `/api/jobs/` | DELETE | Delete a completed/failed/cancelled job record | +| `/api/admin/users//jobs` | GET | Admin view of any user's jobs | + +--- + +## Database Persistence + +Jobs are only written to SQLite when they reach a terminal state (`completed`, `failed`, `cancelled`). In-memory jobs from before a restart are lost, but completed jobs survive restarts via the database. The `GET /api/jobs` endpoint merges both sources: in-memory (for live jobs) and database (for historical). + +--- + +## Key Files + +| File | Relevance | +|------|-----------| +| [app.py](../app.py) | Job dict, route handlers, runner threads, expiry daemon | +| [db.py](../db.py) | `upsert_job`, `get_job`, `list_jobs_for_user`, `delete_jobs_older_than` | diff --git a/docs/monochrome.md b/docs/monochrome.md new file mode 100644 index 0000000..e8b087e --- /dev/null +++ b/docs/monochrome.md @@ -0,0 +1,106 @@ +# Monochrome API Architecture + +## How It Works +Monochrome is a web frontend that proxies audio from Tidal/Qobuz through distributed API instances. It does NOT host audio itself. + +## Instance Discovery +1. **Uptime monitor**: `https://tidal-uptime.jiffy-puffs-1j.workers.dev/` — returns list of live instances +2. **Hardcoded fallbacks** (some may go down over time): + - `https://monochrome.tf` + - `https://triton.squid.wtf` + - `https://qqdl.site` + - `https://monochrome.samidy.com` + - `https://api.monochrome.tf` +3. More instances listed at: https://github.com/monochrome-music/monochrome/blob/main/INSTANCES.md + +## API Endpoints (on any instance) + +### Stream/download: `GET /track/?id={trackId}&quality={quality}` +- **Response envelope**: `{"version": "2.x", "data": { ... }}` +- **Inside `data`**: `manifest` (base64), `audioQuality`, `trackId`, replay gain fields +- **`manifest` decodes to**: JSON `{"mimeType":"audio/flac","codecs":"flac","urls":["https://..."]}` +- Alternative: `OriginalTrackUrl` field (direct URL, skip manifest) + +### Metadata: `GET /info/?id={trackId}` +- Same envelope wrapping +- Returns: title, duration, artist, artists[], album (with cover UUID), trackNumber, volumeNumber, copyright, isrc, streamStartDate, bpm, key, explicit, etc. + +### Search: `GET /search/?s={query}` (tracks), `?a=` (artists), `?al=` (albums), `?p=` (playlists) + +### Album: `GET /album/?id={albumId}&offset={n}&limit=500` + +### Qobuz alternative: `https://qobuz.squid.wtf/api` +- Search: `/get-music?q={query}` +- Stream: `/download-music?track_id={id}&quality={qobuzQuality}` +- Quality mapping: 27=MP3_320, 7=FLAC, 6=HiRes96/24, 5=HiRes192/24 +- Track IDs prefixed with `q:` in the frontend (e.g. `q:12345`) + +## Quality Levels (Tidal instances) +| Quality param | What you get | Works? | +|-------------------|-----------------------|--------| +| HI_RES_LOSSLESS | Best available (FLAC) | Yes | +| HI_RES | FLAC | Yes | +| LOSSLESS | 16-bit/44.1kHz FLAC | Yes | +| HIGH | AAC 320kbps | Yes | +| LOW | AAC 96kbps | Yes | +| MP3_320 | N/A | **404** — not a valid API quality | + +## Manifest Decoding (3 types) +1. **JSON** (most common): `{"mimeType":"audio/flac","urls":["https://lgf.audio.tidal.com/..."]}` — use `urls[0]` +2. **DASH XML**: Contains `` — extract `` if present, otherwise needs dash.js (unsupported in CLI) +3. **Raw URL**: Just a URL string in the decoded base64 + +## Cover Art +- Source: Tidal CDN +- URL pattern: `https://resources.tidal.com/images/{cover_uuid_with_slashes}/{size}x{size}.jpg` +- The album `cover` field is a UUID like `d8170d28-d09b-400a-ae83-6c9dea002b4d` +- Replace `-` with `/` to form the path: `d8170d28/d09b/400a/ae83/6c9dea002b4d` +- Common sizes: 80, 160, 320, 640, 1280 + +## Search Response Structure +- **Endpoint**: `GET /search/?s={query}` +- **Response envelope**: `{"version": "2.x", "data": {"limit": 25, "offset": 0, "totalNumberOfItems": N, "items": [...]}}` +- **Each item** in `items[]`: + - `id` (Tidal track ID), `title`, `duration`, `trackNumber`, `volumeNumber` + - `artist`: `{"id": N, "name": "...", "picture": "uuid"}` + - `artists`: array of artist objects (same shape) + - `album`: `{"id": N, "title": "...", "cover": "uuid"}` + - `isrc`, `copyright`, `bpm`, `key`, `explicit`, `audioQuality`, `popularity` +- **Important**: results are inside `data.items[]`, not `data` directly + +## Frontend Retry Logic +- Randomize instance order +- Try each instance up to `instances.length * 2` times +- 429 (rate limit): 500ms delay, next instance +- 401/5xx: next instance +- Network error: 200ms delay, next instance + +## Spotify URL Converter (spotify_to_ids.py) + +### How It Works +1. **Parse Spotify URL** — regex extracts type (`track`/`album`/`playlist`) and ID +2. **Scrape metadata** — fetches `https://open.spotify.com/embed/{type}/{id}`, extracts `__NEXT_DATA__` JSON from HTML +3. **Extract tracks** — navigates `props.pageProps.state.data.entity` for title/artist + - Track: `name` + `artists[0].name` (or `subtitle`) + - Album/Playlist: `trackList[]` array with `title` + `subtitle` per track +4. **Fallback** — if embed scraping fails, uses oEmbed API (`/oembed?url=...`) for single tracks (title only, no artist separation) +5. **Search Monochrome** — `GET {instance}/search/?s={artist}+{title}`, unwrap envelope, get `data.items[]` +6. **Fuzzy match** — normalize strings (strip feat/remaster/punctuation), Jaccard token overlap, weighted 60% title + 40% artist + +### Code Structure (spotify_to_ids.py) +- `parse_spotify_url(url)` — regex URL parsing → `(type, id)` +- `fetch_spotify_embed(sp_type, sp_id)` — scrape embed page `__NEXT_DATA__` JSON +- `fetch_spotify_oembed(sp_type, sp_id)` — oEmbed API fallback +- `extract_tracks(embed_data, sp_type, sp_id)` — navigate JSON → list of `{title, artist}` +- `normalize(text)` — lowercase, strip feat/remaster/punctuation +- `similarity(a, b)` — Jaccard token overlap ratio +- `find_best_match(results, title, artist, threshold)` — weighted scoring of search results +- `search_monochrome(instances, query)` — search API with envelope unwrapping +- Shared utilities copied from download.py: `fetch()`, `fetch_json()`, `discover_instances()` + +### Key Gotchas +1. Spotify embed page structure (`__NEXT_DATA__`) is fragile — may break if Spotify redesigns +2. oEmbed fallback only works for single tracks, not albums/playlists +3. Remixes and live versions often fail to match (different titles on Spotify vs Tidal) +4. 0.5s delay between searches to avoid Monochrome 429 rate limits +5. All progress/errors go to stderr; only track IDs go to stdout (for piping) diff --git a/docs/onboarding.md b/docs/onboarding.md new file mode 100644 index 0000000..7bd058b --- /dev/null +++ b/docs/onboarding.md @@ -0,0 +1,108 @@ +# Trackpull — Onboarding Guide + +## What Is Trackpull? + +Trackpull is a self-hosted, multi-user web application for downloading music from Spotify URLs. It supports two download backends that can be used independently or together via a smart fallback system: + +- **Monochrome** — proxies audio from Tidal/Qobuz through distributed API instances. Produces high-quality lossless or MP3 files without requiring Spotify credentials. +- **Votify** — downloads directly from Spotify using cookies-based authentication. Requires a valid `cookies.txt` and optionally a Widevine device certificate. +- **Unified** — tries Monochrome first, falls back to Votify for any tracks that fail. This is the default and recommended mode. + +The app is containerized with Docker and stores all state in a mounted volume. A SQLite database tracks users, jobs, and settings. + +--- + +## Systems at a Glance + +| System | Document | +|--------|----------| +| Docker & deployment | [docker-deployment.md](docker-deployment.md) | +| Authentication & roles | [authentication.md](authentication.md) | +| Database schema & layer | [database.md](database.md) | +| Job management | [job-management.md](job-management.md) | +| File management | [file-management.md](file-management.md) | +| Votify download system | [votify.md](votify.md) | +| Monochrome download system | [monochrome.md](monochrome.md) | +| Unified download system | [unified.md](unified.md) | +| Frontend (UI & PWA) | [frontend.md](frontend.md) | + +--- + +## Getting Started + +### 1. Deploy + +```bash +cp .env.example .env +# Set ADMIN_USERNAME, ADMIN_PASSWORD, SECRET_KEY, PORT +docker compose up -d --build +``` + +See [docker-deployment.md](docker-deployment.md) for all environment variables. + +### 2. Upload credentials (admin) + +Log in with your admin account, go to **Settings**, and upload: +- `cookies.txt` — Netscape-format Spotify cookies (required for Votify). +- `device.wvd` — Widevine device certificate (required for some Spotify content). + +Monochrome does not require any credentials. + +### 3. Download music + +Paste one or more Spotify track, album, or playlist URLs into the Unified tab and click Download. The job will appear in the Jobs tab with live output. + +--- + +## Architecture Overview + +``` +Browser (index.html) + │ + ├── POST /api/unified/download + ├── POST /api/download (Votify) + └── POST /api/monochrome/download + │ + ▼ + app.py (Flask / Gunicorn) + │ + ├── db.py (SQLite via thread-local connections) + │ + ├── Votify path: + │ subprocess: votify CLI → ffmpeg (optional MP3 conversion) + │ output dir: /downloads/{user_id}/ + │ + └── Monochrome path: + monochrome/api.py + ├── monochrome/__init__.py (instance discovery) + ├── monochrome/spotify_to_ids.py (Spotify scraping + Tidal search) + └── monochrome/download.py (stream, metadata embed, MP3 conversion) + output dir: /downloads/{user_id}/ +``` + +--- + +## Key Concepts + +**Jobs** are asynchronous. After submitting a download, the job runs in a background thread. Poll the Jobs tab for progress. Jobs survive page refreshes but in-memory state (active downloads) is lost on container restart. + +**Users** are isolated. Each user's files live in `/downloads/{user_id}/` and are not visible to other users. Admins can browse all users' files. + +**Settings** are global (per-application, not per-user). Admins configure fallback quality, job expiry, and upload credentials. + +**Monochrome instances** are third-party servers. If downloads fail, it may be because instances are down. The app automatically tries multiple instances and falls back to Votify. + +--- + +## Maintaining This Documentation + +> **Keep these docs up to date.** When you add, change, or remove a feature, update the relevant document in `docs/`. If you add a new system, create a new document and add it to the table above in this onboarding guide. + +Guidelines: +- One document per system. +- Document the *why* and *how*, not just the *what*. +- Update API endpoint tables when routes change. +- Update option tables when new download parameters are added. +- If a gotcha is discovered (e.g., a third-party API quirk), add it to the relevant doc. + +Stale documentation is worse than no documentation — readers will trust it and waste time on outdated information. diff --git a/docs/unified.md b/docs/unified.md new file mode 100644 index 0000000..1cfa18a --- /dev/null +++ b/docs/unified.md @@ -0,0 +1,66 @@ +# Unified Download System + +## Overview + +The Unified system is the recommended entry point for downloads. It attempts a Monochrome download first, then automatically falls back to Votify for any tracks that failed. This gives users high-quality lossless audio where available, with Spotify as the safety net. + +--- + +## Strategy + +``` +User submits Spotify URL + ↓ +Attempt Monochrome (MP3_320 quality) + ↓ + ├─ All tracks succeeded → done + └─ Some tracks failed + ↓ + Spawn Votify fallback job for failed URLs + (uses fallback_quality setting, default: aac-medium) +``` + +--- + +## Entry Point + +`POST /api/unified/download` → `run_unified_download()` in `app.py`. + +The main job tracks the overall status. If a Votify fallback is spawned, it appears as a separate job in the jobs list, linked by the log output of the main job. + +--- + +## Quality Settings + +| Stage | Quality | Source | +|-------|---------|--------| +| Monochrome attempt | `MP3_320` | Tidal/Qobuz via Monochrome | +| Votify fallback | `fallback_quality` setting | Spotify direct | + +The `fallback_quality` setting is configurable by admins in the Settings page. Default is `aac-medium`. + +--- + +## Output Layout + +Successfully downloaded tracks land in the Monochrome output directory (`/downloads/{user_id}/`). Failed tracks that go through the Votify fallback land in the same user directory under a separate subfolder created by that fallback job. + +--- + +## When to Use Each System Directly + +| Scenario | Recommendation | +|----------|---------------| +| Best quality, no fallback needed | Use Monochrome directly | +| Spotify-only, full quality control | Use Votify directly | +| Most tracks + automatic recovery | Use Unified (default) | + +--- + +## Key Files + +| File | Relevance | +|------|-----------| +| [app.py](../app.py) | `run_unified_download()`, route `/api/unified/download` | +| [monochrome.md](monochrome.md) | Monochrome system details | +| [votify.md](votify.md) | Votify fallback details | diff --git a/docs/votify.md b/docs/votify.md new file mode 100644 index 0000000..40cd9b6 --- /dev/null +++ b/docs/votify.md @@ -0,0 +1,103 @@ +# Votify Download System + +## Overview + +Votify is the primary Spotify download backend. It invokes the `votify-fix` CLI tool (a third-party Python package) as a subprocess, streams its output to the job log, and post-processes the resulting files. + +--- + +## How It Works + +1. User submits Spotify URLs and options via `POST /api/download`. +2. A job is created and a background thread runs `run_download()`. +3. `run_download()` builds a `votify` CLI command and launches it via `subprocess.Popen`. +4. stdout is streamed line-by-line into the job's output log. +5. On completion, post-processing runs: + - Flatten nested directories + - Rename files from embedded metadata + - Wrap single-track downloads in a folder + - Convert to MP3 if requested + +--- + +## Authentication + +Votify authenticates with Spotify using a `cookies.txt` file in Netscape format. This file must be uploaded by an admin via the Settings page before any downloads will succeed. + +Path: `/config/cookies.txt` (configurable via `COOKIES_PATH` env var) + +A Widevine device certificate (`device.wvd`) may also be required depending on the content. It is uploaded separately via Settings. + +Path: `/config/device.wvd` (configurable via `WVD_PATH` env var) + +--- + +## Download Options + +| Option | Values | Description | +|--------|--------|-------------| +| `audio_quality` | `aac-medium`, `aac-high`, `vorbis-low`, `vorbis-medium`, `vorbis-high` | Audio quality | +| `output_format` | `original`, `mp3` | Keep original format or convert to MP3 | +| `download_mode` | `ytdlp`, `aria2c` | Download backend | +| `save_cover` | bool | Save cover art as a separate image file | +| `save_playlist` | bool | Save playlist metadata file | +| `overwrite` | bool | Re-download if file already exists | +| `download_music_videos` | bool | Include music video downloads | +| `no_lrc` | bool | Skip LRC (lyrics) file generation | +| `video_format` | `mp4`, `webm` | Format for music videos | +| `cover_size` | `small`, `medium`, `large`, `extra-large` | Cover art resolution | +| `truncate` | int (optional) | Limit number of tracks to download | + +--- + +## MP3 Conversion + +If `output_format` is set to `mp3`, files are converted after download using `ffmpeg` at 320 kbps. The conversion preserves embedded metadata. Original files are deleted after successful conversion. + +--- + +## Cancellation + +The Popen process handle is stored on the job dict. `POST /api/jobs//cancel` calls `process.terminate()`, which sends SIGTERM to the votify subprocess. + +--- + +## Post-Processing Detail + +After the subprocess exits, `post_process_votify_files()` runs: + +1. **Snapshot before** — records which audio files existed before the download started (to identify new files). +2. **Flatten** — collapses single-subdirectory chains into the parent folder. +3. **Rename** — calls `rename_from_metadata()` to produce `Title - Artist.ext` filenames. +4. **Wrap singles** — if exactly one file downloaded with no enclosing folder, wraps it in a folder named after the file. +5. **Cleanup** — removes leftover empty directories. + +--- + +## External Dependencies + +| Dependency | Purpose | +|------------|---------| +| `votify-fix` (GitHub: GladistonXD/votify-fix) | Spotify download CLI | +| `ffmpeg` | MP3 conversion | +| `aria2c` | Optional download manager | +| `yt-dlp` | Default download manager | +| `mp4decrypt` (Bento4) | MP4 DRM decryption | + +--- + +## Limitations + +- Requires valid Spotify cookies (must be refreshed periodically when they expire). +- DRM-protected content requires a Widevine device certificate. +- Quality options are limited to what Votify and the Spotify API expose. + +--- + +## Key Files + +| File | Relevance | +|------|-----------| +| [app.py](../app.py) | `run_download()`, `post_process_votify_files()`, route `/api/download` | +| [utils.py](../utils.py) | `rename_from_metadata()`, `cleanup_empty_dirs()` | +| [Dockerfile](../Dockerfile) | Installation of votify-fix, ffmpeg, aria2, Bento4 | diff --git a/firebase-debug.log b/firebase-debug.log deleted file mode 100644 index e69de29..0000000