# Karaokepedia - Static Website Archive ## Project Overview This is a **static HTML archive** of Karaokepedia (karaoke.karaniwan.org), a semi-crowdsourced karaoke database focused on Japanese/anime songs available in Philippine karaoke machines. The site was deprecated and migrated to awitotaku.com, but this archive preserves the original content for reference. **Status**: HTTrack mirror from December 29, 2025. Dockerized with nginx:alpine, automated builds via Gitea Actions. ## Architecture ### Static Content - **Pure Static HTML**: All files in `karaoke.karaniwan.org/` are pre-rendered HTML pages (~17.6 MB, 2,309 files) - **No backend**: HTTrack mirror of Ruby on Rails app (visible in HTML comments), but this is static HTML only - **Assets included**: 2 MB of CSS/JS/fonts/images stay in-repo (self-contained archive) ### Container Stack - **Base image**: nginx:alpine (~25 MB) - **Web server**: Custom nginx.conf with font MIME types, gzip, caching - **Security**: Runs as non-root user, includes healthcheck - **Final size**: ~35-40 MB total - **CI/CD**: Gitea Actions builds on push to main ## Structure ``` . ├── karaoke.karaniwan.org/ # Static HTML content (served by nginx) │ ├── songs/ # Individual song pages │ ├── songs*.html # Paginated listings (hex-named) │ ├── artists/ # Artist profiles │ ├── karaoke_machines/ # Machine-specific listings │ ├── tags/ # Tag-based grouping │ ├── assets/ # CSS/JS/images/fonts (~2 MB) │ └── index.html # Main entry point ├── Dockerfile # nginx:alpine with custom config ├── nginx.conf # Custom nginx configuration ├── .dockerignore # Excludes HTTrack artifacts ├── .gitea/workflows/build.yml # CI/CD pipeline ├── hts-cache/ # HTTrack metadata (not in image) ├── hts-log.txt # HTTrack log (not in image) └── index.html # HTTrack root nav (not in image) ``` ## Data Model (Static) Each song page includes: - **Song title** and artist (linked) - **Machine keys**: KY (Kumyoung), TJ (TJ Media), P (Platinum) with numeric codes - **Tags**: Language (Japanese/English/Korean/OPM), genre (Pop/Rock/Metal), type (Anime OST/Drama OST) - **Release dates** and alternative names - Bootstrap 3 responsive layout **File naming**: Hex-based (e.g., `songs9285.html`, `songs02d1.html`) for pagination/organization. ## Development & Deployment ### Local Testing ```bash # Quick test with Python python3 -m http.server 8000 # Visit http://localhost:8000/karaoke.karaniwan.org/ # Docker build and run docker build -t karaokepedia:test . docker run -p 8080:80 karaokepedia:test # Visit http://localhost:8080/ # Check healthcheck docker inspect --format='{{.State.Health.Status}}' ``` ### Building for Production ```bash # Build with tags docker build -t karaokepedia:latest . docker tag karaokepedia:latest your-registry/karaokepedia:latest # Push to registry docker push your-registry/karaokepedia:latest ``` ### CI/CD Pipelines (Gitea Actions) #### Build & Deploy (`.gitea/workflows/build.yml`) - **Trigger**: Push to `main` branch or manual dispatch - **Steps**: Checkout → Setup Buildx → Login to registry → Build & push → Output digest - **Tags**: `:latest` and `:main-` - **Registry**: Configure via secrets (DOCKER_USERNAME/DOCKER_PASSWORD for Docker Hub, or adapt for Gitea registry) #### PR Validation (`.gitea/workflows/pr-validation.yml`) - **Trigger**: Pull request opened, edited, synchronized, or reopened - **Jobs**: - `validate-pr-title`: Enforces [Conventional Commits](https://www.conventionalcommits.org/) format - `validate-docker`: Checks Dockerfile, nginx.conf, and .dockerignore syntax - `build-test`: Builds image and tests container starts, pages load, assets accessible - `check-files`: Verifies required files exist, HTTrack artifacts excluded **Conventional Commit Types**: `feat`, `fix`, `docs`, `style`, `refactor`, `perf`, `test`, `chore`, `ci`, `build`, `revert` **Example PR titles**: - ✅ `feat: add user authentication` - ✅ `fix(docker): correct nginx config path` - ✅ `docs: update README with deployment steps` - ❌ `Added new feature` (missing type) - ❌ `Update files` (not descriptive) ### Registry Configuration Edit `.gitea/workflows/build.yml` and uncomment the appropriate registry: - **Docker Hub** (default): Uses `DOCKER_USERNAME` and `DOCKER_PASSWORD` secrets - **GitHub Container Registry**: Uncomment GHCR section, uses `GITHUB_TOKEN` - **Gitea Container Registry**: Uncomment Gitea section, configure domain and credentials ## Modifying Content Since this is static HTML: 1. **Edit HTML directly** - no templating system 2. **Manual updates** across paginated files (songs*.html) if needed 3. **No regeneration tool** - changes must be applied per-file ## Navigation Conventions - **Internal links** are relative (e.g., `../songs/`, `../../artists/`) - **Asset references** use `assets/application-*.css` and `assets/application-*.js` (fingerprinted) - **External services**: Disqus comments (disabled), Piwik analytics (historic) ## Key Patterns - **Machine abbreviations**: KY=Kumyoung (red label), TJ=TJ Media (orange label), P=Platinum (no color) - **Alphabetical pagination**: `songs.html?initial=A`, `songs6c50.html?initial=A` - **Deprecated notices**: All pages show "This site has been deprecated. Proceed to Awit Otaku" ## HTTrack Artifacts - `hts-cache/`: Mirror metadata (new.lst lists all mirrored URLs) - `hts-log.txt`: Download log (2312 links, 2309 files, 2 404 errors) - `cookies.txt`: Session cookies from mirroring - Root `index.html`: HTTrack navigation page ## What NOT to Do - Don't look for package.json, Gemfile, or build configs - they don't exist - Don't try to "npm install" or "bundle install" - this isn't a dev project - Don't modify HTTrack files (hts-cache/, hts-log.txt) - they're archival metadata - Don't expect dynamic search/filtering - all functionality is static HTML links - Don't include HTTrack artifacts in Docker image - they're excluded via .dockerignore