Some checks failed
Pull Request Validation / validate-pr-title (pull_request) Has been cancelled
Pull Request Validation / validate-docker (pull_request) Has been cancelled
Pull Request Validation / build-test (pull_request) Has been cancelled
Pull Request Validation / check-files (pull_request) Has been cancelled
- Add .gitea/workflows/pr-validation.yml with 4 validation jobs - Validate PR titles follow Conventional Commits format - Check Dockerfile and nginx.conf syntax - Run integration tests (build image, test container, verify pages load) - Verify required files exist and HTTrack artifacts excluded - Update copilot-instructions.md with PR validation details
6.1 KiB
6.1 KiB
Karaokepedia - Static Website Archive
Project Overview
This is a static HTML archive of Karaokepedia (karaoke.karaniwan.org), a semi-crowdsourced karaoke database focused on Japanese/anime songs available in Philippine karaoke machines. The site was deprecated and migrated to awitotaku.com, but this archive preserves the original content for reference.
Status: HTTrack mirror from December 29, 2025. Dockerized with nginx:alpine, automated builds via Gitea Actions.
Architecture
Static Content
- Pure Static HTML: All files in
karaoke.karaniwan.org/are pre-rendered HTML pages (~17.6 MB, 2,309 files) - No backend: HTTrack mirror of Ruby on Rails app (visible in HTML comments), but this is static HTML only
- Assets included: 2 MB of CSS/JS/fonts/images stay in-repo (self-contained archive)
Container Stack
- Base image: nginx:alpine (~25 MB)
- Web server: Custom nginx.conf with font MIME types, gzip, caching
- Security: Runs as non-root user, includes healthcheck
- Final size: ~35-40 MB total
- CI/CD: Gitea Actions builds on push to main
Structure
.
├── karaoke.karaniwan.org/ # Static HTML content (served by nginx)
│ ├── songs/ # Individual song pages
│ ├── songs*.html # Paginated listings (hex-named)
│ ├── artists/ # Artist profiles
│ ├── karaoke_machines/ # Machine-specific listings
│ ├── tags/ # Tag-based grouping
│ ├── assets/ # CSS/JS/images/fonts (~2 MB)
│ └── index.html # Main entry point
├── Dockerfile # nginx:alpine with custom config
├── nginx.conf # Custom nginx configuration
├── .dockerignore # Excludes HTTrack artifacts
├── .gitea/workflows/build.yml # CI/CD pipeline
├── hts-cache/ # HTTrack metadata (not in image)
├── hts-log.txt # HTTrack log (not in image)
└── index.html # HTTrack root nav (not in image)
Data Model (Static)
Each song page includes:
- Song title and artist (linked)
- Machine keys: KY (Kumyoung), TJ (TJ Media), P (Platinum) with numeric codes
- Tags: Language (Japanese/English/Korean/OPM), genre (Pop/Rock/Metal), type (Anime OST/Drama OST)
- Release dates and alternative names
- Bootstrap 3 responsive layout
File naming: Hex-based (e.g., songs9285.html, songs02d1.html) for pagination/organization.
Development & Deployment
Local Testing
# Quick test with Python
python3 -m http.server 8000
# Visit http://localhost:8000/karaoke.karaniwan.org/
# Docker build and run
docker build -t karaokepedia:test .
docker run -p 8080:80 karaokepedia:test
# Visit http://localhost:8080/
# Check healthcheck
docker inspect --format='{{.State.Health.Status}}' <container-id>
Building for Production
# Build with tags
docker build -t karaokepedia:latest .
docker tag karaokepedia:latest your-registry/karaokepedia:latest
# Push to registry
docker push your-registry/karaokepedia:latest
CI/CD Pipelines (Gitea Actions)
Build & Deploy (.gitea/workflows/build.yml)
- Trigger: Push to
mainbranch or manual dispatch - Steps: Checkout → Setup Buildx → Login to registry → Build & push → Output digest
- Tags:
:latestand:main-<commit-sha> - Registry: Configure via secrets (DOCKER_USERNAME/DOCKER_PASSWORD for Docker Hub, or adapt for Gitea registry)
PR Validation (.gitea/workflows/pr-validation.yml)
- Trigger: Pull request opened, edited, synchronized, or reopened
- Jobs:
validate-pr-title: Enforces Conventional Commits formatvalidate-docker: Checks Dockerfile, nginx.conf, and .dockerignore syntaxbuild-test: Builds image and tests container starts, pages load, assets accessiblecheck-files: Verifies required files exist, HTTrack artifacts excluded
Conventional Commit Types: feat, fix, docs, style, refactor, perf, test, chore, ci, build, revert
Example PR titles:
- ✅
feat: add user authentication - ✅
fix(docker): correct nginx config path - ✅
docs: update README with deployment steps - ❌
Added new feature(missing type) - ❌
Update files(not descriptive)
Registry Configuration
Edit .gitea/workflows/build.yml and uncomment the appropriate registry:
- Docker Hub (default): Uses
DOCKER_USERNAMEandDOCKER_PASSWORDsecrets - GitHub Container Registry: Uncomment GHCR section, uses
GITHUB_TOKEN - Gitea Container Registry: Uncomment Gitea section, configure domain and credentials
Modifying Content
Since this is static HTML:
- Edit HTML directly - no templating system
- Manual updates across paginated files (songs*.html) if needed
- No regeneration tool - changes must be applied per-file
Navigation Conventions
- Internal links are relative (e.g.,
../songs/,../../artists/) - Asset references use
assets/application-*.cssandassets/application-*.js(fingerprinted) - External services: Disqus comments (disabled), Piwik analytics (historic)
Key Patterns
- Machine abbreviations: KY=Kumyoung (red label), TJ=TJ Media (orange label), P=Platinum (no color)
- Alphabetical pagination:
songs.html?initial=A,songs6c50.html?initial=A - Deprecated notices: All pages show "This site has been deprecated. Proceed to Awit Otaku"
HTTrack Artifacts
hts-cache/: Mirror metadata (new.lst lists all mirrored URLs)hts-log.txt: Download log (2312 links, 2309 files, 2 404 errors)cookies.txt: Session cookies from mirroring- Root
index.html: HTTrack navigation page
What NOT to Do
- Don't look for package.json, Gemfile, or build configs - they don't exist
- Don't try to "npm install" or "bundle install" - this isn't a dev project
- Don't modify HTTrack files (hts-cache/, hts-log.txt) - they're archival metadata
- Don't expect dynamic search/filtering - all functionality is static HTML links
- Don't include HTTrack artifacts in Docker image - they're excluded via .dockerignore