Nick Sweeting
967c5d53e0
make plugin config more consistent
2025-12-29 13:21:46 -08:00
Claude
a5654e877f
rename media plugin to ytdlp with backwards-compatible aliases
...
- Rename archivebox/plugins/media/ → archivebox/plugins/ytdlp/
- Rename hook script on_Snapshot__63_media.bg.py → on_Snapshot__63_ytdlp.bg.py
- Update config.json: YTDLP_* as primary keys, MEDIA_* as x-aliases
- Update templates CSS classes: media-* → ytdlp-*
- Fix gallerydl bug: remove incorrect dependency on media plugin output
- Update all codebase references to use YTDLP_* and SAVE_YTDLP
- Add backwards compatibility test for MEDIA_ENABLED alias
2025-12-29 19:09:05 +00:00
Nick Sweeting
30c60eef76
much better tests and add page ui
2025-12-29 04:02:11 -08:00
Nick Sweeting
1e4d3ffd11
improve plugin tests and config
2025-12-29 00:45:23 -08:00
Nick Sweeting
f0aa19fa7d
wip
2025-12-28 17:51:54 -08:00
Claude
1b5a816022
Implement hook step-based concurrency system
...
This implements the hook concurrency plan from TODO_hook_concurrency.md:
## Schema Changes
- Add Snapshot.current_step (IntegerField 0-9, default=0)
- Create migration 0034_snapshot_current_step.py
- Fix uuid_compat imports in migrations 0032 and 0003
## Core Logic
- Add extract_step(hook_name) utility - extracts step from __XX_ pattern
- Add is_background_hook(hook_name) utility - checks for .bg. suffix
- Update Snapshot.create_pending_archiveresults() to create one AR per hook
- Update ArchiveResult.run() to handle hook_name field
- Add Snapshot.advance_step_if_ready() method for step advancement
- Integrate with SnapshotMachine.is_finished() to call advance_step_if_ready()
## Worker Coordination
- Update ArchiveResultWorker.get_queue() for step-based filtering
- ARs are only claimable when their step <= snapshot.current_step
## Hook Renumbering
- Step 5 (DOM extraction): singlefile→50, screenshot→51, pdf→52, dom→53,
title→54, readability→55, headers→55, mercury→56, htmltotext→57
- Step 6 (post-DOM): wget→61, git→62, media→63.bg, gallerydl→64.bg,
forumdl→65.bg, papersdl→66.bg
- Step 7 (URL extraction): parse_* hooks moved to 70-75
Background hooks (.bg suffix) don't block step advancement, enabling
long-running downloads to continue while other hooks proceed.
2025-12-28 13:47:25 +00:00
Nick Sweeting
4ccb0863bb
continue renaming extractor to plugin, add plan for hook concurrency, add chrome kill helper script
2025-12-28 05:29:24 -08:00
Nick Sweeting
50e527ec65
way better plugin hooks system wip
2025-12-28 03:39:59 -08:00
Claude
e3ba599812
Update install hooks to respect XYZ_BINARY env vars
...
- All install hooks now respect their respective XYZ_BINARY env vars
(e.g., WGET_BINARY, CHROME_BINARY, YTDLP_BINARY, etc.)
- Support both absolute paths (/usr/bin/wget2) and binary names (wget2)
- Dynamic bin_name used in Dependency JSONL output
- Updated 11 install hooks to follow the new pattern
- Mark checklist items as complete in TODO_hook_architecture.md
2025-12-27 10:12:45 +00:00
Claude
8c846b7d1c
Rename validate hooks to install hooks
...
- Rename 13 on_Crawl__00_validate_* hooks to on_Crawl__00_install_*
- This better reflects what these hooks actually do (check/install binaries)
- Update TODO_hook_architecture.md to reflect renamed hooks
2025-12-27 10:06:34 +00:00
Nick Sweeting
2f81c0cc76
add overrides options to binproviders
2025-12-26 20:39:56 -08:00
Nick Sweeting
e2cbcd17f6
more tests and migrations fixes
2025-12-26 18:22:48 -08:00
Nick Sweeting
0fbcbd2616
gallerydl template
2025-12-26 11:55:19 -08:00
Nick Sweeting
4fd7fcdbcf
new gallerydl plugin and more
2025-12-26 11:55:03 -08:00