Commit Graph

196 Commits

Author SHA1 Message Date
Claude
a5654e877f rename media plugin to ytdlp with backwards-compatible aliases
- Rename archivebox/plugins/media/ → archivebox/plugins/ytdlp/
- Rename hook script on_Snapshot__63_media.bg.py → on_Snapshot__63_ytdlp.bg.py
- Update config.json: YTDLP_* as primary keys, MEDIA_* as x-aliases
- Update templates CSS classes: media-* → ytdlp-*
- Fix gallerydl bug: remove incorrect dependency on media plugin output
- Update all codebase references to use YTDLP_* and SAVE_YTDLP
- Add backwards compatibility test for MEDIA_ENABLED alias
2025-12-29 19:09:05 +00:00
Nick Sweeting
30c60eef76 much better tests and add page ui 2025-12-29 04:02:11 -08:00
Nick Sweeting
f4e7820533 use full dotted paths for all archivebox imports, add migrations and more fixes 2025-12-29 00:47:08 -08:00
Nick Sweeting
f0aa19fa7d wip 2025-12-28 17:51:54 -08:00
Claude
057b49ad85 Update status command to use DB as source of truth
Remove imports of deleted folder utility functions and rewrite
status command to query Snapshot model directly. This aligns with
the fs_version refactor where the DB is the single source of truth.

- Use Snapshot.objects queries for indexed/archived/unarchived counts
- Scan filesystem directly for present/orphaned directory counts
- Simplify output to focus on essential status information
2025-12-28 19:19:03 +00:00
Nick Sweeting
bd265c0083 rename extractor to plugin everywhere 2025-12-28 04:43:15 -08:00
Nick Sweeting
50e527ec65 way better plugin hooks system wip 2025-12-28 03:39:59 -08:00
Claude
b632894bc9 Update views, API, and exports for new ArchiveResult output fields
Replace old `output` field with new fields across the codebase:
- output_str: Human-readable output summary
- output_json: Structured metadata (optional)
- output_files: Dict of output files with metadata
- output_size: Total size in bytes
- output_mimetypes: CSV of file mimetypes

Files updated:
- api/v1_core.py: Update MinimalArchiveResultSchema to expose new fields
- api/v1_core.py: Update ArchiveResultFilterSchema to search output_str
- cli/archivebox_extract.py: Use output_str in CLI output
- core/admin_archiveresults.py: Update admin fields, search, and fieldsets
- core/admin_archiveresults.py: Fix output_html variable name bug in output_summary
- misc/jsonl.py: Update archiveresult_to_jsonl() to include new fields
- plugins/extractor_utils.py: Update ExtractorResult helper class

The embed_path() method already uses output_files and output_str,
so snapshot detail page and template tags work correctly.
2025-12-27 20:28:22 +00:00
Claude
c3acadd528 Remove extractor field from Crawl model and fix tests
- Remove extractor field from Crawl model (moved to config dict)
- Update migration 0002_drop_seed_model to not add extractor
- Update archivebox_add.py to use config['PARSER'] instead
- Update admin.py recrawl to not pass extractor
- Update jsonl.py serialization to not include extractor
- Update test schema SCHEMA_0_8 to not include extractor
- Set default timeout to 60s for test commands
2025-12-27 01:49:09 +00:00
Nick Sweeting
9838d7ba02 tons of ui fixes and plugin fixes 2025-12-25 03:59:51 -08:00
Nick Sweeting
bb53228ebf remove Seed model in favor of Crawl as template 2025-12-25 01:52:41 -08:00
Nick Sweeting
28e6c5bb65 add mcp server support 2025-12-25 01:51:42 -08:00
Nick Sweeting
866f993f26 logging and admin ui improvements 2025-12-25 01:10:41 -08:00
Nick Sweeting
d95f0dc186 remove huey 2025-12-24 23:40:18 -08:00
Nick Sweeting
6c769d831c wip 2 2025-12-24 21:46:14 -08:00
Nick Sweeting
1915333b81 wip major changes 2025-12-24 20:10:38 -08:00
Nick Sweeting
c1335fed37 Remove ABID system and KVTag model - use UUIDv7 IDs exclusively
This commit completes the simplification of the ID system by:

- Removing the ABID (ArchiveBox ID) system entirely
- Removing the base_models/abid.py file
- Removing KVTag model in favor of the existing Tag model in core/models.py
- Simplifying all models to use standard UUIDv7 primary keys
- Removing ABID-related admin functionality
- Cleaning up commented-out ABID code from views and statemachines
- Deleting migration files for ABID field removal (no longer needed)

All models now use simple UUIDv7 ids via `id = models.UUIDField(primary_key=True, default=uuid7)`

Note: Old migrations containing ABID references are preserved for database
migration history compatibility.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-24 06:13:49 -08:00
Nick Sweeting
930b9bf386 add archivebox worker cli cmd to list of all cmds 2024-12-12 21:44:44 -08:00
Nick Sweeting
5cf7725f0e add new archivebox worker implementation based on better distributed systems principles 2024-12-12 21:41:45 -08:00
Nick Sweeting
dcd7e2555e add new archivebox_extract cli command 2024-12-03 02:14:56 -08:00
Nick Sweeting
b948e49013 add urls log to Crawl model 2024-11-19 06:32:33 -08:00
Nick Sweeting
4dd53dc12a Merge branch 'newchanges' into dev 2024-11-19 05:28:20 -08:00
Nick Sweeting
b852951c58 fix cli loading edge case where setup_django wasnt running when it should 2024-11-19 05:27:35 -08:00
Nick Sweeting
f8e2f7c753 restore missing archivebox_update work 2024-11-19 05:09:19 -08:00
Nick Sweeting
52446b86ba restore missing archivebox_status work 2024-11-19 05:08:41 -08:00
Nick Sweeting
0f536ff18b restore missing archivebox_schedule work 2024-11-19 05:07:55 -08:00
Nick Sweeting
fe3320eff0 restore missing archivebox_remove work 2024-11-19 05:07:12 -08:00
Nick Sweeting
230bf34e14 restore missing archivebox_config work 2024-11-19 05:05:06 -08:00
Nick Sweeting
6740202d78 fix cli loading edge case where setup_django wasnt running when it should 2024-11-19 04:20:00 -08:00
Nick Sweeting
f21b86aba8 better cli colors 2024-11-19 04:10:07 -08:00
Nick Sweeting
0f860d40f1 working archivebox_status CLI cmd 2024-11-19 04:05:05 -08:00
Nick Sweeting
292730ebad working archivebox_schedule cmd 2024-11-19 03:54:47 -08:00
Nick Sweeting
3a64ced697 fix archivebox delete errors 2024-11-19 03:45:44 -08:00
Nick Sweeting
0347b911aa archivebox add and remove CLI cmds 2024-11-19 03:40:01 -08:00
Nick Sweeting
2595139180 improve statemachine logging and archivebox update CLI cmd 2024-11-19 03:31:05 -08:00
Nick Sweeting
c9a05c9d94 working archivebox update CLI cmd 2024-11-19 02:32:05 -08:00
Nick Sweeting
a0edf218e8 fix archivebox init and archivebox install CLI commands 2024-11-19 01:05:49 -08:00
Nick Sweeting
5f01fc8307 fix archivebox shell and manage CLI commands 2024-11-19 00:48:39 -08:00
Nick Sweeting
328eb98a38 move main funcs into cli files and switch to using click for CLI 2024-11-19 00:18:51 -08:00
Nick Sweeting
569081a9eb rename abid_utils to base_models 2024-11-18 19:40:05 -08:00
Nick Sweeting
65afd405b1 merge seeds and crawls apps 2024-11-18 19:23:14 -08:00
Nick Sweeting
4a5d607296 move logging_util into archivebox.misc subfolder 2024-11-18 19:08:49 -08:00
Nick Sweeting
e469c5a344 merge queues and actors apps into new workers app 2024-11-18 18:52:48 -08:00
Nick Sweeting
0acd388c02 fix imports and deps 2024-11-18 18:07:34 -08:00
Nick Sweeting
6b83b4c995 leave archivebox running when in archivebox update 2024-11-18 04:27:38 -08:00
Nick Sweeting
eeb2671e4d API improvements 2024-11-18 04:27:38 -08:00
Nick Sweeting
1e3ce67834 fix API and CLU calls 2024-11-18 04:27:38 -08:00
Nick Sweeting
c8e186f21b fix plugin loading order, admin, abx-pkg 2024-11-16 06:44:12 -08:00
Nick Sweeting
b4a5da3ffd update archivebox add CLI command to use new actor system 2024-11-16 02:45:37 -08:00
Nick Sweeting
312e40b95b finally get rid of config/legacy in favor of configfile.py and django.py 2024-10-21 03:06:19 -07:00