Commit Graph

191 Commits

Author SHA1 Message Date
Nick Sweeting
bd265c0083 rename extractor to plugin everywhere 2025-12-28 04:43:15 -08:00
Nick Sweeting
50e527ec65 way better plugin hooks system wip 2025-12-28 03:39:59 -08:00
Claude
b632894bc9 Update views, API, and exports for new ArchiveResult output fields
Replace old `output` field with new fields across the codebase:
- output_str: Human-readable output summary
- output_json: Structured metadata (optional)
- output_files: Dict of output files with metadata
- output_size: Total size in bytes
- output_mimetypes: CSV of file mimetypes

Files updated:
- api/v1_core.py: Update MinimalArchiveResultSchema to expose new fields
- api/v1_core.py: Update ArchiveResultFilterSchema to search output_str
- cli/archivebox_extract.py: Use output_str in CLI output
- core/admin_archiveresults.py: Update admin fields, search, and fieldsets
- core/admin_archiveresults.py: Fix output_html variable name bug in output_summary
- misc/jsonl.py: Update archiveresult_to_jsonl() to include new fields
- plugins/extractor_utils.py: Update ExtractorResult helper class

The embed_path() method already uses output_files and output_str,
so snapshot detail page and template tags work correctly.
2025-12-27 20:28:22 +00:00
Claude
c3acadd528 Remove extractor field from Crawl model and fix tests
- Remove extractor field from Crawl model (moved to config dict)
- Update migration 0002_drop_seed_model to not add extractor
- Update archivebox_add.py to use config['PARSER'] instead
- Update admin.py recrawl to not pass extractor
- Update jsonl.py serialization to not include extractor
- Update test schema SCHEMA_0_8 to not include extractor
- Set default timeout to 60s for test commands
2025-12-27 01:49:09 +00:00
Nick Sweeting
9838d7ba02 tons of ui fixes and plugin fixes 2025-12-25 03:59:51 -08:00
Nick Sweeting
bb53228ebf remove Seed model in favor of Crawl as template 2025-12-25 01:52:41 -08:00
Nick Sweeting
28e6c5bb65 add mcp server support 2025-12-25 01:51:42 -08:00
Nick Sweeting
866f993f26 logging and admin ui improvements 2025-12-25 01:10:41 -08:00
Nick Sweeting
d95f0dc186 remove huey 2025-12-24 23:40:18 -08:00
Nick Sweeting
6c769d831c wip 2 2025-12-24 21:46:14 -08:00
Nick Sweeting
1915333b81 wip major changes 2025-12-24 20:10:38 -08:00
Nick Sweeting
c1335fed37 Remove ABID system and KVTag model - use UUIDv7 IDs exclusively
This commit completes the simplification of the ID system by:

- Removing the ABID (ArchiveBox ID) system entirely
- Removing the base_models/abid.py file
- Removing KVTag model in favor of the existing Tag model in core/models.py
- Simplifying all models to use standard UUIDv7 primary keys
- Removing ABID-related admin functionality
- Cleaning up commented-out ABID code from views and statemachines
- Deleting migration files for ABID field removal (no longer needed)

All models now use simple UUIDv7 ids via `id = models.UUIDField(primary_key=True, default=uuid7)`

Note: Old migrations containing ABID references are preserved for database
migration history compatibility.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-24 06:13:49 -08:00
Nick Sweeting
930b9bf386 add archivebox worker cli cmd to list of all cmds 2024-12-12 21:44:44 -08:00
Nick Sweeting
5cf7725f0e add new archivebox worker implementation based on better distributed systems principles 2024-12-12 21:41:45 -08:00
Nick Sweeting
dcd7e2555e add new archivebox_extract cli command 2024-12-03 02:14:56 -08:00
Nick Sweeting
b948e49013 add urls log to Crawl model 2024-11-19 06:32:33 -08:00
Nick Sweeting
4dd53dc12a Merge branch 'newchanges' into dev 2024-11-19 05:28:20 -08:00
Nick Sweeting
b852951c58 fix cli loading edge case where setup_django wasnt running when it should 2024-11-19 05:27:35 -08:00
Nick Sweeting
f8e2f7c753 restore missing archivebox_update work 2024-11-19 05:09:19 -08:00
Nick Sweeting
52446b86ba restore missing archivebox_status work 2024-11-19 05:08:41 -08:00
Nick Sweeting
0f536ff18b restore missing archivebox_schedule work 2024-11-19 05:07:55 -08:00
Nick Sweeting
fe3320eff0 restore missing archivebox_remove work 2024-11-19 05:07:12 -08:00
Nick Sweeting
230bf34e14 restore missing archivebox_config work 2024-11-19 05:05:06 -08:00
Nick Sweeting
6740202d78 fix cli loading edge case where setup_django wasnt running when it should 2024-11-19 04:20:00 -08:00
Nick Sweeting
f21b86aba8 better cli colors 2024-11-19 04:10:07 -08:00
Nick Sweeting
0f860d40f1 working archivebox_status CLI cmd 2024-11-19 04:05:05 -08:00
Nick Sweeting
292730ebad working archivebox_schedule cmd 2024-11-19 03:54:47 -08:00
Nick Sweeting
3a64ced697 fix archivebox delete errors 2024-11-19 03:45:44 -08:00
Nick Sweeting
0347b911aa archivebox add and remove CLI cmds 2024-11-19 03:40:01 -08:00
Nick Sweeting
2595139180 improve statemachine logging and archivebox update CLI cmd 2024-11-19 03:31:05 -08:00
Nick Sweeting
c9a05c9d94 working archivebox update CLI cmd 2024-11-19 02:32:05 -08:00
Nick Sweeting
a0edf218e8 fix archivebox init and archivebox install CLI commands 2024-11-19 01:05:49 -08:00
Nick Sweeting
5f01fc8307 fix archivebox shell and manage CLI commands 2024-11-19 00:48:39 -08:00
Nick Sweeting
328eb98a38 move main funcs into cli files and switch to using click for CLI 2024-11-19 00:18:51 -08:00
Nick Sweeting
569081a9eb rename abid_utils to base_models 2024-11-18 19:40:05 -08:00
Nick Sweeting
65afd405b1 merge seeds and crawls apps 2024-11-18 19:23:14 -08:00
Nick Sweeting
4a5d607296 move logging_util into archivebox.misc subfolder 2024-11-18 19:08:49 -08:00
Nick Sweeting
e469c5a344 merge queues and actors apps into new workers app 2024-11-18 18:52:48 -08:00
Nick Sweeting
0acd388c02 fix imports and deps 2024-11-18 18:07:34 -08:00
Nick Sweeting
6b83b4c995 leave archivebox running when in archivebox update 2024-11-18 04:27:38 -08:00
Nick Sweeting
eeb2671e4d API improvements 2024-11-18 04:27:38 -08:00
Nick Sweeting
1e3ce67834 fix API and CLU calls 2024-11-18 04:27:38 -08:00
Nick Sweeting
c8e186f21b fix plugin loading order, admin, abx-pkg 2024-11-16 06:44:12 -08:00
Nick Sweeting
b4a5da3ffd update archivebox add CLI command to use new actor system 2024-11-16 02:45:37 -08:00
Nick Sweeting
312e40b95b finally get rid of config/legacy in favor of configfile.py and django.py 2024-10-21 03:06:19 -07:00
Nick Sweeting
b3107ab830 move final legacy config to plugins and fix archivebox config cmd and add search opt 2024-10-21 02:56:00 -07:00
Nick Sweeting
a211461ffc fix LIB_DIR and TMP_DIR loading when primary option isnt available 2024-10-21 00:35:56 -07:00
Nick Sweeting
6e7071bd19 add new binproviders and binaries args to install and version, bump pydantic-pkgr version 2024-10-11 00:45:59 -07:00
Nick Sweeting
cf1ea8f80f improve config loading of TMP_DIR, LIB_DIR, move to separate files 2024-10-07 23:45:11 -07:00
Nick Sweeting
5323953f94 handle Ctrl+C more gracefully 2024-10-04 21:33:46 -07:00