Nick Sweeting
f0aa19fa7d
wip
2025-12-28 17:51:54 -08:00
Nick Sweeting
4ccb0863bb
continue renaming extractor to plugin, add plan for hook concurrency, add chrome kill helper script
2025-12-28 05:29:24 -08:00
Nick Sweeting
bd265c0083
rename extractor to plugin everywhere
2025-12-28 04:43:15 -08:00
Nick Sweeting
50e527ec65
way better plugin hooks system wip
2025-12-28 03:39:59 -08:00
Claude
b632894bc9
Update views, API, and exports for new ArchiveResult output fields
...
Replace old `output` field with new fields across the codebase:
- output_str: Human-readable output summary
- output_json: Structured metadata (optional)
- output_files: Dict of output files with metadata
- output_size: Total size in bytes
- output_mimetypes: CSV of file mimetypes
Files updated:
- api/v1_core.py: Update MinimalArchiveResultSchema to expose new fields
- api/v1_core.py: Update ArchiveResultFilterSchema to search output_str
- cli/archivebox_extract.py: Use output_str in CLI output
- core/admin_archiveresults.py: Update admin fields, search, and fieldsets
- core/admin_archiveresults.py: Fix output_html variable name bug in output_summary
- misc/jsonl.py: Update archiveresult_to_jsonl() to include new fields
- plugins/extractor_utils.py: Update ExtractorResult helper class
The embed_path() method already uses output_files and output_str,
so snapshot detail page and template tags work correctly.
2025-12-27 20:28:22 +00:00
Claude
c3acadd528
Remove extractor field from Crawl model and fix tests
...
- Remove extractor field from Crawl model (moved to config dict)
- Update migration 0002_drop_seed_model to not add extractor
- Update archivebox_add.py to use config['PARSER'] instead
- Update admin.py recrawl to not pass extractor
- Update jsonl.py serialization to not include extractor
- Update test schema SCHEMA_0_8 to not include extractor
- Set default timeout to 60s for test commands
2025-12-27 01:49:09 +00:00
Nick Sweeting
4fd7fcdbcf
new gallerydl plugin and more
2025-12-26 11:55:03 -08:00
Nick Sweeting
9838d7ba02
tons of ui fixes and plugin fixes
2025-12-25 03:59:51 -08:00
Nick Sweeting
bb53228ebf
remove Seed model in favor of Crawl as template
2025-12-25 01:52:41 -08:00
Nick Sweeting
866f993f26
logging and admin ui improvements
2025-12-25 01:10:41 -08:00
Nick Sweeting
d95f0dc186
remove huey
2025-12-24 23:40:18 -08:00
Nick Sweeting
6c769d831c
wip 2
2025-12-24 21:46:14 -08:00
Nick Sweeting
1915333b81
wip major changes
2025-12-24 20:10:38 -08:00
Ben Muthalaly
71c02ca4eb
Update archivebox/misc/logging_util.py
...
Co-authored-by: Nick Sweeting <git@sweeting.me >
2025-02-05 17:55:45 -06:00
Ben Muthalaly
9f4cf0a8e1
Kill the timer process if it doesn't properly terminate.
2025-02-03 02:47:33 -06:00
Nick Sweeting
c5fc4068f4
fix unneeded import
2024-12-18 18:09:21 -08:00
Nick Sweeting
7975b47c85
remove dependencies on unneeded libraries
2024-12-18 18:07:35 -08:00
Nick Sweeting
d192eb5c48
add filestore content addressible store draft
2024-12-04 02:15:04 -08:00
Nick Sweeting
a3fe78afaa
add basename to hashing get_dir_info
2024-12-04 02:15:04 -08:00
Nick Sweeting
eae7ed8447
add hashing misc library for merkle tree generation
2024-12-03 02:12:20 -08:00
Nick Sweeting
2595139180
improve statemachine logging and archivebox update CLI cmd
2024-11-19 03:31:05 -08:00
Nick Sweeting
c9a05c9d94
working archivebox update CLI cmd
2024-11-19 02:32:05 -08:00
Nick Sweeting
328eb98a38
move main funcs into cli files and switch to using click for CLI
2024-11-19 00:18:51 -08:00
Nick Sweeting
4c25e90378
move monkey_patches.py into archivebox.misc subfolder
2024-11-18 19:10:42 -08:00
Nick Sweeting
4a5d607296
move logging_util into archivebox.misc subfolder
2024-11-18 19:08:49 -08:00
Nick Sweeting
b3c1cb716e
move abx plugins inside vendor dir
2024-10-28 04:07:35 -07:00
Nick Sweeting
4b6f08b0fe
swap more direct settings.CONFIG access to abx getters
2024-10-24 15:42:19 -07:00
Nick Sweeting
60f0458c77
rename configfile to collection
2024-10-24 15:40:24 -07:00
Nick Sweeting
657eec479b
fix CONSTANTS.LIB_DIR old style access
2024-10-21 03:20:20 -07:00
Nick Sweeting
b3107ab830
move final legacy config to plugins and fix archivebox config cmd and add search opt
2024-10-21 02:56:00 -07:00
Nick Sweeting
a211461ffc
fix LIB_DIR and TMP_DIR loading when primary option isnt available
2024-10-21 00:35:56 -07:00
Nick Sweeting
bb9c3fda14
fix makemigrations being blocked by check_migrations func
2024-10-14 17:40:06 -07:00
Nick Sweeting
9a04ed7c76
move serve_static and shell_welcome_message into misc
2024-10-14 17:35:28 -07:00
Nick Sweeting
f75ae805f8
comment out Crawl api methods temporarily
2024-10-14 15:41:58 -07:00
Nick Sweeting
2f68a1d476
fix ldap lib loading after apt install
2024-10-09 04:03:02 -07:00
Nick Sweeting
afc24e802a
tweak version log output
2024-10-09 03:18:22 -07:00
Nick Sweeting
613caec8eb
improve install flow with sudo, check package managers, and fix docker build
2024-10-09 00:41:16 -07:00
Nick Sweeting
9f274cf9f4
remove platformdirs dependency
2024-10-08 19:17:18 -07:00
Nick Sweeting
3e4a846488
fix more installer bugs
2024-10-08 18:06:57 -07:00
Nick Sweeting
4b34b729ab
fuck it go back to nested lib and tmp dirs with supervisord sock workaround
2024-10-08 17:48:59 -07:00
Nick Sweeting
35c7019772
handle failure on tmp_dir and lib_dir detection better
2024-10-08 16:56:25 -07:00
Nick Sweeting
de2ab43f7f
switch .is_dir and .exists for os.access to avoid PermissionError on startup
2024-10-08 03:02:34 -07:00
Nick Sweeting
611a2b7c1b
fix a few small nits
2024-10-08 02:10:08 -07:00
Nick Sweeting
46c0463539
safer import handling
2024-10-08 00:51:58 -07:00
Nick Sweeting
cf1ea8f80f
improve config loading of TMP_DIR, LIB_DIR, move to separate files
2024-10-07 23:45:11 -07:00
Nick Sweeting
0c7d7a2225
fix archivebox init colors and dir status checking
2024-10-04 21:34:19 -07:00
Nick Sweeting
da274fd8e8
remove dead code
2024-10-04 14:48:20 -07:00
Nick Sweeting
12f32c4690
fix tmp data dir resolution when running help or version outside data dir
2024-10-04 01:40:41 -07:00
Nick Sweeting
035a14b6ea
better help text output
2024-10-02 19:46:31 -07:00
Nick Sweeting
18474f452b
move config moved out of legacy files and better version output
2024-09-30 23:52:00 -07:00