Claude
4c77949197
Clean up on_Crawl hooks: remove duplicates and standardize naming
...
Deleted dead/duplicate hooks:
- wget/on_Crawl__10_install_wget.py (duplicate of __10_wget_validate_config.py)
- chrome/on_Crawl__00_chrome_install.py (simpler version, kept full one)
- chrome/on_Crawl__20_chrome_launch.bg.js (legacy, kept __30 version)
- singlefile/on_Crawl__20_install_singlefile_extension.js (disabled/dead)
- istilldontcareaboutcookies/on_Crawl__20_install_*.js (legacy)
- ublock/on_Crawl__03_ublock.js (legacy, kept __20 version)
- Entire captcha2/ plugin (legacy version of twocaptcha/)
Renamed hooks to follow consistent pattern: on_Crawl__XX_<plugin>_<action>.<ext>
Priority bands:
00-09: Binary/extension installation
10-19: Config validation
20-29: Browser launch and post-launch config
Final hooks:
00 ripgrep_install.py, 01 chrome_install.py
02 istilldontcareaboutcookies_install.js
03 ublock_install.js, 04 singlefile_install.js
05 twocaptcha_install.js
10 chrome_validate.py, 11 wget_validate.py
20 chrome_launch.bg.js, 25 twocaptcha_config.js
2025-12-31 22:47:36 +00:00
Claude
0cb5f0712d
Add comprehensive tests for machine/process models, orchestrator, and search backends
...
This adds new test coverage for previously untested areas:
Machine module (archivebox/machine/tests/):
- Machine, NetworkInterface, Binary, Process model tests
- BinaryMachine and ProcessMachine state machine tests
- JSONL serialization/deserialization tests
- Manager method tests
Workers module (archivebox/workers/tests/):
- PID file utility tests (write, read, cleanup)
- Orchestrator lifecycle and queue management tests
- Worker spawning logic tests
- Idle detection and exit condition tests
Search backends:
- SQLite FTS5 search tests with real indexed content
- Phrase search, stemming, and unicode support
- Ripgrep search tests with archive directory structure
- Environment variable configuration tests
Binary provider plugins:
- pip provider hook tests
- npm provider hook tests with PATH updates
- apt provider hook tests
2025-12-31 11:33:27 +00:00
Nick Sweeting
967c5d53e0
make plugin config more consistent
2025-12-29 13:21:46 -08:00
Nick Sweeting
30c60eef76
much better tests and add page ui
2025-12-29 04:02:11 -08:00
Nick Sweeting
f4e7820533
use full dotted paths for all archivebox imports, add migrations and more fixes
2025-12-29 00:47:08 -08:00
Nick Sweeting
1e4d3ffd11
improve plugin tests and config
2025-12-29 00:45:23 -08:00
Nick Sweeting
f0aa19fa7d
wip
2025-12-28 17:51:54 -08:00
Nick Sweeting
50e527ec65
way better plugin hooks system wip
2025-12-28 03:39:59 -08:00
Claude
e3ba599812
Update install hooks to respect XYZ_BINARY env vars
...
- All install hooks now respect their respective XYZ_BINARY env vars
(e.g., WGET_BINARY, CHROME_BINARY, YTDLP_BINARY, etc.)
- Support both absolute paths (/usr/bin/wget2) and binary names (wget2)
- Dynamic bin_name used in Dependency JSONL output
- Updated 11 install hooks to follow the new pattern
- Mark checklist items as complete in TODO_hook_architecture.md
2025-12-27 10:12:45 +00:00
Claude
8c846b7d1c
Rename validate hooks to install hooks
...
- Rename 13 on_Crawl__00_validate_* hooks to on_Crawl__00_install_*
- This better reflects what these hooks actually do (check/install binaries)
- Update TODO_hook_architecture.md to reflect renamed hooks
2025-12-27 10:06:34 +00:00
Nick Sweeting
2f81c0cc76
add overrides options to binproviders
2025-12-26 20:39:56 -08:00
Nick Sweeting
866f993f26
logging and admin ui improvements
2025-12-25 01:10:41 -08:00
Nick Sweeting
1915333b81
wip major changes
2025-12-24 20:10:38 -08:00