Fixes#1445
This PR resolves the issue where SingleFile was not respecting Chrome
user data directory and other Chrome launch options that work for other
Chrome-based extractors (PDF, Screenshot, etc.).
## Changes
- Added `SINGLEFILE_CHROME_ARGS` config option with fallback to
`CHROME_ARGS`
- Updated SingleFile extractor to pass Chrome arguments via
`--browser-args`
- Updated documentation
This ensures SingleFile respects the same Chrome configuration as other
Chrome-based extractors.
Generated with [Claude Code](https://claude.ai/code)
Resolved conflicts by keeping Process model changes and accepting dev changes for unrelated files. Ensured pid_utils.py remains deleted as intended by this PR.
Co-authored-by: Nick Sweeting <pirate@users.noreply.github.com>
- Add getMachineType, getLibDir, getNodeModulesDir, getTestEnv CLI commands to chrome_utils.js
These are now the single source of truth for path calculations
- Update chrome_test_helpers.py with call_chrome_utils() dispatcher
- Add get_test_env_from_js(), get_machine_type_from_js(), kill_chrome_via_js() helpers
- Update cleanup_chrome and kill_chromium_session to use JS killChrome
- Remove unused Chrome binary search lists from singlefile hook (~25 lines)
- Update readability, mercury, favicon, title tests to use shared helpers
- Import shared Chrome test helpers
- Add test_singlefile_with_chrome_session() to verify CDP connection
- Add test_singlefile_disabled_skips() for config testing
- Update existing test to use get_test_env()
This fix resolves issue #1445 where SingleFile was not respecting Chrome
user data directory and other Chrome launch options that work for other
Chrome-based extractors (PDF, Screenshot, etc.).
Changes:
- Added SINGLEFILE_CHROME_ARGS config option in config.json with x-fallback
to CHROME_ARGS
- Updated SingleFile extractor to read and pass Chrome arguments via
--browser-args parameter
- Updated docstring to document the new environment variable
This ensures SingleFile respects the same Chrome configuration (user data
directory, cookies, etc.) as other Chrome-based extractors.
Fixes#1445
Co-authored-by: Nick Sweeting <pirate@users.noreply.github.com>
- All install hooks now respect their respective XYZ_BINARY env vars
(e.g., WGET_BINARY, CHROME_BINARY, YTDLP_BINARY, etc.)
- Support both absolute paths (/usr/bin/wget2) and binary names (wget2)
- Dynamic bin_name used in Dependency JSONL output
- Updated 11 install hooks to follow the new pattern
- Mark checklist items as complete in TODO_hook_architecture.md
- Rename 13 on_Crawl__00_validate_* hooks to on_Crawl__00_install_*
- This better reflects what these hooks actually do (check/install binaries)
- Update TODO_hook_architecture.md to reflect renamed hooks
- Remove mock-based tests from plugin tests (headers, singlefile, ublock, captcha2)
- Replace fake cache tests with real double-install tests that verify cache behavior
- Add SCHEMA_0_8 and seed_0_8_data() for testing 0.8.x data directory migrations
- Add TestMigrationFrom08x class with comprehensive migration tests:
- Snapshot count preservation
- Crawl record preservation
- Snapshot-to-crawl relationship preservation
- Tag preservation
- ArchiveResult status preservation
- CLI command verification after migration
- Add more CLI tests for add command (tags, multiple URLs, file input)
- All tests now use real functionality without mocking