- Create Persona class in personas/models.py for managing browser
profiles/identities used for archiving sessions
- Each Persona has:
- chrome_user_data_dir: Chrome profile directory
- chrome_extensions_dir: Installed extensions
- cookies_file: Cookies for wget/curl
- config_file: Persona-specific config overrides
- Add Persona methods:
- cleanup_chrome(): Remove stale SingletonLock/SingletonSocket files
- get_config(): Load persona config from config.json
- save_config(): Save persona config to config.json
- ensure_dirs(): Create persona directory structure
- all(): Iterator over all personas
- get_active(): Get persona based on ACTIVE_PERSONA config
- cleanup_chrome_all(): Clean up all personas
- Update chrome_cleanup() in misc/util.py to use Persona.cleanup_chrome_all()
instead of manual directory iteration
- Add convenience functions:
- cleanup_chrome_for_persona(name)
- cleanup_chrome_all_personas()
- Add _derive_persona_paths() in configset.py to automatically derive
CHROME_USER_DATA_DIR and CHROME_EXTENSIONS_DIR from ACTIVE_PERSONA
when not explicitly set. This allows plugins to use these paths
without knowing about the persona system.
- Update chrome_utils.js launchChromium() to accept userDataDir option
and pass --user-data-dir to Chrome. Also cleans up SingletonLock
before launch.
- Update killZombieChrome() to clean up SingletonLock files from all
persona chrome_user_data directories after killing zombies.
- Update chrome_cleanup() in misc/util.py to handle persona-based
user data directories when cleaning up stale Chrome state.
- Simplify on_Crawl__20_chrome_launch.bg.js to use CHROME_USER_DATA_DIR
and CHROME_EXTENSIONS_DIR from env (derived by get_config()).
Config priority flow:
ACTIVE_PERSONA=WorkAccount (set on crawl/snapshot)
-> get_config() derives:
CHROME_USER_DATA_DIR = PERSONAS_DIR/WorkAccount/chrome_user_data
CHROME_EXTENSIONS_DIR = PERSONAS_DIR/WorkAccount/chrome_extensions
-> hooks receive these as env vars without needing persona logic