diff --git a/TODO_hook_concurrency.md b/TODO_hook_concurrency.md
new file mode 100644
index 00000000..82190e7f
--- /dev/null
+++ b/TODO_hook_concurrency.md
@@ -0,0 +1,486 @@
+# ArchiveBox Hook Script Concurrency & Execution Plan
+
+## Overview
+
+Snapshot.run() should enforce that snapshot hooks are run in **10 discrete, sequential "steps"**: `0*`, `1*`, `2*`, `3*`, `4*`, `5*`, `6*`, `7*`, `8*`, `9*`.
+
+For every discovered hook script, ArchiveBox should create an ArchiveResult in `queued` state, then manage running them using `retry_at` and inline logic to enforce this ordering.
+
+## Hook Numbering Convention
+
+Hooks scripts are numbered `00` to `99` to control:
+- **First digit (0-9)**: Which step they are part of
+- **Second digit (0-9)**: Order within that step
+
+Hook scripts are launched **strictly sequentially** based on their filename alphabetical order, and run in sets of several per step before moving on to the next step.
+
+**Naming Format:**
+```
+on_{ModelName}__{run_order}_{human_readable_description}[.bg].{ext}
+```
+
+**Examples:**
+```
+on_Snapshot__00_this_would_run_first.sh
+on_Snapshot__05_start_ytdlp_download.bg.sh
+on_Snapshot__10_chrome_tab_opened.js
+on_Snapshot__50_screenshot.js
+on_Snapshot__53_media.bg.py
+```
+
+## Background (.bg) vs Foreground Scripts
+
+### Foreground Scripts (no .bg suffix)
+- Run sequentially within their step
+- Block step progression until they exit
+- Should exit naturally when work is complete
+- Get killed with SIGTERM if they exceed their `PLUGINNAME_TIMEOUT`
+
+### Background Scripts (.bg suffix)
+- Spawned and allowed to continue running
+- Do NOT block step progression
+- Run until **their own `PLUGINNAME_TIMEOUT` is reached** (not until step 99)
+- Get polite SIGTERM when timeout expires, then SIGKILL 60s later if not exited
+- Must implement their own concurrency control using filesystem (semaphore files, locks, etc.)
+- Should exit naturally when work is complete (best case)
+
+**Important:** If a .bg script starts at step 05 with `MEDIA_TIMEOUT=3600s`, it gets the full 3600s regardless of when step 99 completes. It runs on its own timeline.
+
+## Execution Step Guidelines
+
+These are **naming conventions and guidelines**, not enforced checkpoints. They provide semantic organization for plugin ordering:
+
+### Step 0: Pre-Setup
+```
+00-09: Initial setup, validation, feature detection
+```
+
+### Step 1: Chrome Launch & Tab Creation
+```
+10-19: Browser/tab lifecycle setup
+- Chrome browser launch
+- Tab creation and CDP connection
+```
+
+### Step 2: Navigation & Settlement
+```
+20-29: Page loading and settling
+- Navigate to URL
+- Wait for page load
+- Initial response capture (responses, ssl, consolelog as .bg listeners)
+```
+
+### Step 3: Page Adjustment
+```
+30-39: DOM manipulation before archiving
+- Hide popups/banners
+- Solve captchas
+- Expand comments/details sections
+- Inject custom CSS/JS
+- Accessibility modifications
+```
+
+### Step 4: Ready for Archiving
+```
+40-49: Final pre-archiving checks
+- Verify page is fully adjusted
+- Wait for any pending modifications
+```
+
+### Step 5: DOM Extraction (Sequential, Non-BG)
+```
+50-59: Extractors that need exclusive DOM access
+- singlefile (MUST NOT be .bg)
+- screenshot (MUST NOT be .bg)
+- pdf (MUST NOT be .bg)
+- dom (MUST NOT be .bg)
+- title
+- headers
+- readability
+- mercury
+
+These MUST run sequentially as they temporarily modify the DOM
+during extraction, then revert it. Running in parallel would corrupt results.
+```
+
+### Step 6: Post-DOM Extraction
+```
+60-69: Extractors that don't need DOM or run on downloaded files
+- wget
+- git
+- media (.bg - can run for hours)
+- gallerydl (.bg)
+- forumdl (.bg)
+- papersdl (.bg)
+```
+
+### Step 7: Chrome Cleanup
+```
+70-79: Browser/tab teardown
+- Close tabs
+- Cleanup Chrome resources
+```
+
+### Step 8: Post-Processing
+```
+80-89: Reprocess outputs from earlier extractors
+- OCR of images
+- Audio/video transcription
+- URL parsing from downloaded content (rss, html, json, txt, csv, md)
+- LLM analysis/summarization of outputs
+```
+
+### Step 9: Indexing & Finalization
+```
+90-99: Save to indexes and finalize
+- Index text content to Sonic/SQLite FTS
+- Create symlinks
+- Generate merkle trees
+- Final status updates
+```
+
+## Hook Script Interface
+
+### Input: CLI Arguments (NOT stdin)
+Hooks receive configuration as CLI flags (CSV or JSON-encoded):
+
+```bash
+--url="https://example.com"
+--snapshot-id="1234-5678-uuid"
+--config='{"some_key": "some_value"}'
+--plugins=git,media,favicon,title
+--timeout=50
+--enable-something
+```
+
+### Input: Environment Variables
+All configuration comes from env vars, defined in `plugin_dir/config.json` JSONSchema:
+
+```bash
+WGET_BINARY=/usr/bin/wget
+WGET_TIMEOUT=60
+WGET_USER_AGENT="Mozilla/5.0..."
+WGET_EXTRA_ARGS="--no-check-certificate"
+SAVE_WGET=True
+```
+
+**Required:** Every plugin must support `PLUGINNAME_TIMEOUT` for self-termination.
+
+### Output: Filesystem (CWD)
+Hooks read/write files to:
+- `$CWD`: Their own output subdirectory (e.g., `archive/snapshots/{id}/wget/`)
+- `$CWD/..`: Parent directory (to read outputs from other hooks)
+
+This allows hooks to:
+- Access files created by other hooks
+- Keep their outputs separate by default
+- Use semaphore files for coordination (if needed)
+
+### Output: JSONL to stdout
+Hooks emit one JSONL line per database record they want to create or update:
+
+```jsonl
+{"type": "Tag", "name": "sci-fi"}
+{"type": "ArchiveResult", "id": "1234-uuid", "status": "succeeded", "output_str": "wget/index.html"}
+{"type": "Snapshot", "id": "5678-uuid", "title": "Example Page"}
+```
+
+See `archivebox/misc/jsonl.py` and model `from_json()` / `from_jsonl()` methods for full list of supported types and fields.
+
+### Output: stderr for Human Logs
+Hooks should emit human-readable output or debug info to **stderr**. There are no guarantees this will be persisted long-term. Use stdout JSONL or filesystem for outputs that matter.
+
+### Cleanup: Delete Cruft
+If hooks emit no meaningful long-term outputs, they should delete any temporary files themselves to avoid wasting space. However, the ArchiveResult DB row should be kept so we know:
+- It doesn't need to be retried
+- It isn't missing
+- What happened (status, error message)
+
+### Signal Handling: SIGINT/SIGTERM
+Hooks are expected to listen for polite `SIGINT`/`SIGTERM` and finish hastily, then exit cleanly. Beyond that, they may be `SIGKILL'd` at ArchiveBox's discretion.
+
+**If hooks double-fork or spawn long-running processes:** They must output a `.pid` file in their directory so zombies can be swept safely.
+
+## Hook Failure Modes & Retry Logic
+
+Hooks can fail in several ways. ArchiveBox handles each differently:
+
+### 1. Soft Failure (Record & Don't Retry)
+**Exit:** `0` (success)
+**JSONL:** `{"type": "ArchiveResult", "status": "failed", "output_str": "404 Not Found"}`
+
+This means: "I ran successfully, but the resource wasn't available." Don't retry this.
+
+**Use cases:**
+- 404 errors
+- Content not available
+- Feature not applicable to this URL
+
+### 2. Hard Failure / Temporary Error (Retry Later)
+**Exit:** Non-zero (1, 2, etc.)
+**JSONL:** None (or incomplete)
+
+This means: "Something went wrong, I couldn't complete." Treat this ArchiveResult as "missing" and set `retry_at` for later.
+
+**Use cases:**
+- 500 server errors
+- Network timeouts
+- Binary not found / crashed
+- Transient errors
+
+**Behavior:**
+- ArchiveBox sets `retry_at` on the ArchiveResult
+- Hook will be retried during next `archivebox update`
+
+### 3. Partial Success (Update & Continue)
+**Exit:** Non-zero
+**JSONL:** Partial records emitted before crash
+
+**Behavior:**
+- Update ArchiveResult with whatever was emitted
+- Mark remaining work as "missing" with `retry_at`
+
+### 4. Success (Record & Continue)
+**Exit:** `0`
+**JSONL:** `{"type": "ArchiveResult", "status": "succeeded", "output_str": "output/file.html"}`
+
+This is the happy path.
+
+### Error Handling Rules
+
+- **DO NOT skip hooks** based on failures
+- **Continue to next hook** regardless of foreground or background failures
+- **Update ArchiveResults** with whatever information is available
+- **Set retry_at** for "missing" or temporarily-failed hooks
+- **Let background scripts continue** even if foreground scripts fail
+
+## File Structure
+
+```
+archivebox/plugins/{plugin_name}/
+├── config.json              # JSONSchema: env var config options
+├── binaries.jsonl           # Runtime dependencies: apt|brew|pip|npm|env
+├── on_Snapshot__XX_name.py  # Hook script (foreground)
+├── on_Snapshot__XX_name.bg.py  # Hook script (background)
+└── tests/
+    └── test_name.py
+```
+
+## Implementation Checklist
+
+### Phase 1: Renumber Existing Hooks ✅
+- [ ] Renumber DOM extractors to 50-59 range
+- [ ] Ensure pdf/screenshot are NOT .bg (need sequential access)
+- [ ] Ensure media (ytdlp) IS .bg (can run for hours)
+- [ ] Add step comments to each plugin for clarity
+
+### Phase 2: Timeout Consistency ✅
+- [x] All plugins support `PLUGINNAME_TIMEOUT` env var
+- [x] All plugins fall back to generic `TIMEOUT` env var
+- [x] Background scripts handle SIGTERM gracefully (or exit naturally)
+
+### Phase 3: Refactor Snapshot.run()
+- [ ] Parse hook filenames to extract step number (first digit)
+- [ ] Group hooks by step (0-9)
+- [ ] Run each step sequentially
+- [ ] Within each step:
+  - [ ] Launch foreground hooks sequentially
+  - [ ] Launch .bg hooks and track PIDs
+  - [ ] Wait for foreground hooks to complete before next step
+- [ ] Track .bg script timeouts independently
+- [ ] Send SIGTERM to .bg scripts when their timeout expires
+- [ ] Send SIGKILL 60s after SIGTERM if not exited
+
+### Phase 4: ArchiveResult Management
+- [ ] Create one ArchiveResult per hook (not per plugin)
+- [ ] Set initial state to `queued`
+- [ ] Update state based on JSONL output and exit code
+- [ ] Set `retry_at` for hooks that exit non-zero with no JSONL
+- [ ] Don't retry hooks that emit `{"status": "failed"}`
+
+### Phase 5: JSONL Streaming
+- [ ] Parse stdout JSONL line-by-line during hook execution
+- [ ] Create/update DB rows as JSONL is emitted (streaming mode)
+- [ ] Handle partial JSONL on hook crash
+
+### Phase 6: Zombie Process Management
+- [ ] Read `.pid` files from hook output directories
+- [ ] Sweep zombies on cleanup
+- [ ] Handle double-forked processes correctly
+
+## Migration Path
+
+### Backward Compatibility
+During migration, support both old and new numbering:
+1. Run hooks numbered 00-99 in step order
+2. Run unnumbered hooks last (step 9) for compatibility
+3. Log warnings for unnumbered hooks
+4. Eventually require all hooks to be numbered
+
+### Renumbering Map
+
+**Current → New:**
+```
+git/on_Snapshot__12_git.py                    → git/on_Snapshot__62_git.py
+media/on_Snapshot__51_media.py                → media/on_Snapshot__63_media.bg.py
+gallerydl/on_Snapshot__52_gallerydl.py        → gallerydl/on_Snapshot__64_gallerydl.bg.py
+forumdl/on_Snapshot__53_forumdl.py            → forumdl/on_Snapshot__65_forumdl.bg.py
+papersdl/on_Snapshot__54_papersdl.py          → papersdl/on_Snapshot__66_papersdl.bg.py
+
+readability/on_Snapshot__52_readability.py    → readability/on_Snapshot__55_readability.py
+mercury/on_Snapshot__53_mercury.py            → mercury/on_Snapshot__56_mercury.py
+
+singlefile/on_Snapshot__37_singlefile.py      → singlefile/on_Snapshot__50_singlefile.py
+screenshot/on_Snapshot__34_screenshot.js      → screenshot/on_Snapshot__51_screenshot.js
+pdf/on_Snapshot__35_pdf.js                    → pdf/on_Snapshot__52_pdf.js
+dom/on_Snapshot__36_dom.js                    → dom/on_Snapshot__53_dom.js
+title/on_Snapshot__32_title.js                → title/on_Snapshot__54_title.js
+headers/on_Snapshot__33_headers.js            → headers/on_Snapshot__55_headers.js
+
+wget/on_Snapshot__50_wget.py                  → wget/on_Snapshot__61_wget.py
+```
+
+## Testing Strategy
+
+### Unit Tests
+- Test hook ordering (00-99)
+- Test step grouping (first digit)
+- Test .bg vs foreground execution
+- Test timeout enforcement
+- Test JSONL parsing
+- Test failure modes & retry_at logic
+
+### Integration Tests
+- Test full Snapshot.run() with mixed hooks
+- Test .bg scripts running beyond step 99
+- Test zombie process cleanup
+- Test graceful SIGTERM handling
+- Test concurrent .bg script coordination
+
+### Performance Tests
+- Measure overhead of per-hook ArchiveResults
+- Test with 50+ concurrent .bg scripts
+- Test filesystem contention with many hooks
+
+## Open Questions
+
+### Q: Should we provide semaphore utilities?
+**A:** No. Keep plugins decoupled. Let them use simple filesystem coordination if needed.
+
+### Q: What happens if ArchiveResult table gets huge?
+**A:** We can delete old successful ArchiveResults periodically, or archive them to cold storage. The important data is in the filesystem outputs.
+
+### Q: Should naturally-exiting .bg scripts still be .bg?
+**A:** Yes. The .bg suffix means "don't block step progression," not "run until step 99." Natural exit is the best case.
+
+## Examples
+
+### Foreground Hook (Sequential DOM Access)
+```python
+#!/usr/bin/env python3
+# archivebox/plugins/screenshot/on_Snapshot__51_screenshot.js
+
+# Runs at step 5, blocks step progression until complete
+# Gets killed if it exceeds SCREENSHOT_TIMEOUT
+
+timeout = get_env_int('SCREENSHOT_TIMEOUT') or get_env_int('TIMEOUT', 60)
+
+try:
+    result = subprocess.run(cmd, capture_output=True, timeout=timeout)
+    if result.returncode == 0:
+        print(json.dumps({
+            "type": "ArchiveResult",
+            "status": "succeeded",
+            "output_str": "screenshot.png"
+        }))
+        sys.exit(0)
+    else:
+        # Temporary failure - will be retried
+        sys.exit(1)
+except subprocess.TimeoutExpired:
+    # Timeout - will be retried
+    sys.exit(1)
+```
+
+### Background Hook (Long-Running Download)
+```python
+#!/usr/bin/env python3
+# archivebox/plugins/media/on_Snapshot__63_media.bg.py
+
+# Runs at step 6, doesn't block step progression
+# Gets full MEDIA_TIMEOUT (e.g., 3600s) regardless of when step 99 completes
+
+timeout = get_env_int('YTDLP_TIMEOUT') or get_env_int('MEDIA_TIMEOUT') or get_env_int('TIMEOUT', 3600)
+
+try:
+    result = subprocess.run(['yt-dlp', url], capture_output=True, timeout=timeout)
+    if result.returncode == 0:
+        print(json.dumps({
+            "type": "ArchiveResult",
+            "status": "succeeded",
+            "output_str": "media/"
+        }))
+        sys.exit(0)
+    else:
+        # Hard failure - don't retry
+        print(json.dumps({
+            "type": "ArchiveResult",
+            "status": "failed",
+            "output_str": "Video unavailable"
+        }))
+        sys.exit(0)  # Exit 0 to record the failure
+except subprocess.TimeoutExpired:
+    # Timeout - will be retried
+    sys.exit(1)
+```
+
+### Background Hook with Natural Exit
+```javascript
+#!/usr/bin/env node
+// archivebox/plugins/ssl/on_Snapshot__23_ssl.bg.js
+
+// Sets up listener, captures SSL info, then exits naturally
+// No SIGTERM handler needed - already exits when done
+
+async function main() {
+    const page = await connectToChrome();
+
+    // Set up listener
+    page.on('response', async (response) => {
+        const securityDetails = response.securityDetails();
+        if (securityDetails) {
+            fs.writeFileSync('ssl.json', JSON.stringify(securityDetails));
+        }
+    });
+
+    // Wait for navigation (done by other hook)
+    await waitForNavigation();
+
+    // Emit result
+    console.log(JSON.stringify({
+        type: 'ArchiveResult',
+        status: 'succeeded',
+        output_str: 'ssl.json'
+    }));
+
+    process.exit(0);  // Natural exit - no await indefinitely
+}
+
+main().catch(e => {
+    console.error(`ERROR: ${e.message}`);
+    process.exit(1);  // Will be retried
+});
+```
+
+## Summary
+
+This plan provides:
+- ✅ Clear execution ordering (10 steps, 00-99 numbering)
+- ✅ Async support (.bg suffix)
+- ✅ Independent timeout control per plugin
+- ✅ Flexible failure handling & retry logic
+- ✅ Streaming JSONL output for DB updates
+- ✅ Simple filesystem-based coordination
+- ✅ Backward compatibility during migration
+
+The main implementation work is refactoring `Snapshot.run()` to enforce step ordering and manage .bg script lifecycles. Plugin renumbering is straightforward mechanical work.
diff --git a/archivebox/api/v1_cli.py b/archivebox/api/v1_cli.py
index 9282acce..3359ca54 100644
--- a/archivebox/api/v1_cli.py
+++ b/archivebox/api/v1_cli.py
@@ -54,7 +54,7 @@ class AddCommandSchema(Schema):
     tag: str = ""
     depth: int = 0
     parser: str = "auto"
-    extract: str = ""
+    plugins: str = ""
     update: bool = not ARCHIVING_CONFIG.ONLY_NEW  # Default to the opposite of ARCHIVING_CONFIG.ONLY_NEW
     overwrite: bool = False
     index_only: bool = False
@@ -69,7 +69,7 @@ class UpdateCommandSchema(Schema):
     status: Optional[StatusChoices] = StatusChoices.unarchived
     filter_type: Optional[str] = FilterTypeChoices.substring
     filter_patterns: Optional[List[str]] = ['https://example.com']
-    extractors: Optional[str] = ""
+    plugins: Optional[str] = ""
 
 class ScheduleCommandSchema(Schema):
     import_path: Optional[str] = None
@@ -115,7 +115,7 @@ def cli_add(request, args: AddCommandSchema):
         update=args.update,
         index_only=args.index_only,
         overwrite=args.overwrite,
-        plugins=args.extract,  # extract in API maps to plugins param
+        plugins=args.plugins,
         parser=args.parser,
         bg=True,  # Always run in background for API calls
     )
@@ -143,7 +143,7 @@ def cli_update(request, args: UpdateCommandSchema):
         status=args.status,
         filter_type=args.filter_type,
         filter_patterns=args.filter_patterns,
-        extractors=args.extractors,
+        plugins=args.plugins,
     )
     return {
         "success": True,
diff --git a/archivebox/api/v1_core.py b/archivebox/api/v1_core.py
index 7f4f4f37..3d83d710 100644
--- a/archivebox/api/v1_core.py
+++ b/archivebox/api/v1_core.py
@@ -65,7 +65,8 @@ class MinimalArchiveResultSchema(Schema):
     created_by_username: str
     status: str
     retry_at: datetime | None
-    extractor: str
+    plugin: str
+    hook_name: str
     cmd_version: str | None
     cmd: list[str] | None
     pwd: str | None
@@ -113,13 +114,14 @@ class ArchiveResultSchema(MinimalArchiveResultSchema):
 
 class ArchiveResultFilterSchema(FilterSchema):
     id: Optional[str] = Field(None, q=['id__startswith', 'snapshot__id__startswith', 'snapshot__timestamp__startswith'])
-    search: Optional[str] = Field(None, q=['snapshot__url__icontains', 'snapshot__title__icontains', 'snapshot__tags__name__icontains', 'extractor', 'output_str__icontains', 'id__startswith', 'snapshot__id__startswith', 'snapshot__timestamp__startswith'])
+    search: Optional[str] = Field(None, q=['snapshot__url__icontains', 'snapshot__title__icontains', 'snapshot__tags__name__icontains', 'plugin', 'output_str__icontains', 'id__startswith', 'snapshot__id__startswith', 'snapshot__timestamp__startswith'])
     snapshot_id: Optional[str] = Field(None, q=['snapshot__id__startswith', 'snapshot__timestamp__startswith'])
     snapshot_url: Optional[str] = Field(None, q='snapshot__url__icontains')
     snapshot_tag: Optional[str] = Field(None, q='snapshot__tags__name__icontains')
     status: Optional[str] = Field(None, q='status')
     output_str: Optional[str] = Field(None, q='output_str__icontains')
-    extractor: Optional[str] = Field(None, q='extractor__icontains')
+    plugin: Optional[str] = Field(None, q='plugin__icontains')
+    hook_name: Optional[str] = Field(None, q='hook_name__icontains')
     cmd: Optional[str] = Field(None, q='cmd__0__icontains')
     pwd: Optional[str] = Field(None, q='pwd__icontains')
     cmd_version: Optional[str] = Field(None, q='cmd_version')
diff --git a/archivebox/cli/archivebox_add.py b/archivebox/cli/archivebox_add.py
index 4a848d13..f868787d 100644
--- a/archivebox/cli/archivebox_add.py
+++ b/archivebox/cli/archivebox_add.py
@@ -86,7 +86,7 @@ def add(urls: str | list[str],
             'ONLY_NEW': not update,
             'INDEX_ONLY': index_only,
             'OVERWRITE': overwrite,
-            'EXTRACTORS': plugins,
+            'PLUGINS': plugins,
             'DEFAULT_PERSONA': persona or 'Default',
             'PARSER': parser,
         }
diff --git a/archivebox/cli/archivebox_extract.py b/archivebox/cli/archivebox_extract.py
index 7ebdc385..45eeb331 100644
--- a/archivebox/cli/archivebox_extract.py
+++ b/archivebox/cli/archivebox_extract.py
@@ -40,7 +40,7 @@ def process_archiveresult_by_id(archiveresult_id: str) -> int:
     """
     Run extraction for a single ArchiveResult by ID (used by workers).
 
-    Triggers the ArchiveResult's state machine tick() to run the extractor.
+    Triggers the ArchiveResult's state machine tick() to run the extractor plugin.
     """
     from rich import print as rprint
     from core.models import ArchiveResult
@@ -51,7 +51,7 @@ def process_archiveresult_by_id(archiveresult_id: str) -> int:
         rprint(f'[red]ArchiveResult {archiveresult_id} not found[/red]', file=sys.stderr)
         return 1
 
-    rprint(f'[blue]Extracting {archiveresult.extractor} for {archiveresult.snapshot.url}[/blue]', file=sys.stderr)
+    rprint(f'[blue]Extracting {archiveresult.plugin} for {archiveresult.snapshot.url}[/blue]', file=sys.stderr)
 
     try:
         # Trigger state machine tick - this runs the actual extraction
@@ -151,7 +151,7 @@ def run_plugins(
             # Only create for specific plugin
             result, created = ArchiveResult.objects.get_or_create(
                 snapshot=snapshot,
-                extractor=plugin,
+                plugin=plugin,
                 defaults={
                     'status': ArchiveResult.StatusChoices.QUEUED,
                     'retry_at': timezone.now(),
@@ -193,7 +193,7 @@ def run_plugins(
             snapshot = Snapshot.objects.get(id=snapshot_id)
             results = snapshot.archiveresult_set.all()
             if plugin:
-                results = results.filter(extractor=plugin)
+                results = results.filter(plugin=plugin)
 
             for result in results:
                 if is_tty:
@@ -202,7 +202,7 @@ def run_plugins(
                         'failed': 'red',
                         'skipped': 'yellow',
                     }.get(result.status, 'dim')
-                    rprint(f'  [{status_color}]{result.status}[/{status_color}] {result.extractor} → {result.output_str or ""}', file=sys.stderr)
+                    rprint(f'  [{status_color}]{result.status}[/{status_color}] {result.plugin} → {result.output_str or ""}', file=sys.stderr)
                 else:
                     write_record(archiveresult_to_jsonl(result))
         except Snapshot.DoesNotExist:
diff --git a/archivebox/core/admin_archiveresults.py b/archivebox/core/admin_archiveresults.py
index 749170ab..1acaf27a 100644
--- a/archivebox/core/admin_archiveresults.py
+++ b/archivebox/core/admin_archiveresults.py
@@ -13,16 +13,16 @@ from archivebox.config import DATA_DIR
 from archivebox.config.common import SERVER_CONFIG
 from archivebox.misc.paginators import AccelleratedPaginator
 from archivebox.base_models.admin import BaseModelAdmin
-from archivebox.hooks import get_extractor_icon
+from archivebox.hooks import get_plugin_icon
 
 
 from core.models import ArchiveResult, Snapshot
 
 
 def render_archiveresults_list(archiveresults_qs, limit=50):
-    """Render a nice inline list view of archive results with status, extractor, output, and actions."""
+    """Render a nice inline list view of archive results with status, plugin, output, and actions."""
 
-    results = list(archiveresults_qs.order_by('extractor').select_related('snapshot')[:limit])
+    results = list(archiveresults_qs.order_by('plugin').select_related('snapshot')[:limit])
 
     if not results:
         return mark_safe('<div style="color: #64748b; font-style: italic; padding: 16px 0;">No Archive Results yet...</div>')
@@ -40,8 +40,8 @@ def render_archiveresults_list(archiveresults_qs, limit=50):
         status = result.status or 'queued'
         color, bg = status_colors.get(status, ('#6b7280', '#f3f4f6'))
 
-        # Get extractor icon
-        icon = get_extractor_icon(result.extractor)
+        # Get plugin icon
+        icon = get_plugin_icon(result.plugin)
 
         # Format timestamp
         end_time = result.end_ts.strftime('%Y-%m-%d %H:%M:%S') if result.end_ts else '-'
@@ -79,7 +79,7 @@ def render_archiveresults_list(archiveresults_qs, limit=50):
                                  font-size: 11px; font-weight: 600; text-transform: uppercase;
                                  color: {color}; background: {bg};">{status}</span>
                 </td>
-                <td style="padding: 10px 12px; white-space: nowrap; font-size: 20px;" title="{result.extractor}">
+                <td style="padding: 10px 12px; white-space: nowrap; font-size: 20px;" title="{result.plugin}">
                     {icon}
                 </td>
                 <td style="padding: 10px 12px; font-weight: 500; color: #334155;">
@@ -88,7 +88,7 @@ def render_archiveresults_list(archiveresults_qs, limit=50):
                        title="View output fullscreen"
                        onmouseover="this.style.color='#2563eb'; this.style.textDecoration='underline';"
                        onmouseout="this.style.color='#334155'; this.style.textDecoration='none';">
-                        {result.extractor}
+                        {result.plugin}
                     </a>
                 </td>
                 <td style="padding: 10px 12px; max-width: 280px;">
@@ -162,7 +162,7 @@ def render_archiveresults_list(archiveresults_qs, limit=50):
                         <th style="padding: 10px 12px; text-align: left; font-weight: 600; color: #475569; font-size: 12px; text-transform: uppercase; letter-spacing: 0.05em;">ID</th>
                         <th style="padding: 10px 12px; text-align: left; font-weight: 600; color: #475569; font-size: 12px; text-transform: uppercase; letter-spacing: 0.05em;">Status</th>
                         <th style="padding: 10px 12px; text-align: left; font-weight: 600; color: #475569; font-size: 12px; width: 32px;"></th>
-                        <th style="padding: 10px 12px; text-align: left; font-weight: 600; color: #475569; font-size: 12px; text-transform: uppercase; letter-spacing: 0.05em;">Extractor</th>
+                        <th style="padding: 10px 12px; text-align: left; font-weight: 600; color: #475569; font-size: 12px; text-transform: uppercase; letter-spacing: 0.05em;">Plugin</th>
                         <th style="padding: 10px 12px; text-align: left; font-weight: 600; color: #475569; font-size: 12px; text-transform: uppercase; letter-spacing: 0.05em;">Output</th>
                         <th style="padding: 10px 12px; text-align: left; font-weight: 600; color: #475569; font-size: 12px; text-transform: uppercase; letter-spacing: 0.05em;">Completed</th>
                         <th style="padding: 10px 12px; text-align: left; font-weight: 600; color: #475569; font-size: 12px; text-transform: uppercase; letter-spacing: 0.05em;">Version</th>
@@ -185,9 +185,9 @@ class ArchiveResultInline(admin.TabularInline):
     parent_model = Snapshot
     # fk_name = 'snapshot'
     extra = 0
-    sort_fields = ('end_ts', 'extractor', 'output_str', 'status', 'cmd_version')
+    sort_fields = ('end_ts', 'plugin', 'output_str', 'status', 'cmd_version')
     readonly_fields = ('id', 'result_id', 'completed', 'command', 'version')
-    fields = ('start_ts', 'end_ts', *readonly_fields, 'extractor', 'cmd', 'cmd_version', 'pwd', 'created_by', 'status', 'retry_at', 'output_str')
+    fields = ('start_ts', 'end_ts', *readonly_fields, 'plugin', 'cmd', 'cmd_version', 'pwd', 'created_by', 'status', 'retry_at', 'output_str')
     # exclude = ('id',)
     ordering = ('end_ts',)
     show_change_link = True
@@ -253,9 +253,9 @@ class ArchiveResultInline(admin.TabularInline):
 
 class ArchiveResultAdmin(BaseModelAdmin):
     list_display = ('id', 'created_by', 'created_at', 'snapshot_info', 'tags_str', 'status', 'extractor_with_icon', 'cmd_str', 'output_str')
-    sort_fields = ('id', 'created_by', 'created_at', 'extractor', 'status')
+    sort_fields = ('id', 'created_by', 'created_at', 'plugin', 'status')
     readonly_fields = ('cmd_str', 'snapshot_info', 'tags_str', 'created_at', 'modified_at', 'output_summary', 'extractor_with_icon', 'iface')
-    search_fields = ('id', 'snapshot__url', 'extractor', 'output_str', 'cmd_version', 'cmd', 'snapshot__timestamp')
+    search_fields = ('id', 'snapshot__url', 'plugin', 'output_str', 'cmd_version', 'cmd', 'snapshot__timestamp')
     autocomplete_fields = ['snapshot']
 
     fieldsets = (
@@ -263,8 +263,8 @@ class ArchiveResultAdmin(BaseModelAdmin):
             'fields': ('snapshot', 'snapshot_info', 'tags_str'),
             'classes': ('card', 'wide'),
         }),
-        ('Extractor', {
-            'fields': ('extractor', 'extractor_with_icon', 'status', 'retry_at', 'iface'),
+        ('Plugin', {
+            'fields': ('plugin', 'plugin_with_icon', 'status', 'retry_at', 'iface'),
             'classes': ('card',),
         }),
         ('Timing', {
@@ -285,7 +285,7 @@ class ArchiveResultAdmin(BaseModelAdmin):
         }),
     )
 
-    list_filter = ('status', 'extractor', 'start_ts', 'cmd_version')
+    list_filter = ('status', 'plugin', 'start_ts', 'cmd_version')
     ordering = ['-start_ts']
     list_per_page = SERVER_CONFIG.SNAPSHOTS_PER_PAGE
 
@@ -321,14 +321,14 @@ class ArchiveResultAdmin(BaseModelAdmin):
     def tags_str(self, result):
         return result.snapshot.tags_str()
 
-    @admin.display(description='Extractor', ordering='extractor')
-    def extractor_with_icon(self, result):
-        icon = get_extractor_icon(result.extractor)
+    @admin.display(description='Plugin', ordering='plugin')
+    def plugin_with_icon(self, result):
+        icon = get_plugin_icon(result.plugin)
         return format_html(
             '<span title="{}">{}</span> {}',
-            result.extractor,
+            result.plugin,
             icon,
-            result.extractor,
+            result.plugin,
         )
 
     def cmd_str(self, result):
diff --git a/archivebox/core/forms.py b/archivebox/core/forms.py
index a4390d96..4aa2fb9e 100644
--- a/archivebox/core/forms.py
+++ b/archivebox/core/forms.py
@@ -10,19 +10,19 @@ DEPTH_CHOICES = (
     ('1', 'depth = 1 (archive these URLs and all URLs one hop away)'),
 )
 
-from archivebox.hooks import get_extractors
+from archivebox.hooks import get_plugins
 
-def get_archive_methods():
-    """Get available archive methods from discovered hooks."""
-    return [(name, name) for name in get_extractors()]
+def get_plugin_choices():
+    """Get available extractor plugins from discovered hooks."""
+    return [(name, name) for name in get_plugins()]
 
 
 class AddLinkForm(forms.Form):
     url = forms.RegexField(label="URLs (one per line)", regex=URL_REGEX, min_length='6', strip=True, widget=forms.Textarea, required=True)
     tag = forms.CharField(label="Tags (comma separated tag1,tag2,tag3)", strip=True, required=False)
     depth = forms.ChoiceField(label="Archive depth", choices=DEPTH_CHOICES, initial='0', widget=forms.RadioSelect(attrs={"class": "depth-selection"}))
-    archive_methods = forms.MultipleChoiceField(
-        label="Archive methods (select at least 1, otherwise all will be used by default)",
+    plugins = forms.MultipleChoiceField(
+        label="Plugins (select at least 1, otherwise all will be used by default)",
         required=False,
         widget=forms.SelectMultiple,
         choices=[],  # populated dynamically in __init__
@@ -30,7 +30,7 @@ class AddLinkForm(forms.Form):
 
     def __init__(self, *args, **kwargs):
         super().__init__(*args, **kwargs)
-        self.fields['archive_methods'].choices = get_archive_methods()
+        self.fields['plugins'].choices = get_plugin_choices()
     # TODO: hook these up to the view and put them 
     # in a collapsible UI section labeled "Advanced"
     #
diff --git a/archivebox/core/models.py b/archivebox/core/models.py
index 928abf80..673c85a9 100755
--- a/archivebox/core/models.py
+++ b/archivebox/core/models.py
@@ -867,6 +867,7 @@ class Snapshot(ModelWithOutputDir, ModelWithConfig, ModelWithNotes, ModelWithHea
             ArchiveResult.objects.create(
                 snapshot=self,
                 plugin=plugin,
+                hook_name=result_data.get('hook_name', ''),
                 status=result_data.get('status', 'failed'),
                 output_str=result_data.get('output', ''),
                 cmd=result_data.get('cmd', []),
@@ -1162,7 +1163,7 @@ class Snapshot(ModelWithOutputDir, ModelWithConfig, ModelWithNotes, ModelWithHea
                 continue
             pid_file = plugin_dir / 'hook.pid'
             if pid_file.exists():
-                kill_process(pid_file)
+                kill_process(pid_file, validate=True)  # Use validation
 
                 # Update the ArchiveResult from filesystem
                 plugin_name = plugin_dir.name
diff --git a/archivebox/core/statemachines.py b/archivebox/core/statemachines.py
index 81c453aa..89dda0c8 100644
--- a/archivebox/core/statemachines.py
+++ b/archivebox/core/statemachines.py
@@ -192,7 +192,7 @@ class ArchiveResultMachine(StateMachine, strict_states=True):
     
     def is_backoff(self) -> bool:
         """Check if we should backoff and retry later."""
-        # Backoff if status is still started (extractor didn't complete) and output_str is empty
+        # Backoff if status is still started (plugin didn't complete) and output_str is empty
         return (
             self.archiveresult.status == ArchiveResult.StatusChoices.STARTED and
             not self.archiveresult.output_str
@@ -222,19 +222,19 @@ class ArchiveResultMachine(StateMachine, strict_states=True):
         # Suppressed: state transition logs
         # Lock the object and mark start time
         self.archiveresult.update_for_workers(
-            retry_at=timezone.now() + timedelta(seconds=120),  # 2 min timeout for extractor
+            retry_at=timezone.now() + timedelta(seconds=120),  # 2 min timeout for plugin
             status=ArchiveResult.StatusChoices.STARTED,
             start_ts=timezone.now(),
             iface=NetworkInterface.current(),
         )
 
-        # Run the extractor - this updates status, output, timestamps, etc.
+        # Run the plugin - this updates status, output, timestamps, etc.
         self.archiveresult.run()
 
         # Save the updated result
         self.archiveresult.save()
 
-        # Suppressed: extractor result logs (already logged by worker)
+        # Suppressed: plugin result logs (already logged by worker)
 
     @backoff.enter
     def enter_backoff(self):
diff --git a/archivebox/core/templatetags/core_tags.py b/archivebox/core/templatetags/core_tags.py
index 33a620c0..685665a4 100644
--- a/archivebox/core/templatetags/core_tags.py
+++ b/archivebox/core/templatetags/core_tags.py
@@ -5,7 +5,7 @@ from django.utils.safestring import mark_safe
 from typing import Union
 
 from archivebox.hooks import (
-    get_extractor_icon, get_extractor_template, get_extractor_name,
+    get_plugin_icon, get_plugin_template, get_plugin_name,
 )
 
 
@@ -51,30 +51,30 @@ def url_replace(context, **kwargs):
 
 
 @register.simple_tag
-def extractor_icon(extractor: str) -> str:
+def plugin_icon(plugin: str) -> str:
     """
-    Render the icon for an extractor.
+    Render the icon for a plugin.
 
-    Usage: {% extractor_icon "screenshot" %}
+    Usage: {% plugin_icon "screenshot" %}
     """
-    return mark_safe(get_extractor_icon(extractor))
+    return mark_safe(get_plugin_icon(plugin))
 
 
 @register.simple_tag(takes_context=True)
-def extractor_thumbnail(context, result) -> str:
+def plugin_thumbnail(context, result) -> str:
     """
     Render the thumbnail template for an archive result.
 
-    Usage: {% extractor_thumbnail result %}
+    Usage: {% plugin_thumbnail result %}
 
     Context variables passed to template:
         - result: ArchiveResult object
         - snapshot: Parent Snapshot object
         - output_path: Path to output relative to snapshot dir (from embed_path())
-        - extractor: Extractor base name
+        - plugin: Plugin base name
     """
-    extractor = get_extractor_name(result.extractor)
-    template_str = get_extractor_template(extractor, 'thumbnail')
+    plugin = get_plugin_name(result.plugin)
+    template_str = get_plugin_template(plugin, 'thumbnail')
 
     if not template_str:
         return ''
@@ -89,7 +89,7 @@ def extractor_thumbnail(context, result) -> str:
             'result': result,
             'snapshot': result.snapshot,
             'output_path': output_path,
-            'extractor': extractor,
+            'plugin': plugin,
         })
         return mark_safe(tpl.render(ctx))
     except Exception:
@@ -97,14 +97,14 @@ def extractor_thumbnail(context, result) -> str:
 
 
 @register.simple_tag(takes_context=True)
-def extractor_embed(context, result) -> str:
+def plugin_embed(context, result) -> str:
     """
     Render the embed iframe template for an archive result.
 
-    Usage: {% extractor_embed result %}
+    Usage: {% plugin_embed result %}
     """
-    extractor = get_extractor_name(result.extractor)
-    template_str = get_extractor_template(extractor, 'embed')
+    plugin = get_plugin_name(result.plugin)
+    template_str = get_plugin_template(plugin, 'embed')
 
     if not template_str:
         return ''
@@ -117,7 +117,7 @@ def extractor_embed(context, result) -> str:
             'result': result,
             'snapshot': result.snapshot,
             'output_path': output_path,
-            'extractor': extractor,
+            'plugin': plugin,
         })
         return mark_safe(tpl.render(ctx))
     except Exception:
@@ -125,14 +125,14 @@ def extractor_embed(context, result) -> str:
 
 
 @register.simple_tag(takes_context=True)
-def extractor_fullscreen(context, result) -> str:
+def plugin_fullscreen(context, result) -> str:
     """
     Render the fullscreen template for an archive result.
 
-    Usage: {% extractor_fullscreen result %}
+    Usage: {% plugin_fullscreen result %}
     """
-    extractor = get_extractor_name(result.extractor)
-    template_str = get_extractor_template(extractor, 'fullscreen')
+    plugin = get_plugin_name(result.plugin)
+    template_str = get_plugin_template(plugin, 'fullscreen')
 
     if not template_str:
         return ''
@@ -145,7 +145,7 @@ def extractor_fullscreen(context, result) -> str:
             'result': result,
             'snapshot': result.snapshot,
             'output_path': output_path,
-            'extractor': extractor,
+            'plugin': plugin,
         })
         return mark_safe(tpl.render(ctx))
     except Exception:
@@ -153,10 +153,10 @@ def extractor_fullscreen(context, result) -> str:
 
 
 @register.filter
-def extractor_name(value: str) -> str:
+def plugin_name(value: str) -> str:
     """
-    Get the base name of an extractor (strips numeric prefix).
+    Get the base name of a plugin (strips numeric prefix).
 
-    Usage: {{ result.extractor|extractor_name }}
+    Usage: {{ result.plugin|plugin_name }}
     """
-    return get_extractor_name(value)
+    return get_plugin_name(value)
diff --git a/archivebox/core/views.py b/archivebox/core/views.py
index 1ffb20b8..df17924a 100644
--- a/archivebox/core/views.py
+++ b/archivebox/core/views.py
@@ -56,9 +56,9 @@ class SnapshotView(View):
     def render_live_index(request, snapshot):
         TITLE_LOADING_MSG = 'Not yet archived...'
 
-        # Dict of extractor -> ArchiveResult object
+        # Dict of plugin -> ArchiveResult object
         archiveresult_objects = {}
-        # Dict of extractor -> result info dict (for template compatibility)
+        # Dict of plugin -> result info dict (for template compatibility)
         archiveresults = {}
 
         results = snapshot.archiveresult_set.all()
@@ -75,16 +75,16 @@ class SnapshotView(View):
                     continue
 
                 # Store the full ArchiveResult object for template tags
-                archiveresult_objects[result.extractor] = result
+                archiveresult_objects[result.plugin] = result
 
                 result_info = {
-                    'name': result.extractor,
+                    'name': result.plugin,
                     'path': embed_path,
                     'ts': ts_to_date_str(result.end_ts),
                     'size': abs_path.stat().st_size or '?',
                     'result': result,  # Include the full object for template tags
                 }
-                archiveresults[result.extractor] = result_info
+                archiveresults[result.plugin] = result_info
 
         # Use canonical_outputs for intelligent discovery
         # This method now scans ArchiveResults and uses smart heuristics
@@ -119,8 +119,8 @@ class SnapshotView(View):
 
         # Get available extractors from hooks (sorted by numeric prefix for ordering)
         # Convert to base names for display ordering
-        all_extractors = [get_extractor_name(e) for e in get_extractors()]
-        preferred_types = tuple(all_extractors)
+        all_plugins = [get_extractor_name(e) for e in get_extractors()]
+        preferred_types = tuple(all_plugins)
         all_types = preferred_types + tuple(result_type for result_type in archiveresults.keys() if result_type not in preferred_types)
 
         best_result = {'path': 'None', 'result': None}
@@ -463,7 +463,6 @@ class AddView(UserPassesTestMixin, FormView):
         urls_content = sources_file.read_text()
         crawl = Crawl.objects.create(
             urls=urls_content,
-            extractor=parser,
             max_depth=depth,
             tags_str=tag,
             label=f'{self.request.user.username}@{HOSTNAME}{self.request.path} {timestamp}',
@@ -598,12 +597,12 @@ def live_progress_view(request):
                         ArchiveResult.StatusChoices.SUCCEEDED: 2,
                         ArchiveResult.StatusChoices.FAILED: 3,
                     }
-                    return (status_order.get(ar.status, 4), ar.extractor)
+                    return (status_order.get(ar.status, 4), ar.plugin)
 
-                all_extractors = [
+                all_plugins = [
                     {
                         'id': str(ar.id),
-                        'extractor': ar.extractor,
+                        'plugin': ar.plugin,
                         'status': ar.status,
                     }
                     for ar in sorted(snapshot_results, key=extractor_sort_key)
@@ -619,7 +618,7 @@ def live_progress_view(request):
                     'completed_extractors': completed_extractors,
                     'failed_extractors': failed_extractors,
                     'pending_extractors': pending_extractors,
-                    'all_extractors': all_extractors,
+                    'all_plugins': all_plugins,
                 })
 
             # Check if crawl can start (for debugging stuck crawls)
diff --git a/archivebox/crawls/models.py b/archivebox/crawls/models.py
index 3ce21d99..b143f13f 100755
--- a/archivebox/crawls/models.py
+++ b/archivebox/crawls/models.py
@@ -353,10 +353,18 @@ class Crawl(ModelWithOutputDir, ModelWithConfig, ModelWithHealthStats, ModelWith
         import signal
         from pathlib import Path
         from archivebox.hooks import run_hook, discover_hooks
+        from archivebox.misc.process_utils import validate_pid_file
 
         # Kill any background processes by scanning for all .pid files
         if self.OUTPUT_DIR.exists():
             for pid_file in self.OUTPUT_DIR.glob('**/*.pid'):
+                # Validate PID before killing to avoid killing unrelated processes
+                cmd_file = pid_file.parent / 'cmd.sh'
+                if not validate_pid_file(pid_file, cmd_file):
+                    # PID reused by different process or process dead
+                    pid_file.unlink(missing_ok=True)
+                    continue
+                
                 try:
                     pid = int(pid_file.read_text().strip())
                     try:
diff --git a/archivebox/hooks.py b/archivebox/hooks.py
index e308dc51..aff3ea22 100644
--- a/archivebox/hooks.py
+++ b/archivebox/hooks.py
@@ -292,8 +292,13 @@ def run_hook(
     stdout_file = output_dir / 'stdout.log'
     stderr_file = output_dir / 'stderr.log'
     pid_file = output_dir / 'hook.pid'
+    cmd_file = output_dir / 'cmd.sh'
 
     try:
+        # Write command script for validation
+        from archivebox.misc.process_utils import write_cmd_file
+        write_cmd_file(cmd_file, cmd)
+
         # Open log files for writing
         with open(stdout_file, 'w') as out, open(stderr_file, 'w') as err:
             process = subprocess.Popen(
@@ -304,8 +309,10 @@ def run_hook(
                 env=env,
             )
 
-            # Write PID for all hooks (useful for debugging/cleanup)
-            pid_file.write_text(str(process.pid))
+            # Write PID with mtime set to process start time for validation
+            from archivebox.misc.process_utils import write_pid_file_with_mtime
+            process_start_time = time.time()
+            write_pid_file_with_mtime(pid_file, process.pid, process_start_time)
 
             if is_background:
                 # Background hook - return None immediately, don't wait
@@ -1327,21 +1334,29 @@ def process_is_alive(pid_file: Path) -> bool:
         return False
 
 
-def kill_process(pid_file: Path, sig: int = signal.SIGTERM):
+def kill_process(pid_file: Path, sig: int = signal.SIGTERM, validate: bool = True):
     """
-    Kill process in PID file.
+    Kill process in PID file with optional validation.
 
     Args:
         pid_file: Path to hook.pid file
         sig: Signal to send (default SIGTERM)
+        validate: If True, validate process identity before killing (default: True)
     """
-    if not pid_file.exists():
-        return
-
-    try:
-        pid = int(pid_file.read_text().strip())
-        os.kill(pid, sig)
-    except (OSError, ValueError):
-        pass
+    from archivebox.misc.process_utils import safe_kill_process
+    
+    if validate:
+        # Use safe kill with validation
+        cmd_file = pid_file.parent / 'cmd.sh'
+        safe_kill_process(pid_file, cmd_file, signal_num=sig, validate=True)
+    else:
+        # Legacy behavior - kill without validation
+        if not pid_file.exists():
+            return
+        try:
+            pid = int(pid_file.read_text().strip())
+            os.kill(pid, sig)
+        except (OSError, ValueError):
+            pass
 
 
diff --git a/archivebox/misc/jsonl.py b/archivebox/misc/jsonl.py
index 50cbd3e5..3e9f6e97 100644
--- a/archivebox/misc/jsonl.py
+++ b/archivebox/misc/jsonl.py
@@ -178,7 +178,8 @@ def archiveresult_to_jsonl(result) -> Dict[str, Any]:
         'type': TYPE_ARCHIVERESULT,
         'id': str(result.id),
         'snapshot_id': str(result.snapshot_id),
-        'extractor': result.extractor,
+        'plugin': result.plugin,
+        'hook_name': result.hook_name,
         'status': result.status,
         'output_str': result.output_str,
         'start_ts': result.start_ts.isoformat() if result.start_ts else None,
diff --git a/archivebox/plugins/chrome/on_Crawl__20_chrome_launch.bg.js b/archivebox/plugins/chrome/on_Crawl__20_chrome_launch.bg.js
index 7ee41eda..3ae9a039 100644
--- a/archivebox/plugins/chrome/on_Crawl__20_chrome_launch.bg.js
+++ b/archivebox/plugins/chrome/on_Crawl__20_chrome_launch.bg.js
@@ -29,6 +29,98 @@ const http = require('http');
 const EXTRACTOR_NAME = 'chrome_launch';
 const OUTPUT_DIR = 'chrome';
 
+// Helper: Write PID file with mtime set to process start time
+function writePidWithMtime(filePath, pid, startTimeSeconds) {
+    fs.writeFileSync(filePath, String(pid));
+    // Set both atime and mtime to process start time for validation
+    const startTimeMs = startTimeSeconds * 1000;
+    fs.utimesSync(filePath, new Date(startTimeMs), new Date(startTimeMs));
+}
+
+// Helper: Write command script for validation
+function writeCmdScript(filePath, binary, args) {
+    // Shell escape arguments containing spaces or special characters
+    const escapedArgs = args.map(arg => {
+        if (arg.includes(' ') || arg.includes('"') || arg.includes('$')) {
+            return `"${arg.replace(/"/g, '\\"')}"`;
+        }
+        return arg;
+    });
+    const script = `#!/bin/bash\n${binary} ${escapedArgs.join(' ')}\n`;
+    fs.writeFileSync(filePath, script);
+    fs.chmodSync(filePath, 0o755);
+}
+
+// Helper: Get process start time (cross-platform)
+function getProcessStartTime(pid) {
+    try {
+        const { execSync } = require('child_process');
+        if (process.platform === 'darwin') {
+            // macOS: ps -p PID -o lstart= gives start time
+            const output = execSync(`ps -p ${pid} -o lstart=`, { encoding: 'utf8', timeout: 1000 });
+            return Date.parse(output.trim()) / 1000;  // Convert to epoch seconds
+        } else {
+            // Linux: read /proc/PID/stat field 22 (starttime in clock ticks)
+            const stat = fs.readFileSync(`/proc/${pid}/stat`, 'utf8');
+            const match = stat.match(/\) \w+ (\d+)/);
+            if (match) {
+                const startTicks = parseInt(match[1], 10);
+                // Convert clock ticks to seconds (assuming 100 ticks/sec)
+                const uptimeSeconds = parseFloat(fs.readFileSync('/proc/uptime', 'utf8').split(' ')[0]);
+                const bootTime = Date.now() / 1000 - uptimeSeconds;
+                return bootTime + (startTicks / 100);
+            }
+        }
+    } catch (e) {
+        // Can't get start time
+        return null;
+    }
+    return null;
+}
+
+// Helper: Validate PID using mtime and command
+function validatePid(pid, pidFile, cmdFile) {
+    try {
+        // Check process exists
+        try {
+            process.kill(pid, 0);  // Signal 0 = check existence
+        } catch (e) {
+            return false;  // Process doesn't exist
+        }
+
+        // Check mtime matches process start time (within 5 sec tolerance)
+        const fileStat = fs.statSync(pidFile);
+        const fileMtime = fileStat.mtimeMs / 1000;  // Convert to seconds
+        const procStartTime = getProcessStartTime(pid);
+
+        if (procStartTime === null) {
+            // Can't validate - fall back to basic existence check
+            return true;
+        }
+
+        if (Math.abs(fileMtime - procStartTime) > 5) {
+            // PID was reused by different process
+            return false;
+        }
+
+        // Validate command if available
+        if (fs.existsSync(cmdFile)) {
+            const cmd = fs.readFileSync(cmdFile, 'utf8');
+            // Check for Chrome/Chromium and debug port
+            if (!cmd.includes('chrome') && !cmd.includes('chromium')) {
+                return false;
+            }
+            if (!cmd.includes('--remote-debugging-port')) {
+                return false;
+            }
+        }
+
+        return true;
+    } catch (e) {
+        return false;
+    }
+}
+
 // Global state for cleanup
 let chromePid = null;
 
@@ -240,17 +332,17 @@ function killZombieChrome() {
                         const pid = parseInt(fs.readFileSync(pidFile, 'utf8').trim(), 10);
                         if (isNaN(pid) || pid <= 0) continue;
 
-                        // Check if process exists
-                        try {
-                            process.kill(pid, 0);
-                        } catch (e) {
-                            // Process dead, remove stale PID file
+                        // Validate PID before killing
+                        const cmdFile = path.join(chromeDir, 'cmd.sh');
+                        if (!validatePid(pid, pidFile, cmdFile)) {
+                            // PID reused or validation failed
+                            console.error(`[!] PID ${pid} failed validation (reused or wrong process) - cleaning up`);
                             try { fs.unlinkSync(pidFile); } catch (e) {}
                             continue;
                         }
 
-                        // Process alive but crawl is stale - zombie!
-                        console.error(`[!] Found zombie (PID ${pid}) from stale crawl ${crawl.name}`);
+                        // Process alive, validated, and crawl is stale - zombie!
+                        console.error(`[!] Found validated zombie (PID ${pid}) from stale crawl ${crawl.name}`);
 
                         try {
                             // Kill process group first
@@ -386,15 +478,20 @@ async function launchChrome(binary) {
     chromeProcess.unref(); // Don't keep Node.js process running
 
     chromePid = chromeProcess.pid;
+    const chromeStartTime = Date.now() / 1000;  // Unix epoch seconds
     console.error(`[*] Launched Chrome (PID: ${chromePid}), waiting for debug port...`);
 
-    // Write Chrome PID for backup cleanup (named .pid so Crawl.cleanup() finds it)
-    fs.writeFileSync(path.join(OUTPUT_DIR, 'chrome.pid'), String(chromePid));
+    // Write Chrome PID with mtime set to start time for validation
+    writePidWithMtime(path.join(OUTPUT_DIR, 'chrome.pid'), chromePid, chromeStartTime);
+
+    // Write command script for validation
+    writeCmdScript(path.join(OUTPUT_DIR, 'cmd.sh'), binary, chromeArgs);
+
     fs.writeFileSync(path.join(OUTPUT_DIR, 'port.txt'), String(debugPort));
 
-    // Write hook's own PID so Crawl.cleanup() can kill this hook process
-    // (which will trigger our SIGTERM handler to kill Chrome)
-    fs.writeFileSync(path.join(OUTPUT_DIR, 'hook.pid'), String(process.pid));
+    // Write hook's own PID with mtime for validation
+    const hookStartTime = Date.now() / 1000;
+    writePidWithMtime(path.join(OUTPUT_DIR, 'hook.pid'), process.pid, hookStartTime);
 
     try {
         // Wait for Chrome to be ready
diff --git a/archivebox/templates/admin/progress_monitor.html b/archivebox/templates/admin/progress_monitor.html
index 10286104..a2be9eda 100644
--- a/archivebox/templates/admin/progress_monitor.html
+++ b/archivebox/templates/admin/progress_monitor.html
@@ -488,7 +488,7 @@
                 <span class="progress-fill"></span>
                 <span class="badge-content">
                     <span class="badge-icon">${icon}</span>
-                    <span>${extractor.extractor}</span>
+                    <span>${extractor.plugin}</span>
                 </span>
             </span>
         `;
@@ -499,10 +499,10 @@
         const adminUrl = `/admin/core/snapshot/${snapshot.id}/change/`;
 
         let extractorHtml = '';
-        if (snapshot.all_extractors && snapshot.all_extractors.length > 0) {
-            // Sort extractors alphabetically by name to prevent reordering on updates
-            const sortedExtractors = [...snapshot.all_extractors].sort((a, b) =>
-                a.extractor.localeCompare(b.extractor)
+        if (snapshot.all_plugins && snapshot.all_plugins.length > 0) {
+            // Sort plugins alphabetically by name to prevent reordering on updates
+            const sortedExtractors = [...snapshot.all_plugins].sort((a, b) =>
+                a.plugin.localeCompare(b.plugin)
             );
             extractorHtml = `
                 <div class="extractor-list">
diff --git a/archivebox/tests/test_migrations_helpers.py b/archivebox/tests/test_migrations_helpers.py
index eddaa4e8..b634583b 100644
--- a/archivebox/tests/test_migrations_helpers.py
+++ b/archivebox/tests/test_migrations_helpers.py
@@ -158,7 +158,7 @@ CREATE TABLE IF NOT EXISTS core_snapshot_tags (
 CREATE TABLE IF NOT EXISTS core_archiveresult (
     id INTEGER PRIMARY KEY AUTOINCREMENT,
     snapshot_id CHAR(32) NOT NULL REFERENCES core_snapshot(id),
-    extractor VARCHAR(32) NOT NULL,
+    plugin VARCHAR(32) NOT NULL,
     cmd TEXT,
     pwd VARCHAR(256),
     cmd_version VARCHAR(128),
@@ -379,7 +379,7 @@ CREATE TABLE IF NOT EXISTS crawls_seed (
     created_by_id INTEGER NOT NULL REFERENCES auth_user(id),
     modified_at DATETIME,
     uri VARCHAR(2048) NOT NULL,
-    extractor VARCHAR(32) NOT NULL DEFAULT 'auto',
+    plugin VARCHAR(32) NOT NULL DEFAULT 'auto',
     tags_str VARCHAR(255) NOT NULL DEFAULT '',
     label VARCHAR(255) NOT NULL DEFAULT '',
     config TEXT DEFAULT '{}',
@@ -465,7 +465,7 @@ CREATE TABLE IF NOT EXISTS core_archiveresult (
     created_at DATETIME NOT NULL,
     modified_at DATETIME,
     snapshot_id CHAR(36) NOT NULL REFERENCES core_snapshot(id),
-    extractor VARCHAR(32) NOT NULL,
+    plugin VARCHAR(32) NOT NULL,
     pwd VARCHAR(256),
     cmd TEXT,
     cmd_version VARCHAR(128),
diff --git a/archivebox/workers/worker.py b/archivebox/workers/worker.py
index b97eb435..ca67cccc 100644
--- a/archivebox/workers/worker.py
+++ b/archivebox/workers/worker.py
@@ -227,10 +227,10 @@ class Worker:
                         urls = obj.get_urls_list()
                         url = urls[0] if urls else None
 
-                    extractor = None
-                    if hasattr(obj, 'extractor'):
+                    plugin = None
+                    if hasattr(obj, 'plugin'):
                         # ArchiveResultWorker, Crawl
-                        extractor = obj.extractor
+                        plugin = obj.plugin
 
                     log_worker_event(
                         worker_type=worker_type_name,
@@ -239,7 +239,7 @@ class Worker:
                         pid=self.pid,
                         worker_id=str(self.worker_id),
                         url=url,
-                        extractor=extractor,
+                        plugin=plugin,
                         metadata=start_metadata if start_metadata else None,
                     )
 
@@ -262,7 +262,7 @@ class Worker:
                         pid=self.pid,
                         worker_id=str(self.worker_id),
                         url=url,
-                        extractor=extractor,
+                        plugin=plugin,
                         metadata=complete_metadata,
                     )
                 else:
@@ -345,9 +345,9 @@ class ArchiveResultWorker(Worker):
     name: ClassVar[str] = 'archiveresult'
     MAX_TICK_TIME: ClassVar[int] = 120
 
-    def __init__(self, extractor: str | None = None, **kwargs: Any):
+    def __init__(self, plugin: str | None = None, **kwargs: Any):
         super().__init__(**kwargs)
-        self.extractor = extractor
+        self.plugin = plugin
 
     def get_model(self):
         from core.models import ArchiveResult
@@ -359,16 +359,16 @@ class ArchiveResultWorker(Worker):
 
         qs = super().get_queue()
 
-        if self.extractor:
-            qs = qs.filter(extractor=self.extractor)
+        if self.plugin:
+            qs = qs.filter(plugin=self.plugin)
 
         # Note: Removed blocking logic since plugins have separate output directories
-        # and don't interfere with each other. Each plugin (extractor) runs independently.
+        # and don't interfere with each other. Each plugin runs independently.
 
         return qs
 
     def process_item(self, obj) -> bool:
-        """Process an ArchiveResult by running its extractor."""
+        """Process an ArchiveResult by running its plugin."""
         try:
             obj.sm.tick()
             return True
@@ -378,8 +378,8 @@ class ArchiveResultWorker(Worker):
             return False
 
     @classmethod
-    def start(cls, worker_id: int | None = None, daemon: bool = False, extractor: str | None = None, **kwargs: Any) -> int:
-        """Fork a new worker as subprocess with optional extractor filter."""
+    def start(cls, worker_id: int | None = None, daemon: bool = False, plugin: str | None = None, **kwargs: Any) -> int:
+        """Fork a new worker as subprocess with optional plugin filter."""
         if worker_id is None:
             worker_id = get_next_worker_id(cls.name)
 
@@ -387,7 +387,7 @@ class ArchiveResultWorker(Worker):
         proc = Process(
             target=_run_worker,
             args=(cls.name, worker_id, daemon),
-            kwargs={'extractor': extractor, **kwargs},
+            kwargs={'plugin': plugin, **kwargs},
             name=f'{cls.name}_worker_{worker_id}',
         )
         proc.start()
diff --git a/tests/test_recursive_crawl.py b/tests/test_recursive_crawl.py
index ef5e223f..9ed52e16 100644
--- a/tests/test_recursive_crawl.py
+++ b/tests/test_recursive_crawl.py
@@ -74,17 +74,17 @@ def test_background_hooks_dont_block_parser_extractors(tmp_path, process):
     # Check that background hooks are running
     # Background hooks: consolelog, ssl, responses, redirects, staticfile
     bg_hooks = c.execute(
-        "SELECT extractor, status FROM core_archiveresult WHERE extractor IN ('consolelog', 'ssl', 'responses', 'redirects', 'staticfile') ORDER BY extractor"
+        "SELECT plugin, status FROM core_archiveresult WHERE plugin IN ('consolelog', 'ssl', 'responses', 'redirects', 'staticfile') ORDER BY plugin"
     ).fetchall()
 
     # Check that parser extractors have run (not stuck in queued)
     parser_extractors = c.execute(
-        "SELECT extractor, status FROM core_archiveresult WHERE extractor LIKE 'parse_%_urls' ORDER BY extractor"
+        "SELECT plugin, status FROM core_archiveresult WHERE plugin LIKE 'parse_%_urls' ORDER BY plugin"
     ).fetchall()
 
     # Check all extractors to see what's happening
     all_extractors = c.execute(
-        "SELECT extractor, status FROM core_archiveresult ORDER BY extractor"
+        "SELECT plugin, status FROM core_archiveresult ORDER BY plugin"
     ).fetchall()
 
     conn.close()
@@ -160,7 +160,7 @@ def test_parser_extractors_emit_snapshot_jsonl(tmp_path, process):
 
     # Check that parse_html_urls ran
     parse_html = c.execute(
-        "SELECT id, status, output_str FROM core_archiveresult WHERE extractor = '60_parse_html_urls'"
+        "SELECT id, status, output_str FROM core_archiveresult WHERE plugin = '60_parse_html_urls'"
     ).fetchone()
 
     conn.close()
@@ -171,7 +171,7 @@ def test_parser_extractors_emit_snapshot_jsonl(tmp_path, process):
 
         # Parser should have run
         assert status in ['started', 'succeeded', 'failed'], \
-            f"parse_html_urls should have run, got status: {status}"
+            f"60_parse_html_urls should have run, got status: {status}"
 
         # If it succeeded and found links, output should contain JSON
         if status == 'succeeded' and output:
@@ -185,39 +185,37 @@ def test_recursive_crawl_creates_child_snapshots(tmp_path, process):
     """Test that recursive crawling creates child snapshots with proper depth and parent_snapshot_id."""
     os.chdir(tmp_path)
 
-    # Disable most extractors to speed up test, but keep wget for HTML content
+    # Create a test HTML file with links
+    test_html = tmp_path / 'test.html'
+    test_html.write_text('''
+    <html>
+    <body>
+        <h1>Test Page</h1>
+        <a href="https://monadical.com/about">About</a>
+        <a href="https://monadical.com/blog">Blog</a>
+        <a href="https://monadical.com/contact">Contact</a>
+    </body>
+    </html>
+    ''')
+
+    # Minimal env for fast testing
     env = os.environ.copy()
     env.update({
-        "USE_WGET": "true",  # Need wget to fetch HTML for parsers
-        "USE_SINGLEFILE": "false",
-        "USE_READABILITY": "false",
-        "USE_MERCURY": "false",
-        "SAVE_HTMLTOTEXT": "false",
-        "SAVE_PDF": "false",
-        "SAVE_SCREENSHOT": "false",
-        "SAVE_DOM": "false",
-        "SAVE_HEADERS": "false",
-        "USE_GIT": "false",
-        "SAVE_MEDIA": "false",
-        "SAVE_ARCHIVE_DOT_ORG": "false",
-        "SAVE_TITLE": "false",
-        "SAVE_FAVICON": "false",
-        "USE_CHROME": "false",
         "URL_ALLOWLIST": r"monadical\.com/.*",  # Only crawl same domain
     })
 
     # Start a crawl with depth=1 (just one hop to test recursive crawling)
+    # Use file:// URL so it's instant, no network fetch needed
     proc = subprocess.Popen(
-        ['archivebox', 'add', '--depth=1', 'https://monadical.com'],
+        ['archivebox', 'add', '--depth=1', f'file://{test_html}'],
         stdout=subprocess.PIPE,
         stderr=subprocess.PIPE,
         text=True,
         env=env,
     )
 
-    # Give orchestrator time to process - parser extractors should emit child snapshots within 60s
-    # Even if root snapshot is still processing, child snapshots can start in parallel
-    time.sleep(60)
+    # Give orchestrator time to process - file:// is fast, should complete in 20s
+    time.sleep(20)
 
     # Kill the process
     proc.kill()
@@ -231,8 +229,7 @@ def test_recursive_crawl_creates_child_snapshots(tmp_path, process):
 
     # Check root snapshot (depth=0)
     root_snapshot = c.execute(
-        "SELECT id, url, depth, parent_snapshot_id FROM core_snapshot WHERE url = ? AND depth = 0",
-        ('https://monadical.com',)
+        "SELECT id, url, depth, parent_snapshot_id FROM core_snapshot WHERE depth = 0 ORDER BY created_at LIMIT 1"
     ).fetchone()
 
     # Check if any child snapshots were created (depth=1)
@@ -247,13 +244,13 @@ def test_recursive_crawl_creates_child_snapshots(tmp_path, process):
 
     # Check parser extractor status
     parser_status = c.execute(
-        "SELECT extractor, status FROM core_archiveresult WHERE snapshot_id = ? AND extractor LIKE 'parse_%_urls'",
+        "SELECT plugin, status FROM core_archiveresult WHERE snapshot_id = ? AND plugin LIKE 'parse_%_urls'",
         (root_snapshot[0] if root_snapshot else '',)
     ).fetchall()
 
     # Check for started extractors that might be blocking
     started_extractors = c.execute(
-        "SELECT extractor, status FROM core_archiveresult WHERE snapshot_id = ? AND status = 'started'",
+        "SELECT plugin, status FROM core_archiveresult WHERE snapshot_id = ? AND status = 'started'",
         (root_snapshot[0] if root_snapshot else '',)
     ).fetchall()
 
@@ -417,12 +414,12 @@ def test_archiveresult_worker_queue_filters_by_foreground_extractors(tmp_path, p
 
     # Get background hooks that are started
     bg_started = c.execute(
-        "SELECT extractor FROM core_archiveresult WHERE extractor IN ('consolelog', 'ssl', 'responses', 'redirects', 'staticfile') AND status = 'started'"
+        "SELECT plugin FROM core_archiveresult WHERE plugin IN ('consolelog', 'ssl', 'responses', 'redirects', 'staticfile') AND status = 'started'"
     ).fetchall()
 
     # Get parser extractors that should be queued or better
     parser_status = c.execute(
-        "SELECT extractor, status FROM core_archiveresult WHERE extractor LIKE 'parse_%_urls'"
+        "SELECT plugin, status FROM core_archiveresult WHERE plugin LIKE 'parse_%_urls'"
     ).fetchall()
 
     conn.close()