Update views, API, and exports for new ArchiveResult output fields

Replace old `output` field with new fields across the codebase:
- output_str: Human-readable output summary
- output_json: Structured metadata (optional)
- output_files: Dict of output files with metadata
- output_size: Total size in bytes
- output_mimetypes: CSV of file mimetypes

Files updated:
- api/v1_core.py: Update MinimalArchiveResultSchema to expose new fields
- api/v1_core.py: Update ArchiveResultFilterSchema to search output_str
- cli/archivebox_extract.py: Use output_str in CLI output
- core/admin_archiveresults.py: Update admin fields, search, and fieldsets
- core/admin_archiveresults.py: Fix output_html variable name bug in output_summary
- misc/jsonl.py: Update archiveresult_to_jsonl() to include new fields
- plugins/extractor_utils.py: Update ExtractorResult helper class

The embed_path() method already uses output_files and output_str,
so snapshot detail page and template tags work correctly.
This commit is contained in:
Claude
2025-12-27 20:28:22 +00:00
parent d65eb587d9
commit b632894bc9
5 changed files with 46 additions and 33 deletions

View File

@@ -59,10 +59,10 @@ def process_archiveresult_by_id(archiveresult_id: str) -> int:
archiveresult.refresh_from_db()
if archiveresult.status == ArchiveResult.StatusChoices.SUCCEEDED:
print(f'[green]Extraction succeeded: {archiveresult.output}[/green]')
print(f'[green]Extraction succeeded: {archiveresult.output_str}[/green]')
return 0
elif archiveresult.status == ArchiveResult.StatusChoices.FAILED:
print(f'[red]Extraction failed: {archiveresult.output}[/red]', file=sys.stderr)
print(f'[red]Extraction failed: {archiveresult.output_str}[/red]', file=sys.stderr)
return 1
else:
# Still in progress or backoff - not a failure
@@ -202,7 +202,7 @@ def run_plugins(
'failed': 'red',
'skipped': 'yellow',
}.get(result.status, 'dim')
rprint(f' [{status_color}]{result.status}[/{status_color}] {result.extractor}{result.output or ""}', file=sys.stderr)
rprint(f' [{status_color}]{result.status}[/{status_color}] {result.extractor}{result.output_str or ""}', file=sys.stderr)
else:
write_record(archiveresult_to_jsonl(result))
except Snapshot.DoesNotExist: