Remove extractor field from Crawl model and fix tests

- Remove extractor field from Crawl model (moved to config dict)
- Update migration 0002_drop_seed_model to not add extractor
- Update archivebox_add.py to use config['PARSER'] instead
- Update admin.py recrawl to not pass extractor
- Update jsonl.py serialization to not include extractor
- Update test schema SCHEMA_0_8 to not include extractor
- Set default timeout to 60s for test commands
This commit is contained in:
Claude
2025-12-27 01:49:09 +00:00
parent ae2ab5b273
commit c3acadd528
6 changed files with 63 additions and 119 deletions

View File

@@ -206,7 +206,6 @@ def crawl_to_jsonl(crawl) -> Dict[str, Any]:
'type': TYPE_CRAWL,
'id': str(crawl.id),
'urls': crawl.urls,
'extractor': crawl.extractor,
'status': crawl.status,
'max_depth': crawl.max_depth,
'created_at': crawl.created_at.isoformat() if crawl.created_at else None,