Nick Sweeting
b749b26c5d
wip
2026-03-23 03:58:32 -07:00
Nick Sweeting
f400a2cd67
WIP: checkpoint working tree before rebasing onto dev
2026-03-22 20:25:18 -07:00
Nick Sweeting
c87079aa0a
Refactor ArchiveBox onto abx-dl bus runner
2026-03-21 11:47:57 -07:00
Nick Sweeting
5381f7584c
Tighten API typing and add return values
2026-03-15 19:24:54 -07:00
Nick Sweeting
311e4340ec
Fix add CLI input handling and lint regressions
2026-03-15 19:04:13 -07:00
Nick Sweeting
934e02695b
fix lint
2026-03-15 18:45:29 -07:00
Nick Sweeting
7d42c6c8b5
bump versions and fix docs
2026-03-15 17:43:07 -07:00
Nick Sweeting
c4d30a853f
Restore index-only snapshot output links
2026-03-15 04:58:46 -07:00
Nick Sweeting
cc3e72b92f
Preserve tags for index-only adds
2026-03-15 04:54:55 -07:00
Nick Sweeting
c7b2217cd6
tons of fixes with codex
2026-01-19 01:00:53 -08:00
Nick Sweeting
839ae744cf
simplify entrypoints for orchestrator and workers
2026-01-04 13:17:07 -08:00
Nick Sweeting
dd77511026
unified Process source of truth and better screenshot tests
2026-01-02 04:20:34 -08:00
Nick Sweeting
c2afb40350
fix lib bin dir and archivebox add hanging
2026-01-01 16:58:47 -08:00
Nick Sweeting
2e350d317d
fix initial migrtaions
2025-12-29 21:27:31 -08:00
Nick Sweeting
f4e7820533
use full dotted paths for all archivebox imports, add migrations and more fixes
2025-12-29 00:47:08 -08:00
Nick Sweeting
f0aa19fa7d
wip
2025-12-28 17:51:54 -08:00
Nick Sweeting
bd265c0083
rename extractor to plugin everywhere
2025-12-28 04:43:15 -08:00
Claude
c3acadd528
Remove extractor field from Crawl model and fix tests
...
- Remove extractor field from Crawl model (moved to config dict)
- Update migration 0002_drop_seed_model to not add extractor
- Update archivebox_add.py to use config['PARSER'] instead
- Update admin.py recrawl to not pass extractor
- Update jsonl.py serialization to not include extractor
- Update test schema SCHEMA_0_8 to not include extractor
- Set default timeout to 60s for test commands
2025-12-27 01:49:09 +00:00
Nick Sweeting
bb53228ebf
remove Seed model in favor of Crawl as template
2025-12-25 01:52:41 -08:00
Nick Sweeting
866f993f26
logging and admin ui improvements
2025-12-25 01:10:41 -08:00
Nick Sweeting
d95f0dc186
remove huey
2025-12-24 23:40:18 -08:00
Nick Sweeting
1915333b81
wip major changes
2025-12-24 20:10:38 -08:00
Nick Sweeting
b948e49013
add urls log to Crawl model
2024-11-19 06:32:33 -08:00
Nick Sweeting
6740202d78
fix cli loading edge case where setup_django wasnt running when it should
2024-11-19 04:20:00 -08:00
Nick Sweeting
0347b911aa
archivebox add and remove CLI cmds
2024-11-19 03:40:01 -08:00
Nick Sweeting
328eb98a38
move main funcs into cli files and switch to using click for CLI
2024-11-19 00:18:51 -08:00
Nick Sweeting
569081a9eb
rename abid_utils to base_models
2024-11-18 19:40:05 -08:00
Nick Sweeting
65afd405b1
merge seeds and crawls apps
2024-11-18 19:23:14 -08:00
Nick Sweeting
4a5d607296
move logging_util into archivebox.misc subfolder
2024-11-18 19:08:49 -08:00
Nick Sweeting
e469c5a344
merge queues and actors apps into new workers app
2024-11-18 18:52:48 -08:00
Nick Sweeting
0acd388c02
fix imports and deps
2024-11-18 18:07:34 -08:00
Nick Sweeting
eeb2671e4d
API improvements
2024-11-18 04:27:38 -08:00
Nick Sweeting
1e3ce67834
fix API and CLU calls
2024-11-18 04:27:38 -08:00
Nick Sweeting
b4a5da3ffd
update archivebox add CLI command to use new actor system
2024-11-16 02:45:37 -08:00
Nick Sweeting
cf1ea8f80f
improve config loading of TMP_DIR, LIB_DIR, move to separate files
2024-10-07 23:45:11 -07:00
Nick Sweeting
b913e6f426
rename OUTPUT_DIR to DATA_DIR
2024-09-30 17:44:18 -07:00
Nick Sweeting
363a499289
move util.py into misc folder
2024-09-30 17:25:15 -07:00
Nick Sweeting
3e5b6ddeae
move config into dedicated global app
2024-09-30 15:59:05 -07:00
Nick Sweeting
8cfe6f4afb
cleanup update flag handling and show better logging to clarify when its working
2022-05-09 20:15:55 -07:00
Nick Sweeting
36f0646501
Merge pull request #669 from FliegendeWurst/fix-issue-235
...
add command: --parser option (fixes #235 )
2021-03-31 00:53:47 -04:00
Nick Sweeting
2656e59215
change list style
2021-03-31 00:47:42 -04:00
FliegendeWurst
60bd9a902e
add command: --parser option
2021-03-28 10:09:11 +02:00
Nick Sweeting
fea0b89dbe
add tag cli option
2021-03-27 03:57:05 -04:00
Nick Sweeting
49939f3eaa
only accept stdin if args are not passed, fix stdin hang in docker
2021-02-16 01:20:47 -05:00
Nick Sweeting
9fa70b3452
add extractors arg to oneshot command and bump version to v0.5.1
2020-12-11 15:48:46 +02:00
Nick Sweeting
257d3f2a98
Update archivebox/cli/archivebox_add.py
2020-11-13 14:52:21 -05:00
Cristian
54df0a035b
fix: Move csv split to the add function to avoid optional nullable argument
2020-11-13 13:10:17 -05:00
Cristian
1ec8276514
fix: Use a comma separated input instead of nargs for the extract flag
2020-11-13 13:01:11 -05:00
Cristian
44eede96e5
feat: Add extract flag to add command
2020-11-13 09:24:34 -05:00
Nick Sweeting
718d39e242
add common code extensions to default blacklist
2020-08-18 08:12:10 -04:00