Nick Sweeting
dd77511026
unified Process source of truth and better screenshot tests
2026-01-02 04:20:34 -08:00
Nick Sweeting
c2afb40350
fix lib bin dir and archivebox add hanging
2026-01-01 16:58:47 -08:00
Nick Sweeting
2e350d317d
fix initial migrtaions
2025-12-29 21:27:31 -08:00
Nick Sweeting
f4e7820533
use full dotted paths for all archivebox imports, add migrations and more fixes
2025-12-29 00:47:08 -08:00
Nick Sweeting
f0aa19fa7d
wip
2025-12-28 17:51:54 -08:00
Nick Sweeting
bd265c0083
rename extractor to plugin everywhere
2025-12-28 04:43:15 -08:00
Claude
c3acadd528
Remove extractor field from Crawl model and fix tests
...
- Remove extractor field from Crawl model (moved to config dict)
- Update migration 0002_drop_seed_model to not add extractor
- Update archivebox_add.py to use config['PARSER'] instead
- Update admin.py recrawl to not pass extractor
- Update jsonl.py serialization to not include extractor
- Update test schema SCHEMA_0_8 to not include extractor
- Set default timeout to 60s for test commands
2025-12-27 01:49:09 +00:00
Nick Sweeting
bb53228ebf
remove Seed model in favor of Crawl as template
2025-12-25 01:52:41 -08:00
Nick Sweeting
866f993f26
logging and admin ui improvements
2025-12-25 01:10:41 -08:00
Nick Sweeting
d95f0dc186
remove huey
2025-12-24 23:40:18 -08:00
Nick Sweeting
1915333b81
wip major changes
2025-12-24 20:10:38 -08:00
Nick Sweeting
b948e49013
add urls log to Crawl model
2024-11-19 06:32:33 -08:00
Nick Sweeting
6740202d78
fix cli loading edge case where setup_django wasnt running when it should
2024-11-19 04:20:00 -08:00
Nick Sweeting
0347b911aa
archivebox add and remove CLI cmds
2024-11-19 03:40:01 -08:00
Nick Sweeting
328eb98a38
move main funcs into cli files and switch to using click for CLI
2024-11-19 00:18:51 -08:00
Nick Sweeting
569081a9eb
rename abid_utils to base_models
2024-11-18 19:40:05 -08:00
Nick Sweeting
65afd405b1
merge seeds and crawls apps
2024-11-18 19:23:14 -08:00
Nick Sweeting
4a5d607296
move logging_util into archivebox.misc subfolder
2024-11-18 19:08:49 -08:00
Nick Sweeting
e469c5a344
merge queues and actors apps into new workers app
2024-11-18 18:52:48 -08:00
Nick Sweeting
0acd388c02
fix imports and deps
2024-11-18 18:07:34 -08:00
Nick Sweeting
eeb2671e4d
API improvements
2024-11-18 04:27:38 -08:00
Nick Sweeting
1e3ce67834
fix API and CLU calls
2024-11-18 04:27:38 -08:00
Nick Sweeting
b4a5da3ffd
update archivebox add CLI command to use new actor system
2024-11-16 02:45:37 -08:00
Nick Sweeting
cf1ea8f80f
improve config loading of TMP_DIR, LIB_DIR, move to separate files
2024-10-07 23:45:11 -07:00
Nick Sweeting
b913e6f426
rename OUTPUT_DIR to DATA_DIR
2024-09-30 17:44:18 -07:00
Nick Sweeting
363a499289
move util.py into misc folder
2024-09-30 17:25:15 -07:00
Nick Sweeting
3e5b6ddeae
move config into dedicated global app
2024-09-30 15:59:05 -07:00
Nick Sweeting
8cfe6f4afb
cleanup update flag handling and show better logging to clarify when its working
2022-05-09 20:15:55 -07:00
Nick Sweeting
36f0646501
Merge pull request #669 from FliegendeWurst/fix-issue-235
...
add command: --parser option (fixes #235 )
2021-03-31 00:53:47 -04:00
Nick Sweeting
2656e59215
change list style
2021-03-31 00:47:42 -04:00
FliegendeWurst
60bd9a902e
add command: --parser option
2021-03-28 10:09:11 +02:00
Nick Sweeting
fea0b89dbe
add tag cli option
2021-03-27 03:57:05 -04:00
Nick Sweeting
49939f3eaa
only accept stdin if args are not passed, fix stdin hang in docker
2021-02-16 01:20:47 -05:00
Nick Sweeting
9fa70b3452
add extractors arg to oneshot command and bump version to v0.5.1
2020-12-11 15:48:46 +02:00
Nick Sweeting
257d3f2a98
Update archivebox/cli/archivebox_add.py
2020-11-13 14:52:21 -05:00
Cristian
54df0a035b
fix: Move csv split to the add function to avoid optional nullable argument
2020-11-13 13:10:17 -05:00
Cristian
1ec8276514
fix: Use a comma separated input instead of nargs for the extract flag
2020-11-13 13:01:11 -05:00
Cristian
44eede96e5
feat: Add extract flag to add command
2020-11-13 09:24:34 -05:00
Nick Sweeting
718d39e242
add common code extensions to default blacklist
2020-08-18 08:12:10 -04:00
Nick Sweeting
b681a477ae
add overwrite flag to add command to force re-archiving
2020-08-18 04:37:54 -04:00
Cristian
6006b4f93b
refactor: Organize code to remove flake8 issues
2020-07-24 12:25:25 -05:00
Cristian
a5550b2105
fix: Rename logging folder to avoid naming conflicts (and circular import issues)
2020-07-22 11:02:13 -05:00
Cristian
f4d1b5121e
refactor: Move logging.py to main module to avoid circular import issues
2020-07-17 18:00:04 -05:00
Nick Sweeting
d3bfa98a91
fix depth flag and tweak logging
2020-07-13 11:26:34 -04:00
Cristian
4ebf929606
refactor: Change wording on CLI help
2020-07-08 08:30:07 -05:00
Cristian
f12bfeb322
refactor: Change add() to receive url and depth instead of import_str and import_path
2020-07-08 08:17:47 -05:00
Cristian
c1d8a74e4f
feat: Make input sent via stdin behave the same as using args
2020-07-07 15:49:40 -05:00
Cristian
b68c13918f
feat: Disable stdin from archivebox add
2020-07-07 12:39:36 -05:00
Cristian
a6940092bb
feat: Make sure that depth can only be either 1 or 0
2020-07-07 10:25:02 -05:00
Cristian
32e790979e
feat: Enable depth=1 functionality
2020-07-07 10:07:44 -05:00