Nick Sweeting
65ee09ceab
move tests into subfolder, add missing install hooks
2026-01-02 00:22:07 -08:00
Nick Sweeting
4cd2fceb8a
even more migration fixes
2025-12-29 22:30:37 -08:00
Nick Sweeting
95beddc5fc
more migration fixes
2025-12-29 22:12:57 -08:00
Nick Sweeting
2e350d317d
fix initial migrtaions
2025-12-29 21:27:31 -08:00
Nick Sweeting
80f75126c6
more fixes
2025-12-29 21:03:05 -08:00
Claude
a5654e877f
rename media plugin to ytdlp with backwards-compatible aliases
...
- Rename archivebox/plugins/media/ → archivebox/plugins/ytdlp/
- Rename hook script on_Snapshot__63_media.bg.py → on_Snapshot__63_ytdlp.bg.py
- Update config.json: YTDLP_* as primary keys, MEDIA_* as x-aliases
- Update templates CSS classes: media-* → ytdlp-*
- Fix gallerydl bug: remove incorrect dependency on media plugin output
- Update all codebase references to use YTDLP_* and SAVE_YTDLP
- Add backwards compatibility test for MEDIA_ENABLED alias
2025-12-29 19:09:05 +00:00
Nick Sweeting
30c60eef76
much better tests and add page ui
2025-12-29 04:02:11 -08:00
Nick Sweeting
f4e7820533
use full dotted paths for all archivebox imports, add migrations and more fixes
2025-12-29 00:47:08 -08:00
Nick Sweeting
f0aa19fa7d
wip
2025-12-28 17:51:54 -08:00
Nick Sweeting
4ccb0863bb
continue renaming extractor to plugin, add plan for hook concurrency, add chrome kill helper script
2025-12-28 05:29:24 -08:00
Nick Sweeting
bd265c0083
rename extractor to plugin everywhere
2025-12-28 04:43:15 -08:00
Nick Sweeting
50e527ec65
way better plugin hooks system wip
2025-12-28 03:39:59 -08:00
Claude
0941aca4a3
Improve test suite: remove mocks and add 0.8.x migration tests
...
- Remove mock-based tests from plugin tests (headers, singlefile, ublock, captcha2)
- Replace fake cache tests with real double-install tests that verify cache behavior
- Add SCHEMA_0_8 and seed_0_8_data() for testing 0.8.x data directory migrations
- Add TestMigrationFrom08x class with comprehensive migration tests:
- Snapshot count preservation
- Crawl record preservation
- Snapshot-to-crawl relationship preservation
- Tag preservation
- ArchiveResult status preservation
- CLI command verification after migration
- Add more CLI tests for add command (tags, multiple URLs, file input)
- All tests now use real functionality without mocking
2025-12-26 23:01:49 +00:00
Nick Sweeting
6c769d831c
wip 2
2025-12-24 21:46:14 -08:00
Nick Sweeting
1915333b81
wip major changes
2025-12-24 20:10:38 -08:00
Nick Sweeting
cf1ea8f80f
improve config loading of TMP_DIR, LIB_DIR, move to separate files
2024-10-07 23:45:11 -07:00
jim winstead
5478d13d52
Add generic_jsonl parser
...
Resolves #1369
2024-03-14 15:42:29 -07:00
Nick Sweeting
099f7d00fe
Use feedparser for RSS parsing ( #1362 )
...
Fixes #1171
Fixes #870 (probably, would need to test against a Wallabag Atom file to
Fixes #135
Fixes #123
Fixes #106
2024-03-14 01:51:45 -07:00
jim winstead
741ff5f1a8
Make it a little easier to run specific tests
...
Changes ./bin/test.sh to pass command line options to pytest, and default to
only running tests in the tests/ directory instead of everywhere excluding
a few directories which is more error-prone.
Also keeps the mock_server used in testing quiet so access log entries don't
appear on stdout.
2024-03-01 12:43:53 -08:00
jim winstead
0f402df42f
Merge with latest dev
2024-03-01 12:05:43 -08:00
jim winstead
e7119adb0b
Add tests for generic_rss and pinboard_rss parsers
2024-03-01 11:27:59 -08:00
jim winstead
1f828d9441
Add tests for generic_rss and pinboard_rss parsers
2024-03-01 11:22:28 -08:00
jim winstead
ccabda4c7d
Handle list of tags in JSON, and be more clever about comma vs. space
2024-02-28 17:38:49 -08:00
jim winstead
178e676e0f
Fix JSON parser by not always mangling the input
...
Rather than by assuming the JSON file we are parsing has junk at the beginning
(which maybe only used to happen?), try parsing it as-is first, and then fall
back to trying again after skipping the first line
Fixes #1347
2024-02-27 14:48:19 -08:00
Nick Sweeting
a680724367
Merge branch 'dev' into search_index_extract_html_text
2023-10-27 23:09:28 -07:00
Ross Williams
310b4d1242
Add htmltotext extractor
...
Saves HTML text nodes and selected element attributes in
`htmltotext.txt` for each Snapshot. Primarily intended to be used
for search indexing.
2023-10-23 21:42:32 -04:00
Ross Williams
b44f7e68b1
Add URL-specific method allow/deny lists
...
Allows enabling only allow-listed extractors or disabling specific
deny-listed extractors for a regular expression matched against an added
site's URL.
2023-08-02 09:36:40 -04:00
Sascha Ißbrücker
40c122515a
fix: make oneshot command return successful exist code
2023-05-29 10:01:27 +02:00
Nick Sweeting
9f1470cf03
fix output permissions tests
2021-05-31 20:57:46 -04:00
Nick Sweeting
eef9adbfcb
fix select invalid test
2021-04-03 15:50:48 -04:00
Nick Sweeting
354b4627ed
fix tests
2021-03-30 23:39:15 -04:00
Nick Sweeting
bd6d9c165b
enforce utf8 on literally all file operations because windows sucks
2021-03-27 01:16:29 -04:00
Nick Sweeting
33df9c1ebe
fix after and before in remove tests
2021-02-18 06:21:44 -05:00
Nick Sweeting
4f5bb3776c
fix sql err
2021-02-18 05:51:53 -05:00
Nick Sweeting
46a4197514
fix tests
2021-02-18 04:26:56 -05:00
Cristian
e82161a768
refactor: Remove setup_django from search
2020-12-11 16:43:48 -05:00
Nick Sweeting
e03d17c208
test extract flag on oneshot
2020-12-11 16:49:18 +02:00
Cristian
f6c73f9aeb
fix: Issue with oneshot command
2020-12-08 18:42:25 -05:00
Nick Sweeting
1b22f8eeef
Merge pull request #515 from cdvv7788/POC-setup-django-on-init
2020-11-27 23:56:37 -05:00
Nick Sweeting
efe3027797
Merge branch 'master' into archive-result
2020-11-27 23:18:11 -05:00
Nick Sweeting
0e2ccbc10d
update urls to new repo path
2020-11-23 02:06:46 -05:00
Nick Sweeting
fdd4effc92
Merge pull request #535 from cdvv7788/extractors-flag
2020-11-13 14:53:17 -05:00
JDC
b1dbfcb73f
Add test remove tag filter
2020-11-13 14:17:12 -05:00
Cristian
44eede96e5
feat: Add extract flag to add command
2020-11-13 09:24:34 -05:00
Cristian
33182fd53c
fix: Add missing assignation
2020-11-04 15:07:45 -05:00
Cristian
d064a3eeff
fix: Handle case when update tries to re-add a link that is not in the sql index
2020-11-04 15:02:54 -05:00
Cristian
e7e33ea7a5
tests: Add tests for several different ways to extract the title
2020-10-30 08:04:26 -05:00
Cristian
f6ce1de882
fix: archivebox version was being called as root
2020-10-27 09:15:14 -05:00
Cristian
a6bee5f111
feat: Move setup_django to an inner module
2020-10-26 08:02:04 -05:00
Cristian
e1d0b8bce7
feat: Initialize django at the beginning
2020-10-26 07:45:21 -05:00