ArchiveBox

mirror of https://github.com/ArchiveBox/ArchiveBox.git synced 2026-04-06 07:47:53 +10:00

Author	SHA1	Message	Date
Nick Sweeting	3e7b83ac91	bump versions	2026-04-04 23:10:12 -07:00
Nick Sweeting	f3622d8cd3	update working changes	2026-03-25 05:36:07 -07:00
Nick Sweeting	80243accfd	Fix archivebox CI regressions	2026-03-24 15:36:23 -07:00
Nick Sweeting	68d9e30c5f	Fix pytest basetemp handling in test harness	2026-03-24 14:46:05 -07:00
Nick Sweeting	ed1ddbc95e	Fix CI workflows and migration tests	2026-03-24 13:37:02 -07:00
Nick Sweeting	50286d3c38	Reuse cached binaries in archivebox runtime	2026-03-24 11:03:43 -07:00
Nick Sweeting	e1eb5693c9	split CrawlSetup into Install phase with new Binary + BinaryRequest events	2026-03-23 13:16:47 -07:00
Nick Sweeting	25f935b9d1	split CrawlSetup into Install phase with new Binary + BinaryRequest events	2026-03-23 13:15:41 -07:00
Nick Sweeting	8a25704aac	add harness tests	2026-03-23 04:12:46 -07:00
Nick Sweeting	1d94645abd	test fixes	2026-03-23 04:12:31 -07:00
Nick Sweeting	b749b26c5d	wip	2026-03-23 03:58:32 -07:00
Nick Sweeting	f400a2cd67	WIP: checkpoint working tree before rebasing onto dev	2026-03-22 20:25:18 -07:00
Nick Sweeting	a6548df8d0	Add configurable server security modes (#1773 ) Fixes https://github.com/ArchiveBox/ArchiveBox/issues/239 ## Summary - add `SERVER_SECURITY_MODE` presets for safe subdomain replay, safe one-domain no-JS replay, unsafe one-domain no-admin, and dangerous one-domain full replay - make host routing, replay URLs, static serving, and control-plane access mode-aware - add strict routing/header coverage plus a browser-backed Chrome/Puppeteer test that verifies real same-origin behavior in all four modes ## Testing - `uv run pytest archivebox/tests/test_urls.py -v` - `uv run pytest archivebox/tests/test_admin_views.py -v` - `uv run pytest archivebox/tests/test_server_security_browser.py -v` <!-- devin-review-badge-begin --> --- <a href="https://app.devin.ai/review/archivebox/archivebox/pull/1773" target="_blank"> <picture> <source media="(prefers-color-scheme: dark)" srcset="https://static.devin.ai/assets/gh-open-in-devin-review-dark.svg?v=1"> <img src="https://static.devin.ai/assets/gh-open-in-devin-review-light.svg?v=1" alt="Open with Devin"> </picture> </a> <!-- devin-review-badge-end --> <!-- This is an auto-generated description by cubic. --> --- ## Summary by cubic Adds configurable server security modes to isolate admin/API from archived content, with a safe subdomain default and single-domain fallbacks. Routing, replay endpoints, headers, and middleware are mode-aware, with browser tests validating same-origin behavior. - New Features - Introduced SERVER_SECURITY_MODE with presets: safe-subdomains-fullreplay (default), safe-onedomain-nojsreplay, unsafe-onedomain-noadmin, danger-onedomain-fullreplay. - Mode-aware routing and base URLs; one-domain modes use path-based replay: /snapshot/<id>/... and /original/<domain>/.... - Control plane gate: block admin/API and non-GET methods in unsafe-onedomain-noadmin; allow full access in danger-onedomain-fullreplay. - Safer replay: detect risky HTML/SVG and apply CSP sandbox (no scripts) in safe-onedomain-nojsreplay; add X-ArchiveBox-Security-Mode and X-Content-Type-Options: nosniff on replay responses. - Middleware and serving: added ServerSecurityModeMiddleware, improved HostRouting, and static server byte-range/CSP handling. - Tests: added Chrome/Puppeteer browser tests and stricter URL routing tests covering all modes. - Migration - Default requires wildcard subdomains for full isolation (admin., web., api., and snapshot-id.<base>). - To run on one domain, set SERVER_SECURITY_MODE to a one-domain preset; URLs switch to /snapshot/<id>/ and /original/<domain>/ paths. - For production, prefer safe-subdomains-fullreplay; lower-security modes print a startup warning. <sup>Written for commit `ad41b15581`. Summary will update on new commits.</sup> <!-- End of auto-generated description by cubic. -->	2026-03-22 20:17:21 -07:00
Nick Sweeting	c87079aa0a	Refactor ArchiveBox onto abx-dl bus runner	2026-03-21 11:47:57 -07:00
Nick Sweeting	ad41b15581	Add configurable server security modes	2026-03-15 23:34:40 -07:00
Nick Sweeting	57e11879ec	cleanup archivebox tests	2026-03-15 22:09:56 -07:00
Nick Sweeting	bc21d4bfdb	type and test fixes	2026-03-15 20:12:27 -07:00
Nick Sweeting	44cabac8d0	fix typing	2026-03-15 19:47:36 -07:00
Nick Sweeting	f932054915	add stricter locking around stage machine models	2026-03-15 19:21:41 -07:00
Nick Sweeting	311e4340ec	Fix add CLI input handling and lint regressions	2026-03-15 19:04:13 -07:00
Nick Sweeting	5f0cfe5251	add new persona tests	2026-03-15 18:46:45 -07:00
Nick Sweeting	934e02695b	fix lint	2026-03-15 18:45:29 -07:00
Nick Sweeting	70c9358cf9	Improve scheduling, runtime paths, and API behavior	2026-03-15 18:31:56 -07:00
Nick Sweeting	7d42c6c8b5	bump versions and fix docs	2026-03-15 17:43:07 -07:00
Nick Sweeting	e598614b05	Avoid filesystem lookups in snapshot admin list	2026-03-15 17:18:53 -07:00
Nick Sweeting	1d16038ceb	Relax archive output readiness check	2026-03-15 13:31:05 -07:00
Nick Sweeting	957387fd88	Fix plugin hook env and extractor retries	2026-03-15 12:39:27 -07:00
Nick Sweeting	f92ca93ae9	Skip puppeteer browser download during package install	2026-03-15 11:39:43 -07:00
Nick Sweeting	7c55259ed0	Update title HTML test for search export	2026-03-15 11:17:58 -07:00
Nick Sweeting	86fdc3be1e	Refresh worker config from resolved plugin installs	2026-03-15 11:07:55 -07:00
Nick Sweeting	47f540c094	Resolve crawl provider dependencies lazily	2026-03-15 10:18:49 -07:00
Nick Sweeting	d4be507a6b	Keep provider plugins enabled under whitelists	2026-03-15 09:49:45 -07:00
Nick Sweeting	82bfd7e655	Filter binary hooks by allowed providers	2026-03-15 09:32:32 -07:00
Nick Sweeting	941135d6d0	Bound URL fixture archive wait	2026-03-15 09:07:25 -07:00
Nick Sweeting	50901e5367	Align worker config propagation expectations	2026-03-15 08:47:00 -07:00
Nick Sweeting	31e883ec53	Stabilize plugin and crawl integration tests	2026-03-15 08:16:52 -07:00
Nick Sweeting	bfc1e76ff5	Update extractor tests for plugin output dirs	2026-03-15 07:32:11 -07:00
Nick Sweeting	b62064f63e	Avoid recursive crawl timeout regressions	2026-03-15 07:09:15 -07:00
Nick Sweeting	5fb3709281	Run recursive crawl tests to completion	2026-03-15 06:55:35 -07:00
Nick Sweeting	68b9f75dab	Stabilize recursive crawl CI coverage	2026-03-15 06:49:40 -07:00
Nick Sweeting	760cf9d6b2	Stabilize CI against expanded plugin surface	2026-03-15 06:31:41 -07:00
Nick Sweeting	1f792d7199	Restore CLI compat and plugin dependency handling	2026-03-15 06:06:18 -07:00
Nick Sweeting	6b482c62df	Restore top-level list command compatibility	2026-03-15 05:04:31 -07:00
Nick Sweeting	58f801c220	Fix update orphan import and host-aware tests	2026-03-15 04:51:06 -07:00
Nick Sweeting	4fa701fafe	Update abx dependencies and plugin test harness	2026-03-15 04:37:32 -07:00
Nick Sweeting	ecb1764590	switch to external plugins	2026-03-15 03:46:23 -07:00
Nick Sweeting	ec4b27056e	wip	2026-01-21 03:19:56 -08:00
Nick Sweeting	c7b2217cd6	tons of fixes with codex	2026-01-19 01:00:53 -08:00
claude[bot]	c2bb4b25cb	Implement native LDAP authentication support - Create archivebox/config/ldap.py with LDAPConfig class - Create archivebox/ldap/ Django app with custom auth backend - Update core/settings.py to conditionally load LDAP when enabled - Add LDAP_CREATE_SUPERUSER support to auto-grant superuser privileges - Add comprehensive tests in test_auth_ldap.py (no mocks, no skips) - LDAP only activates if django-auth-ldap is installed and LDAP_ENABLED=True - Helpful error messages when LDAP libraries are missing or config is incomplete Fixes #1664 Co-authored-by: Nick Sweeting <pirate@users.noreply.github.com>	2026-01-05 21:30:26 +00:00
Nick Sweeting	28b980a84a	higher timeout	2026-01-05 09:07:59 -08:00

1 2

84 Commits