- Add missing backslash on line 383 that caused Docker build parse failure
(the linter removed the \ continuation character, breaking the RUN instruction)
- Use gosu to run archivebox version as the archivebox user since
ArchiveBox refuses to run as root
https://claude.ai/code/session_01X2H7XLawCzLGnrxMArXtVZ
- Restore LISTEN_HOST=archivebox.localhost:8000 and
CSRF_TRUSTED_ORIGINS=http://admin.archivebox.localhost:8000 in
docker-compose.yml (subdomain routing is core to ArchiveBox architecture)
- Restore HEALTHCHECK URL to admin.archivebox.localhost in Dockerfile
- Restore SAVE_WGET=False SAVE_DOM=False in README security section
(old SAVE_* env vars still work via x-aliases in config.json)
- Revert dev setup docs to use ./bin/lock_pkgs.sh instead of bare uv sync
- Fix docker-compose.yml open URL to web.archivebox.localhost:8000
https://claude.ai/code/session_01X2H7XLawCzLGnrxMArXtVZ
- Add Homebrew formula (brew_dist/archivebox.rb) using virtualenv pattern
with auto-generation via homebrew-pypi-poet in bin/build_brew.sh
- Add Debian packaging via nFPM (pkg/debian/) with thin .deb that pip-installs
archivebox into /opt/archivebox/venv on postinstall
- Add build/release scripts: bin/{build,release}_{brew,deb}.sh
- Update CI workflows to build packages on release and test them
- Update README apt/brew install instructions with working commands
- Update bin/setup.sh to use .deb download instead of old Launchpad PPA
https://claude.ai/code/session_01Vx1EsNrNySgsc8Y67dGzCn
Fixes#1139
## Summary
This PR fixes: Feature Request: Add AI-assisted summarization, tagging,
search, and more using LLMs / RAG
## Changes
```
archivebox/core/models.py | 12 +++++++-----
1 file changed, 7 insertions(+), 5 deletions(-)
```
## Testing
Please review the changes carefully. The fix was verified against the
existing test suite.
---
*This PR was created with the assistance of Claude Sonnet 4.6 by
Anthropic | effort: low. Happy to make any adjustments!*
<!-- This is an auto-generated description by cubic. -->
---
## Summary by cubic
Returns tags as a JSON array in Snapshot.to_dict() and accepts both list
and comma-separated tags in from_json(), making search exports and
RAG/LLM integrations easier. Fixes#1139.
- **New Features**
- Tags export is now a sorted JSON list for deterministic output.
- Imports accept list or string formats; trims whitespace and
deduplicates tags for compatibility.
<sup>Written for commit 08b0dfaf12.
Summary will update on new commits.</sup>
<!-- End of auto-generated description by cubic. -->
Previously, `archivebox search --json` exported tags as a comma-separated
string (e.g. "tag1,tag2"), which required manual parsing by consumers like
LlamaIndex, LangChain, and other RAG frameworks.
Now `to_dict()` returns tags as a proper JSON array (e.g. ["tag1", "tag2"]),
making the export directly usable as structured metadata in LLM/RAG pipelines
without additional preprocessing.
`from_json()` is updated to accept both list and string formats for backward
compatibility with existing JSON imports.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
<!-- IMPORTANT: Do not submit PRs with only formatting / PEP8 / line
length changes. -->
# Summary
Add the maintainer info of the ArchiveBox AUR package for
accountability. Much of the packaging has changed since the time of its
initial contribution and I as the current maintainer will make sure
these changes will work smoothly moving forward. I will also make sure
this AUR package will be up to date once the 0.9.x branch is released.
# Related issues
<!-- e.g. #123 or Roadmap goal #
https://github.com/pirate/ArchiveBox/wiki/Roadmap -->
# Changes these areas
- [ ] Bugfixes
- [ ] Feature behavior
- [ ] Command line interface
- [ ] Configuration options
- [ ] Internal architecture
- [ ] Snapshot data layout on disk
<!-- This is an auto-generated description by cubic. -->
---
## Summary by cubic
Update README to tag the current maintainer of the Arch AUR package.
Adds “maintained by @jasongodev” next to the original contributor to
improve accountability and clarify support.
<sup>Written for commit 0d05fd8c53.
Summary will update on new commits.</sup>
<!-- End of auto-generated description by cubic. -->
<!-- IMPORTANT: Do not submit PRs with only formatting / PEP8 / line
length changes. -->
# Summary
This PR fixes the docker image build. Also fixes the uuid7 not found
error on the first run of `archivebox init`.
<!-- This is an auto-generated description by cubic. -->
---
## Summary by cubic
Fixes the Docker image build and the uuid7 error on first init. We now
use uv-managed Python 3.13 and patch uuid.uuid7 before Django
migrations.
- **Bug Fixes**
- Docker: switch to uv-managed Python, create venv with uv --python,
skip version check at build, and start with --init.
- UUID7: add uuid_compat, import it early, and monkey-patch uuid.uuid7
on <3.14 to keep migrations working.
- **Dependencies**
- Bump Python to 3.13.
- Require uuid_extensions on Python <3.14.
<sup>Written for commit 9aa4f0de58.
Summary will update on new commits.</sup>
<!-- End of auto-generated description by cubic. -->
## Summary
Implements native LDAP authentication support for ArchiveBox.
## Changes
- Create `archivebox/config/ldap.py` with LDAPConfig class
- Create `archivebox/ldap/` Django app with custom auth backend
- Update `core/settings.py` to conditionally load LDAP when enabled
- Add LDAP_CREATE_SUPERUSER support to auto-grant superuser privileges
- Add comprehensive tests in test_auth_ldap.py (no mocks, no skips)
- LDAP only activates if django-auth-ldap is installed and
LDAP_ENABLED=True
- Helpful error messages when LDAP libraries are missing or config is
incomplete
## Implementation Approach
- ✅ Native integration (not a plugin)
- ✅ Conditional loading based on libraries + config
- ✅ Separate Django app for LDAP logic
- ✅ Clean if statements in settings.py
- ✅ No mixing LDAP code with rest of codebase
Fixes#1664🤖 Generated with [Claude Code](https://claude.ai/code)
- Create archivebox/config/ldap.py with LDAPConfig class
- Create archivebox/ldap/ Django app with custom auth backend
- Update core/settings.py to conditionally load LDAP when enabled
- Add LDAP_CREATE_SUPERUSER support to auto-grant superuser privileges
- Add comprehensive tests in test_auth_ldap.py (no mocks, no skips)
- LDAP only activates if django-auth-ldap is installed and LDAP_ENABLED=True
- Helpful error messages when LDAP libraries are missing or config is incomplete
Fixes#1664
Co-authored-by: Nick Sweeting <pirate@users.noreply.github.com>
Fixes#1445
This PR resolves the issue where SingleFile was not respecting Chrome
user data directory and other Chrome launch options that work for other
Chrome-based extractors (PDF, Screenshot, etc.).
## Changes
- Added `SINGLEFILE_CHROME_ARGS` config option with fallback to
`CHROME_ARGS`
- Updated SingleFile extractor to pass Chrome arguments via
`--browser-args`
- Updated documentation
This ensures SingleFile respects the same Chrome configuration as other
Chrome-based extractors.
Generated with [Claude Code](https://claude.ai/code)
Show small thumbnails of recently completed ArchiveResult content in the
progress header. The thumbnail strip appears below the stats bar and
shows the last 20 successfully archived items with embeddable content
(screenshots, favicons, DOM snapshots, etc.).
Features:
- API returns recent_thumbnails with embed paths for succeeded results
- Thumbnails display with plugin-specific icons as fallback
- New thumbnails animate in with a pop effect
- Clicking a thumbnail navigates to the snapshot admin page
- Horizontal scrollable strip with custom scrollbar styling
<!-- IMPORTANT: Do not submit PRs with only formatting / PEP8 / line
length changes. -->
# Summary
<!--e.g. This PR fixes ABC or adds the ability to do XYZ...-->
# Related issues
<!-- e.g. #123 or Roadmap goal #
https://github.com/pirate/ArchiveBox/wiki/Roadmap -->
# Changes these areas
- [ ] Bugfixes
- [ ] Feature behavior
- [ ] Command line interface
- [ ] Configuration options
- [ ] Internal architecture
- [ ] Snapshot data layout on disk
<!-- This is an auto-generated description by cubic. -->
---
## Summary by cubic
Adds a thumbnail strip to the live progress header. It shows previews of
the last 20 successful archived items for quick visual feedback and
one-click navigation.
- **New Features**
- API returns recent_thumbnails with embed paths for succeeded results.
- Horizontal, scrollable thumbnail strip under the header.
- Uses preview images when available; plugin icons as fallback.
- New thumbnails animate in with a pop effect.
- Clicking a thumbnail opens the snapshot admin page.
<sup>Written for commit 17029ba8b8.
Summary will update on new commits.</sup>
<!-- End of auto-generated description by cubic. -->
…nstall
- Delete chrome/on_Crawl__10_chrome_validate.py (duplicates
chrome_install)
- Rename wget/on_Crawl__11_wget_validate.py →
on_Crawl__06_wget_install.py
All hooks now follow consistent naming: install, launch, or config
<!-- IMPORTANT: Do not submit PRs with only formatting / PEP8 / line
length changes. -->
# Summary
<!--e.g. This PR fixes ABC or adds the ability to do XYZ...-->
# Related issues
<!-- e.g. #123 or Roadmap goal #
https://github.com/pirate/ArchiveBox/wiki/Roadmap -->
# Changes these areas
- [ ] Bugfixes
- [ ] Feature behavior
- [ ] Command line interface
- [ ] Configuration options
- [ ] Internal architecture
- [ ] Snapshot data layout on disk
<!-- This is an auto-generated description by cubic. -->
---
## Summary by cubic
Removed the redundant Chrome validate hook, renamed the Wget validate
hook to wget_install, and standardized hook names and priorities to
match the install/launch/config lifecycle. This removes duplicate logic
and fixes priority conflicts across Crawl, Binary, and Snapshot hooks.
- **Refactors**
- Deleted chrome/on_Crawl__10_chrome_validate.py (dup of chrome_install)
- Renamed wget validate to on_Crawl__06_wget_install.py
- Standardized on_Binary hook priorities: npm 10, pip 11, brew 12, apt
13, custom 14, env 15
- Fixed on_Snapshot order: staticfile 32, readability 56, mercury 57,
htmltotext 58
<sup>Written for commit 09a1ca3134.
Summary will update on new commits.</sup>
<!-- End of auto-generated description by cubic. -->