alex/ArchiveBox

mirror of https://github.com/ArchiveBox/ArchiveBox.git synced 2026-03-27 10:22:21 +10:00

Go to file

Nick Sweeting 6b48d881fa Create CNAME

2018-12-31 21:18:34 -05:00

fix coc email

2018-12-21 18:22:40 -05:00

rename pip dir archive to archivebox

2018-12-31 20:53:01 -05:00

fix archivebox links

2018-12-31 20:54:32 -05:00

docs @ fa24236b0c

add docs

2018-12-31 20:59:36 -05:00

fix nginx example config to use new name

2018-12-31 20:55:38 -05:00

_config.yml

Set theme jekyll-theme-merlot

2018-12-21 18:40:34 -05:00

.dockerignore

Update .dockerignore

2018-12-31 20:56:27 -05:00

.gitignore

Update .gitignore

2018-12-31 20:57:12 -05:00

.gitmodules

add docs

2018-12-31 20:59:36 -05:00

archive

fix the setup/archive symlinks

2018-12-21 22:10:25 +00:00

CNAME

Create CNAME

2018-12-31 21:18:34 -05:00

Dockerfile

rename pip dir archive to archivebox

2018-12-31 20:53:01 -05:00

LICENSE

Initial commit

2017-05-05 04:50:15 -04:00

README.md

Update README.md

2018-12-31 21:04:41 -05:00

setup

clean up binaries in PATH

2018-12-21 18:21:03 -05:00

README.md

ArchiveBox: Open source local web archiving

(Recently renamed from `Bookmark Archiver`)

"Your own personal Way-Back Machine"

💻 Demo | Source | Changelog | Roadmap

▶️ Quickstart | Details | Configuration | Troubleshooting

Save an archived copy of the websites you visit (the actual content of each site, not just the list of links). Can archive entire browsing history, or just links matching a filter or bookmarks list.

ArchiveBox can import links from:

Browser history or bookmarks (Chrome, Firefox, Safari, IE, Opera)
Pocket
Pinboard
RSS or plain text lists
Shaarli, Delicious, Instapaper, Reddit Saved Posts, Wallabag, Unmark.it, and more!

For each site, it outputs (configurable):

Browsable static HTML archive (wget)
PDF (Chrome headless)
Screenshot (Chrome headless)
HTML after 2s of JS running (Chrome headless)
Favicon
Submits URL to archive.org
Index summary pages: index.html & index.json

The archiving is additive, so you can schedule ./archive to run regularly and pull new links into the index. All the saved content is static and indexed with json files, so it lives forever & is easily parseable, it requires no always-running backend.

DEMO: archive.sweeting.me

Documentation

We use the Github wiki system for documentation.

You can also access the docs locally by looking in the ArchiveBox/docs/ folder.

Getting Started

Reference

More Info

Screenshots

Description

🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more...

archivebox backups bookmark-archiver browser-bookmarks chromium digipres firefox headless-browser internet-archiving pinboard pocket python rss self-hosted singlefile warc wayback-machine web-archiving wget youtube-dl

Readme MIT 40 MiB

Languages

Python 74.6%

HTML 11.5%

TypeScript 8.7%

Shell 2.8%

CSS 1.2%

Other 1.1%