"Your own personal internet archive" (网站存档 / 爬虫)
Browser history or bookmarks (Chrome, Firefox, Safari, IE, Opera, and more)
- *Shaarli, Delicious, Reddit Saved Posts, Wallabag, Unmark.it, and any other text with links in it!*
#### Can save these things for each site:
- **Index:** `index.html` & `index.json` HTML and JSON index files containing metadata and details
- **Title:** `title` title of the site
- **Favicon:** `favicon.ico` favicon of the site
- **WGET Clone:** `example.com/page-name.html` wget clone of the site, with .html appended if not present
- **WARC:** `warc/
The aim of ArchiveBox is to go beyond what the Wayback Machine and other public archiving services can do, by adding a headless browser to replay sessions accurately, and by automatically extracting all the content in multiple redundant formats that will survive being passed down to historians and archivists through many generations.
ArchiveBox differentiates itself from [similar projects](https://github.com/pirate/ArchiveBox/wiki/Web-Archiving-Community#Web-Archiving-Projects) by trying to be a simple, robust, way for the average tech-savvy user to save sizable portions of the content they view and care about locally. Unlike crawler software that starts from a seed URL and works outwards, or public tools like Archive.org designed for users to manually submit links from the public internet, ArchiveBox tries to be a set-and-forget archiver suitable for archiving your browsing history, RSS feeds, or bookmarks, including private/authenticated content that you wouldn't want to share with a centralized service.
## Read more
Whether you want learn which organizations are the big players in the web archiving space, want to find a specific open source tool for your web archiving need, or just want to see where archivists hang out online, our Community Wiki page serves as an index of the broader web archiving community. Check it out to learn about some of the coolest web archiving projects and communities on the web!
- [Community Wiki](https://github.com/pirate/ArchiveBox/wiki/Web-Archiving-Community)
+ [The Master Lists](https://github.com/pirate/ArchiveBox/wiki/Web-Archiving-Community#The-Master-Lists)
*Community-maintained indexes of archiving tools and institutions.*
+ [Web Archiving Software](https://github.com/pirate/ArchiveBox/wiki/Web-Archiving-Community#Web-Archiving-Projects)
*Open source tools and projects in the internet archiving space.*
+ [Reading List](https://github.com/pirate/ArchiveBox/wiki/Web-Archiving-Community#Reading-List)
*Articles, posts, and blogs relevant to ArchiveBox and web archiving in general.*
+ [Communities](https://github.com/pirate/ArchiveBox/wiki/Web-Archiving-Community#Communities)
*A collection of the most active internet archiving communities and initiatives.*
- Check out the ArchiveBox [Roadmap](https://github.com/pirate/ArchiveBox/wiki/Roadmap) and [Changelog](https://github.com/pirate/ArchiveBox/wiki/Changelog)
- Learn why archiving the internet is important by reading the "[On the Importance of Web Archiving](https://parameters.ssrc.org/2018/09/on-the-importance-of-web-archiving/)" blog post.
- Or reach out to me for questions and comments via [@theSquashSH](https://twitter.com/thesquashSH) on Twitter.
---
# Documentation
We use the [Github wiki system](https://github.com/pirate/ArchiveBox/wiki) for documentation.
You can also access the docs locally by looking in the [`ArchiveBox/docs/`](https://github.com/pirate/ArchiveBox/wiki/Home) folder.
## Getting Started
- [Home](https://github.com/pirate/ArchiveBox/wiki/Home)
- [Quickstart](https://github.com/pirate/ArchiveBox/wiki/Quickstart)
- [Install](https://github.com/pirate/ArchiveBox/wiki/Install)
- [Docker](https://github.com/pirate/ArchiveBox/wiki/Docker)
## Reference
- [Usage](https://github.com/pirate/ArchiveBox/wiki/Usage)
- [Configuration](https://github.com/pirate/ArchiveBox/wiki/Configuration)
- [Supported Sources](https://github.com/pirate/ArchiveBox/wiki/Quickstart#2-get-your-list-of-urls-to-archive)
- [Supported Outputs](https://github.com/pirate/ArchiveBox/wiki#can-save-these-things-for-each-site)
- [Scheduled Archiving](https://github.com/pirate/ArchiveBox/wiki/Scheduled-Archiving)
- [Publishing Your Archive](https://github.com/pirate/ArchiveBox/wiki/Publishing-Your-Archive)
- [Chromium Install](https://github.com/pirate/ArchiveBox/wiki/Install-Chromium)
- [Troubleshooting](https://github.com/pirate/ArchiveBox/wiki/Troubleshooting)
## More Info
- [Roadmap](https://github.com/pirate/ArchiveBox/wiki/Roadmap)
- [Changelog](https://github.com/pirate/ArchiveBox/wiki/Changelog)
- [Donations](https://github.com/pirate/ArchiveBox/wiki/Donations)
- [Web Archiving Community](https://github.com/pirate/ArchiveBox/wiki/Web-Archiving-Community)
---
# Screenshots