From 3d2c4c70d267e5bfb09f6ffb333e83a70a62587b Mon Sep 17 00:00:00 2001 From: Nick Sweeting Date: Tue, 9 Jan 2024 20:38:38 -0800 Subject: [PATCH 01/36] Update README.md --- README.md | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/README.md b/README.md index efc5744a..e78c8598 100644 --- a/README.md +++ b/README.md @@ -630,8 +630,7 @@ Data folders can be created anywhere (`~/archivebox` or `$PWD/data` as seen in o
-Expand to learn more about the layout of Archivebox's data on-disk... -
+Expand to learn more about the layout of Archivebox's data on-disk...
All `archivebox` CLI commands are designed to be run from inside an ArchiveBox data folder, starting with `archivebox init` to initialize a new collection inside an empty directory. @@ -664,7 +663,7 @@ The on-disk layout is optimized to be easy to browse by hand and durable long-te Each snapshot subfolder `./archive//` includes a static `index.json` and `index.html` describing its contents, and the snapshot extractor outputs are plain files within the folder. -#### Learn More +

Learn More

- https://github.com/ArchiveBox/ArchiveBox/wiki/Usage#Disk-Layout - https://github.com/ArchiveBox/ArchiveBox/wiki/Usage#large-archives @@ -683,8 +682,7 @@ You can export the main index to browse it statically as plain HTML files in a f
-Expand to learn how to export your ArchiveBox collection... -
+Expand to learn how to export your ArchiveBox collection...
> *NOTE: These exports are not paginated, exporting many URLs or the entire archive at once may be slow.* From 23a9c538c2d4317996e4efe72196d7c2bb2fde82 Mon Sep 17 00:00:00 2001 From: Nick Sweeting Date: Tue, 9 Jan 2024 20:46:22 -0800 Subject: [PATCH 02/36] Update README.md --- README.md | 31 ++++++++++++++----------------- 1 file changed, 14 insertions(+), 17 deletions(-) diff --git a/README.md b/README.md index e78c8598..0ad793e9 100644 --- a/README.md +++ b/README.md @@ -633,20 +633,17 @@ Data folders can be created anywhere (`~/archivebox` or `$PWD/data` as seen in o Expand to learn more about the layout of Archivebox's data on-disk...
-All `archivebox` CLI commands are designed to be run from inside an ArchiveBox data folder, starting with `archivebox init` to initialize a new collection inside an empty directory. +All archivebox CLI commands are designed to be run from inside an ArchiveBox data folder, starting with archivebox init to initialize a new collection inside an empty directory. -```bash -mkdir ~/archivebox && cd ~/archivebox # just an example, can be anywhere -archivebox init -``` +
mkdir ~/archivebox && cd ~/archivebox   # just an example, can be anywhere
+archivebox init
-The on-disk layout is optimized to be easy to browse by hand and durable long-term. The main index is a standard `index.sqlite3` database in the root of the data folder (it can also be [exported as static JSON/HTML](https://github.com/ArchiveBox/ArchiveBox/wiki/Publishing-Your-Archive#2-export-and-host-it-as-static-html)), and the archive snapshots are organized by date-added timestamp in the `./archive/` subfolder. +The on-disk layout is optimized to be easy to browse by hand and durable long-term. The main index is a standard index.sqlite3 database in the root of the data folder (it can also be exported as static JSON/HTML), and the archive snapshots are organized by date-added timestamp in the ./archive/ subfolder. -```bash -/data/ +
/data/
     index.sqlite3
     ArchiveBox.conf
     archive/
@@ -659,18 +656,18 @@ The on-disk layout is optimized to be easy to browse by hand and durable long-te
             warc/1617687755.warc.gz
             git/somerepo.git
             ...
-```
+
-Each snapshot subfolder `./archive//` includes a static `index.json` and `index.html` describing its contents, and the snapshot extractor outputs are plain files within the folder. +Each snapshot subfolder ./archive// includes a static index.json and index.html describing its contents, and the snapshot extractor outputs are plain files within the folder.

Learn More

- -- https://github.com/ArchiveBox/ArchiveBox/wiki/Usage#Disk-Layout -- https://github.com/ArchiveBox/ArchiveBox/wiki/Usage#large-archives -- https://github.com/ArchiveBox/ArchiveBox/wiki/Security-Overview#output-folder -- https://github.com/ArchiveBox/ArchiveBox/wiki/Publishing-Your-Archive -- https://github.com/ArchiveBox/ArchiveBox/wiki/Upgrading-or-Merging-Archives - +
    +
  • https://github.com/ArchiveBox/ArchiveBox/wiki/Usage#Disk-Layout
  • +
  • https://github.com/ArchiveBox/ArchiveBox/wiki/Usage#large-archives
  • +
  • https://github.com/ArchiveBox/ArchiveBox/wiki/Security-Overview#output-folder
  • +
  • https://github.com/ArchiveBox/ArchiveBox/wiki/Publishing-Your-Archive
  • +
  • https://github.com/ArchiveBox/ArchiveBox/wiki/Upgrading-or-Merging-Archives
  • +

From 4adb214812113665c2b7d96c4b43b289e35da8d5 Mon Sep 17 00:00:00 2001 From: Nick Sweeting Date: Tue, 9 Jan 2024 21:12:17 -0800 Subject: [PATCH 03/36] Update README.md --- README.md | 97 ++++++++++++++++++++++++++----------------------------- 1 file changed, 45 insertions(+), 52 deletions(-) diff --git a/README.md b/README.md index 0ad793e9..1a401375 100644 --- a/README.md +++ b/README.md @@ -1,26 +1,16 @@ -
+

ArchiveBox
Open-source self-hosted web archiving.


-▶️ Quickstart | -Demo | -GitHub | -Documentation | -Info & Motivation | -Community +▶️ Quickstart | Demo | GitHub | Documentation | Info & Motivation | Community
- - -   - - - +   -   - +   -     -       -         -     +     -     +