{% extends "core/base.html" %} {% load static %} {% load i18n %} {% block breadcrumbs %} {% endblock %} {% block extra_head %} {% endblock %} {% block body %}


{% if stdout %}

Add new URLs to your archive: results

                {{ stdout | safe }}
                


  Add more URLs ➕
{% else %}
{% csrf_token %}

Create a new Crawl

A Crawl is a job that processes URLs and creates Snapshots (archived copies) for each URL discovered. The settings below apply to the entire crawl and all snapshots it creates.


Crawl Settings

{{ form.url.label_tag }} {{ form.url }}
0 URLs detected
{% if form.url.errors %}
{{ form.url.errors }}
{% endif %}
Enter URLs to archive, one per line. Examples:
https://example.com
https://news.ycombinator.com
https://github.com/ArchiveBox/ArchiveBox
{{ form.tag.label_tag }} {{ form.tag }} {% for tag_name in available_tags %} {% if form.tag.errors %}
{{ form.tag.errors }}
{% endif %}
Tags will be applied to all snapshots created by this crawl. Start typing to see existing tags.
{{ form.depth.label_tag }} {{ form.depth }} {% if form.depth.errors %}
{{ form.depth.errors }}
{% endif %}
Controls how many links deep the crawl will follow from the starting URLs.
{{ form.notes.label_tag }} {{ form.notes }} {% if form.notes.errors %}
{{ form.notes.errors }}
{% endif %}
Optional description for this crawl (visible in the admin interface).

Crawl Plugins

Select which archiving methods to run for all snapshots in this crawl. If none selected, all available plugins will be used. View plugin details →

Quick Select:
{{ form.chrome_plugins }}
{{ form.archiving_plugins }}
{{ form.parsing_plugins }}
{{ form.search_plugins }}
{{ form.binary_plugins }}
{{ form.extension_plugins }}

Advanced Crawl Options

Additional settings that control how this crawl processes URLs and creates snapshots.

{{ form.schedule.label_tag }} {{ form.schedule }} {% if form.schedule.errors %}
{{ form.schedule.errors }}
{% endif %}
Optional: Schedule this crawl to repeat automatically. Examples:
daily - Run once per day
weekly - Run once per week
0 */6 * * * - Every 6 hours (cron format)
0 0 * * 0 - Every Sunday at midnight (cron format)
{{ form.persona.label_tag }} {{ form.persona }} {% if form.persona.errors %}
{{ form.persona.errors }}
{% endif %}
Authentication profile to use for all snapshots in this crawl. Create new persona →
{{ form.overwrite }} {{ form.overwrite.label_tag }} {% if form.overwrite.errors %}
{{ form.overwrite.errors }}
{% endif %}
Re-archive URLs even if they already exist
{{ form.update }} {{ form.update.label_tag }} {% if form.update.errors %}
{{ form.update.errors }}
{% endif %}
Retry archiving URLs that previously failed
{{ form.index_only }} {{ form.index_only.label_tag }} {% if form.index_only.errors %}
{{ form.index_only.errors }}
{% endif %}
Create snapshots but don't run archiving plugins yet (queue for later)
{{ form.config.label_tag }} {{ form.config }} {% if form.config.errors %}
{{ form.config.errors }}
{% endif %}
Override any config option for this crawl (e.g., TIMEOUT, USER_AGENT, CHROME_BINARY, etc.)



{% if absolute_add_path %} {% endif %} {% endif %}
{% endblock %} {% block footer %}{% endblock %} {% block sidebar %}{% endblock %}