Page:
Changelog
Pages
ArchiveBox Architecture Diagrams
Changelog
Chromium Install
Configuration
Docker
Donations
Home
Install
Merging Collections
Publishing Your Archive
Quickstart
Roadmap
Scheduled Archiving
Security Overview
Setting Up Storage
Setting up Authentication
Setting up Search
Troubleshooting
Upgrading or Merging Archives
Upgrading
Usage
Web Archiving Community
28
Changelog
Nick Sweeting edited this page 2023-11-13 21:50:40 -08:00
Table of Contents
Changelog
▶️ If you're having an issue with a breaking change, or migrating your data between versions, open an issue to get help.
ArchiveBox
was previously named Pocket Archive Stream
and then Bookmark Archiver
.
THIS PAGE HAS BEEN MOVED:
See the releases page for versioned source downloads and full changelog.
🍰 Many thanks to our 100+ contributors and everyone in the web archiving community! 🏛
Expand old release notes...
- v0.4.9 released
pip install archivebox
https://pypi.org/project/archivebox/docker run archivebox/archivebox
https://hub.docker.com/r/archivebox/archivebox- https://archivebox.readthedocs.io/en/latest/
- https://github.com/ArchiveBox/ArchiveBox/releases
- easy migration from previous versions
cd path/to/your/archive/folder archivebox init archviebox add 'https://example.com' archviebox add 'https://getpocket.com/users/USERNAME/feed/all' --depth=1
- full transition to Django Sqlite DB with migrations (making upgrades between versions much safer now)
- maintains an intuitive and helpful CLI that's backwards-compatible with all previous archivebox data versions
- uses argparse instead of hand-written CLI system: see
archivebox/cli/archivebox.py
- new subcommands-based CLI for
archivebox
(see below) - new Web UI with pagination, better search, filtering, permissions, and more
- 30+ assorted bugfixes, new features, and tickets closed
- for more info, see: https://github.com/ArchiveBox/ArchiveBox/releases/tag/v0.4.9
- v0.2.4 released
- better archive corruption guards (check structure invariants on every parse & save)
- remove title prefetching in favor of new FETCH_TITLE archive method
- slightly improved CLI output for parsing and remote url downloading
- re-save index after archiving completes to update titles and urls
- remove redundant derivable data from link json schema
- markdown link parsing support
- faster link parsing and better symbol handling using a new compiled URL_REGEX
- v0.2.3 released
- fixed issues with parsing titles including trailing tags
- fixed issues with titles defaulting to URLs instead of attempting to fetch
- fixed issue where bookmark timestamps from RSS would be ignored and current ts used instead
- fixed issue where ONLY_NEW would overwrite existing links in archive with only new ones
- fixed lots of issues with URL parsing by using
urllib.parse
instead of hand-written lambdas - ignore robots.txt when using wget (ssshhh don't tell anyone 😁)
- fix RSS parser bailing out when there's whitespace around XML tags
- fix issue with browser history export trying to run ls on wrong directory
- v0.2.2 released
- Shaarli RSS export support
- Fix issues with plain text link parsing including quotes, whitespace, and closing tags in URLs
- add USER_AGENT to archive.org submissions so they can track archivebox usage
- remove all icons similar to archive.org branding from archive UI
- hide some of the noisier youtubedl and wget errors
- set permissions on youtubedl media folder
- fix chrome data dir incorrect path and quoting
- better chrome binary finding
- show which parser is used when importing links, show progress when fetching titles
- v0.2.1 released with new logo
- ability to import plain lists of links and almost all other raw filetypes
- WARC saving support via wget
- Git repository downloading with git clone
- Media downloading with youtube-dl (video, audio, subtitles, description, playlist, etc)
- v0.2.0 released with new name
- renamed from Bookmark Archiver -> ArchiveBox
- v0.1.0 released
- support for browser history exporting added with
./bin/archivebox-export-browser-history
- support for chrome
--dump-dom
to output full page HTML after JS executes
- v0.0.3 released
- support for chrome
--user-data-dir
to archive sites that need logins - fancy individual html & json indexes for each link
- smartly append new links to existing index instead of overwriting
- v0.0.2 released
- proper HTML templating instead of format strings (thanks to https://github.com/bardisty!)
- refactored into separate files, wip audio & video archiving
- v0.0.1 released
- Index links now work without nginx url rewrites, archive can now be hosted on github pages
- added setup.sh script & docstrings & help commands
- made Chromium the default instead of Google Chrome (yay free software)
- added env-variable configuration (thanks to https://github.com/hannah98!)
- renamed from Pocket Archive Stream -> Bookmark Archiver
- added Netscape-format export support (thanks to https://github.com/ilvar!)
- added Pinboard-format export support (thanks to https://github.com/sconeyard!)
- front-page of HN, oops! apparently I have users to support now 😁?
- added Pocket-format export support
- v0.0.0 released: created Pocket Archive Stream 2017/05/05
This file contains invisible Unicode characters
This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
Getting Started
- 🔢 Quickstart
- 🖥️ Install
- 🐳 Docker
- ➡️ Supported Sources
- ⬅️ Supported Outputs
Usage
- ﹩Command Line
- 🌐 Web UI
- 🧩 Browser Extension
- 👾 REST API / Webhooks
- 📜 Python API / REPL / SQL API
Reference
Guides
- Upgrading
- Setting up Storage (NFS/SMB/S3/etc)
- Setting up Authentication (SSO/LDAP/etc)
- Setting up Search (rg/sonic/etc)
- Scheduled Archiving
- Publishing Your Archive
- Chromium Install
- Cookies & Sessions Setup
- Merging Collections
- Troubleshooting
More Info
- ⭐️ Web Archiving Community
- Background & Motivation
- Comparison to Other Tools
- Architecture Diagram
- Changelog & Roadmap