app-store/apps/archivebox/metadata/description.md

# ArchiveBox

ArchiveBox is a powerful, self-hosted internet archiving solution to collect, save, and view websites offline.

---

![](https://github.com/ArchiveBox/ArchiveBox/assets/511499/90f1ce3c-75bb-401d-88ed-6297694b76ae?raw=true)

---

Without active preservation effort, everything on the internet eventually disappears or degrades. Archive.org does a great job as a centralized service, but saved URLs have to be public, and they can't save every type of content.

ArchiveBox is an open source tool that lets organizations & individuals archive both public & private web content while retaining control over their data. It can be used to save copies of bookmarks, preserve evidence for legal cases, backup photos from FB/Insta/Flickr or media from YT/Soundcloud/etc., save research papers, and more...

📥 **You can feed ArchiveBox URLs one at a time, or schedule regular imports** from your bookmarks or history, social media feeds or RSS, link-saving services like Pocket/Pinboard, our [Browser Extension](https://chromewebstore.google.com/detail/archivebox-exporter/habonpimjphpdnmcfkaockjnffodikoj), and more.  

**It saves snapshots of the URLs you feed it in several redundant formats.**  
It also detects any content featured *inside* pages & extracts it out into a folder:

- 🌐 **HTML**/**Any websites** ➡️ `original HTML+CSS+JS`, `singlefile HTML`, `screenshot PNG`, `PDF`, `WARC`, `title`, `article text`, `favicon`, `headers`, ...
- 🎥 **Social Media**/**News** ➡️ `post content TXT`, `comments`, `title`, `author`, `images`, ...
- 🎬 **YouTube**/**SoundCloud**/etc. ➡️ `MP3/MP4`s, `subtitles`, `metadata`, `thumbnail`, ...
- 💾 **Github**/**Gitlab**/etc. links ➡️ `clone of GIT source code`, `README`, `images`, ...
- ✨ *and more ...
[App] ArchiveBox (#2393) * [App] ArchiveBox * Update apps/archivebox/metadata/description.md Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> * feat(docker=compose): add more env vars for app and their default value --------- Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Co-authored-by: JigSawFr <JigSawFr@users.noreply.github.com> 2024-02-04 12:40:05 +00:00			`# ArchiveBox`

			`ArchiveBox is a powerful, self-hosted internet archiving solution to collect, save, and view websites offline.`

			`---`

			`![](https://github.com/ArchiveBox/ArchiveBox/assets/511499/90f1ce3c-75bb-401d-88ed-6297694b76ae?raw=true)`

			`---`

			`Without active preservation effort, everything on the internet eventually disappears or degrades. Archive.org does a great job as a centralized service, but saved URLs have to be public, and they can't save every type of content.`

			`ArchiveBox is an open source tool that lets organizations & individuals archive both public & private web content while retaining control over their data. It can be used to save copies of bookmarks, preserve evidence for legal cases, backup photos from FB/Insta/Flickr or media from YT/Soundcloud/etc., save research papers, and more...`

			`📥 You can feed ArchiveBox URLs one at a time, or schedule regular imports from your bookmarks or history, social media feeds or RSS, link-saving services like Pocket/Pinboard, our [Browser Extension](https://chromewebstore.google.com/detail/archivebox-exporter/habonpimjphpdnmcfkaockjnffodikoj), and more.`

			`It saves snapshots of the URLs you feed it in several redundant formats.`
			`It also detects any content featured inside pages & extracts it out into a folder:`

			- 🌐 HTML/Any websites ➡️ `original HTML+CSS+JS`, `singlefile HTML`, `screenshot PNG`, `PDF`, `WARC`, `title`, `article text`, `favicon`, `headers`, ...
			- 🎥 Social Media/News ➡️ `post content TXT`, `comments`, `title`, `author`, `images`, ...
			- 🎬 YouTube/SoundCloud/etc. ➡️ `MP3/MP4`s, `subtitles`, `metadata`, `thumbnail`, ...
			- 💾 Github/Gitlab/etc. links ➡️ `clone of GIT source code`, `README`, `images`, ...
			`- ✨ *and more ...`