app-store/apps/archivebox/metadata/description.md
Neo c407343e46
[App] ArchiveBox (#2393)
* [App] ArchiveBox

* Update apps/archivebox/metadata/description.md

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

* feat(docker=compose): add more env vars for app and their default value

---------

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: JigSawFr <JigSawFr@users.noreply.github.com>
2024-02-04 13:40:05 +01:00

1.7 KiB
Executable File

ArchiveBox

ArchiveBox is a powerful, self-hosted internet archiving solution to collect, save, and view websites offline.



Without active preservation effort, everything on the internet eventually disappears or degrades. Archive.org does a great job as a centralized service, but saved URLs have to be public, and they can't save every type of content.

ArchiveBox is an open source tool that lets organizations & individuals archive both public & private web content while retaining control over their data. It can be used to save copies of bookmarks, preserve evidence for legal cases, backup photos from FB/Insta/Flickr or media from YT/Soundcloud/etc., save research papers, and more...

📥 You can feed ArchiveBox URLs one at a time, or schedule regular imports from your bookmarks or history, social media feeds or RSS, link-saving services like Pocket/Pinboard, our Browser Extension, and more.

It saves snapshots of the URLs you feed it in several redundant formats.
It also detects any content featured inside pages & extracts it out into a folder:

  • 🌐 HTML/Any websites ➡️ original HTML+CSS+JS, singlefile HTML, screenshot PNG, PDF, WARC, title, article text, favicon, headers, ...
  • 🎥 Social Media/News ➡️ post content TXT, comments, title, author, images, ...
  • 🎬 YouTube/SoundCloud/etc. ➡️ MP3/MP4s, subtitles, metadata, thumbnail, ...
  • 💾 Github/Gitlab/etc. links ➡️ clone of GIT source code, README, images, ...
  • *and more ...