25 lines
		
	
	
		
			1.7 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
		
		
			
		
	
	
			25 lines
		
	
	
		
			1.7 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
|   | # ArchiveBox
 | ||
|  | 
 | ||
|  | ArchiveBox is a powerful, self-hosted internet archiving solution to collect, save, and view websites offline. | ||
|  | 
 | ||
|  | --- | ||
|  | 
 | ||
|  |  | ||
|  | 
 | ||
|  | --- | ||
|  | 
 | ||
|  | Without active preservation effort, everything on the internet eventually disappears or degrades. Archive.org does a great job as a centralized service, but saved URLs have to be public, and they can't save every type of content. | ||
|  | 
 | ||
|  | ArchiveBox is an open source tool that lets organizations & individuals archive both public & private web content while retaining control over their data. It can be used to save copies of bookmarks, preserve evidence for legal cases, backup photos from FB/Insta/Flickr or media from YT/Soundcloud/etc., save research papers, and more... | ||
|  | 
 | ||
|  | 📥 **You can feed ArchiveBox URLs one at a time, or schedule regular imports** from your bookmarks or history, social media feeds or RSS, link-saving services like Pocket/Pinboard, our [Browser Extension](https://chromewebstore.google.com/detail/archivebox-exporter/habonpimjphpdnmcfkaockjnffodikoj), and more.   | ||
|  | 
 | ||
|  | **It saves snapshots of the URLs you feed it in several redundant formats.**   | ||
|  | It also detects any content featured *inside* pages & extracts it out into a folder: | ||
|  | 
 | ||
|  | - 🌐 **HTML**/**Any websites** ➡️ `original HTML+CSS+JS`, `singlefile HTML`, `screenshot PNG`, `PDF`, `WARC`, `title`, `article text`, `favicon`, `headers`, ... | ||
|  | - 🎥 **Social Media**/**News** ➡️ `post content TXT`, `comments`, `title`, `author`, `images`, ... | ||
|  | - 🎬 **YouTube**/**SoundCloud**/etc. ➡️ `MP3/MP4`s, `subtitles`, `metadata`, `thumbnail`, ... | ||
|  | - 💾 **Github**/**Gitlab**/etc. links ➡️ `clone of GIT source code`, `README`, `images`, ... | ||
|  | - ✨ *and more ... |