Useful archiving efforts and other projects to help out with for people new to and interested in archiving:
HIGH priority (If you don't help archive these automatically, the data will probably be lost forever):
1. http://warrior.archiveteam.org/
Help out automatically archive things being shut down right now by running ArchiveTeam Warrior program (or specific containers) in the background:
Requirements: Few GB of space, some bandwidth and small amount of CPU power, more info: https://wiki.archiveteam.org/index.php/ArchiveTeam_Warrior
If you learn that a site or any online data is in danger of shutting down, read through this page and contact ArchiveTeam on their IRC if required in order to have it archived: https://wiki.archiveteam.org/index.php/Projects
2. Help out automatically forward URLs you browse that are not archived on https://archive.org to them for archival with a browser extension:
https://github.com/internetarchive/wayback-machine-webextension
MEDIUM priority (Important overall)
3. Seed torrents for as long as possible, rare data forever. Make sure to look up a guide for your router to PORT FORWARD your torrent client port, to substantially increase your upload (and your download) speed. In low population torrent swarms, if no one is port forwarded then you might not be able to connect to each other at all and exchange any data despite having it.
Requirements: As much or as little bandwitdh you want (you can set the limits if you need to)
https://github.com/qbittorrent/qBittorrent (Recommended client, especially to replace uTorrent)
4. Archive web pages you want to have a local copy of with a "Web Extension for saving a faithful copy of a complete web page in a single HTML file with a single click"
https://github.com/gildas-lormeau/SingleFile
5. Archive videos with "GUI front-end for youtube-dl, yt-dlp and other compatible video downloaders"
https://github.com/axcore/tartube
6. "Capture or record any area of your screen and share it with a single press of a key"
https://github.com/ShareX/ShareX
7. Archive entire websites you want to have a local copy of
https://www.httrack.com/
8. Publish the data that you have archived that isn't easily or at all available online. You can easily create torrents yourself in your torrent client and then share the magnet link to it anywhere online for anyone to access and, as long as DHT (Distributed Hash Table, decentralized way to share torrents without the need for any specific tracker) is enabled in settings (on by default), your files will be searchable on DHT by DHT crawlers, local or online (for example https://btdig.com/, where you can actually also search for FILE NAMES within all DHT torrents)
(archive.org also creates torrents for all uploads automatically but their torrents shouldn't be relied on because of an error-prone implementation and since they can also break when more files are uploaded or if the item's metadata changes, which includes even getting a new comment on the item)
OTHER useful things:
- In your torrent client settings add the best trackers to be automatically added for all of your newly added torrents (helps more easily connect to peers, especially in obscure torrents):
https://github.com/ngosang/trackerslist
- Look into running a node for I2P (anonymous private network within the global internet):
Requirements: Mostly bandwidth, more info: https://geti2p.net/en/faq
https://geti2p.net/
- Look into running Tor/Hyphanet(Freenet)/IPFS nodes.
- "A self-hosted BitTorrent indexer, DHT crawler, content classifier and torrent search engine with web UI"
https://github.com/bitmagnet-io/bitmagnet
- "ArchiveBox is a powerful, self-hosted internet archiving solution to collect, save, and view websites offline"
https://github.com/ArchiveBox/ArchiveBox
- Look into donating your PC resources to be used more intensively in projects:
BOINC (Berkeley Open Infrastructure for Network Computing: https://boinc.berkeley.edu/projects.php
GIMPS (Great Internet Mersenne Prime Search): https://www.mersenne.org/
- Additional archiving tools: https://github.com/iipc/awesome-web-archiving
- Additional links to archiving and similar communities:
https://wiki.archiveteam.org/index.php/Archiveteam:IRC
https://www.reddit.com/r/Archiveteam
https://www.reddit.com/r/DataHoarder
https://www.reddit.com/r/DataHoarder/wiki/index/ - Hardware and software for data hoarding FAQ
https://www.reddit.com/r/lostmedia
https://www.reddit.com/r/GamePreservationists
https://www.reddit.com/r/torrents
https://www.reddit.com/r/qBittorrent
https://annas-archive.se/torrents
>>>/t/
What are you archiving or want to archive?
Do you have or know anyone who has some rare interesting data or media not available online?
>website got listed on the archive team wiki
>years passed
>forums got deleted
>no one archived it
>>105667004https://wiki.archiveteam.org/index.php/Kongregate
>>105667356https://archive.org/details/archiveteam?tab=collection&query=kongregate&sort=date
>Complete archive of all topic pages in the Kongregate forums shortly before non-gaming forums were removed on 2020-07-22. See also kongregate.com_forums_20230903 for a repeat run of this prior to the complete removal of the forums in 2023.
>>105667389oh. someone should update the wiki then
>>105667427edit the wiki yourself so you help others like you in future
>>105668095i am not going to make an account and wait for a janny to let me do it
>>105666568 (OP)>1. http://warrior.archiveteam.org/i had no idea i could volunteer my computer for archive.org like that. i love that kind of shit
>>105669013ArchiveTeam is cooperating closely with the Internet Archive but they are not a part of them but a collection of online volunteers. But yes, running ArchiveTeam's software to archive things automatically does directly end up on the Internet Archive and their Wayback machine.
>tfw only 100tb of spinning rust and its pretty much full
I wish I knew about archiving when i was a wee lad. So many shit i enjoyed is now lost.
Anyways, is there a semantic standard for naming files?
>>105666568 (OP)have they fixed torrents yet? (meta comments are included in torrents so every time someone comments on an archive entry the torrent is invalid)
the only reason i like torrents is because the IA downloader tool errors out silently. well, effectively silently because of all the garbage it outputs which cant be turned off.
>>105671317I don't think they did, and it probably won't be fixed anytime soon.
>>105666568 (OP)You've now made this thread 251 times. After I pointed out, you made this thread 250 times yesterday The thread was immediately pruned. What is your problem? Why do you constantly spam this?
>>105666568 (OP)You've now made this thread 215 times. After I pointed out, you made this thread 214 imes yesterday, the thread was immediately pruned for spam. What is your problem? Why do you constantly spam this?
>>105671634>the thread was immediately pruned for spamIt wasn't.
>What is your problem?Data loss.
>You've now made this thread 215 times.Great, that's probably at least a few thousand people who are now more knowledgeable about the tools they have to archive things they care about.
DAS
md5: 2399897a54d91be7ff964c45604655fe
🔍
Seems like a reasonable thread to ask without making a new one.
Any reason I should get a NAS over a "DAS" like ORICO 9728C3-EU-BK-BP.
It's a third of the cost of a QNAP (not getting synology) and holds two HDDs.
I should be able to put 2 8TB drives in there and have redundancy no?
The only point of a NAS that I can tell is to have online shit attached, but I don't want my cute and funny collection exposed to the internet.
Any opinions on this?
I don't think I'll need more than 8TB of storage as that is already around 270 full seasons of anime in blueray quality.
I also have a blueray burner for important shit.
I might get a PI or something for synchthing later, but I don't want this on anything but LAN.
>>105671292Should i name them "name_month_day_year_hour:minute"?
I don't know if windows can read the character ":" though
>>105672030>I don't know if windows can read the character ":" thoughNo.
>Should i name them "name_month_day_year_hour:minute"?You can name them however makes the most sense given the data you are archiving, whichever naming scheme would help the person going through the archive the most.
How do I archive a bunch of books at once? Im trying to download controversial authors before their works dissapear. Ive been using WeLib so far getting PDF and epub docs.
>>105672092Depends on the site, JDownloader2 can be good outside of manually writing scripts to download things with wget or something like that.
>>105672059I'll just change the : for _ then
I kinda wanted it to look professional, i never worked with data before but when i see them it's all named in a cool order with versioning and shit
>>105672290You can always change it easily later with something like the Bulk Rename Utility.
>>105672127>JDownloader2Any idea which version doesn't have the spyware/adware thing in it?
I recall this used to be a thing.
>>105672427It was a forum post version I'm pretty sure this is the one with the adware.
>>105672353I use mint but thanks for helping
And i probably wont change later because of laziness