Help:Archiving a source

This help page is a how-to guide.

It explains concepts or processes used by the Wikipedia community. It is not one of Wikipedia's policies or guidelines, and may reflect varying levels of consensus.

Shortcut

H:AAS

On Wikipedia you can archive sources to prevent link rot. A comprehensive list of web archive sites can be found at WP:WEBARCHIVES (technical documentation) and List of Web archiving initiatives. Below are some of the most common. See also WP:Link rot for more detailed information.

Finding an archived source

You can use time travel to find an archive.

Note that as of early 2024, Google has removed links to its cached pages from Google search results (known informally as 'Google cache'). The article from Ars Technica in the link above describes an alternative method to access cached pages on Google that may still work.

Internet Archive Wayback Machine

The Wayback Machine is a service which can be used to cite archived copies of web pages used by articles. This is useful if a web page has changed, moved, or disappeared; links to the original content can be retained. This process can be performed automatically, using the web interface for User:InternetArchiveBot.

Editors are encouraged to add an archive link as a part of each citation, or at least submit the referenced URL for archiving, at the same time that each citation is created or updated. New URLs added to Wikipedia articles (but not other pages) are usually automatically archived by a bot.

Visit the webform at https://web.archive.org, enter the original URL of the web page of interest in the "Wayback Machine" search box and then hit return/enter. The next screen may:

show a calendar listing the snapshot dates for all archived copies of that page, or
show a box near the bottom of the page with a link inviting the user to Save this url in the Wayback Machine,

This is the code that needs to be added to an existing {{cite web}} or similar template:

|archive-url=https://web.archive.org/web/<YYYYMMDDhhmmss>/http://www.originalurl.example.com |archive-date=<YYYY-MM-DD> |url-status=dead

Archive.today

archive.today is an on-demand web archiving service at https://archive.today. A web archiving service allows Wikipedia editors to reduce link rot by preserving a copy of an online source that can be accessed if the original page is moved, changes, or disappears. Not all web pages can be archived using archive.today.

archive.today can archive HTML web pages, style sheets, JavaScript, and digital images. (Full article...)

Perma.cc

Megalodon.jp

When archiving on Megalodon.jp, insert the URL in the "魚拓をURLで検索・取得" field and click "検索と確認". Complete the captcha under "このまま魚拓をとる", and then click "取得" to start the archival process.

Conifer

Conifer was previously called Webrecorder. When adding https://conifer.rhizome.org/ links, it's recommended to add a {{cbignore}} to keep IABot from removing it.

Tracked in Phabricator
Task T321146

There is some info at Wikipedia:Using Webrecorder.

Ghostarchive.org

Ghostarchive.org uses the Webrecorder technology and is thus able to save dynamic content such as YouTube videos, JavaScript etc. it is a general purpose archive service on demand with no login or usage cap requirements.

Usage instructions:

For the moment (as of September 2021), it's recommended to add a {{cbignore}} following the citation template using ghostarchive.org to keep IABot from removing it, which it might do automatically until IABot has been programmed to recognize it.
- Tracked at Phabricator: T291588; marked resolved, but unclear if still problematic: comment: phab:T291588#7980573

The site has default URL shortening e.g. https://ghostarchive.org/archive/fwAS7 .. this is disallowed on Wikipedia per RfCs and policy to prevent abuse. Use the long form URL instead. The long URL can be found by checking another URL first:

For regular web pages:

https://ghostarchive.org/longurl/o9UcZ -> https://ghostarchive.org/archive/20210728022510/https://rms-support-letter.github.io/

For video pages eg.YouTube:

http://ghostarchive.org/vlongurl/UhCiGY75wVw -> https://ghostarchive.org/varchive/youtube/20100711020000/UhCiGY75wVw

For Instagram pages:

https://ghostarchive.org/iarchive/instagram/iyogitabihani/2801706102334087488 -> https://ghostarchive.org/iarchive/ilongurl/instagram/iyogitabihani/2801706102334087488 -> https://ghostarchive.org/archive/2021/https://instagram.com/p/CbhqNsiquFA

Thus it is a two step process: determine what the long URL is; add it to Wikipedia.

To find the earliest and latest archive available use a timestamp of "1990" or "3000" e.g.
- https://ghostarchive.org/archive/1990/https://rms-support-letter.github.io/will find the earliest archived copy of that webpage, while
- https://ghostarchive.org/archive/3000/https://rms-support-letter.github.io/will find the latest.

Note: Do not use 1990 or 3000 in Wikipedia it should be the actual timestamp.