Talk:Community Tech/Migrate dead external links to archives
WikiCache
editMartix made a similar proposal at WikiCache. Perhaps the ideas they drafted there could be used to guide the development of this project. Blue Rasberry (talk) 15:45, 16 May 2016 (UTC)
- Oh, thanks for pointing that out. I'll go and reply to him. -- DannyH (WMF) (talk) 20:42, 17 May 2016 (UTC)
Adding query against alternative archive if not present in Wayback Machine due to robots.txt or other reasons
editWebsites with robots.txt restrictions will not be captured by the Internet Archive's global Wayback crawls, and even content captured in the past from a given host will not be displayed if/when robots.txt restrictions are added. For this project, how often do the dead links not have corresponding versions in the Wayback Machine? If this happens a non-trivial amount, could be good to subsequently check against and/or Memento (http://timetravel.mementoweb.org/) or (more narrowly) Archive-It (wayback.archive-it.org); these archives may contain captures irrespective of robots.txt restrictions.