As the manager of a large website, one of my biggest problems (and pet peeve) is Media libraries turning into document dumps, full of files no longer linked-to by any web pages. These unused files take up valuable disk space on the servers and often contain
out of date information.
No matter how hard I drill it into my editors to delete files when they are obsolete, our servers quickly accumulate orphan files. We need a way to find them.
I'd like to propose two new functions:
- A report than can be run from the Media perspective to show a list of files which aren't linked to by any page in the website. This would be sorted and displayed by folder in a similar way to the unpublished pages report.
- A contextual menu item for "what pages link to this?", shown when right-clicking any individual page or file in the Content or Media perspectives.
Both functions will require a mechanism that scans a page upon publish, and stores hyperlink information as searchable metadata attached to the page. This metadata would also allow the creation of a web-page link checker (possibly scanning external hyperlinks