LinkChecker

Salesforce Knowledge Article LinkChecker

Contents

Background

I write technical documentation for MapAnything, a tech startup based on the Salesforce platform. Therefore I write within the Salesforce ecosystem. I use Salesforce Knowledge to develop our information architecture.

One of the limitations of working in Salesforce Knowledge is that there is not currently a tool to check for dead links. Links change all the time, and it's important to keep them updated to maintain a good user experience. Moreover, I noticed our Resource Center uses dynamically generated JavaScript to create the page. It creates a nice enough site, but there was no simple way to rip the HTML from the page and check it for dead links. At the end of the day, one would have to use a web crawler to sort it out, and, frankly, I don't know enough JavaScript to implement such a fix efficiently.

The LinkChecker relies on you exporting your Knowledge Articles via the Heroku Knowledge Exporter. If you can't export your articles via the Heroku application, I apologize. This repository will not work for you just yet. Eventually, I'll get this application working via the Salesforce API. For now, however, this scanner simply matches the article name to the CSV file exported via Heroku. It then pulls the links from the HTML and reports them to the user.

Although this project was not designed to work with HTML files from outside of Salesforce, it should. The LinkChecker will simply report that the filename is null.

Screenshots

Usage

Note: This project is still quite early in development.

Environment

OS: Mac OSX
IDE: Eclipse Neon Release (4.6.2)

Instructions

Export your articles via the Heroku Knowledge Exporter. Unzip them.
If you have multiple article types, simply add them all to one folder. I create a folder called "html" and drop everything in it. Make sure you keep the CSV file that Heroku generates. LinkChecker uses this to map the filename to the article name.
Download the LinkChecker jar file from this repository.
Double-click the jar file.
Point to the folder where you stored your articles. LinkChecker performs a recursive search (files within files), so just point it to the top level directory.
LinkChecker should run and give you all the output you need to fix your links. LinkChecker reports the HTTP status of your article and the links within it.

Output LinkChecker reports the HTTP status code. Generally speaking, you want to look for status codes that are not 200. Redirects (3xx) won't affect your end user, but you may want to fix them regardless.

HTTP Status	Meaning
2xx	OK
3xx	Redirect
4xx	Not Found

Development Backlog

Short Term

Test with Linux.
Change Links from thread class to SingleThreadExecutor to prevent multiple threads running.
Add Enable/Disable recursive search option.
Resolve command line stack trace for bad file path.
Add increased support for non-Salesforce articles.

Long Term

Add support for Salesforce API. Allow users to log in directly and scan their articles from one interface.

Release Notes/Issues

If LinkChecker is running, a user can browse to another file and run the LinkChecker again. This spans another thread and is not desirable since both threads output to the same text area. This issue has been added to the development backlog.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
LinkChecker		LinkChecker
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LinkChecker

Background

Screenshots

Usage

Development Backlog

Release Notes/Issues

About

Uh oh!

Releases

Packages

Languages

marcf08/LinkChecker

Folders and files

Latest commit

History

Repository files navigation

LinkChecker

Background

Screenshots

Usage

Development Backlog

Release Notes/Issues

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages