C# Programming > Web

C# URL Checker

Link Checker

Using .NET Framework, it is very easy to check for valid and broken links in C#. The main namespace required is System.Net

7/5/11 Update: use HttpWebResponse instead of WebResponse

Access the Web

In the article describing how to download a file in C# we described how to connect to the internet to retrieve a file. Using the same technique we can connect to a URL and download a webpage (which is also a file) to test whether it is available or not. In fact, HTTP status codes let us know a lot more than that.

Setting up the Connection

Setting up the web connection is simple:

Uri urlCheck = new Uri(url);
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(urlCheck);
request.Timeout = 15000;

The HttpWebRequest class automatically handles everything required to request data from the given URL. You can also manually specify for how long a request should attempt to connect to the link. In this case we set a timeout of 15 seconds (set in milliseconds).

HttpWebResponse response;
     response = (HttpWebResponse)request.GetResponse();
catch (Exception)
     return false; //could not connect to the internet (maybe) 

The GetResponse command actually goes and tries to access the website. An exception is thrown if the class can't connect to the link (usually because the computer is not connected to the internet). This is a good place to make the distinction between not being able to connect and the webpage not existing. However there are a few wrinkles. For example, a 403 status code (forbidden access) will throw an exception instead of simply setting a response code.

Checking the Status Code

If otherwise the connection when through okay, the HttpWebResponse class will give us access to a status code of the response. This status code tells us the state of the URL. Note that we had to explicitly cast WebResponse to HttpWebResponse to gain access to the status code.

There are many status codes and each have their own meaning. The most common one is 200, which means the URL was found. 404 means the page was not found, 302 means the page is redirected somewhere else, etc. You can check out the complete status code definitions. Luckily for us, the HttpStatusCode enum encapsulates the most common status codes and their meaning.

So for our example, we might just want to check if the status code is 200 (the page was found) and return false otherwise.

return response.StatusCode == HttpStatusCode.Found;

Sample Program

Go ahead and download the C# source code. The CheckURL function takes in a webpage address as a parameter and returns a simple boolean value indicating whether the link is valid or broken.

Back to C# Article List