Collect Advertised Jobs CHM Help File Extractor
Apr 22

Printer Friendly Version

Download Source Code: WebRequestSpy.zip - 15.03KB

Redirected Requests

Snapshot of the Web Request Spy, after an interactive HTTP transfer
Snapshot of the Web Request Spy, after an interactive HTTP transfer

After you performed the three steps for the same google.com web page, you are supposed to see as response stream the same content as in previous two cases. More, this time you get more info on the URL, the request and response objects. On my computer, the Response property grid shows an interesting value: The ResponseUri property is "http://www.google.ca", so it looks like I got a different page from what I requested.

If you live in Canada or another country, outside US, you may already know that Google transparently redirects you to its localized pages, based on the country you send your request from.

With our interactive Web Request Spy, there is an easy way to catch this. After a Create Request, and right before a Get Response, set the Request's AllowAutoRedirect property to false. After you complete the transfer, if you live, like me, in a non-US country, you will likely get back now a different response stream. On my computer, it's the content of a HTML page, which tells me to access google.ca.

If you inspect the StatusCode property of the Response, you'll see the value Found, which is equivalent with Redirect and has the same numeric value, 302. Successful complete HTTP requests are all supposed to return status OK, or 200. A different value tells us that we need to perform at least another request.

For redirect requests from a web server, you'll find the redirect location in the Response Headers, with name Location. You can see the Location header name in your PropertyGrid control, but not its value. Check the Debug window, it should be there. On my screen, I found a "Location: http://www.google.ca/", and if I set my URL to this address, and perform, step by step, a second request, I'll get the response stream I got in first two use cases.

So HttpWebRequest/Response classes offer a higher level of detail. Most HTTP commands are encapsulated as either individual properties, or in Headers collections.

Socket-Level Requests

A low-level HTTP transfer can be also performed at the TCP/IP socket-level, using the Socket class, from System.Net.Sockets namespace.

At this level, you have to manually build, in code, your HTTP request headers, and separate the status code, the response headers and the response stream, from what you get back from the web server. WebUtilities.WebRequestSocket encapsulates the code for such a transparent HTTP request:

/// <summary>
/// Low-level transparent HTTP transfer, using socket classes
/// </summary>
/// <param name="url">web address</param>
/// <returns>response stream</returns>
public static string WebRequestSocket(string url)
{
    string result = "";

    // create end point (IP Address + Port)
    Uri uri = new Uri(url);
    IPHostEntry hostEntry = Dns.GetHostEntry(uri.Host);
    IPAddress address = hostEntry.AddressList[0];
    Debug.WriteLine(">>> IP Address: " + address.ToString());
    IPEndPoint endPoint = new IPEndPoint(address, uri.Port);
    Debug.Assert(endPoint != null);

    // create and connect TCP socket
    using (Socket socket = new Socket(
        endPoint.AddressFamily, SocketType.Stream, ProtocolType.Tcp))
    {
        socket.Connect(endPoint);
        Debug.Assert(socket.Connected);

        // prepare and send request string, with headers
        string request = "GET "
            + System.Web.HttpUtility.UrlPathEncode(uri.AbsolutePath)
            + " HTTP/1.1\r\n"
            + "Host: " + uri.Host + "\r\n"
            + "Connection: close\r\n"
            + "Accept: */*\r\n"
            + "\r\n";
        Debug.WriteLine(">>> Request:");
        Debug.WriteLine(request.Trim('\r', '\n'));
        byte[] buffer = Encoding.ASCII.GetBytes(request);
        socket.Send(buffer, buffer.Length, 0);

        // get response stream: headers followed by content stream
        int bytes = 0;
        buffer = new byte[1000];
        using (NetworkStream netStream = new NetworkStream(socket))
        using (BinaryReader reader = new BinaryReader(netStream))
            while (true)
            {
                bytes = reader.Read(buffer, 0, buffer.Length);
                if (bytes == 0)
                    break;
                result += Encoding.ASCII.GetString(buffer, 0, bytes);
            }

        // shutdown socket
        socket.Shutdown(SocketShutdown.Both);
    }

    // separate returned HTTP status code
    Debug.Assert(result.Length > 0 && result.StartsWith("HTTP/1."));
    string status = result.Substring(0, result.IndexOf("\r\n") + 2);
    Debug.WriteLine(">>> Status Line: " + status.Trim('\r', '\n'));
    _statusCode = (HttpStatusCode)
        Convert.ToInt32(status.Split(' ')[1]);

    // separate response headers
    string headers = result.Substring(status.Length,
        result.IndexOf("\r\n\r\n") + 4 - status.Length);
    LoadHeaders(headers);
    Debug.WriteLine(">>> Response Headers:");
    Dump(_headers);

    // separate and return response content string
    result = result.Substring(status.Length + headers.Length);
    return result;
}

From Web Request Spy, call Transparent Requests - Using Sockets menu command, for the default google.com URL, and you will get the page with redirect status code.

To handle auto-redirects, we provided a second implementation of WebUtilities.WebRequestSocket, which loops through calls until no other redirect is required. From Web Request Spy, call Transparent Requests - Using Sockets, with Redirects menu command:

/// <summary>
/// Low-level transparent HTTP transfer, using socket classes,
/// with or without automatic redirects
/// </summary>
/// <param name="url">web address</param>
/// <param name="autoRedirect">if true, chains autoredirects</param>
/// <returns>response stream</returns>
public static string WebRequestSocket(string url, bool autoRedirect)
{
    if (!autoRedirect)
        return WebRequestSocket(url);

    // auto-redirect handling
    string result;
    while (true)
    {
        result = WebRequestSocket(url);
        if ((int)_statusCode != (int)HttpStatusCode.Redirect)
            break;
        url = _headers["Location"];
    }
    return result;
}

Explicit Requests

At the socket-level, you can see that HTTP requests and responses contain string heading commands. A first command line is followed by "name: value" header lines. The header blocks, for both requests or responses, end with an empty line.

The request command line - like "GET / HTTP/1.1" - contains the HTTP method name (GET here, but can be POST, HEAD or other supported HTTP command name), followed by the relative path of the page you're looking for (/ for the root), and the TCP/IP protocol name and version (HTTP/1.1).

Below is the full request command string sent by WebUtilities.WebRequestSocket, for google.com page. Beside the first command line, it contains request headers with the domain or host name, an indication the connection will close after the handshake, and that any kind of data is accepted as response stream:
   GET / HTTP/1.1
   Host: www.google.com
   Connection: close
   Accept: */*

First line of a response command string - like "HTTP/1.1 302 Found" - contains the TCP/IP protocol name and version, followed by the HTTP status code, as integer and string friendly value.

Here are possible response headers, for google.com page, requiring further redirect to the localized page:
   Location: http://www.google.ca/
   Cache-Control: private
   Set-Cookie: PREF=ID=900b73c2c2b289d8:TM=1177262052:LM=1177262052:S=Bs2ryQH49JGlSiz8; expires=Sun, 17-Jan-2038 19:14:07 GMT; path=/; domain=.google.com
   Content-Type: text/html
   Server: GWS/2.1
   Content-Length: 218
   Date: Sun, 22 Apr 2007 17:14:12 GMT

ASP.NET Requests

On the server-side, from ASP.NET, there are two very similar HttpWebRequest/Response classes, implemented by System.Web namespace (not System.Net!). The HTTP request is usually handled by an ASP.NET page (or web forms page, ASPX page), usually derived from System.Web.UI.Page.

HttpWebRequest/Response classes, client-side and server-side, in ASP.NET
HttpWebRequest/Response classes, client-side and server-side, in ASP.NET

Here is how you can use the Request/Response properties, directly when the page loads, and manually build and format the result. In the case below, we will send back some XML:

protected void Page_Load(object sender, EventArgs e)
{
    if (Request.QueryString["xml"] == "true")
    {
        Response.ContentType = "text/xml";
        Response.Write("<?xml version=\"1.0\">"
            + "<answer>Yes, it's good!</answer>");
        Response.End();
    }
}

Subscribe and Share: Subscribe using any feed reader Bookmark and Share

Leave a Reply