WWW and the Technology Behind


Seminar Paper, 2002

89 Pages, Grade: B (ECTS)


Excerpt


Table of Contents

1. The Hypertext Transfer Protocol (HTTP)
1.1 HTTP – a short history
1.2 Functionality
1.3 Messages in HTTP
1.4 Fetching/sending information in HTTP
1.5 Error handling
1.6 HTTP proxies
1.7 Content Negotiation
1.8 Authentication
1.9 Domain Name System - DNS

2. Internet E-mail
2.1 Simple Mail Transfer Protocol (SMTP)
2.2 Post Office Protocol (POP)
2.3 Internet Message Access Protocol - IMAP
2.4 MIME

3. Presentation – CSS, DHTML, XHTML
3.1 Cascading Style Sheets - CSS
3.2 Dynamic HTML - DHTML
3.3 XHTML
3.4 The Relationship between HTML and its complements

4. Extensible Markup Language – XML
4.1 Functionality
4.2 The goals of XML
4.3 XML applications
4.4 Document Type Definition
4.5 XML style sheets
4.6 XML parser

5. PHP as an Example of a Scripting Language
5.1 PHP – a short history
5.2 What possibilities does PHP offer? Why choose PHP?
5.3 Functionality
5.4 Environments
5.5 Alternatives to PHP

6. Security
6.1 Secure Socket Layer – SSL
6.2 HTTP via SSL – HTTPS
6.3 Secure HTTP – S-HTTP
6.4 A Possible Alternative - IPsec

Appendix
Appendix 1 PHP tasks Fel! Bokmärket är inte definierat.

References

Introduction

With the middle of the 1990s an army of acronyms started to conquer the world. Any combination of three or four letters seemed to have a high tech meaning; HTTP, FTP, ASP, PHP, SMTP, POP and many more. One was, and still is, virtually everywhere to be found: WWW. After some time, most people realized that they all had something to do with a network of computers spanning the globe, the Internet. But what are they really about and how does the Internet work, anyway?

It is the purpose of this paper to shed some light on the meaning of some of these acronyms. It is an introduction to the technologies that can be seen as the technological basis of the Internet and its most prominent application, the Worldwide Web a.k.a. the web or WWW. Priority will be given to technologies that are widely used and are considered to be of importance for the future development of the web. Starting with the protocols that govern information exchange over the internet, namely the Hypertext Transfer Protocol, the Simple Mail Transfer Protocol, and the Post Office Protocol, this paper will continue by giving a short introduction to current additions to HTML, “the web’s language”. A description of basic HTML is left out deliberately, as very profound and complete literature on this subject is already available in great numbers. A section on ways to dynamically create web pages will be ultimately followed by an introduction to internet security issues and available technologies for data protection, namely SSL, HTTPS, S-HTTP, and IPsec.

1. The Hypertext Transfer Protocol - HTTP

Everyone who has ever surfed the web had more or less direct contact with HTTP, “404 not found” certainly being the most common message users receive directly. Basically, HTTP is one of the three basic components of the architecture of the Worldwide Web (WWW, the web). When it was created two major design goals had to be achieved (Wilde, 1999):

It had to be light, in a sense that it could easily be implemented in servers and clients.

It had to be fast. The web’s model of data distribution results in a large number of documents being spread over a large number of servers. A fast protocol would ensure quick retrieval of information over the web.

These goals have remained the same, even though HTTP has undergone some major developments since its conception. (Wilde, 1999)

1.1 HTTP – a short history

HTTP/0.9, the first protocol version, only supported the GET method, used to retrieve information. A client, a program establishing connections and sending requests (typically a web browser), would basically open a connection to a server, a program accepting connections and responding to requests. The client then sends a line consisting of the keyword GET followed by a document name. The server would respond by transmitting the requested text-only document and closing the connection after transmission. It was thus a simple protocol that did not support other media than text or allow to send information from the client to the server, not even simple error codes or information about the document. (Wilde, 1999)

A final version of the more powerful HTTP/1.0 was not released until 1996, and only as an informational RFC 1945 (Berners-Lee et. al, 1996, in Wilde, 1999) that simply documented the implementations made by major client and server programmers in addition to HTTP/0.9. The improvements of version 1.0 were many, but most importantly it included the concept of media types, it adopted the Multipurpose Internet Mail Extension (MIME), which already described a framework for exchanging different types of media. The new version also defined a versatile message format, consisting of an initial line plus a number of header fields that could be used to pass information between the client and the server and back. Another big improvement was the introduction of the POST method, allowing the transmission of information to a server. The new structured response format also allowed servers to include status codes that would provide helpful information if a request failed. User authentication was also included. (Wilde, 1999)

The prevalent model of request/response interactions, however, needed improvement. The latest version HTTP/1.1, released January 1997, follows a model of persistent connections (Persistent HTTP), keeping the connection open and waiting whether another request is send to the same server. This greatly reduces the number of times a TCP connection has to opened and closed. In addition, HTTP/1.1 supports the host header field used by a client to specify to which host a request is being sent. Version 1.1 also reports an error if the host header field is missing. Absolute URIs are also accepted in requests. Under version 1.0 it was only legal to use absolute URIs if the request was sent to a proxy. Absolute URIs can also be used to identify the host name on a server using virtual hosts. New request methods were introduced. Included here are the DELETE, OPTIONS, PUT and TRACE method. The transfer of partial entities by explicitly specifying byte ranges of resource. This is extremely useful if the transfer of a resource has been interrupted, and the client wants to request the remaining part of it. Content negotiation allows making a selection between different representations of a resource. These representations can be characterized by language, quality, encoding, or other parameters that do not affect the content of a resource. With persistent connections the closing of the connection does no longer signal the end of a document. For resources which length is not known in advance (e.g. dynamic content) chunked encoding can be used. HTTP/1.1 also introduced a more sophisticated caching model allowing servers and proxies detailed control over how caching should be performed for particular resources. The authentication scheme was also made more secure by defining a digest access authentication method that eliminates clear text transmission of user name and password. (Wilde, 1999)

1.2 Functionality

Basically, HTTP is a simple request/response protocol based on a connection-oriented transport service. A basic HTTP operation consists of a client sending a request to a server which then sends an appropriate response back to the client. There may be intermediaries present between client and origin server, where the requested resource resides. Proxies, gateways and tunnels are common forms of intermediaries defined in HTTP. A proxy can act as both, client and server, It receives requests and then acts as a client making requests on behalf of other clients. A proxy may also use its cache to service a request, not making a request to the origin server. A gateway is quite similar to a proxy. The difference lies in the client not knowing it is communicating with an intermediary instead of the origin server. A client communicating with a proxy does so explicitly, directing the request only to the proxy. In contrast to a proxy or a gateway, a tunnel acts as a blind intermediary, merely passing on messages without interpreting or modifying them. (Wilde, 1999)

1.3 Messages in HTTP

The interaction scheme between client and server using a HTTP connection is very simple. It consists of a request, sent from client to server, and a response, from the server to the client. The format of both types of messages is laid down in RFC 822 (Crocker, 1982, in Wilde, 1999). The standard message consists of a start-line, zero or more message-header fields (headers), an empty line, and an optional message-body, which contains the so called entity of the message. The start-line is either a request-line or a status-line depending on whether the message is a request or a response. A standard message basically looks like this (Wilde, 1999):

illustration not visible in this excerpt

1.3.1 Message Headers

Headers can be grouped into four different types (Wilde, 1999):

General headers apply to requests as well as responses, and do not apply to the entity being transferred.

Entity headers describe the entity transferred by a request or a response by meta-information. If the message does not contain any body and therefore no entity, the entity headers will describe the resource identified by the request.

Request headers give information about the request and the client itself to the server. They do not contain information about the entity.

Response headers are used by the server to pass any information not given in the status line. Response headers do not contain any information about the entity of the message. (Wilde, 1999)

1.3.1.1 General Headers

General headers can be applied to both, requests and responses. They only apply to the message being transmitted and not to the entity. (Wilde, 1999)

Cache Control

This header field is used to specify caching directives, that instruct all caching systems what to do with a given message.

Connection

This header allows a sender, client or server, to specify options that apply to particular connection, which means that it must not be passed on by proxies or further connections.

Date

This field indicates when a message has been generated. For clients it is optional to send a date header, servers must include it in their response.

MIME-Version

Actually, HTTP is not MIME (Multi Purpose Internet Mail Extension) compliant. For the creator of a message it is, however, possible to indicate what version of MIME was used to create the message, indicating full compliance with that MIME version.

Pragma

The Pragma field is used to specify implementation-specific directives for any recipient along the request/response chain.

Trailer

The Trailer header tells the recipient what header fields to expect in the trailer of a message encoded with chunked transfer coding.

Transfer-Encoding

The values of this field can either be chunked, for chunked encoding, or gzip, for encoding according to the gzip file format as specified in Internet Informational RFC 1952 (Deutsch, 1996, in Wilde, 1999).

Upgrade

This field can be used by a client to ask the server to change protocols. For example, a client could use this field in a future message according to HTTP/1.1 and indicate to the server that he is willing to switch to HTTP/2.0 if possible.

Via

This field allows the tracking of messages, as proxies and gateways must use this field to indicate the intermediate protocols and recipients of a message. Each intermediary adds a via field and like this the recipient can easily reconstruct the way of a message through the system.

1.3.1.2 Entity Headers

Entity header fields contain information about the body of a message, the entity. If there is no body present, information about the requested resource is given. A common example would be the response to a HEAD request. Only information about the entity, but not the entity itself would be submitted in response. (Wilde, 1999)

Allow

This field is used to transmit a list of methods that are supported for a particular resource. A server must use this field in a 405 (method not allowed) response.

Content-Base

The recipient of an entity needs a way of resolving relative URIs in an entity. For this purpose the Content-Base field specifies the base for resolving relative URIs. If no Content-Base field is present, the Content-Location field or the URI originating the request are used to resolve relative URIs. The base URI may also be defined in a <BASE> tag in the entities HTML code.

Content-Encoding

This field tells a client which type of encoding has been applied to the entity of a message, so it can apply the corresponding decoding mechanism to obtain the media type specified in the Content-Type field.

Content-Language

The Content-Language header indicates the natural language used in the body of the message. Languages are specified by language tags, see the Accept-Language request header in section 1.3.1.1 for more details.

Content-Length

The length of a message can be indicated as a decimal number of octets, this is specified in the Content-Length header field. In persistent connections, where the end of a message is not indicated by closing the connection, both, clients and servers have to indicate the length of any entity set.

Content-Location

If the entity of a response can be accessed from a location other than from the requested resources URI, the server should identify this in the Content-Location header. This header may also specify the base URI for the entity if the Content-Base field is missing.

Content-MD5

This field allows to check whether an entity has been altered along the way through the system, as the MD5 digest of the entity in this field allows an end-to-end integrity check. However, this cannot be seen as a security feature, it is only used to detect accidental alterations.

Content-Range

This field indicates where a partial entity has to be inserted into the full entity body, it specifies the position and the length of the partial entity being transferred. If the partial entity body consists of multiple ranges, the MIME type multipart/multimessage has to be used. Each multiple byte range in this format has its own Content-Range field.

Content-Type

The media type of the entity is specified in the Content-Type header. It is, however, possible that the actual encoding of the body may be different from the indication in the Content-Type header.

ETag

The ETag field contains the tag, which is used as a validator. It may also be used to compare entities from the same resource.

Expires

The date and time in this field indicate when an entity is to be considered stale. The cache should not return a copy of the entity without validation after this time and date.

Last-Modified

This field indicates the last modification of the entity. This may be a modification date from a file system, a time stamp from a data base or the current date and time, in the case of dynamic content.

Extension-header

This mechanism allows additional entity header fields to be defined without changing the protocol. However, it is not sure whether or not these headers will be recognized by the recipient.

1.3.2 Request

The first message to be sent in an HTTP interaction is always a request. It is sent from client to server directly after the successful establishment of a connection and specifies the client’s request. The format of a request is rather simple (Wilde, 1999):

illustration not visible in this excerpt

The request-line contains the most important information of the request. It consists of three fields separated by space characters. The method field specifies the method to be performed by the server on the resource specified by the request-URI. The request-URI must be used in absolute form, containing the host name) if the request is sent to a proxy. If the request is sent to the origin server, however, a path form can be used, which uniquely identifies the resource on the server. The HTTP-version field indicates the protocol version used. Any request must include a string indicating a HTTP-version. (Wilde, 1999)

1.3.2.1 Request Methods

The action to be performed by a server upon a request is defined by a request method. These methods can be characterized according to two properties. Safe methods never take an action other than retrieval. For idempotent methods it can be said that the side-effects of more than one identical request are the same as for a single request. Every safe method is also idempotent. (Wilde, 1999)

The different methods defined in HTTP/1.1 and other recently proposed methods are (Wilde, 1999; W3.org, 2002):

GET

Is used to get any kind of information of the server. A representation of the object is transferred to the client.

Is used when we press a link or when we write directly to a URL. As result, the server HTTP send us a document corresponding to the selected URL, or active a CGI module which generate the return information.

Some URIs refer to specific variants of an object, and some refer to objects with many variants. In the latter case, the representations, encodings, and languages acceptable may be specified in the header request fields, and may affect the particular value which is returned.

HEAD

The HEAD method is identical to GET except that the server must not return a message-body in the response. The meta information contained in the HTTP headers in response to a HEAD request SHOULD be identical to the information sent in response to a GET request.

This method can be used for obtaining meta information about the entity implied by the request without transferring the entity-body itself. This method is often used for testing hypertext links for validity, accessibility, and recent modification.

CHECKOUT

Similar to GET but locks the object against update by other people. The lock may be broken by a higher authority or on timeout: in this case a future CHECKIN will fail.

SHOWMETHOD

When an object can support more operations than are defined in this specification, SHOWMETHOD allows a client to understand the interface to that operation sufficiently to allow the user to perform it interactively.

Returns a description (perhaps a form) for a given method when applied to the given object. The method name is specified in a For-Method: field. (TBS).

PUT

Actualize information about an object of the server. It is similar to POST, but in this case, the sent information to the server must be stored in the URL which accompany to the command. In this way it can actualize the content of one document.

DELETE

Requests that the server delete the information corresponding to the given URL. After a successful DELETE method, the URL becomes invalid for any future methods.

POST

It used to send information to the server, for example the data contented in a form. The server will pass the information to a process in charge of its treatment ( generally a CGI application). The operation that is realized with the proportionate information depends of the used URL.

This method creates a new object linked to the specified object. The message-id field of the new object may be set by the client or else will be given by the server. A URL will be allocated by the server and returned to the client. The new document is the data part of the request. It is considered to be subordinate to the specified object, in the way that a file is subordinate to a directory containing it.

LINK

Links an existing object to the specified object.

The link method of HTTP adds meta information (Object header information) to an object, without touching the object's content. For example, it requests the creation of a link from the specified object to another object.

The request is followed by a set of object headers which are to be added.

UNLINK

This method deletes metainformation about an object. The request contains object headers which are to be removed. Only headers exactly matching the headers given are removed.

Obviously the operation may be used for unlinking objects. It may also be used for removing other meta-information such as object title, expiry date, etc.

CHECKIN

Similar to PUT, but releases the lock set on the object. Fails if no lock has been set by CHECKOUT. Suggestion : phase out this (rcs-like) model in favor of the PUT (cvs-like, non-locking) model of code management.

TEXTSEARCH

This is a simple form of search. The text is assumed to derive from the requesting user, and is in no special format.

The exact algorithm to be applied is not defined in this specification, but techniques such as vocabulary proximity matching between the request data portion and the contents or titles of documents, keyword matching, stemming, and the use of a thesaurus are quite appropriate.

SPACEJUMP

This method is similar to the TEXTSEARCH method, but instead of the search criterion being a text string, it is a set of coordinates defining a point within the image.

SEARCH

Proposed only. The index (etc) identified by the URL is to be searched for something matching in some sense the enclosed message.

CONNECT

The CONNECT method is only reserved in HTTP/1.1 and is basically used for SSL proxying. It initiates a transparent path between a client and an origin server, the proxy simply passes data back and forth.

1.3.2.2 Request Headers

Request header fields pass additional information about the request and the client to the server, they act as request modifiers. Some headers change the semantics of the request and must be properly interpreted by the server (conditional, partial requests for resources), others can be ignored (software used on the client side). (Wilde, 1999)

Accept

The Accept header field specifies the media types acceptable for a response. Clients can be very specific about the type of media they can interpret or which types of media they prefer to get, if available.

Accept-Charset

Using this header clients can specify which character sets they are willing to accept. The ISO 8859-1 character set can be assumed to be acceptable to all clients, as it is the Latin alphabet. Clients able to understand other characters as well can signal this to the server. If a server is not able to satisfy a requested character set it may answer with a 406 (not acceptable) or with an unacceptable response using the ISO 8859-1 character set.

Accept-Encoding

This header specifies acceptable content encodings of a response. The absence thereof signals a server that any kind of content encoding is acceptable. The Internet Assigned Numbers Authority (IANA) acts as a registry for possible values for the Accept-Encoding header field. A server incapable of satisfying a requested content encoding should respond with a 406 (not acceptable) error.

Accept-Language

The Accept-Language header may be ignored by the server in some cases. It specifies the preferred language of the content (English, German,…). In cases where the response is not language specific (e.g. images) it can be ignored, where language is an important property of the content (text) it must be observed by the server. In the proposed standard RFC 1766 (Alvestrand, 1995, in Wilde, 1999) a system of language specifying tags and subtags is proposed.

Authorization

Clients can authenticate themselves to a server by including credentials either for basic or digest access authentication in the Authorization header.

Expect

A client that expects a certain behaviour of a server can indicate this using the Expect header. If the server cannot satisfy this expectations it will answer with a 417 (expectation failed) error.

From

This header is used to pass the e-mail address of the user who is using the client, to the server. No client should, however, pass on the user’s address without his approval. Automated clients, like search engines, should always include a contact address, in case an automated client is causing problems for a server.

Host

This is the only required header field specified in HTTP/1.1. It has to be sent in every request to a server. If it is missing, the server has to respond with a 400 (bad request) error. The Host header field specifies the Internet host and port number of the resource being requested, as obtained from the original source. Like this, servers can differentiate between requests received from a single IP address which is associated with multiple host names.

If-Modified-Since

This header is used to make a GET request conditional. If the resource has not been modified since the given date, the server sends a 304 (not modified) response. If has been modified, the response is just like for a normal, unconditional GET method. If the request is invalid or results in a response other than 200 (ok), the error sent back is the same as in the case of a normal, unconditional GET method. This header allows more efficient caching. If a cached entity is still valid, this is realized very quickly by the client and only a minimum amount of data has to be transferred. In case of a modification, the response contains the updated version.

If-Match

This header makes a request conditional by specifying one or more entity tags. If the entity tag specified matches the entity tag associated with the resource, the server performs the requested method. If there is no match, the server should send a 412 (precondition failed) response.

If-Non-Match

This field serves the same purpose as the If-Match field. In case of a If-Non-Match header the requested operation is only performed if none of the resources identified in the entity tags in the If-None-Match field is current. If any of the entity tags match, the server should respond with a 304 (not modified) in case of a GET or HEAD request. In all other cases it should respond with a 412 (precondition failed). This can, for example, be used to prevent races between PUT operations.

If-Range

A client, who has a partial copy of an entity in its cache and needs to receive the rest of it, it could use a conditional GET and a Range header field. But if that conditional GET failed, it would have to make a second request for the complete entity. The If-Range header makes this second request unnecessary. It can be used with a If-Modified-Since or an entity tag. An If-Range header always is used in conjunction with a Range request header. If the entity is still the same, the server sends a 206 (partial response) back, if it has changed, the server sends a 200 (ok) and the entire entity with it.

If-Unmodified-Since

This header makes the server respond with a 412 (precondition failed) if the resource has been modified since the given date. If it has not been modified, the server will perform the requested task.

Max-Forwards

This header can only be used together with TRACE and OPTION methods. It limits the number of proxies or gateways that can forward a request. This is to enable a client to trace requests that seem to be looping or stuck midway.

Proxy-Authorization

This field is used by the client to authenticate itself with the nearest proxy.

Range

This header is used by clients to specify range requests, especially if a part of the entity is already cached. The byte range specifies the starting point for the retrieval of the rest of the entity. If such a request is successful the server replies 206 (partial response). Otherwise it sends back a 416 (requested range not satisfiable) response. A proxy receiving a complete entity will cache the entire entity and only pass on the requested ranges to its client.

Referer

This field can be used by a client to inform a server about the URI of the resource from which the URI of the request was taken. The server uses this information for interest, logging, optimised caching and other purposes, like setting the HTTP-Referrer Variable in a Common Gateway Interface application.

TE

This field is similar to the Accept-Encoding request header, but it restricts the transfer encodings that are acceptable in the response. It only applies to the immediate connection, therefore its token must be supplied in the Connection general header field whenever it is present in a message. If a server cannot send a response that is acceptable to the TE field, it should reply with a 406 (not acceptable) response.

User-Agent

The User-Agent header contains information about the client, typically this will be the type and version of the browser. The purpose of this is collecting statistical information and recognizing user agents that need a tailored response because of limitations or special abilities.

1.3.3 Response

The response a server sends back to the client contains the status line, various header fields (general, entity, response headers) and the body/entity. The status line again specifies the listing version, the status code and the reason-phrase. The status code is a three-figure integer value, which supplies important information about availability, successful handling to the client and error messages. The reason-phrase is a textual description of the status code. (Wilde, 1999)

In general, these status codes and messages are divided into five categories (Wilde, 1999):

1xx: Informal messages: Request received, handling is executed.

2xx: Success: Request was received successfully, understood and assumed.

3xx: Pass on: Further internal messages must be initiated, so that a request can be completely processed.

4xx: Client error: The Request contains invalid syntax or cannot not be processed.

5xx: Server error: The server cannot process a valid Request.

HTTP is used mostly by web browsers and servers, as well as by proxies and search engines. Server responses have a three-part structure and supply data in the title format (in the technical sense, about MIME 1,0). If the server transmits data, which corresponds to other formats, e.g. pictures, then the format is indicated in the header, and then receives this data in the announced format.

The three-part structure looks as follows (Wilde, 1999):

Status code

In the first line the HTTP version and the status code are located. A more detailed description of 4xx and 5xx error codes can be found in section 1.5.

Headers

Here general, response and entity headers are to be found.

Body

After the blank line, which followed the last header, the server transmits the actual information, e.g. the HTML code or the bytes of a picture in the body.

A response basically looks like this (Wilde, 1999):

illustration not visible in this excerpt

1.3.2.1 Response headers

Response headers pass additional information from the server to the client that cannot be fitted in the status line. (Wilde, 1999)

Accept-Ranges

This field can reduce request-response interactions between a client and a server, it basically indicates the servers acceptance of range requests for a resource.

Age

This field indicates the age of the supplied document in seconds.

Location

The Location header is used to redirect clients to the new location of a resource that was moved or to tell the client where to find a resource created upon request by a client.

Proxy-Authenticate

This field has to be included in a 407 (proxy authentication required) response, it may also be used in a 401 (unauthorized) response if it has been generated by proxy asking for authentication. The client uses the information given in this field to create a request with authentication information using the Proxy-Authorization field.

Retry-After

This field indicates how long a service is expected to be unavailable after a 503 (service unavailable) status code response. It can also be used to time a redirection.

Server

The Server header contains information about the server sending the response. It will usually be the type and version of the server software. However, this feature makes a server quite vulnerable to unauthorized access.

Vary

This header contains information about the dimension of which content may vary as a result of a content negotiation process.

Warning

The Warning header contains additional information that would not fit into the status code field. 1xx warnings describe the revalidation status of the response and are deleted after successful revalidation. 2xx warnings describe things that are not rectified by revalidation.

WWW-Authenticate

This field must be included in a response carrying a 401 (unauthorized) error code. The client will use the information in this header field to create a valid request using the Authorization request header.

1.4 Fetching/sending information in HTTP

When a client makes a request to a server, the following steps are executed: A user goes to a URL, selecting a link in an HTML document or entering the URL directly in the field Location of the client. The client translates the URL, separating its different parts. Thus it identifies the access protocol, direction DNS or IP of the server, the possible optional port (the value by default is 80) and the required object of the server. A TCP/IP connection with the server is opened, calling the corresponding TCP port. The request is made. For this, the client sends the necessary command (GET, POST, HEAD...), the direction of the required object (the content of the URL that follows the direction of the server), the version of HTTP being used (almost always HTTP/1.0) and a set of headers as described earlier. (Wilde, 1999)

The server responds to the client. The response consists of a status code and the MIME data type with the return information, followed by the entity, the requested information itself. Then the TCP connection is closed. (Wilde, 1999)

This process is repeated in each access to server HTTP. For example, if a document HTML has four images inserted, the previous process repeats five times, one for document HTML and four for the images. (Wilde, 1999)

Example:

From a client the URL is asked for http://www.unican.es/invest/default.html

www.unican.es is opened to a connection TCP/IP with port 80 of the system.

The client makes the request, sending something similar to this:

GET / invest/default.html HTTP/1.0
Accept: text/plain Ready of types MIME that accepts or understands
Accept: text/html the client
Accept: audio / *
Accept: video/mpeg
..
Accept: * / * It indicates that it accepts other possible types MIME
User-Agent: Mozilla/3.0 (WinNT; I) Information on the type of client
Line in target, indicates the end of the request

The server responds with the following information:

HTTP/1.0 200 OK Status of the operation; in this case, correct
Date: Monday, 7-Oct-96 18:00:00 Date of the operation
Server: NCSA 1,4 Type and version of the server
MIME-version: 1,0 Version of MIME that handles
Content-type: text/html Definition MIME of the data type to give back
Content-length: 254 Length of the data that follow
Last-modified: 6-Oct-96 12:30:00 Date of modification of the data
Line in target
< HTML > Beginning of the data
< HEAD >
< TITLE > Resources of investigation in UNICAN < / TITLE >
</HEAD >
< BODY >
..
..
< / HTML >

The connection is closed after the transmission.

1.5 Error handling

Client Error 4xx

If the HTTP request cannot be processed because the client has made an error in its request (such as syntactical errors or sending unauthorized requests), the server responds with a status code of this class. (Wilde, 1999)

400 (bad request)

401 (unauthorized)

402 (payment required)

403 (forbidden)

404 (not found)

405 (method not allowed)

406 (non acceptable)

407 (proxy authentication required)

408 (request time-out)

409 (conflict)

410 (gone)

411 (length required)

412 (precondition failed)

413 (request entity too large)

414 (request-URI too large)

415 (unsupported media type)

416 (requested range not satisfiable)

417 (expectation failed)

418 (reauthentication required)

419 (proxy reauthentication required)

(Wilde, 1999)

The 4xx class of status code is intended for cases in which the client seems to have erred. Except when responding to a HEAD request, the server should include an entity containing an explanation of the error situation, and whether it is a temporary or permanent condition. These status codes are applicable to any request method. User agents should display any included entity to the user. (Wilde, 1999)

If the client is sending data, a server implementation using TCP should be careful to ensure that the client acknowledges receipt of the packet(s) containing the response, before the server closes the input connection. If the client continues sending data to the server after the close, the server's TCP stack will send a reset packet to the client, which may erase the client's unacknowledged input buffers before they can be read and interpreted by the HTTP application. (Wilde, 1999)

400 Bad Request

The request could not be understood by the server due to malformed syntax. The client should not repeat the request without modifications. (Wilde, 1999)

401 Unauthorized

The request requires user authentication. The response must include a WWW-Authenticate header field (section 14.47) containing a challenge applicable to the requested resource. The client may repeat the request with a suitable Authorization header field. If the request already included Authorization credentials, then the 401 response indicates that authorization has been refused for those credentials. If the 401 response contains the same challenge as the prior response, and the user agent has already attempted authentication at least once, then the user should be presented the entity that was given in the response, since that entity might include relevant diagnostic information. HTTP access authentication is explained in "HTTP Authentication: Basic and Digest Access Authentication". (Wilde, 1999)

[...]

Excerpt out of 89 pages

Details

Title
WWW and the Technology Behind
College
Jönköping International Business School  (DataEkonomer)
Course
Web Technology
Grade
B (ECTS)
Author
Year
2002
Pages
89
Catalog Number
V4644
ISBN (eBook)
9783638128506
ISBN (Book)
9783656081401
File size
873 KB
Language
English
Notes
It is the purpose of this paper to shed some light on the meaning of some of these acronyms. It is an introduction to the technologies that can be seen as the technological basis of the Internet and its most prominent application, the Worldwide Web a.k.a. the web or WWW. Priority will be given to technologies that are widely used and are considered to be of importance for the future development of the web. Starting with the protocols that govern information exchange over the internet, namely the Hypertext Transfer Protocol, the Simple Mail Transfer Protocol, and the Post Office Protocol, this paper will continue by giving a short introduction to current additions to HTML, 'the web's language'. A description of basic HTML is left out deliberately, as very profound and complete literature on this subject is already available in great numbers. A section on ways to dynamically create web pages will be ultimately followed by an introduction to internet security issues and available technologies for data protection, namely SSL, HTTPS, S-HTTP, and IPsec. 415 KB
Keywords
WWW, HTTP, PHP, Perl, Javascript, HTML, dHTML, XML, XHTML, Internet, POP, SSL, IPsec, E-Mail, Netzwerke, FTP
Quote paper
Christian Wolf (Author), 2002, WWW and the Technology Behind, Munich, GRIN Verlag, https://www.grin.com/document/4644

Comments

  • No comments yet.
Look inside the ebook
Title: WWW and the Technology Behind



Upload papers

Your term paper / thesis:

- Publication as eBook and book
- High royalties for the sales
- Completely free - with ISBN
- It only takes five minutes
- Every paper finds readers

Publish now - it's free