Monday, June 22, 2009

In computer networks, a proxy server is a server (a computer system or an application program) that acts as a go-between for requests from clients seeking resources from other servers. A client connects to the proxy server, requesting some service, such as a file, connection, web page, or other resource, available from a different server. The proxy server evaluates the request according to its filtering rules. For example, it may filter traffic by IP address or protocol. If the request is validated by the filter, the proxy provides the resource by connecting to the relevant server and requesting the service on behalf of the client. A proxy server may optionally alter the client's request or the server's response, and sometimes it may serve the request without contacting the specified server. In this case, it 'caches' responses from the remote server, and returns subsequent requests for the same content directly.
A proxy server has two purposes:
To keep machines behind it anonymous (mainly for security).
To speed up access to a resource (via caching). It is commonly used to cache web pages from a web server.
A proxy server that passes requests and replies unmodified is usually called a gateway or sometimes tunneling proxy.
A proxy server can be placed in the user's local computer or at various points between the user and the destination servers or the Internet. A reverse proxy is a proxy used as a front-end to accelerate and cache in-demand resources (such as a web page).

Types and functions
Proxy servers implement one or more of the following functions:

Caching proxy server
A caching proxy server accelerates service requests by retrieving content saved from a previous request made by the same client or even other clients. Caching proxies keep local copies of frequently requested resources, allowing large organizations to significantly reduce their upstream bandwidth usage and cost, while significantly increasing performance. Most ISPs and large businesses have a caching proxy. These machines are built to deliver superb file system performance (often with RAID and journaling) and also contain hot-rodded versions of TCP. Caching proxies were the first kind of proxy server.
Some poorly-implemented caching proxies have had downsides (e.g., an inability to use user authentication). Some problems are described in RFC 3143 (Known HTTP Proxy/Caching Problems).
Another important use of the proxy server is to reduce the hardware cost. An organization may have many systems on the same network or under control of a single server, prohibiting the possibility of an individual connection to the Internet for each system. In such a case, the individual systems can be connected to one proxy server, and the proxy server connected to the main server.

Web proxy
A proxy that focuses on WWW traffic is called a "web proxy". The most common use of a web proxy is to serve as a web cache. Most proxy programs (e.g. Squid) provide a means to deny access to certain URLs in a blacklist, thus providing content filtering. This is often used in a corporate, educational or library environment, and anywhere else where content filtering is desired. Some web proxies reformat web pages for a specific purpose or audience (e.g., cell phones and PDAs).
AOL dialup customers used to have their requests routed through an extensible proxy that 'thinned' or reduced the detail in JPEG pictures. This sped up performance but caused problems, either when more resolution was needed or when the thinning program produced incorrect results. This is why in the early days of the web many web pages would contain a link saying "AOL Users Click Here" to bypass the web proxy and to avoid the bugs in the thinning software.[citation needed]

Content-filtering web proxy
Further information: Content-control software
A content-filtering web proxy server provides administrative control over the content that may be relayed through the proxy. It is commonly used in commercial and non-commercial organizations (especially schools) to ensure that Internet usage conforms to acceptable use policy. However in some cases users who disagree with the policy can figure out ways to revolt by bypassing a proxy (particularly a software-based one, by downloading and using their own proxy).
Some common methods used for content filtering include: URL or DNS blacklists, URL regex filtering, MIME filtering, or content keyword filtering. Some products have been known to employ content analysis techniques to look for traits commonly used by certain types of content providers.
A content filtering proxy will often support user authentication, to control web access. It also usually produces logs, either to give detailed information about the URLs accessed by specific users, or to monitor bandwidth usage statistics. It may also communicate to daemon based and/or ICAP based antivirus software to provide security against virus and other malware by scanning incoming content in real time before it enters the network.

Anonymizing proxy server
An anonymous proxy server (sometimes called a web proxy) generally attempts to anonymize web surfing. There are different varieties of anonymizers. One of the more common variations is the open proxy. Because they are typically difficult to track, open proxies are especially useful to those seeking online anonymity, from political dissidents to computer criminals. Some users are merely interested in anonymity on principle, to facilitate constitutional human rights of freedom of speech, for instance. The server receives requests from the anonymizing proxy server, and thus does not receive information about the end user's address. However, the requests are not anonymous to the anonymizing proxy server, and so a degree of trust is present between that server and the user. Many of them are funded through a continued advertising link to the user.
Access control: Some proxy servers implement a logon requirement. In large organizations, authorized users must log on to gain access to the web. The organization can thereby track usage to individuals.
Some anonymizing proxy servers may forward data packets with header lines such as HTTP_VIA, HTTP_X_FORWARDED_FOR, or HTTP_FORWARDED, which may reveal the IP address of the client. Other anonymizing proxy servers, known as elite or high anonymity proxies, only include the REMOTE_ADDR header with the IP address of the proxy server, making it appear that the proxy server is the client. A website could still suspect a proxy is being used if the client sends packets which include a cookie from a previous visit that did not use the high anonymity proxy server. Clearing cookies, and possibly the cache, would solve this problem.

Hostile proxy
Proxies can also be installed in order to eavesdrop upon the dataflow between client machines and the web. All accessed pages, as well as all forms submitted, can be captured and analyzed by the proxy operator. For this reason, passwords to online services (such as webmail and banking) should always be exchanged over a cryptographically secured connection, such as SSL.

Intercepting proxy server
An intercepting proxy (also known as a "transparent proxy") combines a proxy server with a gateway. Connections made by client browsers through the gateway are redirected through the proxy without client-side configuration (or often knowledge).
Intercepting proxies are commonly used in businesses to prevent avoidance of acceptable use policy, and to ease administrative burden, since no client browser configuration is required.
It is often possible to detect the use of an intercepting proxy server by comparing the external IP address to the address seen by an external web server, or by examining the HTTP headers on the server side.

Transparent and non-transparent proxy server
The term "transparent proxy" is most often used incorrectly to mean "intercepting proxy" (because the client does not need to configure a proxy and cannot directly detect that its requests are being proxied). Transparent proxies can be implemented using Cisco's WCCP (Web Cache Control Protocol). This proprietary protocol resides on the router and is configured from the cache, allowing the cache to determine what ports and traffic is sent to it via transparent redirection from the router. This redirection can occur in one of two ways: GRE Tunneling (OSI Layer 3) or MAC rewrites (OSI Layer 2).
However, RFC 2616 (Hypertext Transfer Protocol—HTTP/1.1) offers different definitions:
"A 'transparent proxy' is a proxy that does not modify the request or response beyond what is required for proxy authentication and identification".
"A 'non-transparent proxy' is a proxy that modifies the request or response in order to provide some added service to the user agent, such as group annotation services, media type transformation, protocol reduction, or anonymity filtering".

Forced proxy
The term "forced proxy" is ambiguous. It means both "intercepting proxy" (because it filters all traffic on the only available gateway to the Internet) and its exact opposite, "non-intercepting proxy" (because the user is forced to configure a proxy in order to access the Internet).
Forced proxy operation is sometimes necessary due to issues with the interception of TCP connections and HTTP. For instance, interception of HTTP requests can affect the usability of a proxy cache, and can greatly affect certain authentication mechanisms. This is primarily because the client thinks it is talking to a server, and so request headers required by a proxy are unable to be distinguished from headers that may be required by an upstream server (esp authorization headers). Also the HTTP specification prohibits caching of responses where the request contained an authorization header.

Suffix proxy
A suffix proxy server allows a user to access web content by appending the name of the proxy server to the URL of the requested content (e.g. "en.wikipedia.org.6a.nl").
Suffix proxy servers are easier to use than regular proxy servers. The concept appeared in 2003 in form of the IPv6Gate and in 2004 in form of the Coral Content Distribution Network, but the term suffix proxy was only coined in October 2008 by "6a.nl"[citation needed].

Open proxy server
Main article: Open proxy
Because proxies might be used to abuse, system administrators have developed a number of ways to refuse service to open proxies. Many IRC networks automatically test client systems for known types of open proxy. Likewise, an email server may be configured to automatically test e-mail senders for open proxies.
Groups of IRC and electronic mail operators run DNSBLs publishing lists of the IP addresses of known open proxies, such as AHBL, CBL, NJABL, and SORBS.
The ethics of automatically testing clients for open proxies are controversial. Some experts, such as Vernon Schryver, consider such testing to be equivalent to an attacker portscanning the client host. Others consider the client to have solicited the scan by connecting to a server whose terms of service include testing.

Reverse proxy server
Main article: Reverse proxy
A reverse proxy is a proxy server that is installed in the neighborhood of one or more web servers. All traffic coming from the Internet and with a destination of one of the web servers goes through the proxy server. There are several reasons for installing reverse proxy servers:
Encryption / SSL acceleration: when secure web sites are created, the SSL encryption is often not done by the web server itself, but by a reverse proxy that is equipped with SSL acceleration hardware. See Secure Sockets Layer. Furthermore, a host can provide a single "SSL proxy" to provide SSL encryption for an arbitrary number of hosts; removing the need for a separate SSL Server Certificate for each host, with the downside that all hosts behind the SSL proxy have to share a common DNS name or IP address for SSL connections.
Load balancing: the reverse proxy can distribute the load to several web servers, each web server serving its own application area. In such a case, the reverse proxy may need to rewrite the URLs in each web page (translation from externally known URLs to the internal locations).
Serve/cache static content: A reverse proxy can offload the web servers by caching static content like pictures and other static graphical content.
Compression: the proxy server can optimize and compress the content to speed up the load time.
Spoon feeding: reduces resource usage caused by slow clients on the web servers by caching the content the web server sent and slowly "spoon feeding" it to the client. This especially benefits dynamically generated pages.
Security: the proxy server is an additional layer of defense and can protect against some OS and WebServer specific attacks. However, it does not provide any protection to attacks against the web application or service itself, which is generally considered the larger threat.
Extranet Publishing: a reverse proxy server facing the Internet can be used to communicate to a firewalled server internal to an organization, providing extranet access to some functions while keeping the servers behind the firewalls. If used in this way, security measures should be considered to protect the rest of your infrastructure in case this server is compromised, as its web application is exposed to attack from the Internet.

No comments: