To communicate with web servers and server-side applications through the Hypertext Transfer Protocol (HTTP)
The Internet Protocol
Internet: A worldwide collection of networks, routing equipment, and computers.
Uses a common set of protocols to define how the parties will interact with each other.
Data transmission consists of sending/receiving streams of zeroes and ones along the network connection.
The Internet Protocol
Two types of information:
Application data: information one computer wants to send to another.
Network protocol data: describes how to:
Reach the intended computer.
Check for errors & data loss in the transmission.
Network protocol: rules for protocol data.
Most common: Internet Protocol (IP).
Developed to enable different local area networks to communicate with each other.
Has become the basis for connecting computers around the world together over the Internet.
Two Computers Communicating Across the Internet
Figure 1 Two Computers Communicating Across the Internet
A is your home computer.
Connected to an Internet service provider (ISP).
ISP is connected to an Internet access point.
B is on an local area network at XYZ Computers.
XYZ has its own Internet access point.
Internet access points are connected by a complex collection of pathways (the Internet).
Message sent from one access point can eventually reach any access point over these pathways.
Destination Address
Data must be marked with a destination address.
In IP, addresses are denoted by a sequence of four numbers.
Each is one byte (a number between 0 and 255).
E.g., 130.65.86.66.
To be able to accommodate more devices, IP addresses will be extended to sixteen bytes.
To send data to B, A needs to know B’s Internet address.
A includes that address in the network protocol portion when sending the data.
Domain Naming Service
In addition to an IP address, computers can have an easy-to-remember domain name.
E.g., horstmann.com.
Domain Naming Service (DNS): translates from domain name to IP address.
When A wants to request data from a domain name:
It asks the DNS for the numeric Internet Address.
It includes the numeric address with the request for data.
Packets
IP breaks large chunks of data up into more manageable packets.
Each packet is delivered separately.
Each packet in a larger transmission may be sent by a different route.
Packets are numbered.
The recipient reassembles them in the right order.
Transmission Control Protocol
Internet Protocol (IP) does not notify the sender if data is lost or garbled.
This is the job of a higher level protocol Transmission Control Protocol (TCP).
Used by most popular Internet services – WWW & e-mail.
Bypassed by “streaming media” services for highest possible throughput.
Most commonly used combination is TCP with IP (TCP/IP).
Port Numbers
One computer can offer multiple services over the Internet.
E.g., both a web server program and a mail server program.
When data are sent to that computer, they need to indicate which program is to receive the data.
IP uses port numbers for this:
Integer between 0 and 65,535.
Sending program must know port number of receiving program.
Port number is included in the transmitted network protocol data.
Contents of TCP Packets
Must include:
Internet address of recipient.
Port number of recipient.
Internet address of sender.
Port number of sender.
Self Check 23.1
What is the difference between an IP address and a domain name?
Answer:
An IP address is a numerical address, consisting of four or sixteen bytes. A domain name is an alphanumeric string that is associated with an IP address.
Self Check 23.2
Why do some streaming media services not use TCP?
Answer:
TCP is reliable but somewhat slow. When sending sounds or images in real time, it is acceptable if a small amount of the data is lost. But there is no point in transmitting data that is late.
Application Level Protocols
TCP/IP mechanism establishes an Internet connection between two ports on two computers.
Each Internet application has its own application protocol.
Describes how data for that application are transmitted.
Hypertext Transfer Protocol (HTTP)
Application protocol used by the World Wide Web.
A web address is called a Uniform Resource Locator (URL).
You type a URL into the address window of your browser.
E.g., http://horstmann.com/index.html.
Browser Steps in Loading URL
Examines the part of the URL between the double slash and the first single slash (horstmann.com).
This identifies the computer to which you want to connect.
Because it contains letters, this part of the URL is a domain name, not an IP address.
Browser sends request to a DNS server to obtain IP address for horstmann.com.
From http: prefix, deduces protocol is HTTP.
Uses port 80 by default.
Establishes a TCP/IP connection to port 80 at IP address obtained in Step 1.
Browser Steps in Loading URL
Deduces from the /index.html that you want to see the file /index.html and sends this request formatted as an HTTP command through the established connection.
GET /index.html HTTP/1.0
Host: horstman.com
blank line
Web server running on computer whose IP Address was obtained above receives the request.
It decodes the request.
It fetches the file /index.html.
It sends the file back to the browser on your computer.
Browser Steps in Loading URL
Browser displays contents of the file:
Since this file is an HTML file, it translates the HTML codes into fonts, bullets, etc.
If the file contains images, it makes more GET requests through the same connection.
HTTP Commands
Telnet
Telnet program allows you to:
Type characters to send to a remote computer.
View the characters that the remote computer sends back.
It is a useful tool to establish test connections with servers.
You can imitate the browser connection by using a dialog box or typing at the command line:
telnet horstmann.com 80
Telnet
After Telnet starts, type the following without using backspace:
GET / HTTP/1.0
Host: horstmann.com
Then press the Enter key twice.
The server responds to the request with the file.
Telnet is not a browser.
It does not understand HTML tags so it just displays everything it was sent.
Telnet
Figure 2 Using Telnet to Connect to a Web Server
HTTP and HTML
Do not confuse HTTP with HTML.
HTML is a document format that describes the structure of a document.
HTTP is a protocol that describes the command set for web server requests.
Web browsers:
Know how to display HTML documents.
And how to issue HTTP commands.
Web servers:
Know nothing about HTML.
Merely understand HTTP and know how to fetch the requested items.
Application Level Protocols
HTTP is one of many application protocols in use on the Internet.
Another commonly used protocol is the Post Office Protocol (POP).
POP is used to download received messages from e-mail servers.
To send messages, you use another protocol: Simple Mail Transfer Protocol (SMTP).
Sample POP Session
Figure 3 A Sample POP Session
Self Check 23.3
Why don’t you need to know about HTTP when you use a web browser?
Answer:
The browser software translates your requests (typed URLs and mouse clicks on links) into HTTP commands that it sends to the appropriate web servers.
Self Check 23.4
Why is it important that you don’t make typing errors when you type HTTP
commands in Telnet?
Answer:
Some Telnet implementations send all keystrokes that you type to the server, including the backspace key. The server does not recognize a character sequence such as G W Backspace E T as a valid command.
A Client Program
Task: write a Java program that:
Establishes a TCP connection to a server.
Sends a request to the server.
Prints the response.
Sockets
A socket is an object that encapsulates a TCP/IP connection.
There is a socket on both ends of a connection.
Create a socket in a Java program:
Socket s = new Socket(hostname, portnumber);
Connect to the HTTP port of server, horstmann.com:
final int HTTP_PORT = 80;
Socket s = new Socket("horstmann.com", HTTP_PORT);
If it can’t find the host, the Socket constructor throws an UnknownHostException.
Input and Output Streams
Use the input and output streams attached to the socket to communicate with the other endpoint.
Answer:
Port 80 is the standard port for HTTP. If a web server is running on the same computer, then one can’t open a server socket on an open port.
Self Check 23.8
Can you read data from a server socket?
Answer:
No, a server socket just waits for a connection and yields a regular Socket object when a client has connected. You use that socket object to read the data that the client sends.
URL Connections
Java library class URLConnection provides convenient support for the HTTP protocol.
First, construct a URL object from a URL starting with the http or ftp prefix:
URL u = new URL("http://horstmann.com/index.html");
Then use the URL's openConnection method to get the URLConnection:
URLConnection connection = u.openConnection();
Call the getInputStream method to obtain an input stream:
Using http://horstmann.com/
200 OK
<html>
<head><title>Cay Horstmann's Home Page</title></head>
<body>
<h1>Welcome to Cay Horstmann's Home Page</h1>
...
</body>
</html>
Self Check 23.9
Why is it better to use a URLConnection instead of a socket when reading data from a web server?
Answer:
The URLConnection class understands the HTTP protocol, freeing you from assembling requests and analyzing response headers.
Self Check 23.10
What happens if you use the URLGet program to request an image (such as http://horstmann.com/cay-tiny.gif)?
Answer:
The bytes that encode the images are displayed on the console, but they will appear to be random gibberish.