1. THE DIFFERENT KINDS OF CONNECTION

Early personal computers could only connect to the outside world in two ways -- serial ports and parallel ports . Typically, you have two serial ports and one parallel port. The communication possibilities have greatly increased, but these two standbys are standard and are found on practically every personal computer.

These days, your computer can have many connections to the outside world, and many computing jobs deal with getting computers to communicate. Naturally, connecting via the Internet is a big deal, but you may also have to deal with connecting to resources, such as databases, on a local network.

Serial Ports

As you'll recall from Lesson 6, modern computers use the byte composed of 8 bits as a basic unit of memory. Computer peripherals such as printers, modems, and scanners also work with an 8-bit byte. A serial port works by sending the bits contained in a byte one at a time, represented by changing voltage levels. Serial ports are used wherever you want to keep the total number of wires to a minimum, and the number of connectors and cables small and inexpensive. Serial cables have nine or fewer wires; two wires are used for transmitting and receiving data, and the others are used for control functions.

Before it became standard for personal computers to have built-in modems, a typical use for a serial port was to connect to a modem. Other devices typically connected to serial ports include scanners, PDA (personal digital assistants) synchronization cradles, and printers.

By convention, serial ports are named COM1, COM2, and so on. These ports are characterized by the rate at which data is transmitted in terms of bits per second, plus some other parameters not covered in this lesson. A top speed of over 100,000 bits per second is common. This translates to over 10,000 bytes per second; however, many devices can't communicate that fast.

Most programming languages provide for sending and receiving data on COM ports. Typically, a language enables you to set the transmission speed, and send or receive one 8-bit byte at a time. Currently, the only programmers dealing with serial ports directly are those experimenting with digital circuitry.

Parallel Ports

Parallel ports achieve a high rate of data transmission by sending all 8 bits of a byte in parallel (8 bits equals eight wires) at a time. With the eight wires needed for reading data plus the control signals, you end up with a 25-pin plug as well as an expensive and thick wire. Because the most common use for parallel ports has been for communicating with printers, the conventional port name is LPT1 (line printer). Modern GUI operating systems such as Windows take over printing functions, so only more advanced languages enable you to talk directly to LPT1.

If you want to see the serial and parallel ports your computer supports, you can do this from the Control Panel on a Windows machine. Choose Start > Settings > Control Panel , double-click the System icon, select the Hardware tab, and then choose the Device Manager . Look for a Ports listing. (The steps may be a little different depending on your version of Windows.)

Newer Serial Devices

The hardware limitations of the original IBM PC COM ports are too restrictive now that users have dozens of potential accessories that can be attached to a personal computer. High-speed serial interfaces that use only a few wires have replaced COM port connections in modern machines. This is all part of the trend to cut the cost and bulk of connectors and wires as computers have gotten more compact.

If your computer was purchased in the last few years, you probably have a USB (Universal Serial Bus) or Firewire connector on your system. With these standards, connectors can communicate with many devices. Unfortunately, the protocols for communicating on these newer serial ports are so complex that programming languages don't provide easy access.

A protocol is a set of rules describing how data is transmitted. Through a determined effort by professional societies and manufacturers, standardization on a few protocols makes standard cable connectors and modern networking possible.

2. NETWORKING COMPUTERS

When people first started connecting computers together in networks, a wide variety of proprietary protocols and hardware proliferated. Just because your computer had a network port, there was no guarantee you could connect to anything. Fortunately that era is gone. Instead, standardization for connection hardware and protocols has enabled the era of freely networked computers, which is just getting started.

Networks are classified into three groups according to the following definitions

  • LAN (Local Area Network): Computers in a geographically small area such as a building, linked with high-speed connections.
  • WAN (Wide Area Network): A network spread over a larger area, such as a factory or campus. Typically, a WAN provides connections between LANs using technology more suited to longer distance transmission.
  • Internet: A network of networks, composed of a hierarchy of smaller networks including specialized backbone networks that move data between smaller networks.

The topic of networking is beyond the scope of this course. This lesson concentrates on interesting tasks you might do with a program to connect to the Internet.

3. CONNECT TO THE INTERNET

No matter what hardware you're using to connect to the Internet -- dial-up modem, ISDN (Integrated Services Digital Network), cable, and optical fiber -- you only have to deal with TCP/IP and URLs at the programming level.

TCP/IP (Transmission Control Protocol/ Internet Protocol) can be explained as this: IP is responsible for getting chunks of your data to the computer you are addressing, and TCP is responsible for making the overall communication process reliable.

A URL (Uniform Resource Locator) is a standardized way of addressing resources on the Internet, and is also know as a URI (Uniform Resource Identifier). The standard is maintained by the W3C. A URL (or URI, if you prefer) is composed of at least two parts -- a protocol plus some form of address, optionally followed by additional data. Table 7-1 shows some common Internet protocols and how they're used.

Symbol
Protocol
Used for
http:
Hypertext Transfer Protocol
Web resources including hyperlinks
ftp:
File Transfer Protocol
Standard utility for file transfers
telnet:
Terminal Emulator
Character by character terminal emulation for remote access

Table 7-1: Common Internet protocols.

A Simple Internet Example

Java is an ideal language for experimenting with grabbing information off the Internet because it was designed from the beginning to be network aware. The following example demonstrates connecting to a well-known site run by the United States government to provide accurate time. It uses the following URL:

http://tycho.usno.navy.mil/cgi-bin/timer.pl

The parts of this URL are:

  • http: This declares the protocol used.
  • tycho.usno.navy.mil: This is the Internet address of a particular computer. You know this is a military network site because of the .mil.
  • cgi-bin/timer.pl: This says that you want a response from the timer.pl function, which happens to be a script in the PERL scripting language.

We're going to write a Java program to connect to that computer, access that timer.pl script, and capture the output. To keep the example compact, the following program accomplishes everything in a single method -- not a particularly good practice for anything larger than this example. This program is similar to sequential file reading, as discussed in Lesson 6, except that you use a URL instead of a local file name. The line numbers are for reference only.

1.  package com.pb ;
2.  import java.net.* ; // note
3.  import java.io.* ;
4.  public class UrlExample{
5.  public static void main( String[] args ){
6.   if( args.length < 1 ){ //
7.   System.out.println(
8.    "Program expects a URL on the 
    command line");
9.   System.exit(1);
10.   } 
11.   try { // 
12.   URL theUrl = new URL( args[0] ); //
13.   URLConnection conn = 
   theUrl.openConnection();
14.    conn.connect();
15.   InputStream in = conn.getInputStream(); // 
16.   int ch = in.read();
17.   while( ch != -1 ) { // 
18.    System.out.print((char)ch );
19.    ch = in.read();
20.   }
21.   in.close();
22.   System.out.println("Done");
23.   }catch(Exception e){ // 
24.   System.out.println("Problem " + 
   e.toString() );
25.   } 
26.  } 
27.  }
  • Line 6: This is the way a Java program gets information from the command that started it. The program expects to see an array of strings that have at least one item, namely the URL, to connect to. If there's nothing there, you get the error message on Line 8.
  • Line 11: This block of code starts with try and ends with catch . It's part of Java's error notification system.
  • Line 12: This statement creates an object that knows how to create a connection to another computer using a URL. The connection is represented by a URLConnection object that gets created on Line 13.
  • Line 15: The InputStream object represents the way Java reads streams of characters. This object behaves the same way when reading from the connection as it would when reading from a local file.
  • Line 17: Read characters from the stream one at a time and write them out until you reach the end, where the end is indicated by getting a -1 from the InputStream .
  • Line 23: The code would get executed if the statements between the try and here cause an error, such as not being able to contact the URL.

When executing a program such as this, with the URL on the command line:

java com.pb.UrlExample 
  http://tycho.usno.navy.mil/cgi-bin/timer.pl

you get the following response:

<title>What time is it?</title>
<h2> US Naval Observatory Master Clock Time</h2> <h3>
<br><br>June 16, 01:43:28 UTC
<br><br>June 15, 09:43:28 PM EDT
<br><br>June 15, 08:43:28 PM CDT
<br><br>June 15, 07:43:28 PM MDT
<br><br>June 15, 06:43:28 PM PDT
Done

Isn't that cool? That's all it takes to grab information off a computer on the Internet.

Web Services

Naturally, things can get a lot more complicated than that simple example. A hot topic these days is the idea of a Web service . A Web service is a program that's accessible from a network such as the Internet, and has a well-defined set of functions it can perform on request. XML, the markup language discussed in Lesson 5, is commonly used in creating Web services.

4. DATABASE SYSTEMS

In the early days of personal computing, databases only lived on giant mainframes run by people in white lab coats. Database software was very expensive and the programming language libraries available did not provide much capability for dealing with database type operations. Furthermore, each database vendor had a proprietary language.

A lot has changed since then. Now there are free open-source database packages with capabilities rivaling the most expensive systems, and many Web sites use databases extensively. There has also been a great improvement in standardization of the way users interact with databases.

Designers of modern programming languages have not tried to incorporate database functionality directly into the language. Instead, database programs have remained separate and the emphasis has been on creating standardized ways of communication. This makes sense because the requirements for a database are quite different from the requirements of a user application. Database programs have to stay running all the time, store huge amounts of data, and respond to multiple queries.

Modern relational database systems organize data into what are called tables . Think of a table as a ledger or as a spreadsheet: You have rows and columns, and each column stores a particular type of information. To insert information into a database table, you can't just pick one cell where one row and one column meet and then write in the value. To make an entry, you must complete the entire row, providing data for each column.

Relational Databases

Relational databases are special because information from one table can be used to look up information from another table. For example, imagine that you're employed by the company who has the database tables described previously, and they want you to send flyers announcing their new products to anyone they know who hasn't received mail from them in the last two weeks. You can look up the names of people who've last received flyers more than two weeks ago from one table, and find the addresses for those people out of the address table. It's even possible to set up the system so that you need an entry in the address table for a person before you can create an entry for them in the flyer mailing table. By doing this, you don't have to worry about missing someone's address.

5. RANDOM ACCESS FILES

In the sequential reading example, after reading nine lines, the next read operation would start at the beginning of the 10th line. The file read point was positioned at the 10th line. That position could be described with an integer as the nth byte in the file, where the first byte in the file is at position zero, similar to the first element of an array is in position zero.

When reading and writing with random access, you must know exactly where the data starts in the file. One way to do this is to use the same record size every time. In the following pseudocode for a subroutine to read a record, the variable recordBuffer is an array of characters that has a fixed size so you can calculate where to read.

FUNCTION readCheck( chkNum, recordBuffer )
OPEN "checks.dat" FOR RANDOM AS #ranf
SEEK #ranf TO chkNum * checkRecordSize
READ #ranf recordBuffer
CLOSE #ranf
RETURN

In another part of the program, there would have to be routines that pick the characters out of the recordBuffer and then interpret them in terms of your check records.

Using random access becomes increasingly helpful as files get larger. Just think how much harder it'd be to read the last 100 bytes in a 10 megabyte file by sequential reading instead of seeking a file position equal to the size of the file minus 100.

Database technology absolutely depends on random access because the files are so large, but is limited by the fact that after you've decided on a record size for a file, you can't change it. If your record design changes, you have to write a program that can read all of the old records and write a new file with the new record sizes.

Moving On

This lesson covered the important programming concepts of data storage, file manipulation, and memory management. You should now be able to more easily solve real-world programming problems. You also learned some important data storage technology terms.

The next lesson looks at an aspect of modern computing that is related to the data storage and retrieval techniques you saw in this lesson. Instead of reading and writing data on a local computer, you talk to other computers on local networks or the Internet. Before you move on, be sure to complete the assignment and quiz for this lesson. Don't forget to drop by the Message Board to see what your fellow students have to say.

6. DATABASE QUERY LANGUAGE

Database systems are useful, but what makes them powerful is the ability to retrieve information out of them in an organized manner. Most modern database systems (products such as Oracle, mySQL, or Microsoft SQL Server) can process a special type of language, called a query language . You use these languages to query the database, or ask it to return information.

Using query languages, you write instructions called queries to return the exact information you need. Exactly how these queries are created and used in your program depends on the facilities in the language you are using. Rather than get into programming language specific issues I am going to emphasis the query language itself.

The most common query language today is called SQL (Structured Query Language). It's usually pronounced sequel, although an earlier, much older query language was actually named SEQUEL, and there are a few people who claim that pronunciation causes confusion.

Most modern database systems process a version of SQL. Unfortunately, it's not as well standardized as are programming languages. Manufacturers are always trying to distinguish their product so they keep adding functions. Nevertheless, the basics of the language that are discussed here are pretty standard.

Using SQL, you can retrieve specific information from a table, such as the zip code for everyone in the address table whose first name is Judy. You can also perform queries that are more complicated. For example, you can have your query return the name, street address, city, state, and Zip Code from the address table for every name in the flyer mailing table whose last mailing date is older than two weeks ago. In this case, your query would match the names between the two tables to find the address for the names it identifies in the flyer mailing table.

The ability to match up data from different tables in this manner is what distinguishes a relational database from a flat database. Pieces of data such as the name in the previous example can be used to identify other data based on its relationship to it, in different tables. SQL accommodates flexible logic in matching information such as this. The following section looks at the four basic operations SQL can perform.

7. SQL COMMAND EXAMPLES

A common SQL convention is for keywords and instructions to be capitalized. There are four basic tasks that SQL can perform on a database table. It can INSERT new rows of data, SELECT data from the table, UPDATE data, or DELETE a row from the table. Look at each of these four tasks.

INSERT Statement

An INSERT statement is exactly how it sounds: You use it to insert a row of data into a table. To perform an INSERT operation, you need to specify what table you're adding data to and what data you're adding. The full statement looks similar to this:

INSERT (value1, value2) INTO table_name

If you're not filling every column in a specific order in your insert, you have to additionally specify which values are destined for which column.

SELECT Statement

Use a SELECT statement to retrieve values from a database. You can specify which values you want returned; you don't have to get back a whole row of data. You can also add other specifications to this statement, as discussed earlier in the flyer mailing example.

A typical SELECT statement looks similar to the following:

SELECT column_name FROM table_name WHERE column_name = someValue

The WHERE clause is how you specify what data you want returned. Note that the column_name doesn't have to be the same as the one you're returning. You can add multiple conditions to the WHERE clause by using AND or OR and then adding more conditions. There are many ways this statement can be extended to do different tasks and perform more complicated comparisons; for the purpose of this lesson, however, just know this is possible.

UPDATE Statement

As the name implies, an UPDATE statement is used to update data already in the database. To do this, specify which tables you're modifying, and which columns of data you want to update. You can also use a WHERE clause to specify which rows are updated. (If you say WHERE some_column = 2 , for example, only rows in which the column value is 2 will be updated. You can use this statement to modify the values of one or more columns.)

The full statement looks something similar to this:

UPDATE table_name
SET column_name = new_value
WHERE conditions . . ..

DELETE Statement

A DELETE statement is used to remove rows from a database. Because databases require information to be stored in rows, you have to remove an entire row at a time. Specify which table from which you want to remove rows and what conditions describe the rows you want to delete (using WHERE ).

A simple delete statement looks similar to this:

DELETE
FROM table_name
WHERE conditions . . ..

SQL also includes other ways to modify these queries. For example, you can have it sort the output of a person and address alphabetically by name. There's much more that can be done, but most SQL is built around those four commands, which manipulate database data. Manipulating database data is the point of a query language.

Moving On

This lesson covered how programs deal with the new programming environment in which everything is connected. It also gave you a short overview of how programs talk to databases using query languages. In the final lesson, you get some practical advice on programming from a seasoned programmer. Lesson 8 discusses what programmers are working on these days and the kinds of tools they use.

Before you move on, be sure to complete the assignment and quiz for this lesson. Don't forget to drop by the Message Board to see what your fellow students have to say.