Computers in Physics How to Transfer Files on the Internet Glenn Ricart Citation: Computers in Physics 8, 20 (1994); doi: 10.1063/1.4823251 View online: http://dx.doi.org/10.1063/1.4823251 View Table of Contents: http://scitation.aip.org/content/aip/journal/cip/8/1?ver=pdfcov Published by the AIP Publishing Articles you may be interested in How rotational vortices enhance transfers Phys. Fluids 25, 093301 (2013); 10.1063/1.4817671 How To Evaluate Websites; Obtaining the Modeling Physics Curriculum: Obtaining the freely-distributed Modeling Physics curriculum files Phys. Teach. 43, 479 (2005); 10.1119/1.2060657 How to Get CIP Source Code Over the Internet Comput. Phys. 7, 637 (1993); 10.1063/1.4823238 Alternative internet proposed by disgruntled file-sharers Phys. Today How the Mubarak government caused an internet blackout in Egypt Phys. Today
Reuse of AIP Publishing content is subject to the terms at: https://publishing.aip.org/authors/rights-and-permissions. Download to IP: 63.249.156.141 On: Sat, 29 Oct 2016 19:30:10
INTERNET CORNER
HOW TO TRANSFER FILES ON THE INTERNET Glenn Ricart Department Editor: Glenn Ricart
[email protected]
Question: What grows at 11% per monthforyears ata time? Answer: The usage of the Internet, the world's largest computer network and an increasingly indispensible research and education too!' The Internet connects more than one million computers in some 70 countries. Usage has been growing at a minimum of 11 % per month for more than five years. The core of that usage has been from higher education and the research portion ofindustry. Ifyou read Computers in Physics, you almost certainly have an Internet electronic-mail account and have mastered at least its fundamentals. If you read this column and the ones that will follow in this space, you will learn how to use sophisticated information tools to explore the exponentially growing Internet.
Question:
Ifusage ofthe Internet is expandingfaster than the "Big Bang" expansion ofthe universe, how long will it be before Alpha Centauri has an Internet link?
Answer:
Some of the e-mail I receive is so bizarre it may already be originating from Alpha Centauri.
I
n a time when many papers published in this journal and elsewhere are first written digitally on a computer and then formatted and produced on a computer, it makes sense to distribute them digitally as well. With digital distribution, you can: 1. Have access to an article nearly instantly after it passes final editorial review 2. Search its full text 3. Print a personal copy at your convenience. In the future, the cost of editorial and distribution work may well be partly or fully covered by revenues from customized advertising; that means the cost to you could tum out to be low or even zero. (You do promise to read the ads in which you indicated an interest, right?) The editors of Computers in Physics have expressed a willingness to explore timely distribution of this journal in electronic format. This column will be both the first to be distributed electronically and also your guide to using electronic information on the Internet.
Where tostart? This month we will look at retrieving the sometimes bewildering profusion of file formats in which you may find things on the Internet. The Internet itself, of course, does not care a whit about file formats. It simply moves bytes of information around. The program that does the moving is called the "file transfer program" or, since you have to type it quite often, just "ftp." When ftp starts, it requests the name of the distant com-
puter to be accessed. The name will have lots ofdots in it, just like the part of the e-mail address that comes after the "at" sign (@). Then, you will need to identify yourself to the distant machine; in all cases this month, you will simply be an "anonymous" user. When the special username of"anonymous" is specified, the machine being accessed accepts any password but appreciates the courtesy of being given your e-mail address as the password. At the completion of these formalities, the ftp program is ready for information on the directory in which the desired file is stored. In a future column I will discuss discovery techniques, but for now, the directory information will be specified. Then comes the fun step: request the file transfer. Ifyou have a high-speed Internet connection, this step goes marvelously fast. Finally, ftp is told to disconnect from the distant computer. You can retrieve copies of this column. You will need to know that the article is stored: on machine: umd5.umd.edu in directory: publAIP/CiP and the filename is: ic-Ltxt, This is real information. Try retrieving it on your own computer. The ftp process looks different on different computers. On a Macintosh, you will do most ofthe work through menus and pop-up dialog boxes. On a Unix system, it will look something like this (the parts you type are in bold):
%ftp umd5.umd.edu Connected to umd5.umd.edu. 220 umd5.umd.edu FTP server (ULTRIX Version 4.1) Name (umd5.umd.edu:glenn): anonymous 331 Guest login ok, send ident as password. Password:
[email protected] 230 Guest login ok, access restrictions apply. ftp> cd pub/AIP/CiP 250 CWD command successful. ftp> get ic-l.txt 200 PORT command successful. 150 Opening data connection for ic-Ltxt (128.8.11.212,2645) (24015 bytes). 226 Transfer complete. local: ic-Ltxt remote: ic-l.txt 24015 bytes received in 3.28 seconds (12.78 Kbytes/s) ftp> bye 221 Goodbye. Hints: Upper- and lower-case counts. You probably will not
Reuse of AIP Publishing content is subject to the terms at: https://publishing.aip.org/authors/rights-and-permissions. Download to IP: 63.249.156.141 On: Sat, 29 Oct 20 COMPUTERS INPHYSICS, VOL. 8, NO.1, JAN/FEB 1994 2016 19:30:10
get to see your e-mail address as you type it; the system thinks it is a password and will not echo it. The messages returned from your computer will vary, but all computers will want the same information: • Which computer to access (umd5.umd.edu) • Who you are (an anonymous user with an electronic mail address) • Where the file is (pub/AIP/CiP) • The name of the file you want (ic-Ltxt) • That you are done (bye). If you tried the example, you now have a copy of this column on your own computer. You may use your favorite text editor to examine it or print it. But it does not have bold print or mathematical symbols. For those things, you will need to retrieve the article in a more sophisticated format. Let us examine the formats in which you might find a scholarly paper, beginning with the simple text format we just used.
Text Text files are the files sent bye-mail and moved around by default using the ftp. Each computer byte represents one English character (upper- or lower-case), a punctuation mark, or one of 32 control characters. One of these control characters is the carriage return; another is the end-of-line indicator. Since different computers have different ideas about which one of these (or both) should be put between lines, computers exchanging text information on the Internet perform automatic conversions. This is one reason why a text file with a size of 1200 bytes on one computer may occupy 1350 bytes when sent over the Internet to another computer. Happily, all ofthis happens automatically and like magic. The only limitations of text files are the characters they can contain. The code table that converts bit values of each byte to an appropriate character is called ASCII-the American Standard Code for Information Interchange. While it has ampersands, parentheses, brackets, and braces, it does not have an umlaut, a cent sign, or a bullet. Ifyour word processor does have these things, you will find that ifyou export the file as "text," the special characters have probably mutated into something very special indeed or even disappeared. There are several ways to remember what can be stored in a plain-text file. The first clue is the keys on your keyboard; ifa symbol you want is embossed on the keycap, it is probably in ASCII. Another clue is available for those who know Fortran. IfFortran will accept it, it is in ASCII. Although most
files on the Internet that do not provide any clues about themselves are text files, you always know it is text if the file name ends in ".txt." This paragraph is going to get a little complicated. Please feel free to skip to the next paragraph. Besides the carriage return and end-of-line characters, there are other control characters that can get into text and make life interesting. The "tab" character seems perfectly harmless, but there is no such thing as a standard tab stop. The most common case is to assume tab stops every four or every eight characters. Finally, keyboard "delete" keys can be wired to generate either "backspace" or "delete." If your "delete" key generates a "backspace," but your computer wants to see a real "delete" before it corrects a typing error, all of your backspaces will go right into your text. While it looked good on your screen when you typed it, the recipient may see screens full of garbage. See your local computer wizard to get your "delete" key to send a "delete," or conversely tell your computer to interpret a "backspace" as the signal to delete the previous character. You have probably seen several of the so-called emoticons in text. If someone wants to make sure that you realize
Introducing Glenn Ricart Glenn Ricart is one of those people who think 11 % per month i. pretty slow. In 1985. he began the modern NSFnet portion of Internet by leading a project to connect the campus -wide networks at 16 southeastern universities. His project, SURAnet, used the TCP/IP technology that ARPA had pioneered to interconnect computer-science departments in the earlier part of the decade but extended it to full-campus networks. Dr. Ricart is director of the Computer Science Center at the University of Maryland, College Park, and assistant vice chancellor for Academic Information Technology for the University of Maryland System. His course in Computer Architecture is carried nationwide on the National Technological University satellite system. His most recent passion is the creation of the Monticello Electronic Library. Just as Thomas Jefferson's library at Monticello was the first substantial library in America and would later form the heart of the national Library of Congress. the new Monticello Electronic Library may become the first substantial electronic library and the model for future national electronic libraries.
Reuse of AIP Publishing content is subject to the terms at: https://publishing.aip.org/authors/rights-and-permissions. Download to IP: 63.249.156.141 On: Sat, 29 Oct COMPUTERS IN PHYSICS, VOL. 8, NO.1, JANIFEB 1994 21 2016 19:30:10
INTERNET CORNER that something is meant to be humorous, they may follow it with a smiling face :-). (Look sideways.) People have invented dozens of these things; if you see too much punctuation in a row, just look at it sideways with a little imagination. Text is versatile. Everyone on the Internet can send and receive it. So if you want to send something more complicated, the obvious trick is to encode it into text somehow and send the text. That's exactly the principle behind TEX and PostScript.
TEX You probably already know a lot about encoding mathematics into text. If you write a little Fortran, you could compute: K2 = OMEGA**2 / C**2 * EPSILON * MU Which is a formula encoded into text in a way that Fortran can understand it. Donald Knuth wrote a language called TEX that allows you to encode mathematical notation into text. For example, you might write: $$k A2={ {\omegaA2} \over{ cA2} }\epsilon\mu$$ in text to get the equation: 2
ol
k ---£1.1 c2
Since physics involves considerable mathematics, and TEX is one of the important time-saving tools for writing papers that involve mathematics, many authors write in TEX or one of its labor-saving macro languages such as LATEX. The benefit for Internet communication is that the TEX (or LATEX or REVTEX) is written in text, and hence can be easily
Retrieving GhostScript GhostScript is a product of the Free Software Foundation, which believes in public use of software supported by voluntary contributions. After making an optional contrib ution to the Free Softwa re Foundation, retrieve GhostScript with this information, if you have a Unix machine: on machine: prep.ai.mit.edu in directory: pub/gnu and the filename is: GhostScript-2.6.I.tar Because the information is stored in a compressed "tar" file, you will need to retrieve it in the "binary" mode. Just before the "get" command, insert the command "binary." It will look like this: 230 Guest login ok, access restrictions apply. ftp> cd pub/gnu 250 CW O command successful. ftp> binary 200 Type set to I ftp> get GhostScript-2.6.l.tar After retrieving GhostScript, you will need to extract the softwa re (ta r xvfGhostScr ipt-2.6.l.tar) and compile it (ma ke). If you are unfamiliar with these steps, please consult a local expert.
understood across the network. For this reason, many physics papers are stored on the Internet in TEX. To read it, you retrieve the text version and run it through your own TEX interpreter and view it or print it. The non-proprietary nature of TEX means that there are several competing software implementations on nearly every possible computer that might be connected to the Internet. Of course, ifthe recipients have no TEX at all, they will be unable to interpret the TEX file. TEX file names usually end in .tex to remind you that they need to be processed by TEX. I have made a TEX version ofthis column. Ifyou retrieve it, you will be able to process it through TEX and get equations, bold print, and other kinds of beautiful stuff. Follow exactly the same steps above, but ask for "ic-Ltex" instead of "ic-Ltxt." What you'll retrieve is a TEX input file. Ask a local expert how to process it through TEX at your site and print it. In this issue, Paul Dubois's Scientific Programming column entitled "Making Applications Programmable" is also available as a TEX file: retrieve "sp-Ltex."
Postbcript" An even-more-general encoding ofa page ofinformation is provided by the PostScript language. PostScript is generated by a word processor or document language and was originally intended to tell a printer how to print your document. Today there are PostScript interpreters that can display the information on your screen or print it on your printer. The PostScript instructions are all in text format. But they can specify very complicated things, including graphics and color images. Anything you see in the pages of Computers in Physics may have been a PostScript image. One advantage of PostScript is that text, graphics, and pictures and their locations are all encoded into a single file. That means a single text file can be used to represent an arbitrarily complex paper. For example, a paper written in LATEX could be run through TEX and the TEX output converted to PostScript. This PostScript could then be sent over the Internet and viewed by anyone with a PostScript interpreter even if he or she does not have a TEX compiler. There are a few catches with PostScript. Interpreters for the language are often owned by Adobe Systems Inc., and therefore PostScript printers tend to cost more. However, the Open Software Foundation does have a freely available interpreter called "GhostScript" that may be retrieved from the Internet (see "Retrieving GhostScript," this page). PostScript files on the Internet tend to have names that end in ".ps." You may retrieve a PostScript version of this article by following the instructions above but asking for "ic-Lps" instead of "ic-Ltxt."
Viewing systems You have no doubt noted that it takes a bit of dedication to move things around on the Internet and still read them. The problems inherent in moving full documents around have given rise to a new breed of software called "viewing systems." There are several viewing systems: Acrobat, Envoy, and Replica are just three. None is public domain; you must buy them. These systems intercept a simulated "print" of a document and create a file instead. These files can then be
Reuse of AIP content is subjectVOL. to the8,terms https://publishing.aip.org/authors/rights-and-permissions. Download to IP: 63.249.156.141 On: Sat, 29 Oct 22 Publishing COMPUTERS INPHYSICS, NO.1,at: JAN/FEB 1994 2016 19:30:10
moved in binary mode over the Internet and viewed on another system that has matching viewing software. The use ofthese systems is just beginning, and at present you will find very little on the Internet stored in these formats. Nevertheless, the author believes that these viewing systems are the key to easy and painless document access on the Internet. This document in Adobe Acrobat pdf format is called "ic-I.pdf."
SGML In the future, another good way to exchange text information will be with the standardized generalized markup language (SGML). In SGML, one marks the text with its purpose: This piece of text is a chapter title; that piece, an author name; this other bit, a reference. Since each piece of text is identified with its meaning, you can do contextual searches. You might reasonably search in every document authored by Vice President Gore instead of having to settle for every document that mentions Gore's name somewhere. Style manuals can then convert the SGML into appropriate TEX or other document languages. SGML is text encoded and hence easily sent over the Internet.
Binary files If two people on the Internet have the same word proc-
essor, they can interchange the binary files generated by that word processor. One tells the ftp to leave such files alone by invoking the "binary" mode of transmission; that makes the file-transfer program stop looking for end-of-line characters and such and simply transfer the bytes "as is." Ifyou are both using WordPerfect version 5.1, or you are both using Chi Writer, this will work fine. However if one person has WordPerfect 6.0 and sends a 6.0-format binary file to someone with WordPerfect 5.1 over the Internet, the file will be unreadable. Ifyou cannot do it with disk interchange, you probably cannot do it with binary files on the Internet. One of the common binary file formats is the Rich Text Format (RTF). This article has been saved in that format under "ic-Lrtf." If you have a word processor that handles rtf files, please give it a try, remembering to use the binary command.
Wrap-up A scholarly paper on the Internet can be stored in many different forms. The most universal form, text, has the least richness. There are several formats that include detailed text formatting, graphics, and images, and they require special software for viewing or printing. TEX and PostScript are among those with the most widely available viewing and printing programs, and so they make good choices for storing papers.
CIPONLINE
HOW TO GET CIP SOURCE CODE Computers in Physics makes source code available on the Internet for some articles. In this issue, these articles include "Catching the Right Bus, Part 2: Using a Parallel Printer Adapter as an Inexpensive Interface," on page 45 and "Controlling Chaos," on page 62. The AlP holds copyright on this software. To obtain Computers in Physics source code, use an anonymous file-trans fer-protocol (ftp) procedure, as described below. Please note that ftp is an Internet protocol and not available on Bitnet and other electronic mail servIces.
Procedure 1. Ftp to pinet.aip.org. If you have questions about this procdure, please contact your system administrator. 2. When prompted for name, type "anonymous," and when prompted for password, type your e-mail address. 3. Type "cd cip_sourcecode" to go to the cip_sourcecode subdirectory. Type "dir" to list the files in this subdirectory. The CONTENTS file contains an up-to-date directory of sourcecode listings with cross-references to the relevant Computers in Physics articles. To view the CONTENTS file on your screen, use "get CONTENTS -". will pause the scrolling ofthe text file on the screen. continues the scrolling. 4. After deciding which file to copy, use the "get" command to transfer it from the cip_sourcecode subdirec-
tory. For example, if the name of the file is "foo," use the command, "get foo". This command will move foo into your local directory. If you are unable to retrieve the file in this way, you may have to rename it with the command, "get source-file destination-file". For example, type "get foo schmoo". 5. To view a file before copying it, use the procedure described in step 3 above. 6. Type "bye" at the prompt to exit ftp. The following is a near duplication of what should appear on your screen: ftp pinet.aip.org (various messages) Namerpinet.aip.org cd cip_sourcecode ftp> dir ftp> get CONTENTS ftp> get foo ftp>bye In addition to making source code available, Computers in Physics posts previews of forthcoming issues, including advance abstracts ofpeer-reviewed articles, on AlP's information service PINET. PINET subscribers may access the previews by entering the command "go cip".
Reuse of AIP Publishing content is subject to the terms at: https://publishing.aip.org/authors/rights-and-permissions. Download to IP: 63.249.156.141 On: Sat, 29 Oct COMPUTERS INPHYSICS, VOL. 8, NO.1,JANIFEB 1994 23 2016 19:30:10