This chapter is devoted to the hours of sixth week. By having some information and experience about the utilization of the Internet utilities now it is time to mention about the World Wide Web utilities and Web Design. We emphasize on the Hyper Text Markup Language (HTML) for Web Design although the other recently avaliable facilities can also be found in the Internet.
The HTML part of this chapter is mostly based on the Turkish document written by
Fahrettin Önal
Written by
Metin Demiralp
Istanbul Technical University, Informatics Institute,
Maslak ( 80626, Istanbul, Türkiye)
Version 0. 60
World Wide Web (WWW) Utilities
Some Terms and a Short History of WWW
Before making a clear definition about the World Wide Web (WWW) it is better to explain what the hypertext is. The hypertext or hyperdocument, as it is sometimes called, is an advanced type of text such that the access to the pages is not only sequential. There are links which make shortcuts between the pages. This is unlike a book where the knowledge or data is given in an ordered fashion such that to get a complete idea about what the content to be given is requires a sequential proceeding between the pages of the book. However, in the case where one tries to create a single information resource by referring to different books, there may not be an ordered access to the reference books. Depending on how the search progresses the order may be changing. But the important thing in this case is the possibility of branched access routes to each individual resource. The hypertext looks like to this structure. It is an integral structure composed of individual pages which may be linked to each other in many different ways. This structure gives the hypertext the superiority of having very flexible compositions to facilitate the access to different resources. Encyclopedia information is quite suitable for hypertext structure and perhaps is the most reasonable example to this end.
An hypertext is not only a text based document. It may contain movie pictures, voices and the combinations of these as well as the text material, static pictures and similar objects.
Hypertext structures are very convenient for data search through computer networks. Since an hypertext may have, animated or not, images the access to these pages from a remote point in the network requires the transfer of datagrams not only for plain text data but bitmaps for graphic and sound structures. The home computer which has hypertext pages to be served to the remote node clients is called web server. This naming is due to the fact that the collection of the hypertexts in this computer looks like a web. Here web is not used just for the net of spider. The word web has a more general meaning than spider net. It means something woven or something similar to a tissue. It is merely a combination of text, pictures, color, moving objects in the displays, and sound. A web server must have one or more hypertext home pages. The access to this server is through the standard protocols like TCP/IP. It is based on client/server architecture. Because of the hypertext structure it is possible to establish links between the documents of the same server or furthermore between the pages of different servers.
The entire collection of the web servers all over the Internet is called World Wide Web and abbreviated as WWW.
WWW was first proposed by Tim Berners--Lee who was a scientist at European Laboratory for Particle Physics, CERN which is an acronym for the name of the same institution in French (Conseil Europieen pour la Recherche Nucléaire. This laboratory is near Geneva in Switzerland where most people speaks French.). The purpose of Berners--Lee at the very beginning was to develop a method for making scientific papers and graphical images available to the other scientists through Internet. Same person, Tim Berners--Lee, developed also HTML which is an acronym for the statement HyperText Markup Language, HTTP which stands for the statement HyperText Transfer Protocol, URL which means Uniform Resource Locator concepts and tools, and called the new structure as world wide web for the first time. His project was approved in September 1990. He selected NeXT workstation as a platform because of its some appropriate properties. Berners--Lee was the first person who wrote the first WWW server and the first text and graphical client although graphics were not embedded with the text. Graphics were in a separate window. The software developed by him was also including an HTML editor. The browser became available in late 1990 and the total software package was made available through internet in the 1991 summer.
The browser is used to name the software which is used to get access to the web server and make use of the documentation there. This term is also frequently called client. The first widely used browser was written by some undergraduate students at the University of Illinois. Their names were Mark Andreessen, Eric Bina and so on. The software was called Mosaic because it deals with a mosaic of text, graphics, images. Mosaic was written in the period of late 1992 and early 1993. It was designed for Silicon Graphics workstations. The first public version of Mosaic was released in March 1993. In April 1994, Marc Andreessen and Jim Clark who is the founder of Silicon Graphics started Netscape Communications Corporation. Soon after the establishment of this corporation Netscape Navigator was born. Today, most widely used browsers are Netscape Navigator and Internet Explorer.
Web Servers
The web servers use HTTP as a protocol. A secure version of HTTP is also partially available and called S--HTTP. Under the Linux or Unix systems HTTP needs a devil like software which is called HyperText Transfer Protocol Daemon. This is active to serve clients when a demand comes. The daemon has the name httpd. Although there are various available web servers perhaps the most popular one is Apache server. It is public software and easy to install and maintain. There are a lot of documentation on this and other servers through Internet and/or booksellers. We do not intend to give more details about the web server because it is out of the scope of this book.
Web Design
Hyper Text Markup Language (HTML)
Everbody who has an Internet access may have a web site. To construct the pages like you encounter when you browse World Wide Web through Internet, first of all, you must have a web server or an access to a web server where you can locate the pages you are going to create. World Wide Web is based on client/server technology at least for the moment we write this document. A web server is a platform which provides the necessary support to the clients, that is, the users who demand an access to the web server and want to see the web pages presented by the web server via some browsers like the most famous ones, Netscape, Internet Explorer. Almost all of the Internet Service Provider Companies give service to their clients for the location of personal web pages and also for the construction of the sites. This means that if you have an access to the Internet then you are ready for constructing your own personal web pages and therefore your own web site.
Web pages should be located into some specific directories. You should learn where to locate these pages from your Internet Service Provider. In addition, you should learn the URL (Uniform Resource Locator) the clients must use to get access to your site. You may send the documents for the pages you prepared via e--mail to your Internet Service Provider and then they can install these pages into necessary location. On the other hand there are some web sites whose maintainers permit you to install your web oriented files without requesting any payment. Some of this type sites are given below via their URLs.
What is HTML?
The expansion of the acronym HTML is the statement HyperText Markup Langauge. It is a collection of rules which permit us to create files to be resolved and understood by the web browsers. Web browsers take action according to the content of the file they read via the data stream coming from the server to the client. By using HTML it is possible to use different fonts, colors and to create specifically formatted documents beside the plain texts. The pictures, drawings and animated objects can be displayed via HTML. Tables, forms, frames can also be created and their properties can be controlled to facilitate the construction of rigorous displays. It is also possible to create data input and output connections between the server and client at a level of permission granted by the web server.
An HTML file is composed of tags which may be considered as commands for the HTML operations and some text, image, and sound materials. The file has generally the four or five letter suffixes in name, .htm and .html, the latter of which is used for mostly Unix based systems while the former one is mostly for the operating systems developed by Microsoft.
We are going to get some information about these tags as we proceed below.
Editors
The softwares to edit an HTML file are classified into three groups:
1)The first group is composed of the softwares which are capable of editing only text materials. Since an HTML file is a text file it is possible to open the HTML file and edit it directly on the code by these types of editor. This does not necessitate a special operating system or editors. You can use just the text editors which are available under the operating system you use. These editors can be exemplified as wordpad notepad, edit, joe, vi, emacs, pico.
2)The second group editors are used to create the HTML code for the user. These editors are assumed to be working on a window system. The user can employ the mouse clicking for selection of the desired elements and writing the corresponding HTML code into the target file. HotdogPro, Bluefish, Homesite can be given as examples for these types of the editors.
3)The third group is composed of advanced editors. You can locate the objects by the editor by means of carry -- drop method. The editor task is based on What You See Is What You Get (WYSIWYG, pronounced like veezyveeg) philosophy. Hence you can directly see what you are doing and you can see the correspondingly created code and revise it if you need. Frontpage and Staroffice can be given as examples for this group of editors.
You may ask the question \lq\lq Okay, I can design the web pages by these editors. Therefore, why do I have to know about HTML?". You are right for the very simple pages where you may not need HTML really. However, it is better to know about the codes in detail when the page under design becomes complicated since this knowledge will give a capability of overruling on the web page design.
Some word processors like Microsoft Word are equipped with the capabilitiy of saving some kind of files as HTML codes. However, it is better not to use these facilities because the output HTML file may be swollen by the unnecessary block operations and repetitions, that is, the size of the output HTML file becomes larger than it should normally be. This also makes the resulted code unusually complicated.
What are the Expected Fundamental Properties for a Web Site?
The fundamental properties we expect from a web site can be itemized as follows.
The access to the web site should be rapid as much as possible.
The text material must be readable in size and in content.
The color selections must be comfortably perceptable by the eye.
The content of the document must be easily perceptable, digestable and satisfactory.
The visitor of the site can easily arrive at its target location through the possible shortest path.
There must be a harmony in the design of the web site. Extraordinarily plain structures and abnormally crowded designs must be avoided.
What is Tag?
HTML is a language hence it
does have commands or command like structures. The command like structures of HTML
are called tags. Each tag starts with a less than symbol <and ends with a greater
than symbol>
1)The tags of this group exist with their conjugates where less than symbols < is replaced by the seymbols </ in name. The tag itsefl starts a well-defined action which continues until the tag's conjugate is encountered. Everything between the tag and its conjugate is affected by the action defined by these tag-conjugate pairs. The utulization of the conjugate may be optional for some tags.
2)The tags of this group are alone. Their actions are taken in a sudden and single step. The tag itself is used for the correspondicg action.
The tags are case insensitive and tag-conjugate couples can be nested in a way such that the action of an inner tag-conjugate couple overdominates the same type action of an outer tag-conjugate pair.
Some Tags of Vital Importance
The content and the necessary tags of an HTML file are located between the <HTML> and </HTML> tags. The <HTML> tag and its conjugate </HTML> may be skipped without creating any problem for some browsers especially under Linux. However the existence of the browsers which mandatorily check these tags urges us not to ignore the utilization of these tags. Therefore it is better to write <HTML> tag at the beginning of the HTML file while its conjugate </HTML> is inserted at the end of the same file.
There must be a <BODY> </BODY> tag--conjugate pair between the <HTML> and </HTML>tags. The main content and necessary tags for some actions on this content are given between the<BODY> and </BODY> tags. As <HTML> and </HTML> tags, the utilization of these tags for the body of the content of the HTML file is mandatory for all HTML structures except frame structure although some advanced browsers may not require these tags.
The <HEAD> and </HEAD> tags are used between the <HTML> and </HTML> tags and given before the <BODY> tag. There may be given some information like character type about the header and script file information (if any) or script codes between the <HEAD> and </HEAD> tags.
To specify the header string which is displayed in the title bar of the window created by the browser one should insert this string between the <TITLE> and </TITLE> tags which are located somewhere between the <HEAD> and </HEAD> tags.
The <META> Tag
This tag is used inside the domain of the <HEAD> tag. It has no conjugate. The <META> tag accepts some parameters. Amongst these we can mention about NAME, CONTENT, and HTTP-EQUIV. The utilization of these parameters can be shown through the following examples:
The <META HTTP-EQUIV="Content-Type" CONTENT=text/html;charset=iso-8859-9">tag, provides the utilization of Turkish Fonts as soon as an access is accomplished to the page. Here it is possible to use windows-1254 instead of iso-8859-9.
The tag, <META NAME="Keywords" CONTENT="html,htm,personal web page, web">, is used to specify the content of the page.}
The <BODY>Tag
This tag does also accept parameters. They are listed below:
BGCOLOR: This parameter is used to specify the background color. You can give either the name of the color as a string or its hashmark started hexadecimal code.
BACKGROUND: This parameter locates a picture at the background. The picture is specified by an image filename. The image file of the picture must be in either gif or jpeg format. If necessary the path of the image file can be specified through this parameter. If an image filename is specified and that file under this name can not be accessed then the value of BGCOLOR is used to paint the background instead of locating the picture.
TEXT: This parameter specifies the general color of the text material.
LINK: This parameter is used to specify the color of the string which is used for link.
VLINK: This parameter is also related to links. However, the value given to this parameter is used as the color to paint the string which defines the link if the link was activated before.
ALINK:This parameter is also related to links. However, the value given to this parameter is used as the color to paint the string which defines the link when this string is clicked by mouse.
Well! The parameter utilization is optional and order independent. That is, you can use them in any order. We recommend you to give a BGCOLOR if you do not cover the background by a picture otherwise you may obtain quite ugly appearences on your page since the default BGCOLOR of the browsers are not generally so attractive.
Writing Through HTML
One of the basic functions of HTML is of course text creation, that is, writing. There are HTML tags which are directly related to writing operations. Some of them are given below.
The <FONT> Tag
This tag has a conjugate, that is, <FONT> and is used to manage the controlling of the font utilization. </FONT> can accept the below listed parameters:
FACE: This parameter specifies the type of the font to be used.
SIZE: This parameter specifies the size of the font to be used.
COLOR: This parameter specifies the color of the font to be used. Color specification is done by giving the color either in name or its hashmark started hexadecimal representation.
Some Font Controlling Tags
If a boldface field is desired in the text to be printed by HTML then the field must be surrounded by the <B> and </B> tags. The </STRONG> and <STRONG> tags can also be used for the same purpose.
If an italic field in the text should be surrounded by the <I> and </I> tags.
The field surrounded by the <U> and </U> tags are underlined.
The field which continuosly blinks should be encompassed between the tags <BLINK> and <BLINK>.
The mathematical formula typesetting is very limited under HTML. However it is possible to use sub or super indexes. You can use the <SUB> and </SUB> tags for sub indexing while you need to use <SUP> and </SUP> tags for super indexing.
HTML enables us to increase or decrease the size of the characters employed in a field. There is a default size scale and each size is defined in points which is a typographical unit (72 points = 1 inch). The <BIG> and </BIG> tags increase the size of the characters in the field they encompass one level in the scale whereas the <SMALL> and </SMALL> tags decrease the size one level in the scale.
You may want to display some text just as it stands. Then you need to use <PRE> and <PRE> tags. By default, these tags use the typewriter fonts whose characters are of same width without regarding how the shape is. This is very useful for maintaining the appearence of the text in the HTML output. If you enforce HTML to use some other fonts between these tags then the apperance of the text material given between these commands is changed.
Lists
You can design lists via HTML. An itemized list can be formed by taking the content of the list between the <UL> and </UL> tags. These tags play the role of an envelope. Each item in the list can be specified in a line which starts by the <LI> tag (It is a single type tag). This type of listing does not enumarete the items of the list. For enumerating the <UL> and </UL> tags must be exchanged with the <OL> and </OL>tags. The itemization is same as before however the output contains numbers at the beginning of each item. The enumeration is automatic and starts from 1 by default.
Paragraph Tags
In HTML the paragraphs are created by taking the content of the paragraph between the <P> and </P> tags. The ALIGN parameter is used for paragraph management. For example it is possible to centralize the whole paragraph.
The tag which is used alone breaks the present line.
The tag which does not have a conjugate is used to create a horizontal rule. The features of the rule like color and length can be controlled via appropriate parameters.
The and tags can be used to centralize the field given between these tags.
Imagefiles
The pictures or drawings, that is, graphical objects can be maintained in various format files. The files, in the basis, contain binary information or bitmap of the object. For this purpose the picture which is assumed to be two dimensional is divided into pixels by creating a grid on the picture. The pixel is the smallest unit of a display. It is generally assumed to be square since almost all monitors are operated via horizontal line by line spanning of the screen by an electron beam. If the display is black and white then each pixel is assigned by either 0 or 1 in according to its color. The size of the pixel of the display medium depends on the resolution of the medium. Therefore it is possible to create a pattern of 0s and 1s for displaying the graphics. This pattern is called bitmap. The quality of the bitmap increases as the size of the pixel decreases and therefore the number of pixels increases.
In the case of colored graphic object the color is resolved in terms of some main colors. The main colors determine the type of coloring method. In HTML rgb coloring method where the main colors are red, green, and blue is used. If a graphical object is colored then the whole plane of the graphic is resolved to three separate patterns each of which defines a bitmap on the basis of a main color. By overlapping these patterns the entire display is created.
Therefore, the display information or the color patterns of a display can be maintained in a file which is usually called imagefile. The imagefiles do not only involve the color patterns of the image. In fact, the bitmaps of the patterns can be compressed by using various algorithms to diminish the size of the imagefile. Hence the information about the compressing and decompressing of the binary data in the imagefile must also be given in the same file. This creates a lot of possibilities to save an image into a file. Each of this possibilities is called image format. Although there are a lot of image formats the most widely used ones are GIF and the JPEG formats. The GIF files may have animated objects. Animation needs the display of many frames rapidly and consecutively to give the impression of the moving objects. It is based on the incapability of the human eye to distinguish the pictures which are moving consecutively and rapidly. For example, the consecutive displaying of 25 pictures make the human eye failed to distinguish each individual picture. Hence, if the series of pictures contain the consecutive instants of a motion then, when the pictures are displayed with the speed of 25 pictures a second, the human eye gives the impression of watching the original motion continuously. Therefore it is not hard to see that the animated objects are bundles of the snapshots of a motion. GIF files can contain these and, although some other formats do the same thing, the most of browsers can display only the animated GIF files. The animated or not, GIF filenames contain the four letter suffix .gif.
The other format, JPEG files can also be displayed by the most common browsers. JPEG filenames contain the four letter suffix .jpg.
It is possible to make a picture transparent in a color. For example, you can make the background color which is usually white transparent in a picture. This transparency becomes very important when we attempt to make the images overlapped. If the upper image is transparent in a color then the pixels of the lower image overdominates the pixels of the transparent color therefore the lower picture is viewed through the transparent color. This feature is enabled in GIF format. To make a file transparent you can use several utilities. One package of the famous one of these utilities is ImageMagick. It is publicly available without any payment for many of the operating systems including Linux of course. The SuSE Linux does have this facility and what you are going to do for making the files transparent is to use the command convert.
The JPEG format has the capability of saving the images in higher quality than the GIF files. The sizes of the imagefiles may vary depending on the structures of the images. The sizes of the imagefiles becomes unavoidably high if the size of the image is large and/or the content is comprehensive. Since the sizes of the files determines the speed of the access to a web site it is desired to construct imagefiles as small as possible. There are some optimization utility softwares which may decrease the imagefile sizes without any appearable loss of quality. We do not get into details about these.
The <IMAGE> Tag
This is a single type tag and used to insert an image into the display of HTML file. It accepts several parameters which are listed below.
SRC: This parameter describes how to access to the image file
WIDTH: This parameter defines the width of the image. Several units can be used for specification but the pixel unit is the default.
HEIGHT: This defines the height of the display of the picture.
BORDER: It is possible to put a frame to surround the picture. This parameter defines the thickness of the border of this frame.
ALIGN: This parameter defines the horizontal position of the image.
VALIGN: This parameter defines the vertical position of the image.
Links
It is possible to make links between HTML documents. This property gives the hypertext features to the HTML. For making a link the and tags are used. Text or image can be given between these tags. These tags accept parameters which are listed below:
HREF: This is the information about the location of the node to be linked. This information to be given can be either a filename or a URL. The filename can include the path to access the file if necessary. The URL may define the type of link. For example, to get a link to the web site of the Informatics Institute of Istanbul Technical University one can use http://www.be.itu.edu.tr or the URL ftp://ftp.be.itu.edu.tr can be used to get a link to the ftp site of the Informatic Institute of .Istanbul Technical University. If the field to be given for this parameter is as mailto:youre-mailaddress then the client can use its mail program to send an e--mail to you by clicking on this string in the display created by its browser.
TARGET: This parameter describes how the specified node is linked. There are four options for this parameter:
BLANK: A new browser window is opened nd the page to be linked is displayed in this window.
SELF: The page to be linked is opened in the same browser window.
PARENT: The page to be linked is opened in the same window of the browser but in a new different rame.
TOP: The page to be linked is opened alone in the same window of the browser. It is also possible to give a frame name to this parameter directly.
STYLE: This parameter can be used to change the style of the link. For example, it is possible to remove the underline of the field given between the and tags. For this purpose one can use STYLE="text-decoration:none". his may facilitate you to get point about the visual design of your page.
The
It is possible to create link from some regions of an image to some nodes. In this way different regions of an image can be linked to different documents or to the URLs. This feature is not noticed by the client usually. The following example explains how to use this tag.
IMAGE FILE
where the and tags are used to specify the regions on the image. These tags accept the parameter NAME which is used for the identification of the region. The inner tag <AREA> is used to define the region in detail. It accepts some parameters like SHAPE which defines the shape of the region and COORDS which defines some necessary cordinates of the positions necessary for a unique specification.
Tables
Tables are cellular structures. These increase the visual effectiveness of the pages and give the chance of easily handling of the page. These structures enable us to manage the things not only in horizontal but also in vertical direction.
It is also possible to give background color or background design to the totality of the table or its cells individually. The BORDER parameter can be used for the settings of the border of the frame of the table. The space inside the cell can be controlled via the CELLPADDING parameter while the CELLSPACING controls the spaces between the cells of the same line.
Frames
Let us consider the case where we have a lot of pictures or similar type of structures, that is, images in a page. Assume that we want to switch between this page and some subsequent ones several times. Each passage from a subsequent page to the main page will take so long time if the number and the sizes of the imagefiles are high. Whereas, we can divide the main page into frames to keep the main frame which contains the most of the images constant. Then not the main frame but another frame can be used for switching between the main and subsequent pages. To create a framed structure we need a main page and subsequent pages whose number is equal to the number of the frames.
The construction of a frame structure can be exemplified as follows
where the <FRAMESET>tag and its conjugate </FRAMESET>take the role of an envelope and all structures about the framing are given between these tags. The first line of the above command separates the window into two regions left of which will cover 30 percentage of the entire window.
The second and the third tag pairs which locate the left.html and right.html files will display the outputs through the left and right windows respectively. The NAME parameters give names to the individual frames. It is possible to use the frame windows names for the specification of the link targets. The SCROLLING parameter can be used to create scrolling bars for each individual frame window. The COLS and ROWS attributes (parameters) can be used to partition each frame into columns or/and rows.
A left clicking event (every action in graphic programming is mostly called event) may cause a frame window opening at right. For this purpose we can use the anchor tag with its TARGET parameter such that TARGET's argument becomes the name of the right frame. For example we can write .
It is also possible to switch from a multi frame structure to a single frame one. For this purpose, the value of the TARGET parameter is specified as TOP. This reload the page from scratch by displaying it in a single piece.
Some Clues
The tag can be used for some important operations. Some of these are given below:
The parameter assignment HTTP-EQUIV=' REFRESH' inside the <META> tag reloads the document to the client.
The parameter assignment CONTENT=' n; URL=url' specifies the period of the reloading of the page to the browser. If a URL string is given in the argument of the CONTENT parameter then the page at that URL will be loaded after the specified time interval. We can give the following example:
You can specify a given amount of space between two strings. But this needs a special action since the consecutive blank spaces which are created by pressing the spacebar key of the keyboard in an HTML file are assumed just as a single default space. This action is created by using the character group which is an acronym for the statement Non--Breakable SPace. These can be repeatedly used to create a desired spaces between the objects.
Some characters which are used for some specific purposes in HTML can be displayed by using certain ampersand prefixed character groups. Some of these entities are given below.
Some Illustrative Example HTML Codes and Their Displays
In the following portion of this chapter four different example HTML codes are given. They are selected from the URL: http://www.be.itu.edu.tr/beders. The corresponding snapshots of Netscape windows for these codes are also given. The Turkish Character are converted to English ones for typographical reasons.