Web Solution Tools: World Wide Web

The World Wide Web (commonly shortened to the Web) is a system of interlinked

hypertext documents accessed via the Internet. With a Web browser, one can view

Web pages that may contain text, images, videos, and other multimedia and navigate

between them using hyperlinks. Using concepts from earlier hypertext systems, the

World Wide Web was begun in 1989 by English scientist Tim Berners-Lee, working at

the European Organization for Nuclear Research (CERN) in Geneva, Switzerland. In

1990, he proposed building a "web of nodes" storing "hypertext pages" viewed by

"browsers" on a network,[1] and released that web in 1992. Connected by the existing

Internet, other websites were created, around the world, adding international

standards for domain names & the HTML language. Since then, Berners-Lee has played

an active role in guiding the development of Web standards (such as the markup

languages in which Web pages are composed), and in recent years has advocated his

vision of a Semantic Web.

The World Wide Web enabled the spread of information over the Internet through an

easy-to-use and flexible format. It thus played an important role in popularising use of

the Internet, [2] to the extent that the World Wide Web has become a synonym for

Internet, with the two being conflated in popular use. [3]

How it works

Viewing a Web page on the World Wide Web normally begins either by typing the URL

of the page into a Web browser, or by following a hyperlink to that page or resource.

The Web browser then initiates a series of communication messages, behind the

scenes, in order to fetch and display it.

First, the server-name portion of the URL is resolved into an IP address using the

global, distributed Internet database known as the domain name system, or DNS. This

IP address is necessary to contact and send data packets to the Web server.

The browser then requests the resource by sending an HTTP request to the Web server

at that particular address. In the case of a typical Web page, the HTML text of the

page is requested first and parsed immediately by the Web browser, which will then

make additional requests for images and any other files that form a part of the page.

Statistics measuring a website's popularity are usually based on the number of 'page

views' or associated server 'hits', or file requests, which take place.

Having received the required files from the Web server, the browser then renders the

page onto the screen as specified by its HTML, CSS, and other Web languages. Any

images and other resources are incorporated to produce the on-screen Web page that

the user sees.

Most Web pages will themselves contain hyperlinks to other related pages and perhaps

to downloads, source documents, definitions and other Web resources. Such a

collection of useful, related resources, interconnected via hypertext links, is what was

dubbed a "web" of information. Making it available on the Internet created what Tim

Berners-Lee first called the WorldWideWeb (a term written in CamelCase, subsequently

discarded) in November 1990.[1]

Berners-Lee has said that the most important feature of the World Wide Web is "Error

404", which tells the user that a file does not exist. Without this feature, he said, the

web would have ground to a halt long ago.

Berners-Lee has also expressed regret over the format of the URL. Currently it is

divided into two parts - the route to the server which is divided by dots, and the file

path separated by slashes. The server route starts with the least significant element

and ends with the most significant, then the file path reverses this, moving from high

to low. Berners-Lee would have liked to see this rationalised. So an address which is

currently (e.g.) "http://www.mrfweb.we.bs /document/pictures/illustration.jpg" would

become http:/uk/co/examplesite/documents/pictures/illustration.jpg. In this format the

server no longer has any special place in the address, which is simply one coherent

hierarchical path.

History

History of the World Wide Web

This NeXT Computer used by Sir Tim Berners-Lee at CERN became the first Web server.

The underlying ideas of the Web can be traced as far back as 1980, when, at CERN in

Switzerland, Sir Tim Berners-Lee built ENQUIRE (a reference to Enquire Within Upon

Everything, a book he recalled from his youth). While it was rather different from the

system in use today, it contained many of the same core ideas (and even some of the

ideas of Berners-Lee's next project after the World Wide Web, the Semantic Web).

In March 1989, Berners-Lee wrote a proposal[4] which referenced ENQUIRE and

described a more elaborate information management system. With help from Robert

Cailliau, he published a more formal proposal (on November 12, 1990) to build a

"Hypertext project" called "WorldWideWeb" (one word, also "W3")[1] as a "web of

nodes" with "hypertext documents" to store data. That data would be viewed in

"hypertext pages" (webpages) by various "browsers" (line-mode or full-screen) on the

computer network, using an "access protocol" connecting the "Internet and DECnet

protocol worlds".[1]

The proposal had been modeled after EBT's (Electronic Book Technology, a spin-off

from the Institute for Research in Information and Scholarship at Brown University)

Dynatext SGML reader that CERN had licensed. The Dynatext system, although

technically advanced (a key player in the extension of SGML ISO 8879:1986 to

Hypermedia within HyTime), was considered too expensive and with an inappropriate

licensing policy for general HEP (High Energy Physics) community use: a fee for each

document and each time a document was charged.

A NeXT Computer was used by Berners-Lee as the world's first Web server and also to

write the first Web browser, WorldWideWeb, in 1990. By Christmas 1990, Berners-Lee

had built all the tools necessary for a working Web:[5] the first Web browser (which

was a Web editor as well), the first Web server, and the first Web pages[6] which

described the project itself.

On August 6, 1991, he posted a short summary of the World Wide Web project on the

alt.hypertext newsgroup.[7] This date also marked the debut of the Web as a publicly

available service on the Internet.

The first server outside Europe was set up at SLAC in December 1991 [8].

The crucial underlying concept of hypertext originated with older projects from the

1960s, such as the Hypertext Editing System (HES) at Brown University--- among

others Ted Nelson and Andries van Dam--- Ted Nelson's Project Xanadu and Douglas

Engelbart's oN-Line System (NLS). Both Nelson and Engelbart were in turn inspired by

Vannevar Bush's microfilm-based "memex," which was described in the 1945 essay "As

We May Think".

Berners-Lee's breakthrough was to marry hypertext to the Internet. In his book

Weaving The Web, he explains that he had repeatedly suggested that a marriage

between the two technologies was possible to members of both technical communities,

but when no one took up his invitation, he finally tackled the project himself. In the

process, he developed a system of globally unique identifiers for resources on the Web

and elsewhere: the Uniform Resource Identifier.

The World Wide Web had a number of differences from other hypertext systems that

were then available. The Web required only unidirectional links rather than bidirectional

ones. This made it possible for someone to link to another resource without action by

the owner of that resource. It also significantly reduced the difficulty of implementing

Web servers and browsers (in comparison to earlier systems), but in turn presented the

chronic problem of link rot. Unlike predecessors such as HyperCard, the World Wide

Web was non-proprietary, making it possible to develop servers and clients

independently and to add extensions without licensing restrictions.

On April 30, 1993, CERN announced[9] that the World Wide Web would be free to

anyone, with no fees due. Coming two months after the announcement that the Gopher

protocol was no longer free to use, this produced a rapid shift away from Gopher and

towards the Web. An early popular Web browser was ViolaWWW, which was based

upon HyperCard.

Scholars generally agree, however, that the turning point for the World Wide Web

began with the introduction[10] of the Mosaic Web browser[11] in 1993, a graphical

browser developed by a team at the National Center for Supercomputing Applications at

the University of Illinois at Urbana-Champaign (NCSA-UIUC), led by Marc Andreessen.

Funding for Mosaic came from the High-Performance Computing and Communications

Initiative, a funding program initiated by the High Performance Computing and

Communication Act of 1991, one of several computing developments initiated by

Senator Al Gore.[12] Prior to the release of Mosaic, graphics were not commonly mixed

with text in Web pages, and its popularity was less than older protocols in use over the

Internet, such as Gopher and Wide Area Information Servers (WAIS). Mosaic's graphical

user interface allowed the Web to become, by far, the most popular Internet protocol.

The World Wide Web Consortium (W3C) was founded by Tim Berners-Lee after he left

the European Organization for Nuclear Research (CERN) in October, 1994. It was

founded at the Massachusetts Institute of Technology Laboratory for Computer Science

(MIT/LCS) with support from the Defense Advanced Research Projects Agency

(DARPA)—which had pioneered the Internet—and the European Commission.

Standards
Web standards

Many formal standards and other technical specifications define the operation of

different aspects of the World Wide Web, the Internet, and computer information

exchange. Many of the documents are the work of the World Wide Web Consortium

(W3C), headed by Berners-Lee, but some are produced by the Internet Engineering

Task Force (IETF) and other organizations.

Usually, when Web standards are discussed, the following publications are seen as

foundational:

* Recommendations for markup languages, especially HTML and XHTML, from the

W3C. These define the structure and interpretation of hypertext documents.
* Recommendations for stylesheets, especially CSS, from the W3C.
* Standards for ECMAScript (usually in the form of JavaScript), from Ecma

International.
* Recommendations for the Document Object Model, from W3C.

Additional publications provide definitions of other essential technologies for the World

Wide Web, including, but not limited to, the following:

* Uniform Resource Identifier (URI), which is a universal system for referencing

resources on the Internet, such as hypertext documents and images. URIs, often called

URLs, are defined by the IETF's RFC 3986 / STD 66: Uniform Resource Identifier (URI):

Generic Syntax, as well as its predecessors and numerous URI scheme-defining RFCs;
* HyperText Transfer Protocol (HTTP), especially as defined by RFC 2616: http://1.1

and RFC 2617: HTTP Authentication, which specify how the browser and server

authenticate each other.

Privacy

Computer users, who save time and money, and who gain conveniences and

entertainment, may or may not have surrendered the right to privacy in exchange for

using a number of technologies including the Web.[13] Worldwide, more than a half

billion people have used a social network service,[14] and of Americans who grew up

with the Web, half created an online profile[15] and are part of a generational shift

that could be changing norms.[16][17] Among services paid for by advertising, Yahoo!

could collect the most data about users of commercial websites, about 2,500 bits of

information per month about each typical user of its site and its affiliated advertising

network sites. Yahoo! was followed by MySpace with about half that potential and then

by AOL-TimeWarner, Google, Facebook, Microsoft, and eBay.[18]

Privacy representatives from 60 countries have resolved to ask for laws to complement

industry self-regulation, for education for children and other minors who use the Web,

and for default protections for users of social networks.[19] They also believe data

protection for personally identifiable information benefits business more than the sale

of that information.[19] Users can opt-in to features in browsers from companies such

as Apple, Google, Microsoft (beta) and Mozilla (beta) to clear their personal histories

locally and block some cookies and advertising networks[20] but they are still tracked

in websites' server logs.[citation needed] Berners-Lee and colleagues see hope in

accountability and appropriate use achieved by extending the Web's architecture to

policy awareness, perhaps with audit logging, reasoners and appliances.[21]

Security

The Web has become criminals' preferred pathway for spreading malware. Cybercrime

carried out on the Web can include identity theft, fraud, espionage and intelligence

gathering.[22] Web-based vulnerabilities now outnumber traditional computer security

concerns,[23] and as measured by Google, about one in ten Web pages may contain

malicious code.[24] Most Web-based attacks take place on legitimate websites, and

most, as measured by Sophos, are hosted in the United States, China and Russia.[25]

The most common of all malware threats is SQL injection attacks against websites.[26]

Through HTML and URIs the Web was vulnerable to attacks like cross-site scripting

(XSS) that came with the introduction of JavaScript[27] and were exacerbated to some

degree by Web 2.0 and Ajax web design that favors the use of scripts.[28] Today by

one estimate, 70% of all websites are open to XSS attacks on their users.[29]

Proposed solutions vary to extremes. Large security vendors like McAfee already design

governance and compliance suites to meet post-9/11 regulations,[30] and some, like

Finjan have recommended active real-time inspection of code and all content regardless

of its source.[22] Some have argued that for enterprise to see security as a business

opportunity rather than a cost center,[31] "ubiquitous, always-on digital rights

management" enforced in the infrastructure by a handful of organizations must replace

the hundreds of companies that today secure data and networks.[32] Jonathan Zittrain

has said users sharing responsibility for computing safety is far preferable to locking

down the Internet.[33]

Web accessibility

Many countries regulate web accessibility as a requirement for web sites.

Java

A significant advance in Web technology was Sun Microsystems' Java platform. It

enables Web pages to embed small programs (called applets) directly into the view.

These applets run on the end-user's computer, providing a richer user interface than

simple Web pages. Java client-side applets never gained the popularity that Sun had

hoped for a variety of reasons, including lack of integration with other content (applets

were confined to small boxes within the rendered page) and the fact that many

computers at the time were supplied to end users without a suitably installed Java

Virtual Machine, and so required a download by the user before applets would appear.

Adobe Flash now performs many of the functions that were originally envisioned for

Java applets, including the playing of video content, animation, and some rich GUI

features. Java itself has become more widely used as a platform and language for

server-side and other programming.

JavaScript

JavaScript, on the other hand, is a scripting language that was initially developed for

use within Web pages. The standardized version is ECMAScript. While its name is

similar to Java, JavaScript was developed by Netscape and has very little to do with

Java, although the syntax of both languages is derived from the C programming

language. In conjunction with a Web page's Document Object Model (DOM), JavaScript

has become a much more powerful technology than its creators originally

envisioned.[citation needed] The manipulation of a page's DOM after the page is

delivered to the client has been called Dynamic HTML (DHTML), to emphasize a shift

away from static HTML displays.

In simple cases, all the optional information and actions available on a

JavaScript-enhanced Web page will have been downloaded when the page was first

delivered. Ajax ("Asynchronous JavaScript and XML") is a group of interrelated web

development techniques used for creating interactive web applications that provide a

method whereby parts within a Web page may be updated, using new information

obtained over the network at a later time in response to user actions. This allows the

page to be more responsive, interactive and interesting, without the user having to

wait for whole-page reloads. Ajax is seen as an important aspect of what is being

called Web 2.0. Examples of Ajax techniques currently in use can be seen in Gmail,

Google Maps, and other dynamic Web applications.

Publishing Web pages

Web page production is available to individuals outside the mass media. In order to

publish a Web page, one does not have to go through a publisher or other media

institution, and potential readers could be found in all corners of the globe.

Many different kinds of information are available on the Web, and for those who wish

to know other societies, cultures, and peoples, it has become easier.

The increased opportunity to publish materials is observable in the countless personal

and social networking pages, as well as sites by families, small shops, etc., facilitated

by the emergence of free Web hosting services.

Statistics

According to a 2001 study, there were massively more than 550 billion documents on

the Web, mostly in the invisible Web, or deep Web.[34] A 2002 survey of 2,024 million

Web pages[35] determined that by far the most Web content was in English: 56.4%;

next were pages in German (7.7%), French (5.6%), and Japanese (4.9%). A more

recent study, which used Web searches in 75 different languages to sample the Web,

determined that there were over 11.5 billion Web pages in the publicly indexable Web

as of the end of January 2005.[36] As of June 2008, the indexable web contains at

least 63 billion pages.[37] On July 25, 2008, Google software engineers Jesse Alpert

and Nissan Hajaj announced that Google Search had discovered one trillion unique

URLs.[38]

Over 100.1 million websites operated as of March 2008.[39] Of these 74% were

commercial or other sites operating in the .com generic top-level domain.[39]

Speed issues

Frustration over congestion issues in the Internet infrastructure and the high latency

that results in slow browsing has led to an alternative, pejorative name for the World

Wide Web: the World Wide Wait.[citation needed] Speeding up the Internet is an

ongoing discussion over the use of peering and QoS technologies. Other solutions to

reduce the World Wide Wait can be found on W3C.

Standard guidelines for ideal Web response times are:[40]

* 0.1 second (one tenth of a second). Ideal response time. The user doesn't sense

any interruption.
* 1 second. Highest acceptable response time. Download times above 1 second

interrupt the user experience.
* 10 seconds. Unacceptable response time. The user experience is interrupted and

the user is likely to leave the site or system.

These numbers are useful for planning server capacity.

Caching

If a user revisits a Web page after only a short interval, the page data may not need to

be re-obtained from the source Web server. Almost all Web browsers cache

recently-obtained data, usually on the local hard drive. HTTP requests sent by a

browser will usually only ask for data that has changed since the last download. If the

locally-cached data are still current, it will be reused.

Caching helps reduce the amount of Web traffic on the Internet. The decision about

expiration is made independently for each downloaded file, whether image, stylesheet,

JavaScript, HTML, or whatever other content the site may provide. Thus even on sites

with highly dynamic content, many of the basic resources only need to be refreshed

occasionally. Web site designers find it worthwhile to collate resources such as CSS

data and JavaScript into a few site-wide files so that they can be cached efficiently.

This helps reduce page download times and lowers demands on the Web server.

There are other components of the Internet that can cache Web content. Corporate and

academic firewalls often cache Web resources requested by one user for the benefit of

all. (See also Caching proxy server.) Some search engines, such as Google or Yahoo!,

also store cached content from websites.

Apart from the facilities built into Web servers that can determine when files have

been updated and so need to be re-sent, designers of dynamically-generated Web

pages can control the HTTP headers sent back to requesting users, so that transient or

sensitive pages are not cached. Internet banking and news sites frequently use this

facility.

Data requested with an HTTP 'GET' is likely to be cached if other conditions are met;

data obtained in response to a 'POST' is assumed to depend on the data that was

POSTed and so is not cached.

Link rot and Web archival

Main article: Link rot

Over time, many Web resources pointed to by hyperlinks disappear, relocate, or are

replaced with different content. This phenomenon is referred to in some circles as "link

rot" and the hyperlinks affected by it are often called "dead links".

The ephemeral nature of the Web has prompted many efforts to archive Web sites. The

Internet Archive is one of the most well-known efforts; it has been active since 1996.

Academic conferences

The major academic event covering the Web is the World Wide Web Conference,

promoted by IW3C2.

WWW prefix in Web addresses

The letters "www" are commonly found at the beginning of Web addresses because of

the long-standing practice of naming Internet hosts (servers) according to the services

they provide. So for example, the host name for a Web server is often "www"; for an

FTP server, "ftp"; and for a USENET news server, "news" or "nntp" (after the news

protocol NNTP). These host names appear as DNS subdomain names, as in

"www.mrfweb.we.bs".

This use of such prefixes is not required by any technical standard; indeed, the first

Web server was at "nxoc01.cern.ch",[41] and even today many Web sites exist without

a "www" prefix. The "www" prefix has no meaning in the way the main Web site is

shown. The "www" prefix is simply one choice for a Web site's host name.

However, some website addresses require the www. prefix, and if typed without one,

won't work; there are also some which must be typed without the prefix. Sites that do

not have Host Headers properly setup are the cause of this. Some hosting companies

do not setup a www or @ A record in the web server configuration and/or at the DNS

server level.

Some Web browsers will automatically try adding "www." to the beginning, and

possibly ".com" to the end, of typed URLs if no host is found without them. All major

web browsers will also prefix "http://www.mrfweb.we.bs/" and append ".com" to the

address bar contents if the Control and Enter keys are pressed simultaneously. For

example, entering "example" in the address bar and then pressing either Enter or

Control+Enter will usually resolve to "http://www.mrfweb.we.bs", depending on the

exact browser version and its settings.
Web Management India
Web Solution Tools.
Mrf Web Design
Mrf Web Development
Mrf Web Development

Web Solution Tools

Saturday, January 10, 2009

World Wide Web

0 comments:

Blog Archive

My Blog List

Latest News

Latest Link

Text