On internet, search engine is a tool to search information and websites by typing keywords or key-phrases. Search results are presented as a list with brief content summary. Results are also referred to as SERPS or “search engine results pages”. The information an engine searches may contain web-pages, pictures and other files.
Engines look for information on the World Wide Web [www] and FTP [File Transfer Protocol] servers. They mine data in databases or open directories. In addition to human maintained web directories, search engines run algorithm on a web crawler to maintain real time information as well.
The first internet searching tool was Archie, which was developed by Alan Emtage, Bill Heelan and J. Peter Deutsch in 1990. As the amount of data was limited during those days, and could be instantly searched manually, Archie did not index the contents of websites. In 1991, Mark McCahill developed Gopher, which could be termed an advanced version of Archie. However, in June 1993, Matthew Gray invented the engine which was perhaps the first web search engine called World Wide Web Wanderer. The second engine “Aliweb” was developed in November 1993. In 1994, appeared the first crawler-based engine named WebCrawler. Later, a great number of engines were developed and rose to popularity.
A typical engine performs the following tasks:
Search engines store information about millions of web-pages. Few engines, such as Google, store entire or part of the search page [called caching], while others, like AltaVista, store every word. The pages are retrieved by a Web crawler [or spider]. Page contents are then analyzed for determining how to index. Indexing allows information to be instantly found.
When a user enters a query by typing keywords, the engine examines its index so that best-matching web pages can be listed. At present, there are no public engines that can search documents according to date. To refine a search, engines use “Boolean operators”, which are AND, OR and NOT. There are also natural language queries which allow the user to ask a question as one would ask to a human.
Relevance of results is a major factor in determining an engine’s performance. There are millions of pages which share a particular word; some of them may also be more relevant and credible than the others. Most engines use methods to rank the results so that more appropriate results are listed first. The techniques which these engines employ in listing results vary from engine to engine. Such techniques also change with time as new technology arrives.
Google, at present enjoys the lion’s share by capturing approximately 83% of the search engine market in May 2011. Yahoo follows the trail and holds a 6.5% market share as at the same date. Baidu, Bing, Ask and AOL come next with their respective shares of 5%, 4%, 0.52% and 0.36%. Google’s worldwide market share was at apex in April 2010, when the Google engine had gained 86.3% share.