Search Engines

How Do They Work?
By Sydney B.


Search engines are databases. They're collections of data stored so that when you, the user, wants to search, the engine looks though the data and brings you the relevent websites to your search. It all happens in the blink of an eye!

How does that data get organized? It's actually very simple:
Every search engine has a "webcrawler" or "bot," sometimes called a "spider." This crawler will browse or "crawl" though a list of websites and pick words and phrases that it deems are relevant to the topic of the site. Every word gets a rank of importance, and every crawler ranks those words differently.

For example, the webcrawler for Search Engine X may give a higher rank to the first 10 words on a website, whereas the crawler for Search Engine Z gives a higher rank to words that appear frequently in the text. This means that the same search keywords can bring up different top results for each engine. It's therefore a good idea to perform searches on multiple engines, to be sure you're getting all the results you could get.

Another way search engines get their keywords is using a code that web designers put into the HTML of the website itself. This is called a Meta Tag. The meta tag has two different uses: it tells a search engine what keywords to use for the webpage, and it gives a description for the engine to display when that website is brought up as a hit.
Meta tags are usually given more importance in the keyword rank, however not many search engines support the use of meta tags anymore. It's too easy for a site coder to put words such as "sex" or "Mp3" into their meta tags to bring up more hits for their site, even though the site has nothing to do with those words.

One thing to keep in mind when searching is that engines can't think! A search engine is a word-based machine. If you search for "dog," the engine won't return websites that use the word "canine." Currently, AI is simply too limited to support an idea based web search.

Excite's search engine is an example of an idea based search. Because of the enormous amount of data needed to support such an engine, Excite's updates were slower, and didn't get the same amount of results as other word based search engines. An idea based search isn't necessarily impossible, it's just not practical. In the future, idea based searches may outweigh word based.

The webcrawler on a search engine doesn't constantly check every site on the web. It has a regularly scheduled cycle that is programmed by the owners of the engine to check for new sites and check sites for updates. Some search engines have webcrawlers specifically for fast-updating sites, to keep searches new and reliable. News and sports sites are some that fall into this category.

It's important to note that an amateur website may take weeks, even months to show up in a search engine. Some more popular engines, like Google.com, have places on their site where you can submit a URL to be put on a special list, so the crawler will visit your page on it's next trip around the web. Refer to the Links page for a link to Google's URL add site.