Search engines’ indexation and healthcare industry
Google searches fundamentals: web indexing
Every day you use Google or another search engine without really knowing exactly how it works. Yet the results of your searches can vary significantly depending on many parameters. In this article you will find information on the essential tools to use the web effectively. This is also the minimum level of understanding needed before embarking on a project to improve website visibility, thus increasing the level of qualified traffic.
It all starts with web indexation
Few people know that when Google presents a page with the results of a query, or SERP (Search Engine Results Page), these are the results from the Google, or equivalent, index, not from the web itself.
The search engine crawlers and indexation
Google explores the web with robots or spiders that constantly crawl the web, analyzing each page and then indexing them. How each spider explores the web, as well as their indexation rules, are specific to each search engine.
Some estimate that almost 50% of web traffic results from the activity of these robots or spiders. It is possible to check this for yourself with tools such as Google Analytics or even a specific plug-in in your CMS that will indicate the source of the visitors to your site.
The visitor from Mountain View, California is without doubt the famous Googlebot; those from Redmond, Washington, Microsoft’s Bing engine with its msnbot.
Although not official, it is likely that certain rules and algorithms are applied during the indexing process to optimize the treatment of the index in response to a query; for example, the user-friendliness of a web page on a mobile phone, which will then allow Google to propose this page if the user is searching the web from this type of device.
Lesson one: the results presented by the search engines vary depending on certain parameters, in this case, the type of device. We shall see that there are dozens of factors that can potentially influence search results.
Robots index or spiders crawl the web to generate an index of the web; it is this base which is operated when searching on Google
How robots choose their targets and adjust the frequency of their passage?
Originally, Google crawler started on a page and then followed the links, either internal or external, between pages to explore the web.
This is why some websites, such as Wikipedia, which have very rich content and are widely referenced by other sites, have their pages visited and crawled by robots more often than others.
This concept of “reference” though still valid – as we shall see in future chapters – probably has little influence on the frequency of indexing by search engines.
Today, it is possible to “submit” a map of your site or “sitemap” to Google to indicate which pages can and should be crawled. It is also possible to suggest how often the content of the page will be changed and therefore how often it will be necessary to revisit to this page.
Google and other search engines do not appreciate having their time and valuable resources wasted exploring static pages; it is therefore needless to specify a daily frequency if your article never changes.
In case of abuse, the risks for the site are significant, carrying penalties and/or affecting the of the frequency of indexing and ultimately its position in SERP’s.
If you need to occasionally change some passages there are other techniques to promptly update the index.
Ultimately what you see in your SERP is a selection of the index. Take, for example, the SERP for the query “web consultant health industry“ with B6B Consulting in 2nd and 3rd positions, a good example of a successful organic Search Engine Optimization (or SEO); this company holds 1st position for the query in French.
You can access the Google index by selecting “cached / en cache” after clicking on the green arrow next to the website address.
You will directly access the “archived” page in the Google’s index, which could be different from the page currently online.
Best practice: dos and don’ts
Today, there are very few ways to influence the indexing of your site; conversely there are many errors that can penalize your site, even though the term “penalties” does not officially exist.
Here are some key points to develop and maintain a technically perfect website:
- Opt for a simple architecture and avoid deep pages (more than two clicks to get there),
- Optimize intra-site navigation by multiplying internal links,
- Search and fix broken links, internal and external; spiders do not like dead-end roads,
- Regularly monitor your site, the number of pages indexed by Google or other engines, compare it with the number of pages submitted for indexing in your sitemap (which assumes of course it is accessible to spiders),
- Check that your “robots.txt” file does not block access to any sections of your site, (do not smile this happens quite frequently),
- Find and correct “404” errors for “page not found”,
- Use redirection, such as code 301 “Moved permanently”, with discernment and for limited transition periods.
Robots and spiders abhor dead-end roads!!!
For a content site, without the need for frequent updating, such monitoring can be done on a monthly basis and may take just a few minutes once the process has been put in place. Failing to monitor “the health status” of your site will increase the risk of a slow and insidious – virtually undetectable – degradation of its position in search engines.
Damage is commonly visible after several months; the root causes are then rarely identified.
The outcome is a slow death by asphyxia or, if the budget allows, a full and expensive redesign of the site – usually incorporating the same mistakes from inception.
Do you have concerns about the health status of your site? Ask for a diagnosis from your web agency or call an independent specialist like B6B Consulting.
What are the implications for the healthcare industries?
The first stage on the long road to good web visibility is the indexation of your pages by robots or spiders. If your site contains multiple technical errors either by design or poor maintenance, the indexing process and therefore its visibility will be compromised.
For healthcare industries, there is a lot to lose by neglecting these technical aspects.
For a pharmaceutical company for example, developing and publishing original and quality content is quite easy. The question of the target audiences will be addressed in future articles on Search Engine Optimization for health industries’ websites.
However, if the indexing process is deficient, which is often very difficult to diagnose, visibility on search engines could be severely degraded, and could lead to a waste , just because it is commonly easier to focus on site layout, animations, videos, services, etc. rather than the technical points.
All of this is then spoiled and compromised by inadequate site indexation because of non-compliance with some simple and stable rules for indexing a website by search engines.
Ultimately, it is a waste of time and money to develop quality content, well-structured and optimized for a selection of key words and expressions, if your site cannot be indexed because a line in your robots.txt file is blocking access to spiders and crawler robots. This example is intentionally caricatural, however it stresses that technique must precede content and editorial optimization.