What is web crawling and robots.txt file
Free Analysis of your website: http://www.contentlook.co
See all our other videos here: https://www.youtube.com/user/WebsiteAnalysis
Click Here To Subscribe - Never Miss A Video! https://www.youtube.com/user/WebsiteAnalysis
Twitter: https://twitter.com/ContentLook
Google+: https://plus.google.com/u/0/103474122250715988437/posts
Facebook: https://www.facebook.com/contentlook.co
Pinterest: http://www.pinterest.com/contentlook/
LinkedIn: https://www.linkedin.com/company/squirrly
https://www.youtube.com/watch?v=Ylg4sBl1vPc
What is web crawling
You probably wondering what's all of this robots file
I am going to tell you why is important to have robots file in your website
and i am going to tell you what you can do to fix it
so you have the robots file
and what is robot file
well the robots file is like having a very huge trespassing side
saying do not cross
there is an huge not cross sign but who is it for
is for search engine crawlers
what is a crawler
imagine crawlers like being like this very tiny robots
for all of us who are you know as me let
and all this small things just go from
link to link to link
gather as much informations as possible about each link that they
basically stumble upon
this crawlers can find out all the
information
that you have inside of your website and some days this is harmful
because you have information inside your website
you have different pages
that you may want nobody to know about
and you have some pages in your website
that you don't want to be found via search engines and having a robots file
is a trespassing side for the crawlers
who will try to go and find each page that you have in your website
if you want that this pages to be found that's find no problem
but if you have lets sea this page or example
you can just he is a place it in your robots file
that will tell the search engines crawlers
horst while small here
it will tell them
do not cross you can not go there
you can go here if you want
you can take information form this page
you can not take information from this page
you can not take information from this page
is very important to have a robust robots.txt file in your website
how can you fix this
how can you make sure that you have a robots.txt file in your website
if you're using wordpress you can simply search for
a wordpress seo plugin
that already comes with a robots.txt file
and for example the Squirrly SEO Plugin
already has it. it will just post a robots.txt file in your website
otherwise if you want to fix it and you don't have a wordpress
you can use the button that we have in this ContentLook report
that says "send this issue to your team members"
and then you can get your technical guy or your seo person
from your team
to start solving this problem
we provide exactly what you need to know about robots file and press that button
and somebody from your team will know how to solve this problem for you
that's all about robots. txt file
If you’ve recently made changes to a URL on your site, you can update your web page in Google Search with the Submit to Index function of the Fetch as Google tool. This function allows you to ask Google to crawl and index your URL.
Googlebot is Google’s web crawling robot, which finds and retrieves pages on the web and hands them off to the Google indexer.
To keep the index current, Google continuously recrawls popular frequently changing web pages at a rate roughly proportional to how often the pages change.
A robots. txt file is a text file that stops web crawler software, such as Googlebot, from web crawling certain pages of your site.
Although respectable web crawlers follow the directives in a robots. txt file, some crawlers might interpret those directives differently. Web crawling
While Google won't crawl or index the content blocked by robots.txt, we might still find and index information about disallowed URLs from other places on the web. As a result, the URL address and, potentially, other publicly available information such as anchor text in links to the site can still appear in Google search results.
You can prevent a page from appearing in Google Search by including a noindex meta tag in the page's HTML code.