Tag Archive | "live search"

Tags: getting started, live search, mentor, online income, virtual assistance

The Internet Garden: Tending to Your Virtual Assistance Business Part 1


Webmaster News Blog - The Internet Garden: Tending to Your Virtual Assistance Business

Think of all the times in life that you wish someone was right there to help or assist you. And think of the lengths we go to at times to try and get that situation to become a reality.

Fact is we could fill a book with measures people have gone too to have help at the ready. Well thanks to the Internet we can add another. It is called a virtual assistant (or a VA).

Practically speaking, the reality is there are a number of applications related to being a virtual assistant. Just like there are red peppers, green peppers, and yellow peppers. We will consider a prime example of a VA application for this article. It is transcribing or transcription. This can apply to legal matters, corporate applications, or in our example a medical scenario.

The beauty of VA and other Internet businesses is that you can usually do it right from your own home. In fact a good number of the VA’s out there are staying at home moms.

Naturally you have already determined that you are qualified to take on these types of assignments and you have yourself equipped with a capable computer with quality Internet access. Now you need to get some work.

You have a few options here. One option may be that you are struggling to work out of an office taking care of some of the administrative duties we have mentioned above. Why not sit down with your boss and discuss the possibility of you doing this from your home. Setting that up from there is relatively easy. And everything can be sent across the web right back to the office. And thanks to the web you are always “just a ring of the bell” away.

A second option is finding an independent company that supplies the assignments for their staff. And their staff is not cooped up in an office they are spread across the Internet easily reached via the Internet or phone.

The third option is starting your own independent VA service and contracting your own assignments. For example I know a talented gal who set up a transcription service from her home with a local Doctor. He and/or his staff dictates data for her and she has the training and skills to transcribe that onto electronic media. All she needs, aside form the training, is a reliable computer, the necessary programs, and a dictation machine to play back the dictation tapes that are mailed to her. When complete she simple mails the dictation tapes back, a printed copy of the transcriptions and electronically mails a copy of the transcriptions.

This started out as a side job to supplement her day job. However it quickly grew to being contacted by other Doctors. A couple of them hundreds of miles away! Why would Doctor’s want to do this?

For one as we mentioned at the outset, we all love it when dependable help is right there, whenever we need it. Secondly and very important, this is often more cost effective. The job does not require the VA to deal with customer/patient needs so they do not need to be “there”. Also this means the Doctor is not paying them or for their salary, benefits, workman’s compensation, vacation, sick days, and a space to work. And what the Doctor pays them for their services is a tax write off (At least in the USA). It’s a win / win situation for the Doctor and the VA.
Have you decided what type of peppers you are going to plant? Because once you have the steps that follow are much the same as they would be with whatever you planted. And so it is with your VA business.

You have narrowed down what area of the VA world you want to conquer now you need to do some gardening or cultivating. As far as soliciting and acquiring jobs you will need to conjure up and prepare a professional resume to present to potential clients. I believe for those who have taken transcribing courses they offer tips on how to nail down some work. Some I have been told have gotten leads and/or a supportive reference from their course provider. Obviously any hands on experience you have will go a long way in helping you land a project.

Remember in these times of economic difficulties the idea of using VA’s is rapidly growing. Go back a couple paragraphs and re-read why that is the case. If a Doctor, Lawyer, or corporate manager can get the same quality work and not have to take on the operative costs of housing staff that is very inviting. Again it is truly a win / win situation. And don’t kid yourself; this is going to be the pattern for years to come.

What are you waiting for then? Do you possess these skills already? Then get started by taking on some side projects and grow your VA “plant” from there. Need some education to do this? Check out the courses available in the local schools in your area. You would be surprised how accommodating the schedule can be. Take your classes around your schedule and go from there. Now go!

Subscribe to BevyHost Feeds   Subscribe to BevyHost via RSS Feed   Subscribe to BevyHost Feeds
Follow BevyHost on Twitter   Follow BevyHost on Twitter   Follow BevyHost on Twitter

 

 

Copyright © 2009 BevyHost.com.

Posted in Online MarketingComments (0)

Tags: keyword tool box, keyword tools, keywords, live search, online business, Search Engine Optimization, seo, traffic

Breaking Out The Toolbox


Webmaster News Blog - Breaking Out The Toolbox

You got a loose screw or a broken household item? Maybe your car or a kitchen appliance needs some attention? Perhaps you just got some furniture or another item that needs assembled? So “who you gonna call?”

Most likely you are going to break out the tools and have a go at it. Even if you don’t consider yourself Mr. Fix It, chances are you have handled a screwdriver a time or two. Then again perhaps you are a handy man, thus making it more likely that you will have success with your tools.

When it comes to building, managing, and promoting your web site the need and benefit of web tools are just as important. Why is that people may wonder? And what kind of web tools are available? For that matter, where do you find the toolbox?

For the purposes of this discussion we are going to focus on the proper use of keywords. Equally as relevant we are going to review the value and need of keywords. What kind of keywords are we talking about?

As many of you know, the large percentage of Internet surfers use search engines to locate web sites that address or contain information they wish to view and digest. Obviously the most famous search engines are Google, Yahoo, and now Bing. Most surfers do not recall or have the exact name of the web site they are looking for so they use these search engines to assist them.

By simply typing in a relevant word or a “keyword” that is descriptive or related to the site they desire to find, the search engine will present a search result list. Usually that will jog the memory or offer satisfactory options for the Internet traveler. This is where the use of keywords is valuable.

Practically speaking if someone is seeking information, product, or services that you offer, your aim is to have your web page feature high on the search result list that is produced in response to the words the Net surfer types in. For this to happen, then your site needs to contain a nice array of popular or keywords words that are related to your site and its content.

The more tag words you implement combined with the likelihood they are applied, will improve the promotion miles you can gain from them. At the same time, and seemingly illogical, over use of individual keywords can have little or negative effect on our page ranking. “How do I know what is enough and not too much?” one would naturally ask.

You can use a simple percentage formula that is called by some keyword density. According to some who analyze the effectiveness of keywords, you want keep the keyword density in the 3% to 8% range, which plainly speaking means that you should not use any keyword more than 8% of the total words in your content. So if you have a post of 300 words, than no keyword should be used more than 24 times for optimum effect. At the other end of the range is 3%, meaning you will employ some keywords at least 9 times each in a 300 word post. Now keep in mind this is not an exact science so don’t be calling down evil upon me if this formula does not work 100% of the time.

Clearly the use of keywords is not the only variable to successful web marketing and search engine rankings. For example the location of keywords is important. If, for instance, your post is about classic Corvettes, you would not save your use of the words like Chevrolet and car until the last paragraph, nor would use those words 9 times in the first paragraph of your 300 word post.

Though it is important to use your premium keywords in the beginning of and in the description of your post, you want to cleverly disperse your keywords woven throughout your content. “For crying out loud, I am still confused, frustrated, and totally clueless” some of you are rebutting. Don’t worry you are not alone.

That is why help is available. Hey many of us can’t fix our car or our appliances, or like me, you hate to put book shelves together. So you let the “professionals” use their toolbox, nothing wrong with that. Search engine services, like Google, have on line tools that will analyze your site and/or the content you feature and give you valuable suggestions for keywords to use.

Many of them have keyword tracking tools that provide a chart for you to see how many people have used the keywords they are recommending. So you can see for yourself which ones are better when compared to others. Of course you will use an appropriate number of them.

So as we rap up this powwow, it is vital that we remind you of a couple tidbits. As is always the case, success breeds success. By all means use words that are productively working for you. Matter of fact, and also always applicable, you can view the success of others and the keywords used by other web masters with sites similar to yours. You can even track their keyword responses to know just how effective each of their tag words is.

In conclusion, please, and I mean please, keep in mind this is not a word “game”. You need to use keywords in conjunction with quality content. Many web promoters try and cheat the system and spam a post with a saturation of keywords or tags. Or they write a post with the right formula of keyword use but the content is senseless and completely pointless. If you want keywords to work for your Internet promotion, than you must use them appropriately.

Now you have the secret or at least the toolbox.

Subscribe to BevyHost Feeds   Subscribe to BevyHost via RSS Feed   Subscribe to BevyHost Feeds
Follow BevyHost on Twitter   Follow BevyHost on Twitter   Follow BevyHost on Twitter

 

 

Copyright © 2009 BevyHost.com.

Posted in Online MarketingComments (0)

Tags: internet business, internet research service, keywords, live search

Top 10 Internet Businesses: Internet Research Services



Webmaster News Blog - Top 10 Internet Businesses: Internet Research Services

“Seek and ye shall find” to quote a famous teacher. (And you should be ashamed if you don’t know who that is) In a broader sense, we all have heard of the people that are in the information business, have we not?

And freely speaking, there are a few of us that are actually very, very good at gathering information. Let’s try and ramp up that number. Maybe that will include you?

Which begs the question what is an Internet Research service and why is it important? And better yet, can you make money doing it? Let me assure you that if this gig is for you, that you can make some dough in it. So keep reading please.

The Internet Research profession or trade primarily revolves around gathering needed information for corporations, legal firms, research centers, and many small to medium sized business. What kind of research? Let’s take a contemplative leer into that.

 In the corporate world and in smaller business: Information is crucial in the corporate arena. It involves a multi pronged strategy. Information and data on product, sales, manufacturing, shipping/receiving, supplies, market trends, market status, and financial analysis.
 In the legal department: This takes a few directions also. Legal research is relied upon in day to day business and industry to guarantee that all potential legal liability is averted. This is why corporations and major businesses have a legal department and/or a law firm on retainer. In a standard law firm extensive research and compilation of relevant data will win a high percentage of their cases. Nothing beats facts and good old preparation. Have you ever seen the volumes of data that law firms used to have to comb by finger? Imagine the value and efficiency of using the Internet and internal computer storage!
 Research Centers may range from medical/science related facilities, charities, to insurance companies and any analogous enterprises and organizations.

Therefore it is not hard to picture the necessity of having thorough and relentless information seekers on staff if you want to continue being successful and avoid obstacles or surprises. The Internet has become an enormous and substantial tool to accomplish the task. And the salary potential is quite rewarding for those who score a position in an Internet research job. The potential earning power is also there, if you, operate a Internet Research service.

The clincher is that the skills required are very straight forward. You will need to be proficient in Internet use and research skills. You need to know how to methodically plod through an abundant amount of stats, testimony, legal precedents, reports, summaries, case histories, records, facts, and trends. And separate the non essentials and bull. The cool thing is that people who are good at this usually like doing it. It appears to provide some kind of euphoric high by causing an endorphin rush as they digest and sift through information like a conquering Jedi Knight. And did I tell you that you get paid for it?

Aside from gladiator librarian like research skills and Internet command you will need strong customer service skills. Now if a company is looking beyond the research duties there may be qualifications that involve law studies, business management, business operations, and/or any other relevant studies linked to your job description.

So, what do you think? Are you a person with fleeting fingers and a hunger for information gathering? Because if you are, than there are opportunities at your front door.

The beauty of all this it may offer options of working at home. Seriously Internet research can truly be a satisfying and challenging, a gratifying and lucrative Internet business. You really need to check this out.

Subscribe to BevyHost Feeds   Subscribe to BevyHost via RSS Feed   Subscribe to BevyHost Feeds
Follow BevyHost on Twitter   Follow BevyHost on Twitter   Follow BevyHost on Twitter

 

 

Copyright © 2009 BevyHost.com.

Posted in Online MarketingComments (0)

Tags: how to make money, internet business, live search, online business, Online Marketing

Morphing Internet Skills Into a Cool Job


Webmaster News Blog - Morphing Internet Skills Into a Cool Job

Who doesn’t like to get paid to do something really cool or to work at a place that is connected to a passion of theirs? Perhaps getting paid to watch sporting events or being in the entertainment business?

Hey maybe you can do that. Particularly let’s hone in on the entertainment world first. Yes the Internet has opened up a whole new world of interactive programs. This naturally has created unique job positions to be filled. Jobs related to and requiring Internet skills.

Now some of these opportunities would customarily be linked into the broadcasting field. Nevertheless there are a growing number of positions that are needed that involve the Internet and the entertainment field. Like what?

Take some of the latest integration of social Internet media with news broadcasting and talk shows. For example Oprah often has guests on that use Skype to appear on her show. Clearly there needs to be someone setting up these video streams.

Likewise the use of Twitter or Facebook has become common on CNN, Fox news and etc. Someone has to set this up and filter through these instant messages and then make a few premium hand selected comments to make available for the program.

Or how about radio programs that use interactive tools like social site commenting along with email and messaging. Someone has to help the on air staff to get the most engaging comments noticed. With the sheer volume of people participating, it would be hard for the host to read them all themselves.

Or how about in a live talk show?, also including radio. As subjects are dissected and digested, research and verification needs to be done quickly and thoroughly. The host doesn’t want to be caught with their “pants down” on the air stammering for statistics and relevant studies. Behind the scenes the Internet is fired up and being used relentlessly for finding the facts and pertinent information. Some of this work will also be done in preparation of the program as a matter fact.

And don’t forget this is applies to your local broadcasting outlets. And never forget the countless number of Internet radio options that also do various versions of similar programming, thus also need Internet task masters.

But make no mistake about it the Internet is vital in today’s entertainment talk and news business, but speaking of stats: professional and College sports, and any covered sports for that matter are a breeding ground for fact finders. Stats are an enormous part of sports. As the game is being broadcast and the action is being described someone is working the Internet for related information like a shark looking for lunch.

For example let’s say you are watching an NFL football game. The Pittsburgh Steelers have the ball at their own 35 yard line and they need at least 3 points to win the game. They are playing the Cleveland Browns and there is only 2 minutes left in the game. Within minutes you will be told how many times the Steelers have been successful in this scenario. You will learn how close they need to get for their field goal kicker to have a high percentage chance at winning the game with a field goal. They will tell what that percentage is. You will hear about the last time he kicked a game winner. You will hear how good of a percentage the Browns have at stopping a team in this scenario. You will already have been kept abreast of what each team has done thus far in the game. How many times they threw the ball, how many times they ran the ball, and so on. And this is just a small sample of what you would hear.

The fact is I could go on and on with the stats and interesting factoids that will be thrown at you with a rapid fire rate of an oozy. The point is someone is feeding this information to the broadcaster.

And this is the case in any sport. Watch a MLB baseball broadcast and you will see shots of the broadcasters sitting behind or next to a computer doing their job. So as the game progresses, someone behind the scenes is feeding those stats and information and posting on their monitor.

Then we have the after the game reporting. Box scores for the web page, article reviews on the game that include fast and data. And on and on.

Some web sites even have a play by play live game cast. So someone is watching the game why they are getting paid to feed the events of the game into the web page as it is happening so that people can check on line and follow the live action.

These are just 4 examples of how the Internet is interwoven in the sports broadcasting business. And again this applies to sports at all levels. Even in your local High Schools and Colleges. Even with your local minor league professional teams. Wouldn’t it be awesome to work for a sports franchise and be there Internet researcher/fact/stat finder? Or to work for their webcast and/or web site?

You would go to all the games, work with the on air personalities and get to use your cyber skills to boot. You got to love it.

So are you are sports lunatic who also is quite handy with the Internet? Are you a gun slinging outlaw when it comes to finding facts and stats? Then you need to look into the stat man angle or web applications and find yourself a job on the broadcasting team and unleash your powers on the sports broadcasting world.

Or are you into the talk show or news scene? Face it the vast spectrum of talk shows and news programs have an Internet team helping to put out a whiz bang of a show.

Hey somebody has to do the job, so why can’t it be you?

Subscribe to BevyHost Feeds   Subscribe to BevyHost via RSS Feed   Subscribe to BevyHost Feeds
Follow BevyHost on Twitter   Follow BevyHost on Twitter   Follow BevyHost on Twitter

 

 

Copyright © 2009 BevyHost.com.

Posted in Online MarketingComments (0)

Tags: how to make money, keywords, live search, seo

Topical Power


Webmaster News Blog - Topical Power

Years ago, it is sad to say, but the Internet was driven by the sex industry. Well perhaps that was not sad for everyone, but as a society that was a little alarming in my humble opinion.

If you look at the most popular web sites you may argue we have not progressed very far from those days. Among the most searched and visited sites are pages about Paris Hilton (I’m not kidding), Britney Spears (Still not kidding), Eminem (I wish I was kidding), Girls (as in pretty girls), and American Idol.

Of course there are more magnetic and enormously popular sites that are a far cry from the fan club, celebrity obsession sites. Of course I beg to differ with calling some of those people a celebrity. But that’s just me. Anyhow I digress.

Also among the most visited sites are Facebook.com, Google.com, Hi5.com, Live.com,MSN.com, MySpace.com, Orkut.com, Wikipedia.com, Yahoo.com, and YouTube.com. You also have the notorious and fashionable sites like ebay.com, craigslist.com, amazon.com and adultfriendfinder.com, so on.

When it comes to you and I however we need to take a look at a different aspect of the popular web searches and site preferences. After all, none of the sites above are operated by any of us. Well I suppose some of us could have a fan site devoted to one of the more famous and sought after celebs, which actually brings me full circle to the point we are going to discuss.

Inevitably if we have a web page featuring content or anything related to a repeatedly searched topic then we are most likely going to do well. Why do we say most likely? That is because there is more to a successful web page than getting people to visit it. However that is markedly important. So how should that affect us?

If we are deciding what direction to take with an Internet business then we should understand what people want. After all we all know that this is a primary part of the supply and demand business model. So, what are people searching, in general terms, these days?

The numbers and the studies conclude that there are a number of topics that people are aggressively wanting to read about. Among them are health topics, food, sports, dating, business, arts, games, computers, recreation, shopping, society and cultural subjects, real estate, and reference material. This can break down like this:

  • Health = weight loss, medical conditions, beauty care and products, exercise, energy boost, and mental health concerns
  • Food = Diet, recipes, restaurants, etc.
  • Sports = Teams, leagues, players, tickets, etc.
  • Dating = personal ads, dating advice, chat rooms, etc
  • Business = finance, management, Internet business, job search, training, supplies, technology, etc.
  • Arts = Music, concerts, literature, paintings, sculptors, photography, fan pages, etc.
  • Games = On line games, Xbox, Playstation, etc
  • Computers = Education, Internet business, Internet use, technology, etc.
  • Recreation = Travel, movies, events, etc
  • Shopping = etc
  • Society and Culture = politics, news, current events, world events, heritage, etc
  • Real Estate = home searches, real estate agents, etc
  • Reference material = dictionary ,encyclopedia, Wikipedia, online schooling, research, etc

In light of this analysis then it would be beneficial to approach the presentation of your site with a topic mentality. Yes use the power of a topics or what some would call a niche market to target those who have already demonstrated a pattern of searching these topics of interest.

This should also come in handy when you are choosing keywords and Meta tags for your site. Overall the lesson to be learned here is that we live in a “table of contents” type society, especially when it comes to the Internet. So give the people what they want and make it easier and more likely for them to find your web page!

Subscribe to BevyHost Feeds   Subscribe to BevyHost via RSS Feed   Subscribe to BevyHost Feeds
Follow BevyHost on Twitter   Follow BevyHost on Twitter   Follow BevyHost on Twitter

 

 

Copyright © 2009 BevyHost.com.

Posted in Online Marketing, Search Engine OptimizationComments (0)

Tags: live search, online income, online marketting

Delivering The Data


Webmaster News Blog - Delivering The Data

The old cliché is that men are the hunters and gatherers and women are the caretakers and the home makers. And while there may be a hint of truth to that cliché, it is often true that the average person loathes doing data searches.

Oh sure many of us will hop on line and do a search engine inquiry with some amount of eagerness. And we feel special when we find exactly what we are looking for. But on the other hand we are very frustrated when we can’t get our hands on the information we need or want. So much so that if we find a scenario where someone is offering the information we need for a reasonable fee, we will seriously consider purchasing it. And many of us will take them up on their offer. And, why not?

Selling information is not new by any stretch. Take Consumer Report magazine for example. It is a publication that puts products to a rigorous test and then reports the results to the public. The magazine is not free and yet it is enormously successful.

Or take the wacky, quirky guy who screams at you in his commercials. He runs around Washington DC wearing a suit with question marks all over. He sells Government Grant information. Information that is free if you look hard enough. But some people would rather pay someone who has apparently done the homework for them.

Some people will appropriately interject here that this guy is unethical and misleading, but my point is that there is a market for data nevertheless.

And yes there are people who are unscrupulous and slippery in this niche market. Sadly that is the case in any market. Our 2 examples demonstrate the potential for delivering desired data to the online public.

Yes indeed, people are more than willing to subscribe to a trusted data site just as they have already proven they would purchase a book or a magazine.

Just in the last couple weeks I have used a data service for a real estate / foreclosure venture, pricing trading cards, and getting market “blue book” value for vintage stereo equipment and I was willing to pay a monthly fee for this data. Now in the name of full disclosure I have seen many data sources that I thought were over priced and therefore I declined; but again, can’t we say that about any product or service?

How about you though? No doubt many of you are like me. However there is no question that there are some hunters and gatherers among us. Maybe you have been looking for a door to open to you? Or possibly, more specifically, you are looking for an opportunity to start an Internet business. Perhaps this is your door.

Is it possible that you have an affinity or an ability to compile usable information? It is relatively easy to set up. It is basically inexpensive to do also. It is not a matter of creating original content either. The concept is very direct. First you need to determine what type of data has a practical and at least a perceived value to a data seeking public. Then put in due diligence and amass accurate and applicable facts and provide them for your potential subscribers.

Obviously, the more meticulous you are, combined with genuine veracity, the more successful you will be. As you review the variety of sites that are already delivering the data, you will find you have many options and strategies to choose from. For example the trading card value site I use charges by the month. You can stop and start it at your convenience. This works for them because it is information that changes as time goes on. Thus they know you will likely want to return if you’re satisfied. Others sell data on a “one time fee” basis. Whatever works, right?

It may seem abstract or too much of a niche market for some, but the fact is there is a large percentage of Internet users out there looking for data. Consequently if you are a skilled data accumulator and are adept at dispensing useful data then you need to deliver it. And why not make a few bucks while you’re at it?

Subscribe to BevyHost Feeds   Subscribe to BevyHost via RSS Feed   Subscribe to BevyHost Feeds
Follow BevyHost on Twitter   Follow BevyHost on Twitter   Follow BevyHost on Twitter

 

 

Copyright © 2009 BevyHost.com.

Posted in Online MarketingComments (0)

Tags: live search, webmaster news

Live Search Webmaster Center Feed


Webmaster Center blog comments Q&A, Round 3

The Bing Webmaster Center team has been very busy lately, working on very cool stuff that we can't wait to share with you (patience, Grasshopper – all will be revealed in time). But the blog waits for no one (well, that's the intent, anyway). From time to time, we gather up enough interesting tidbits of Q&A that we want to share with all of our blog readers. Now it's that time again. So let's get to it.

Q: I'm not able to gain access to Webmaster Center with the authentication code used in a tag. Can you help?

A: The Webmaster Center online Help topic Authenticate your website recommends using a tag formed as follows:

However, some users attempt to combine the flow of authentication codes for multiple sites in the tags. If you must use the tag method of authentication (as opposed to the XML file authentication method as described in the Help topic), we recommend placing your Bing Webmaster Center authentication code last so that it is not followed by a space. In addition, Webmaster Center does look for the proper XHTML-based closing of the tag – the " />", so be sure to use this closing in your code.

This issue is discussed further in the Webmaster Center forum topic Site Verification Error for Bing Webmasters Tools.

Q: Why do I have to register as a user in the Webmaster Center blog just to post a comment?

A: We were getting a few non-registered visitors who were posting way too much spam in the blog comments. We needed to block that junk from being posted, so we implemented a new rule that requires folks to register before they can leave comments. Since we can control spam from registered user accounts, we felt this was the best course for minimizing the disruption of irrelevant comments. We hope this is not a hardship on anyone!

Q: I've posted two random blog comments requesting inclusion of my site into Bing News Service? Why haven't you added my site?

A: Let's redirect those requests to the right place. To request that Bing add your news site to our list of news sources, we ask that you send the request via email to the Bing News Service team. Please be sure to identify yourself, your URL, what types of news you provide, your audience, and any other determining factors such as awards won, etc.

Q: I have a very complicated or specific question to ask about my site and the Bing index. Can you answer it here?

A: Blog comments are best used for furthering the conversation about the associated blog article. Specialized service requests or specific questions about Bing products and services requiring detailed, individualized answers are always better left in the Bing Webmaster Center forums as a starting place. We have a forums administrator on staff who, along with the regular VIP contributors there, can offer helpful advice and insight to your questions. There are some amazing folks participating over there!

Q: How do I get my company listed in the Bing local listings?

A: Use the Bing Local Listing Center form. You may need to sign in to your Webmaster Center account or create a new sign-in account to access this form.

Q: How can I ensure that my local business contact information (address and phone number) from my website get into the Bing index?

A: One common problem we see with this is that some sites rely solely upon an image containing text to convey this information. This is not good practice for SEO. If you want to be sure MSNBot (or any other search engine bot) can to read such information, please add it to your website as text (the image is OK as long as the text version also exists)!

Q: Your recent posts on web spam has brought up a question: how do I report web spam that I find in search engine results pages to Bing?

A: To report web spam sites, we recommend that you go to the Bing Support web form to file the complaint. In the Problem list, select Content Removal Request. In the resulting list box, select Other. In the comments text box, include specific and detailed information in your report. Complete the rest of the form and then click Submit.

A member of the Bing web spam team will review the report and investigate the matter. If the report is accurate, appropriate action will be taken. Note that if the report is malicious and false, no action will be taken against the accused site.

Q: My website offers tax-related services. As a result, I use the word "tax" numerous times in my content. Could Bing consider my site to be web spam due to the appearance of keyword stuffing? When do I cross the line from acceptable to web spam?

A: The key here always comes back to how the content appears to the human reader. Is it logical? Is it readable? Does it make sense? In this particular case, the repeated use of the word "tax" in content regarding tax services offered is reasonably expected and thus is fine. In fact, including a solid set of explanatory content that defines these keyword phrases only strengthens the case for reasonably repeating this word. If the use of this repeated word makes contextual sense to the reader and is not a clumsy attempt to stuff the word in where it's not necessary or helpful, and you have a good amount of supporting content to accompany it, you'll be fine. Our crawler sees this usage and understands it is legitimate. Just write your content for the reader's comprehension and the crawler will not penalize you for keyword stuffing.

The important thing to remember is that true web spam often involves multiple issue violations. As such, it typically takes more than one violation to trigger web spam consequences – having a slightly above average number of keywords won't automatically torpedo your site. Just as you need to do several things well to improve your ranking (build good content, build valuable inbound links, target several keywords, etc.), you need to do several things wrong to really hurt your ranking. That said, if it's obvious that you are trying to abuse the system, even with just one egregious issue, then penalties will ensue.

Lastly, we don't define any borderline between acceptable and non-acceptable web spam. If you think what you've done might be considered web spam because you know you're trying to game the system, then take a different approach to optimizing your pages. I'll repeat my mantra: write content for the human reader, not the crawler. Develop good, unique content that is readable, understandable, and valuable. If you do this without involving any black-hat, SEO-style trickery in an effort to artificially boost your ranking, then you'll never have to worry about this being an issue.

Q: Regarding backlinks in forum comments and link-level web spam, is it only a problem when the page linked to is not relevant to the conversation in the forum, or is this a problem for all backlinks?

A: It always comes down to whether the effort is intended to legitimately benefit the human reader or benefit the owner of the link. If the link in a blog comment is relevant to the content in both the blog article and the blog comment and as an extension to that content, is of value and interest to the reader, then it is not a problem. In fact, this is a fine idea (whether or not the rel=”nofollow” attribute is automatically applied by the blog to user-generated links). However, if the link in the blog comment is not relevant to either the blog article or the blog comment's content, is not of relevant, legitimate interest to the reader, and instead is only beneficial to the link owner, then that is web spam. It's pretty straight-forward.

Also consider how the blog comment link is formed, as in whether it is a single link inline to the comment's content or is it a bazooka blast consisting of multiple, irrelevant links following a short, generic message that could be applicable to anything (or nothing). If your goal is to tell the reader about some information relevant to the post and that info is found within good content on your site, that's great. Add those links! Even if rel=”nofollow” is employed by the blog in all UGC-based links, the potential for driving live traffic to your site is good, and if the content there is worthwhile, that will improve public awareness of that content and ultimately be a good link building strategy. But if the comment is merely an excuse for blatant advertising links, it is web spam. Note the difference in intent. If you do right by the reader, you'll be fine.

If you have any questions, comments, or suggestions, feel free to post them in our SEM forum. Later…

– Rick DeJarnette, Bing Webmaster Center

[Link]

Illuminating the path to SEO for Silverlight

Microsoft Silverlight is a transformative technology. It enables otherwise basic websites to act as full-blown applications, provides access to state-of-the-art animation and video rich media presentations, and takes full advantage of your development team's existing experience in standard programming languages, such as C#.

However, there is one little problem. Like competing rich Internet application (RIA) technologies that present images, animations, and videos, all of that non-text-based content is extremely hard for search engines to parse and index. As a result, many website owners who are initially thrilled at the cutting edge presentation shown on their websites are later confounded when their beautiful sites suddenly fail to show up in the search engine results pages (SERPs). The problem is that the very technology used with the intention of blowing away their customers, when not thoughtfully implemented, can literally blow away their page rank as a result. In those cases, the site developer/webmaster failed to account for search engine optimization (SEO) when they implemented Silverlight.

Silverlight applications are packaged up for deployment into files with the extension .xap. These files represent the instructions to start the application and, in most cases, contain the content of the application. Unfortunately, search engines can't easily read Silverlight XAP files. Yes, the technical parsing and content extraction capabilities used in the world of search are improving every day. But as of today, you'd be wise to cater to what the search engines already do very well: read text. This means you need to add text-based information -metadata and more – to the Silverlight objects you employ on your website.

You might think it's not worth the effort to do such work. If you don't care about being found in search, you might have a point! But consider this: if some of your intended clients use operating systems or web browsers that are unsupported by Silverlight, what will they see when they go to your Silverlight-enabled site? Will the site still be usable? Will its presentation still make sense? Or will the page be blank? Do you even know?

If you care about these audiences, then the same backward compatibility work you'd do to help them will also serve you in your SEO efforts. As an overall investment, if you want to use advanced content technologies to improve the user experience for your customers, you might well want to invest in making your site more accessible to all potential users, including search engine crawlers (aka bots).

Basic SEO

Right off the bat, there are several things you can do to help the search engine bot learn more about your Silverlight-infused pages. Because bots cannot "read" Silverlight content the way we can, the wise addition of metadata is all the more important in these pages. This information helps the bots interpret what is the theme of the page, how the content relates to other pages and sites, and provides keywords to help the search engine understand the page well enough to rank it accurately among other, relevant sites for search users.

Much of this advice is actually basic SEO, but as Bing so commonly sees RIA-laden pages that possess none of this information, it bears repeating here. All of your pages, not just those using Silverlight, should have these elements in the source code:

  • Descriptive tag. Every page should include a descriptive and unique tag. That information is part of what a bot reads to assess what sort of content is contained on the page. Using a title such as Silverlight application is about as useless as no tag at all. Get specific!
  • Descriptive name="description" tag. Another important page element that bots use to determine the contents of a page is the text within the "description" tag. This information is often used to help create the website description snippet used on a SERP. As before, don't go generic here – be specific and unique. There is often so little text-based information on a Silverlight page that every little bit of unique content will be that much more meaningful to the search engine indexer.
  • Descriptive

    tags. The first level heading is second only to the tag for being the place to define the thematic contents of a page. As such, stick to only one iteration per page, but make it meaningful and unique.

  • Discoverable navigation. No man is an island, but a web page with no discernible navigation links to other pages might be. And any page built without any discoverable navigation to other pages must not be very important – at least, that's the way bots will see it. Be sure every page on your site is linked to at least one other page, and link out to other pages from every page so the bot doesn't get stuck in a blind alley and abandon crawling your site any further.
  • Descriptive alt text. When you add an tag to your page, be sure to provide that additional meta content. Bots can't read the contents of that image, even if it is merely an image of text, so the alt text you add is critical for helping the bot better understand what it cannot see.
  • Meaningful application name. Just as there is some SEO value to creating human-friendly URLs, where the directory and file names spell out logical words rather than globally unique identifier (GUID)-based gibberish, there is value to naming your Silverlight application in a manner that helps identify its purpose or role in the page. An object in the page code identifying "SilverlightApp1" is meaningless to everyone but the originating developer (and even then, it's questionable!).

Every one of these elements is an opportunity to develop keywords for your pages. Be sure to use keyword-rich text in every opportunity. But as always, do so wisely. Keep it readable and oriented for human readers, not stuffed for bots. Keyword stuffing will only get your site in trouble.

Graceful degradation

OK, so the basic SEO stuff has been knocked out, but what more can you do for a Silverlight page? As it turns out, we've only just begun to optimize.

The key to success in ensuring that down-level users will not be abandoned when you use an advanced technology like Silverlight is to implement a graceful degradation strategy. That means if a client, for whatever reason, cannot access the advanced primary technology offered (in this case, Silverlight), they still have a means to get something out of the page by means of lesser, secondary technology, be it metadata, substitute text on the page, a static image, or whatever else you can provide, content-wise, to assist those users.

To provide that graceful degradation experience to your users, modify your Silverlight pages to include one or more of the following solutions.

1. Present alternate, static page content

Instead of using the tag, use the tag to instantiate your Silverlight content in your page. The tag allows the page to provide secondary, down-level content to be presented in case the initial, primary content (such as a Silverlight application) cannot be presented. By using the tag, you can include text descriptions and other relevant content following the instantiation of the application in the code. Write these text descriptions toward the non-Silverlight user, describing the Silverlight application's role on the page, its function, or any other pertinent information that would help down-level users understand what would have been shown if they were able to access Silverlight. Be sure to use your page's targeted keywords as you describe the Silverlight content.

Below is an example of how you can include contextual, alternative information within your page's Silverlight tag code:

Traffic Map for King County, Washington


Typical King County metro weekday rush-hour traffic at 5:00pm

Silverlight enabled computers can use this page to see up-to-date traffic conditions on the major roads and highways in King County, Washington.

It’s easy to on your computer. See what you've been missing!

As you can see, the alternative content included the important

tag and some informative content identifying the role of the Silverlight application. And by providing a link to installing Silverlight, you might enable another user to step up and see your page in its primary view.

2. Use multiple sections

Another strategy for creating a graceful degradation of Silverlight includes using multiple sections on the page: one for the actual Silverlight content and another to be shown on computers that do not have Silverlight installed. Similar to the previous example, this technique sample demonstrates the presentation of static page content:


Traffic Map for King County, Washington


Typical King County metro weekday rush-hour traffic at 5:00pm

Silverlight enabled computers can use this page to see up-to-date traffic conditions on the major roads and highways in King County, Washington.

It’s easy to on your computer. See what you've been missing!

Note that the alternative is created by default as hidden content. Contrary to the generic advice given in the recent page-level web spam article, The pernicious perfidy of page-level web spam, the use of hidden content in this case is recognized by the search engine as contextually related to the graceful degradation strategy for Silverlight. As such, its use in this case will not raise any red flags to the search engine concerning potential web spam. As usual for these types of things, interpreting user intent is key to search engine bots identifying whether or not an ambiguous page element might be malicious.

3. Expose alternate, dynamic content

What if you are using Silverlight for more than just a single webpage application? What if you have a site-wide, Silverlight application used in an e-commerce scenario? In that case, you'll want to expose your inventory catalog of deep link content to search rather than have it left invisible in Silverlight. For this, you'll need to take a different approach. The alternate content here must describe any and all end point(s) that you want to make available to the search engine bot.

Instead of doing a deep dive here on this technique (this article is already getting long!), I'll instead refer you to a few useful resources of information on how to expose these end points to the non-Silverlight user and the bot. Both include good code examples and a clear explanation of how the technique is employed:

4. Use the createObject function in JavaScript

This is a more developer-oriented SEO strategy that you can employ with Silverlight. This technique uses JavaScript to automatically generate the markup code needed to create the tag and its required parameters.

Again, as no one wants to read a white paper posing as a blog column, I will simply point you to helpful resources for more information:

Test the new down-level experiences

Once you've implemented your Silverlight graceful degradation strategy, test it in non-Silverlight-enabled environments. Popular choices among SEOs include text-based, web browser environments such as Lynx browser or SEO-browser. You can also use operating systems currently incompatible with Silverlight, such as Windows 98, Linux, FreeBSD, or SolarisOS, or unsupported web browsers, such as Opera. For details on Silverlight compatibility, see the list of Compatible Operating Systems and Browsers.

Planning graceful degradation of Silverlight for SEO is identical to planning for those clients that are not Silverlight-enabled. Once your pages present useful, alternative content to non-Silverlight clients using the suggestions above, you can rest assured that search engine bots will also be able to see the results of your effort. And until bots can read RIA-based multimedia content like humans can, that is how you do SEO with Silverlight.

For additional information on performing SEO on Silverlight-enabled webpages, see the following:

If you have any questions, comments, or suggestions, feel free to post them in our SEM forum. I'll be back soon!

– Rick DeJarnette, Bing Webmaster Center

[Link]

Chasing the long tail with keyword research (SEM 101)

When a key opens a lock, it typically provides the key's holder with a clear path to where he or she wants to go. Keywords and key phrases do the same for a website. They help direct searchers to content they wish to see on the Internet. But there is a key difference: whereas a lock key will typically match up with only one lock, keywords can lead a searcher down multiple paths to many matching, relevant websites. It is a filtering process that leads the holder to the destination to which they want to go. (At least that's how it's supposed to work – see my recent article on keyword web spam for times when this is not the case.)

Search engines are still heavily oriented toward text-based content. Even when other media types are indexed, it is typically done so using text-based descriptions. Search engine users separate the wheat from the chaff on the Internet by searching for words that are relevant to the information they seek. That is their key. Smart webmasters, anticipating those users who will employ search to find content similar to what they've published, can boost their chances of bringing searchers to their websites by using the same words in their content that searchers will in their searches. It's simply matching keys to unlocking (revealing) the content you want.

Sometimes the keywords and key phrases searchers will use for a given field of interest are obvious, but that's not always good news for webmasters. If these keywords are obvious to you, it's likely that they are obvious to everyone, and if your site falls into one of those fields, all of your competitors' websites will be using those same keywords.

The long tail of search

If this is the case for you, there's no need to despair – there is hope. There's an often-overlooked truism in our industry: search has a long tail. Most webmasters only work to identify their sites with the head, so there's typically a lot of untapped value to be had in working on that long tail.

What do I mean by head and tail? Consider the form of a tadpole. Much of its mass is in the big head, but then its form flows into a long, tapering tail. Graphs of keyword search trends often look like a tadpole with a very long tail. A few primary keywords typically dominate a sizable percentage of the search traffic, but then there are secondary and even tertiary keywords. By themselves, they are clearly not as effective as the primary keywords, as fewer users search on them. But there are people who either search directly on them or use them as a part of longer queries, and those users are just as valuable as conversion opportunities as users of primary keywords. The key distinction here is that most webmasters do not bother to actively compete for those potential customers in the long tail.

If you are in an industry that has a few heavy-hitter, powerhouse websites as competitors, whose webmasters have worked hard to develop great content and earn authoritative backlinks, it can be as frustrating as chasing your own tail for a smaller upstart to compete with those sites using the same primary keywords. Competing in the long tail can be a great way to mop up some otherwise untapped business and begin to develop a name and reputation for your website. It's always better to compete for a high rank for a few keywords in the tail than to merely settle for a middling or worse rank for the most popular keywords in the head (settling for mediocrity is what most webmasters do, and thus why there's so often good opportunities for the taking).

And with the time you spend successfully targeting the long tail keyword opportunities, if you make the effort to simultaneously develop quality content and work to earn authoritative inbound links for that content, your site will only increase in stature. At that point, you can start thinking about getting more competitive for those primary keywords in the head as well.

Make it so

So all of that sounds fine in concept. But how do you execute on such a plan? You have to know what keywords are being used in your field. You need to know what keywords you need to use on your website. You need to make your website a legitimate target for searchers who use those keywords. To get such keyword intelligence, you need a great keyword tool. One that is easy to use, draws from strong industry data sources, and offers a variety of views of that data. Frankly, I suggest you take a look at Microsoft Advertising Intelligence.

Microsoft Advertising Intelligence is the successor to the 2009 beta tool called adCenter Excel Add-in Keyword Research Tool. As you might have inferred by its previous moniker, it installs as an add-in to Microsoft Office Excel 2007 (it won't work with any previous versions of Excel, however). You'll need an account with adCenter to gain access to the keyword data, but that's easily enough done, and there's no cost for setting up the account. Note that the tool was designed for users of search marketing (aka Pay Per Click [PPC] ads). However, the research needed to develop strong-performing keywords for PPC ads parallels that of keywords for search engine optimization (SEO), and thus the tool is easily repurposed for those efforts.

Once installed, Microsoft Advertising Intelligence is presented as a tab on the Excel ribbon named Ad Intelligence. Click that tab, and from there, you have access to a series of helpful tools that can help you perform the following tasks:

  • Extract current keywords from an existing site
  • Create new keywords by starting with an existing list, a webpage, or by selecting a vertical
  • Expand current list of keywords by examining advertiser bidding selections and analysis of search query data
  • Analyze keyword performance by query, time, demographics, geo-location, and more
  • Identify the categories using that keyword and drill down to common queries
  • Identify key performance indicators (KPIs) for keywords and compare yours against industry averages
  • Look up typical PPC keyword pricing for particular keywords
  • Learn the click-through-rate (CTR) and the cost per click (CPC) around your chosen match-type position
  • Learn about industry KPIs and learn more about your own particular vertical, including the average CTR and CPC, and then compare your performance against your vertical's average

I recommend that, immediately after installation, you first configure the tool to work with your adCenter account. In the Options & Help section of the ribbon, click Options, and then fill in the User name and Password fields with your adCenter credentials. Click Test Connection to confirm everything is ready to go. Once you get a message box confirming the connection was good, click OK to close the open dialog boxes.

There are nine tool buttons on the ribbon, some containing multiple, related tools. Instead of me trying to explain all of the cool stuff that Microsoft Advertising Intelligence can do, I'll simply refer you to the tool's website for technical documentation, its active community forum, and the numerous video tutorials.

Identify the long tail

Once you've installed the tool, you can use it to pull a list of the current keywords used on your website today. Here's how:

  1. In Excel's Ad Intelligence tab, click the Keyword Wizards tool, select the option Extract from website, and then click Next.
  2. Type the URL you want to use, and then click Next.
  3. You can first review the keywords extracted by clicking Review, and then click Next to continue.
  4. Select the option Queries That Contain Your Keyword to see other keywords based on those extracted from your site, and then click Next.
  5. You can either change the setting Maximum suggested keywords or use the default. Click Next to continue.
  6. Click Review to see the updated list, and then click Next.
  7. To see historical data on the usage of the keywords in your list, click Monthly traffic, and then click Next.
  8. You can then modify the range of dates for historical usage performance data retrieved as well as for forward prediction usage or keep the defaults.
  9. Click Finish to get your report.

In the resulting report, you can change the sort order of any of the columns of data to see which keywords and key phrases had the highest CTR on any particular month or in aggregate.

If you want to be very specific in conducting your research and customizing your reports, you can skip the keyword wizard and instead use the other tools in Microsoft Advertising Intelligence to narrow down keywords for specific verticals, demographics (including age, gender, and location), and more. You'll see which words are the highest performers, and how those words have performed recently.

This is powerful information, and you'll learn which words are being used in your field at which frequency. Check your site's keywords against those who are the movers and shakers in your field, and you may discover some under-utilized keywords in the long tail of search that may be a golden opportunity for your site.

Once you do, implement them wisely on your site, and then monitor your site's progress over the coming weeks and months. For advice on implementing keywords wisely, check out our earlier blog articles on using keywords, including Put your keywords where the emphasis is (SEM 101) and The key to picking the right keywords (SEM 101). Whatever you do, don't follow the examples of keyword abuse documented in the blog article The pernicious perfidy of page-level web spam (SEM 101). Remember that SEO is not an overnight quick fix. Time is needed for crawling and reindexing changed content from the search engine side and then for searchers to find you. Patience, along with hard, smart work, will pay off. (And don't ignore other aspects of a thoughtful SEO plan that can improve ranking as well, such as creating great, unique content and earning authoritative, high-quality inbound links!)

So stop chasing your own tail. Instead, invest in chasing the long tail of search by using a keyword intelligence tool like Microsoft Advertising Intelligence. That is the key for unlocking success in search.

If you have any questions, comments, or suggestions, feel free to post them in our SEM forum. See you again soon…

– Rick DeJarnette, Bing Webmaster Center

[Link]

The liability of loathsome, link-level web spam (SEM 101)

When I was a kid in high school, I used to go to the public library and do initial research in the Encyclopedia Britannica (yes, the bound book editions. I also remember black & white television with vacuum tubes and rotary telephones! Sheesh, I'm getting old!). I would pick up the index volume that contained the keyword I wanted to look up to identify which of the main volumes had the content I sought.

But imagine this: when I opened up the referenced main volume to the page specified, I always found the content I wanted. I never once went to the content page referenced in the index and found a page full of advertisements, come-ons for dubious physical enhancement pharmaceuticals, or any irrelevant, unwanted garbage like that. That's how Internet search is supposed to work, too.

Search engine as master index

However, unlike the Encyclopedia Britannica, which maintained sole control over the information it published (thus making its index a really good bet for finding the content you want), the fast and loose world of the Internet is open to all comers, for better or for worse. The good of that trait allows for information of all types, from highly important to trivial (and in all ranges of value, from well-researched reports to skewed opinions to deceptive trash), to be found, but you must know where to look. This is where a search engine's role as master indexer comes into play. Services like Bing use their own resources to scan the Web for content and organize their findings into a useful index of the available content for users.

But since no one entity has control over the content placed on the Web, the useful and informative website is joined by the unscrupulous huckster, who spends a huge amount of effort to deceive the search engine index in order to bring unsuspecting web searchers to their irrelevant website. This deception is the core of web spam.

Bing and other search engine service providers work diligently to detect and eliminate web spam-tainted results from getting into our search engine results pages (SERPs). It's a tough battle, and it requires a great deal of work to keep our SERPs useful and legitimate for search customers.

We’ve already discussed the basic definition of web spam and one of the two major implementations, page-level web spam, in previous blog articles. We'll wrap up this web spam series with a discussion of the other major type, link-level web spam. And finally, we'll discuss what a webmaster can do to restore their website's listing (removing penalties) with Bing once the detected web spam has been removed.

Definition of link-level web spam

Link-level web spam uses web link deceptions in an attempt to artificially inflate the page rank of a specific page or site. Savvy webmasters know that earning high quality, relevant inbound links from authoritative sites can have a very positive influence on the search engine's rank of the linked site – we recently published a blog post on this subject titled, Link building for smart webmasters (no dummies here). This is good search engine optimization (SEO). Some less savvy and/or more unscrupulous folks believe they can simply substitute the "high quality, relevant" part of the equation for high quantity and swap "authoritative sites" for either junk or irrelevant sites and achieve the same goal. Sadly for them, this is not the case.

The intent of the link-level web spammer is to create huge numbers of inbound links (typically from unrelated, low quality sites) to attain illegitimate page rank for a site to fool web searchers into visiting their sites. Luckily, Bing and the other search engines can assess the quality and authority of a particular website.

Sites employing link-level techniques also often employ page-level web spam techniques to make their sites appear to be relevant to a commonly searched keyword when they are not. The use of link-level web spam techniques will cause a search engine to examine your site more deeply, and if it's determined to be using web spam techniques, your site could be penalized.

As we stated earlier with page-level web spam, some of these techniques can have valid uses at their core, but the intention behind their use is the distinguishing factor. When we detect deceptive intent as we crawl the Web, we identify those pages as web spam and penalize them as appropriate, ranging from neutralization (which levels the playing field for other sites offering content on the same subject) to expulsion from the index. As you can imagine, for an online-based business, these are serious consequences, so it pays to know what NOT to do when you optimize your site for search (or hire a consultant to do the same).

Post web spam

Definition: This is a form of user-generated content (UGC)-based outbound links posted in other web sites, such as in guest book pages, forums, blog comments, message boards, and referrer logs.

Problem: The destination links in post web spam are usually unrelated, topic-wise, to the page containing the UGC outbound link. Often these posts include multiple links. In sites that rely on post web spam for inbound links, it is not unusual for a sizable percentage of all of their inbound links to be from post web spam.

What we look for: Several techniques for implementing post web spam are commonly used, including:

  • Add backlinks to all UGC content. When users go onto websites that allow UGC to be created, those who use post web spam include backlink URLs to their sites, even if they don't have anything to do with the comment or, more significantly, the theme of the UGC-sponsoring site.
  • Automation. Spammers often use automated techniques to repeatedly submit the same UGC post containing short, generic text and a clickable URL to their sites in every UGC-sponsoring page possible.
  • Keyword stuffing. Post web spam text is often keyword stuffed. Check out our page-level web spam article titled The pernicious perfidy of page-level web spam for more information on this.
  • Massive repetition. Lots of non-relevant, poor quality, inbound links come from such pages as online guest books, forums, and blog comments.

What post web spammers don't often realize is that many UGC-oriented pages automatically append the attribute rel=”nofollow” to any links created in UGC content. As such, no inbound link credit is derived when search engines crawl and index these pages.

From a webmaster point of view, however, we encourage active, regular cleaning up (or better yet, preventing) of UGC-based web spam content. If there is too much junk or web spam content on a page, it could reflect poorly on the overall quality of your page, even if you are employing rel=”nofollow” to URLs. For that matter, it is very important for any site that allows UGC content to actively monitor their site's security. Hosting malware can also get a site penalized, and you don't want that! For more information on malware, see our blog article series called The merciless malignancy of malware, Part 1, Part 2, Part 3, and Part 4.

Link farming

Definition: A link farm is a large collection of websites that exist for the sole purpose of providing massive numbers of links to targeted websites, ostensibly to improve the appearance of their organic, online popularity.

Problem: Link farming is often employed to promote one website using many other websites or it can be a commercial enterprise in which the link farm sells its unscrupulous (and worthless) outbound linking services to less-SEO-savvy webmasters.

What we look for: Link farming is often implemented using the following techniques:

  • Large, sudden surge of new inbound links. When dozens or hundreds of inbound links suddenly appear for a new or a previously small website, such a big change can indicate link farm web spam activity. The relevance of the outbound linking sites will be a key factor in whether or not such a sudden change warrants further investigation.
  • Consistent similarities between outbound linking sites. If a large number of the inbound links for a site come from sites that are very similar in design, structure, and other key characteristics, this can lead to deeper scrutiny of a website for web spam.
  • Poor linking standards. A link farm will often have a large number of unrelated links on the page, or will have related links to many sites that employ other spam methods.The pages themselves are designed to maximize the number of links on them, favoring outbound links rather than original content on the page.

When link farms are identified, those sites are penalized, which negates the value of the links they contain. In addition, the pages they link to are more likely to be heavily scrutinized for other forms of web spam.

See our earlier blog articles for more information on what makes a good link versus a bad link.

Link exchanges

Definition: Unlike link farms that target a few selected sites, link exchanges are organized groups of websites who participate in providing reciprocal inbound and outbound links ostensibly to benefit all websites in the exchange.

Problem: Web spam-oriented link exchanges typically involve unrelated web sites reciprocally exchanging links en masse for the purposes of rank inflation. As such, they offer no value to human visitors, and thus they are candidates for being considered web spam.

While earning inbound links are a part of legitimate SEO activities, as we've stated before, Bing values quality links over quantity of links. Inbound links from sites unrelated to the theme of your site, typical with most link exchanges, will be of little to no value to you for improving your page rank.

What we look for: Link exchanges usually include the following activities:

  • Starts out as email spam. Link exchanges often start out as spam emails sent from webmasters of unrelated sites asking other webmasters if they would like to improve their ranking by exchanging links.
  • Excessive links. Link exchanges (reciprocal links) between unrelated sites, especially when done to excess, can be indicators of web spam, and a participating website might be more heavily scrutinized for other web spam problems.

Note that reciprocal linking is not an automatic red flag. Some websites within a particular niche will link to others when it provides a relevant value to their customers. For example, think of a bed and breakfast who links out to local wineries and a winery who links out to local bed and breakfasts – these are interrelated activities to a region that are naturally relevant for site visitors.

But as usual, too much of a good thing can be bad. And when there is no relevance between linked sites, the value of link exchanges can quickly degrade down to the level of web spam (especially when the numbers of unrelated links is deemed excessive).

Penalties and restitution

Mistakes happen. An entrepreneurial do-it-yourselfer optimizes a website based on bad (spammy) advice from the Web. A Mom-and-Pop-shop website owner naively hires an unscrupulous website consultant. Heck, it's even possible that a search engine might mistakenly label an innocent site as web spam. So what do you do?

If you made a mistake on your site and your rank has been neutralized, the solution is easy. Web spam neutralization is handled automatically with Bing. If you are using web spam techniques on your website and you want to remove the site's web spam neutralization penalty, eliminate the web spam violations and then republish your website. Once the Bing crawler, MSNBot, recrawls your site, if the web spam violations have been removed, the neutralization will be automatically resolved in the index.

But what if your site has been purged from the Bing index? That requires some manual intervention.

Request reconsideration for your site

If you search for your site in the Bing index using the advanced search keyword phrase site:www.myURL.com (using your URL, of course!) and nothing turns up, your site is not in the index. If this is a sudden change and you know you've used some unscrupulous web spam techniques, you'll need help to get back into the index.

First of all, fix all of the web spam violations on your site. Not just one or two, but all of them. Then, once you've republished a corrected version of your website, contact Bing support to request reconsideration of your website's penalty. Here's how:

  1. Go to Bing E-mail Support and fill out the form completely
  2. Select Content Inclusion Request from the drop-down list. A new drop-down will appear underneath.
  3. From the new drop-down list, select Reinclusion request.
  4. Write a clear and detailed explanation of what you have done to resolve the problem in the next text box. (You can prepare this in advance, and then copy and paste the text into the form.)
  5. Type the security code from the presented image into the text box below.
  6. Once the form is completed, click submit.

A member of the Bing support team will quickly review your request and schedule your site to be recrawled. If the crawler determines that all of the violations have indeed been resolved, then your site is eligible to be added back into the index. But be patient – this process doesn't happen overnight (which is why it's a wise idea to avoid such web spam penalties in the first place).

For more information on Bing penalties and restitution, see the blog article Getting out of the penalty box.

If you have any questions, comments, or suggestions, feel free to post them in our Ranking Feedback and Discussion forum. Later…

– Rick DeJarnette, Bing Webmaster Center

P.S. It was suggested to me that I list the other articles in this web spam series for those who might be interested in reading the entire set, so here goes (in order of publication):

Enjoy!

Rick

[Link]

The pernicious perfidy of page-level web spam (SEM 101)

In the exciting world of today's Internet, where the world's information is literally at your fingertips, where you can endlessly communicate, shop, research, and be entertained, spam is a big downer. The unwanted email spam that fills our inboxes also consumes huge portions of the available bandwidth of our routers and trunk lines. But email is not the only spam game in town.

Web spam is the bane (well, one of the banes) of the search engine and web searcher communities. Search engines want to provide search users with a great experience, helping them find what they want as quickly and as easily as possible. Search users want to use search engines to get the right information they seek as quickly as possible. And webmasters want search users to find their websites, but also to get those search user visitors to become conversions instead of bounces.

Web spam, those unwanted garbage pages that use overtly deceptive search engine optimization (SEO) techniques and contain no valuable content, is a frustration to search engines and search users alike, and ultimately work against the best interests of conversion-seeking webmasters (severely annoying a potential customer is rarely a great sales technique!).

In the previous article that defined web spam and discussed how it is different from junk content, we mentioned that there are two types of web spam. In this article, we're going to delve into the details of the first type: page-level web spam.

Definition of page-level web spam

Page-level web spam uses on-page SEO trickery (not to be confused with link-level web spam, which we'll discuss in an upcoming article). Webmasters and optimizers for these sites do this because they believe they can fool the search engines into giving their webpages a higher-than-deserved ranking based on their content relevancy, often times for subject areas that are completely unrelated to the site's actual content. This is done in an effort to deceive searchers into visiting their spammy sites for a multitude of reasons, none of which usually benefit the end user.

The use of the following questionable SEO techniques will cause Bing to examine your site more deeply for page-level web spam. If your site is determined to be using web spam techniques, your site could be penalized as a result.

Note that Bing recognizes that the core concepts behind many of these techniques can have valid uses. No one is saying that their use always and automatically denotes web spam. The issue of intent behind their use is the distinguishing factor for determining whether or not web spam is present and any site penalties are needed. Please understand that, from a search engine perspective, the web spam effort consistently provides very little to no value whatsoever to end users. The entire effort is directed to fraudulently affect search engine rankings. As Martha Stewart might say, that's not a good thing.

Keyword URL and link stuffing

Definition: This is the use of heavily repeated keywords and phrases with the goal of attaining a more favorable ranking for those words in a search engine index.

Problem: Keywords can be repeated to excess, so much so that they render any text in which they appear unintelligible from a natural language point of view. Those excessive repetitions can also be added in places that are not seen by the end user (meaning outside of displayed page text). Some web spam pages even use repeated keywords that are unrelated to the theme of the page. If any of these conditions are detected, these techniques will draw the attention of Bing as likely web spam.

What we look for: The purveyors of web spam use a variety of methods for keyword stuffing, including:

  • Excessive repetitions of keywords. The number of repetitions relative to the amount of content on the page is a key indicator of web spam. The practice of repetitive keyword stuffing is often relative to the amount of content in a page. For example, a very long page of text dedicated to a single topic may naturally repeat its primary theme keyword several times, but a page with less content using the same number of repetitions of the same word may be indicative of keyword stuffing.
  • Stuffing words unrelated to the page or site theme. Stuffing the page with words that are known to be heavily searched on the Web when they are irrelevant to the theme of a site can be an indicator of web spam. Relevance is an important factor for evaluating whether keywords are indicators of web spam.
  • Stuffing on-page text. Littering the text of a page with repeated keywords that render the text meaningless and unreadable to humans is a clear problem. When such content on the page is not useful to people, the content is often suspect as web spam.
  • Stuffing in less visible areas of the page. Placing repeated keywords in less visible areas of a page, such as at the bottom of the page, in links, in Alt text, and in the title tag, can be indicative of web spam.
  • Hiding stuffed keywords in the code of a page. By putting keywords in the code of a page that the search engine crawler (aka a bot) will see but configuring it so that a web browser will not show it to a human reader can be highly suspicious. Such methods as formatting text fonts the same color as the background, using extremely small fonts, and hiding stuffed keywords using tag attributes such as style=”display: none" and class=”hide” (both of which prevent the tagged contents from being shown to the user) will draw the attention of a search engine for closer scrutiny.

Note that stuffing the keywords tag alone is not a reason to be judged as web spam. But tag stuffing could be an indicator that other web spam techniques may be employed and could draw a search engine to take a closer look at such a site.

It is important that webmasters not overreact to this information. A small amount of relevant keyword repetition is considered common and is not considered web spam as long as it is used naturally within the page content language and the page provides useful, relevant content. They key message is always the same: develop your pages for human readers, not for search engine bots, for the best results. For more information on creating and using keywords wisely, see the blog articles The key to picking the right keywords and Put your keywords where the emphasis is.

Misspelling and computer generated words

Definition: Pages populated with many various spellings of targeted keywords, especially those unrelated to the theme of the page or the site, can indicate that the keyword lists are computer generated.

Problem: Aggressive inclusion of large numbers of misspelled or rare word lists and phrases can be considered web spam when used to excess. The relevance of those words to the theme of the page or the site is the key distinguishing factor here.

What we look for: The Bing team commonly sees the following techniques on web spam sites:

  • Excessive use of misspelled keywords. Huge lists containing all possible iterations of a misspelled word can be so excessive that the page will be worthy of closer inspection for web spam.
  • Large numbers of misspelled words unrelated to the theme of the site. Long lists of word spelling variations whose core definitions are unrelated to the theme of the page or the site can indicate the site is web spam.
  • Common misspellings of popular site URLs in domain names. Common misspellings of URLs and other computer-generated content are usually considered web spam sites.

Redirecting and cloaking

Definition: When a web client visits a website, certain traits can be used to identify the user and redirect them to a different page. These include, but are not limited to, redirects based on the referral code, the user agent (bot or human), and IP address.

Problem: Redirecting can be a legitimate technique in some cases such as if a web client is limited in what it can display on a mobile device web browser, or when a web server uses the client's IP address to determine the language in which to present the content (aka geo-targeting). However, problems arise when sites filter their content based on whether the user agent belongs to an end user web browser versus a search engine bot. This type of filtering can run the gamut between showing the bot a keyword-stuffed page to an entirely different set of content, all of which is an attempt to deceive. When used with this intent, this is web spam.

What the webmasters who implement these techniques don't understand is that search engines can detect this attempted deception. We do see when the content presented is user-agent based, and when the differences between the content variations is not done in the same light as that done between mobile and desktop browsers.

What we look for: Some webmasters design their websites to use the following deceptive techniques when the detected user agent is a search engine bot:

  • Script-based redirects. The use of JavaScript or tag refreshes to automatically change which page is displayed are often suspicious in nature and will get more scrutiny from Bing. This is because some sites use JavaScript to redirect all visiting user agents to a new page, and that page may contain web spam. However, since search engine bots don't execute JavaScript natively, they won't execute the redirect and thus are supposed to index the contents of the original page (although the search engines bots can still detect this behavior).
  • Referral redirects. Some websites consider the referrer when they show a page. When the referrer is a SERP and the target website shows a different page than the one shown when the user directly navigates to the URL, this behavior is considered web spam.
  • Redirect search engine bot to a target page. Some sites detect the user agent specified and send search engine bots to alternate, text-based pages modified with other web spam techniques such as keyword stuffing (but the site provides its normal web content pages to end user web browser user agents). When redirects are filtered on search engine user agents for the purpose of deceiving them, this is a web spam version of cloaking. Bots can detect when they are redirected to special pages. So when this is encountered, it is usually indicative of web spam and will be investigated further.
  • Redirect end users to a target page. Sometimes webmasters use cloaking to work the opposite way than described immediately above. They may serve highly optimized content pages on Topic A to search engine bot user agents, but when a web browser visits the site, the page shown shows content for a completely different subject (typically an illicit one, such as a page promoting porn, casino or online gambling, illicit pharmaceuticals, and the like.). The effort here is to rank well for a commonly searched topic of interest in a search engine results page (SERP). Then supposedly when searchers find that link in their SERPs, they click the blue link in the SERP and are unwittingly redirected to the web spam page.

The problem for webmasters practicing these techniques is that their technical deceptions are not very effective. Search engines use a number of techniques to uncover such fraudulent practices as redirect and cloaking web spam. When they are revealed, the websites of the perpetrators are penalized, sometimes severely. Well-meaning webmasters or online business owners who hire unscrupulous consultants or carelessly take black hat SEO advice from indiscriminate sources on the Web are setting themselves up for trouble. Reviewing the issues identified in this article as well as the official webmaster guidelines for Bing, Yahoo, and Google, will go a long way to keeping a website on the right track for search.

In the next article on web spam, we'll discuss link-level web spam in detail. We'll also include some information on what to do if your site was pegged as web spam and after the problems have been resolved, how to request reinstatement into the Bing index as a normal website. Stay tuned!

If you have any questions, comments, or suggestions, feel free to post them in our Ranking Feedback and Discussion forum. Until next time…

– Rick DeJarnette, Bing Webmaster Center

[Link]

Webmaster Center FAQ updated

Just a quick note for today.

While we've been busy working on the new series of web spam blog articles, we were also busy working on updating the recently compiled Webmaster Center FAQ (as described in the blog announcement New Webmaster Center FAQ available) with even more content. While some might call it new and improved, I just claim that it's bigger with more information (hmmm — I guess it is new and improved!).

The Webmaster Center FAQ document now contains 82 detailed questions and answers, organized into 12 categories, all navigable through a linked table of contents. As before, it's available as a downloadable PDF through the Microsoft Download Center. Check it out!

And be sure to keep coming back to the Webmaster Center blog on a regular basis. We have a lot of new and exciting content coming your way soon. You won't want to miss it. Heck, you might even want to subscribe to our blog's RSS feed.

Thanks for tuning in. Now back to our regularly scheduled programming.

If you have any questions, comments, or suggestions, feel free to post them in our SEM forum. Later…

– Rick DeJarnette, Bing Webmaster Center

[Link]

Eggs, bacon, spam, spam, and spam (SEM 101)

What is spam? One could argue that spam is a multi-faceted thing. The word itself has many definitions. For example, it can be defined as a processed spiced ham and pork slathered with a gelatinous glaze food product found in a tin (it's apparently very popular in Hawai'i, don't you know?). However, spam is also often used to reference a very popular comedy sketch written and performed on Monty Python's Flying Circus (it was punctuated with a waitress reciting a menu in which the canned meat was used multiple times as a featured ingredient in almost every dish, which drives a woman customer to exclaim, "I don't like spam!"). It can also refer to all of the unwanted, advertising email that congests the Internet on a daily basis is also colloquially known as spam (per one study from 2008, more than 78% of all Internet email traffic on a given day, over 200 billion messages, are unwanted).

It seems that we've established a lexicological pattern that most folks don't like the various incarnations of spam (Hawai'i notwithstanding). While any of those preceding versions of spam may occasionally affect the lives of the webmaster community (especially those who love gelatinous-coated, processed meat products, absurd humor, or inappropriate and unsolicited advertising), the version of spam that concerns webmasters most is web spam.

Web spam

So what is web spam? Web spam is unwanted web content that uses overtly manipulative techniques in an effort to fraudulently attain undeservingly high ranking in search engines. As a webmaster, your work goals include building your brand, instilling customer trust and loyalty, and in the end, getting conversions. But by using web spam techniques, you are ultimately doing yourself more harm than good.

To illustrate why this is a problem, let us stand for a moment in the shoes of a regular Joe (or Jo), who wants to use a search engine to find information on a topic of interest. He or she browses to their search engine of choice (might I suggest Bing? :-) , types a keyword or two to direct the search, and then gets the results in a nicely formatted page with lots of relevant choices (I did suggest Bing, after all!). After reviewing the list of top ranked choices in the search results page, he or she clicks on a link, expecting to find a page related to their topic of interest, containing a lot of useful and interesting content. But what if he or she instead ends up on a page that is, at best, tangentially related to their topic or, more likely, on a page filled with unrelated and unwanted content, such as links for casino gambling, illicit pharmaceuticals, quack physical performance enhancers, counterfeit products, fake education degrees, or other dicey, inappropriate material? No one would be happy to be tricked into seeing this garbage.

Averting the potential for such a bad search experience is why Bing works so hard to eliminate web spam from our index. We want customers to find the best results, provide them with a great experience, and thereby instill confidence in our service for future searches. Web spammers, on the other hand, only want to hoodwink the search engines to get artificially high rankings for undeserved — and often non-relevant — content (and many times attempt to scam the customers they snare in their trap).

Web spam vs. junk

In fact, this gaming of the system is what separates web spam from junk pages. Like junk, web spam content typically provides little to no organic value to searchers. But junk pages are just that – useless content. So what kind of pages are considered junk? The following are just some of the types of pages that you can think of as junk:

  • Custom 404 error message pages that erroneously return HTTP status code 200 OK
  • “The page you are looking for has been moved" message pages that do not use redirects
  • Pages with little or no content

As long as those junk pages clearly and legitimately represent what they are to both the public and to search engines, that's fine. Junk is not a problem for search engines! The definition of web spam disregards the quality or type of content a page provides (or doesn't provide, as the case may be). Therefore, the designation of a web page by Bing as web spam hinges on whether there is an effort made to manipulate search engine rankings, and if so, to what degree.

Web spam types & repercussions

We currently classify spam based on two types of signals: page-level and link-level. Page-level web spam is comprised of on-page, deceitful search engine optimization (SEO) techniques employed in an attempt to artificially inflate page rank. Link-level web spam uses fraudulent linking strategies for the same purposes. For a page to be labeled as web spam by Bing, at least one of these techniques must be in use.

Webmasters and SEOs considering the use of such techniques need to understand that search engines, which are busy crawling and indexing the content of the Web every day, are exposed to every sort of exploit imaginable. We see it all. We see the pages that want to be associated with one topic when they really are about another. And since we know how much these deceptive sites frustrate our customers (hey, we use search as much anyone, so we can sympathize!), we actively work to detect web spam. And once it is detected, we penalize those sites with actions commensurate with the egregiousness of their offenses, ranging from rank neutralization (intentionally lowering their organic page rank) to permanent expulsion from the index.

The story continues…

We will continue this discussion with additional, in-depth posts about the definitions and details of both page-level and link-level web spam. We'll also cover how to request re-evaluation of your site's web spam designation by Bing support staff (in cases of mistaken identification or web page revisions that resolve the problem). Stay tuned!

If you have any questions, comments, or suggestions, feel free to post them in our Ranking Feedback and Discussion forum. See you back here soon…

– Rick DeJarnette, Bing Webmaster Center

[Link]

New Webmaster Center FAQ available

Everybody loves a list of frequently asked questions (FAQ). They often include tidbits of knowledge that are not easily found elsewhere, or present information in ways that often make it more understandable.

The Webmaster Center team has been working hard to present more useful content through the team blog in recent months. We already have a FAQ of sorts through the Webmaster Center Help (see the Help system's left hand navigation for the Help-based FAQ). We also have some FAQ content in the Webmaster Center forums. We've also published a series of Webmaster Center blog comment Q&A posts (parts one and two so far) to address common questions posted to the blog (rather than the forum!).

Today we have posted an edited compilation of Q&As from all of those sources and more into a downloadable PDF as our official Webmaster Center FAQ for your reading pleasure. It is a long document (would you expect anything else from me?), but it is organized into categories and has a navigable table of contents to help you quickly get to the information that interests you most. We hope you find it useful. We'll be periodically updating, adding to, and revising the FAQ content as new content is developed and new questions are asked. Please be sure to check back from time to time to see what new content has been added. Enjoy!

If you have any questions, comments, or suggestions, feel free to post them in our SEM forum. See you again soon…

– Rick DeJarnette, Bing Webmaster Center

[Link]

IIS SEO Toolkit 1.0 hits the streets! (SEM 101)

In August, we alerted webmasters to the news from the Microsoft Internet Information Services (IIS) team of the beta 1 release of their excellent new tool, the IIS SEO Toolkit. The public response we received was very enthusiastic and supportive. In response to many requests, we followed that up with a more detailed article explaining how the new tool is installed as an extension of IIS 7.

Since that time, the tool was refreshed in a beta 2 release and, as of November 13, a full version 1.0 release. Given the deserved popularity of the tool among web developers and search engine optimizers, I wanted to take a moment to explore what has changed with the IIS SEO Toolkit since the beta 1 article was published.

Technical requirements unchanged

This remains unchanged from the first beta release. The tool was built as an extension of IIS 7, so you need to use an operating system that can run IIS 7 in order to run the tool. That means you can run any of the following IIS 7 and higher-compatible operating systems:

  • Windows Vista Service Pack 1 (and higher — you may need to first upgrade the default version of IIS to 7.0)
  • Windows 7
  • Windows Server 2008
  • Windows Server 2008 R2

Because of the toolkit's core dependency of running as an extension to IIS 7, Windows XP, which cannot run IIS 7, can't be used as a platform for the IIS SEO Toolkit.

However, make no mistake about this important point. The IIS SEO Toolkit is a client-side tool. Once it is installed on your computer workstation, you can use it to perform a detailed analysis on your website regardless of the web server platform it uses, be it IIS, Apache, or even if the "site" is simply a collection of HTML and associated files and folders stored locally on your computer.

Install the toolkit

Admittedly, the beta 1 product was a bit complicated for some to install. We got some complaints about it in Bing. But that's beta software for you! The IIS team developers took your feedback and streamlined the process tremendously for the final release. Now the installation of version 1.0 of the IIS SEO Toolkit is an automated process (click the preceding link to start). You don't even need to uninstall any beta versions! How cool is that?

New and improved features since beta 1

In our first blog article about the IIS SEO Toolkit, we did a decent rundown of the basic features found in beta 1. So what's changed? Let's take a look:

Usability improvements

There are many improvements here. First and foremost, after installing the new version of the IIS SEO Toolkit, you can start the tool through an icon on your Start menu (I found it at Start > IIS 7.0 Extensions > Search Engine Optimization (SEO) Toolkit 1.0). This provides users with much more intuitive, direct access to the tool.

Other usability improvements include streamlining the query builder interface, the addition of a Violations tab on the Details dialog boxes for direct access to affected pages, new context menus for copying content from the tool, new Help shortcuts, keyboard navigation has been improved, and much more.

Extensibility enhancements

The toolkit has been opened up with a new set of APIs for developers who need to extend the tool's potential. Now you can create custom modules that can extend the crawling process through augmenting existing metadata reports with your own, parsing new content types, and defining new violations based on user needs. The extensibility even enables developers to add new user interface (UI) elements for both the Site Analyzer and the Sitemaps and Sitemap Indexes tools. For more information on how this feature works, check out the IIS developer team's blog article IIS SEO Toolkit – Crawler Module Extensibility.

Metadata stored

By using the aforementioned extensibility of the tool, the HTML parser can now store the contents of the tags from all pages within the analyzed site and use that data for new queries.

Comparison reports

You can now compare two reports side-by-side to track various changed metrics over time from your website. For more information on this, check out the blog article IIS SEO Toolkit – Report Comparison.

Authentication support

The tool's custom crawler, IISBot, can now crawl secured pages that use either basic or Windows authentication. This will be a significant boon to users who want to use the toolkit on secured intranet sites and on protected staging servers prior to publication.

Canonicalization support

The toolkit supports canonicalization efforts by accepting the use of the rel="canonical" attribute of the tag. It can look for and identify several new canonical error violations in the site's code. Support for sub-domain canonicalization is provided. The toolkit also provides link position information for canonical URLs.

Data export options

You can now choose to export a comma-separated-value (CSV) format list of all detected violations, all URLs, or all links from the toolkit.

New reports

New reports include a redirects summary and links depth, which identifies the deepest-linked pages in your site. For more on this, take a look at the blog article IIS SEO Toolkit – New Reports (Redirects and Link Depth).

New Routes query

The routes query is an entirely new type of query, which identifies deeply-linked pages (pages that are buried on your site and require several clicks to finally access from the home page) and more.

Local files cache options

Users now have the option to disable the toolkit from keeping analyzed site files in a local cache. Disabling the cache allows reports to run faster and consumes far less disk space. Note, however, that disabling the cache also prevents the Site Analysis tool from presenting the Content tab from the Details dialog box, disables the contextual position of links, and disables the Word Analysis feature from appearing.

Better robots.txt file handling and management

The Robots Exclusion tool can directly open your robots.txt file for processing.

Better Sitemap.xml handling and management

The Sitemaps and Sitemap Indexes tool has been improved with better filtering and managing of canonical URLs.

More information

The IIS SEO Toolkit development team blog lists more details of the features from each successive release of the toolkit, from beta 1, through beta 2, up to the new version 1.0. Check out these articles and their team blog for ongoing, detailed posts on how to take full advantage of the tool for your website.

If you have any questions, comments, or suggestions, feel free to post them in our SEM forum. Until next time…

– Rick DeJarnette, Bing Webmaster Center

[Link]

Webmaster Center blog comments Q&A, Round 2

We still get many questions in our blog comments, even though we try to encourage our readers to post their questions to our Webmaster Center forums (which are actually staffed to answer your questions!). I do look through the blog comments every day and delete those that are junk (those that are empty, duplicated, offensive, and overtly spammy – see our Q&A reply on why blog comments are deleted in the 1st Webmaster Center blog Q&A post). But I don't always get a chance to address the individual questions posted there.

Since so many folks still ask questions in the blog comments, it's just easier to aggregate them and address them en masse in a post. And if reading this Q&A inspires you to ask additional questions, please do so in the forums, where you're more likely to get an intelligent and expedient response!

That all said, let's get to it:

Q: When will the IIS SEO Toolkit come out of beta?

A: That's already done! Version 1.0 of the IIS SEO Toolkit went live on Friday, November 13th. Note that you can install the official 1.0 release without needing to uninstall the beta version first. We'll be covering more about the latest release of the IIS SEO Toolkit in upcoming articles in this blog. Stay tuned!

Q: Is there any way I can run the IIS SEO Toolkit in Windows XP?

A: The IIS SEO Toolkit has a core dependency of running as an extension to Internet Information Service (IIS) version 7 and higher. Unfortunately, IIS 7 itself won't run directly on Windows XP, so as a result, neither will the IIS SEO Toolkit. However, you can run the toolkit in Windows Vista Service Pack 1 (and higher — you may need to first upgrade the default version of IIS to 7.0), Windows 7 (which comes with IIS 7.5 by default), and for those who like to run Windows Server as their desktop OS, Windows Server 2008, or Windows Server 2008 R2.

Q: I use Apache on Linux as my web server platform – can I still use the IIS SEO Toolkit to probe and analyze my site?

A: Absolutely, as long as your computer workstation is running any of the compatible Windows platforms listed in the previous question's answer. You see, the IIS SEO Toolkit is designed to be used as a client-side tool. Once the toolkit is installed on an IIS 7-compatible computer, you can analyze any website, regardless of which web server platform the probed website itself is running.

Q: How do you install IIS SEO Toolkit version 1.0? The beta 1 installation was so tiresome and complicated.

A: The version 1.0 release has improved all that. Give the v1.0 installation a try! And remember: you can install version 1.0 over an earlier beta installation if need be, so it's really easy to do now (as long as your computer can run IIS 7).

Q: Now that MSNBot 1.1 has been replaced with MSNBot 2.0b, will there be any visible differences to the untrained eye?

A: Not at all, unless you use that "untrained" eye to read through your web server's referrer logs. If you do, you'll note that when you are crawled by Bing, you'll now only see references to the following user agent:

msnbot/2.0b (+http://search.msn.com/msnbot.htm)

The old bot, version 1.1, has left the room.

Q: When will MSNBot 2.0 retire for Googlebot?

A: Ah, a funny! That was a good one. :-)

Q: Bing's Webmaster Center tools say my site has malware on it, but Google's does not. Whom do I trust?

A: No news is not necessarily a validation of cleanliness when it comes to malware detection. The absence of a positive report from a third-party service could actually be a false negative because you don't control the scanner's rules, what content is scanned, and how often the scanning is done.

You always need to be diligent about regularly checking for malware on your website yourself, regardless of whom else might be scanning your site. However, if you do receive warnings about detected malware on your website from a trusted and reliable source, it behooves you to look into the matter a bit deeper, just to be sure. Check out our four-part series of posts in this blog called "The merciless malignancy of malware" for more information on what to do and how to do it. Good luck!

Q: I clicked a link in a Bing search engine results page (SERP) that popped up a message box that said something about the site containing malware. Can I just ignore that?

A: That's not a good idea. Bing will disable the normal "blue link" in its SERPs to a page that was detected to contain malware, substituting instead the following malware pop-up message when the link is clicked:

The warning message was put there for your protection. Yes, you can opt to click through the warning message and go to the site, but that course of action puts your computer at serious risk of infection.

To understand more about what Bing does with its malware warnings and what to do when you see one (or worse yet, when your users report that they get it for your links in the Bing SERPs!), check out the blog post, The merciless malignancy of malware Part 1.

Q: What is the name of the publisher/sponsor of Bing?

A: Bing is one of the many products and services offered by Microsoft Corporation of Redmond, Washington. Perhaps you've heard of us.

Q: I liked your articles on keywords usage in Bing. How are keywords best used in Google and Yahoo ads?

A: Hmmm. You might want to ask Google and Yahoo about that.

Q: Will paid links or link exchanges really get my site to the top ranking?

A: Nope. You can buy search ads from Microsoft, but neither the presence nor the absence of those pay-per-click advertisements have any influence in where your site ranks in the Bing SERPs. (Although if you bid well in Microsoft adCenter, you will likely earn more business as your site's link will show up under the Sponsored sites list on the Bing SERPs).

But let's be clear on this. Paying for or participating in link exchange schemes will not improve your page rank with Bing, and in fact, it could very well hurt it. What will ultimately improve your page rank is creating great content, earning inbound links from relevant, authoritative websites, and performing legitimate SEO on your site. If you build your site with a focus on helping people find great information, you'll be on the right track for earning the highest rank your site deserves.

Q: What secret trick can I do to quickly increase the rank of my site's pages on Bing?

A: Well, it's actually the same trick used for other search engines. Get a rubber chicken, swing it around your head in a circle three times at a 14 degree angle off parallel from the ground (the high point of the circle pointing to the northwest). Follow that up by performing legitimate search engine optimization (SEO) techniques as noted in the answer to the preceding question. Note that the rubber chicken step may not be applicable in all locations. ;-)

Q: Is the information from the Webmaster Center Crawl Issues tool available through the Bing API?

A: Unfortunately, it is not. You'll have to engage the tool directly to access this data.

Q: I found a broken link in your post. Will you fix it?

A: I will if you tell me about it!

Q: What does it mean to "404 a page"?

A: HTTP error 404 is presented when a URL no longer leads to a file to display in a browser at the listed path and file name referenced. This may happen because the page file was moved, renamed, deleted, the server is offline, or because the URL itself contains an error. To 404 a page means simply to take it down, make it inaccessible. For more information on 404 File Not Found errors (and how to customize them for customers visiting your site), check out our recent blog article titled, Fixing 404 File Not Found frustrations (SEM 101).

Q: How do I request that Bing remove a bad link to my website?

A: Leave a message on the Webmaster Center’s Crawling/Indexing Discussion forum. As long as you own the site hosting the page with the bad link, we can help by deleting the content from the index, the cache, or both. If you don't own the page with the link, then that becomes a bit more tricky…

If the erroneous inbound link is on a site you don't own or control and the URL at least references the correct domain name, consider creating a custom 404 message that will help the folks find the content they want once they arrive on your site. If you have a page that would be a perfect match for the broken link, you might even consider implementing a 301 redirect if the domain name is correct but the file name referenced in the broken inbound link does not exist.

Q: I submitted a link to my new website earlier today and it's still not showing up in Bing! Why not?

A: It's impossible to comment with any certainty on a particular case without knowing the specific issues involved, but the solution might be as simple as waiting just a little bit longer! But after a few days have passed, if you still have specific questions about this process, then please feel free to leave a message on the Webmaster Center’s Crawling/Indexing Discussion forum and we'll look into it.

Q: I can't find how to navigate to Webmaster Center anymore from the Bing home page. What happened to the More link?

A: The November 11 update changed the user interface of the Bing home page just a tad, so now you need to click the EXPLORE link on the home page to see the Bing at a glance page. On all other Bing pages, the More link is still there (in the list of menu links across the top right of the page). See our blog post Changes to Bing, UI navigation to Webmaster Center for more details.

That’s it for now! If you have any questions, comments, or suggestions, feel free to post them in our SEM forum. I'll be back soon with another SEM 101 article. Until then…

– Rick DeJarnette, Bing Webmaster Center

[Link]

Link building for smart webmasters (no dummies here) (SEM 101)

In past SEM 101 articles, we've talked about the importance of inbound links to successful ranking (see "Links: the Good, the Bad, and the Ugly" – Part 1 and Part 2). We've already discussed many issues surrounding them, but we haven't done a dedicated post on how to be successful at link building from a search engine's perspective. Let's finally address that omission here and now.

What is the point of building links?

Your website is your self-representation on the Web. It's a major asset to your business, often simultaneously serving as your online business card, an introductory company brochure, detailed sales literature, supporting documentation, and a point of sales distribution point for your products and/or services. It's also your place to demonstrate your expertise in your specialized field of interest. If your website offers something of worth, valuable to web users interested in that topic, then it behooves you to let the world know about it. Consider the effort your contribution to the betterment of humanity (or at least a chance to make a few conversions!).

Link building is a very important form of self-promotion on the Web. You contact webmasters of other, related websites and let them know your site exists. If the value that you have worked so hard to instill in your site is evident to them, they will assist their own customers by linking back to your site. That, my friend, is the essence of link building.

Think of link building as your chance to build your reputation on the Web. As your website is likely one of your business' most valuable assets, consider link building to be a primary business-building exercise. Just don't make the mistake of believing it will result in instant gratification. Successful link building efforts require a long-term commitment, not an overnight or turnkey solution. You need to continually invest in link building efforts with creativity and time. Good things come to those who wait (and work smartly!).

Bing’s policy on link building

Bing’s position on link building is straightforward – we are less concerned about the link building techniques used than we are about the intentions behind the effort. That said, techniques used are often quite revealing of intent. Allow me to explain.

Bing (as well as other search engines) places an extremely high priority on helping searchers find relevant and useful content through search. This is why we regularly say that search engine optimization (SEO) techniques oriented toward helping users are ultimately more effective than doing SEO specifically for search engine crawlers (aka bots).

The webmasters who create end user value within their websites, based on the needs of people, are the ones who will see their page rank improve. So where does that value come from? Content. Good, original, text-based content.

How do I get valuable inbound links?

Make no mistake: getting legitimate and highly valuable, inbound links is not a couch-potato task. It's hard work. If it were easy to do, everyone would do it and everyone would have the same results – mediocrity. But this is not to say that it is impossibly hard or that successful results are unattainable. Persistence and diligence are extremely important, but so is having something of value, content-wise, to earn those inbound links to your site.

We’ve said it before, and you'll hear it said again: content is king. Providing high-quality content on your pages is the single most important thing you can do to attract inbound links. If your content is unique and useful to people, your site will naturally attract visitors and, as a result, automatically get good links to your site. By focusing on great content, over time, your site will naturally acquire those coveted inbound links.

But are all inbound links created equal? Not at all. Your goal should be to focus on getting inbound links from relevant, high-quality sites that are authorities in your field.

Site relevance

Relevance is important to end users. If you run a site dedicated to model trains, getting an inbound link from an illicit pharmaceutical goods site is orthogonal to the interests of your customers. Unless the outbound linking page from such a site makes a relevant case for linking to you, this type of unrelated link is of minimal value (and if the intention is determined to be manipulative, may even lead to penalties against your site). Why? Because so many sites today are set up solely to serve as link exchanges, where they have no specific theme to their site (other than seemingly random – and usually paid for – outbound links). As these sites do nothing to advance the cause of the web user looking to find useful information, search engines regard them as junk for end users, and thus as junk links to their linked-to sites.

You see, search engines know everything about the sites linking to your site. We crawl them just as we crawl your site. We see the content they possess and the content you possess. If there is a clear disconnect, the value of that inbound link is significantly diminished, if not completely disregarded.

Authority sites

So what links are valuable? That's pretty easy, isn't it? If relevance is important, the most highly regarded, relevant sites are best of all. Sites that possess great content, that have a history in their space, that have earned tons of relevant, inbound links – basically, the sites who are authorities in their field – are considered authoritative sites. And as authorities, the outbound links they choose to make carry that much more value (you don't get to be an authority in your field by randomly linking out to irrelevant, junk sites). Good SEO practices, a steady history, great content, and other, authoritative inbound links beget authority status. The more relevant, authoritative inbound links you earn for your website, the more of an authority your site becomes in the eyes of search engines. These are the natural results of solid content and smart link building.

Going unnatural

So what does it mean to go unnatural? It means you're trying to fake out the search engines, to try to earn a higher ranking that the quality of your site's content dictates as natural through manipulation of search engine ranking algorithms. This chicanery can range from relatively benign but useless efforts to overly aggressive promotion to outright fraud. And as the major search engine bots are continually crawling the entire Web, we see what is being done, the relationships between linked sites, the changes to links over time, which sites link to one another, and so much more, we account for these cunning behaviors in our indexing values applied to those pages.

Examples of potentially conspiratorial hocus-pocus that might be perceived as unnatural and warrant a closer review by search engine staff include but are not limited to:

  • The number of inbound links suddenly increases by orders of magnitude in a short period of time
  • Many inbound links coming from irrelevant blog comments and/or from unrelated sites
  • Using hidden links in your pages
  • Receiving inbound links from paid link farms, link exchanges, or known "bad neighborhoods" on the Web
  • Linking out to known web spam sites

When probable manipulation is detected, a spam rank factor is applied to a site, depending upon the type and severity of the infraction. If the spam rating is high, a site can be penalized with a lowered rank. If the violations are egregious, a site can be temporarily or even permanently purged from the index.

Using the Webmaster Center Backlinks tool

Are you curious to see who is linking to your site and how authoritative Bing considers each site to be? Check out the Bing Webmaster Center tools, specifically the Backlinks tool. (If you haven't yet registered your websites with Webmaster Center, go to About the Bing Webmaster Center tools to learn more.) Once logged in, click the site you wish to review from the Site List page (a webmaster can register multiple sites on one account), then click the Backlinks tool tab. The Page score field associated with each linked page indicates a relative value for that page.

So what can I do to get good, legitimate inbound links?

OK, so you have great content. You built it, and now they will come, right? Well, if you have the patience of Vladimir and Estragon, sure. But sometimes you want to nudge the world a little bit. You want to speed up that process, all the while remaining legitimate in your efforts. You want to actively participate in link building!

I described link building earlier as hard work. But perhaps smart work is a better description. Check out a few of these smart ideas and determine how they apply to your site, your customers, and your industry's niche. Note that all of these ideas are predicated on the assumption that you've already created useful, original, expert content that users will want to read and webmasters of relevant sites will want to link to. That done, let's spread the news! Here's how:

  • Develop your site as a business brand and be consistent about that branding in your content
  • Identify relevant industry experts, product reviewers, bloggers, and media people and let them know about your site and its content
  • Write and publish concise, informative press releases online as developments warrant
  • Publish expert articles to online article directories
  • Participate in relevant blogs and forums and refer back to your site's content when applicable (Note that some blogs and forums add the rel=”nofollow” attribute to links created in user-generated content (UGC). While creating links to your content in these locations won't automatically create backlinks for search engines, readers who click through and like what they find may create outbound links to your site, and those are good.)
  • Use social media sites such as Twitter, Facebook, and LinkedIn to connect to industry influencers to establish contacts, some of whom may connect back to you (be sure you have your profiles set up with links back to your website first)
  • Create an online newsletter on your site with e-mail subscription notifications
  • Launch a blog or interactive user forum on your site *
  • Join and participate in relevant industry associations and especially in their online forums
  • Ultimately, strive to become a trusted expert voice for your industry and let people know that your website contains your published wit and wisdom
  • For more link building ideas, check out these additional link building tips.

* Note that if you do go this route, please keep site security issues in mind. You'll want to keep a close eye on all UGC, being watchful for possible code injection and malformed links (filter for it if you can). Consider disallowing UGC from unregistered users. Be sure to keep up with web server and application software updates, use applicable security software, require strong passwords, etc. For more information on securing your web server, check out our recent series of security blog articles , "The Merciless Malignancy of Malware," especially Part 3 and Part 4.

When you do develop these new ideas, be consistent in your work. Nothing can kill momentum like over-commitment early on and under-delivery later on. Assess what you can realistically do on an on-going basis, commit to it as part of building your business, and do it consistently and with high quality. You want to develop both an online library of worthwhile content and a reputation for the regular delivery of new material. That is what draws initial visitors and keeps them coming back.

If you have any questions, comments, or suggestions, feel free to post them in our SEM forum. See you again soon…

– Rick DeJarnette, Bing Webmaster Center

[Link]

Changes to Bing, UI navigation to Webmaster Center

Did you hear the news? Bing was recently upgraded with another set of exciting end-user features. Instead of reciting the long list of changes here, I'll just point you to the blog post in the Bing Search blog titled, "Bing’s Next Chapter Begins Today." The Bing engineering team has been focused on improving the already cool features that have made using Bing so much fun.

One edit to the Bing home page has changed the way users navigate to the Toolbox page, the Webmaster Center tools, the blog, and the forums pages. On the Bing home page, click the EXPLORE link in the left list to browse to the Bing at a glance page. Scroll down that page to find all of the aforementioned links for the Webmaster Center resources.

Note that once past the Bing home page, the More link, which also links to the Bing at a glance page, is still listed in the top left of all other Bing pages.

See you again soon…

– Rick DeJarnette, Bing Webmaster Center

[Link]

Robots speaking many languages

We’ve already covered in past blog articles some of the basics about how webmasters can use a file called robots.txt to control how search engine crawlers (aka bots) crawl their websites. But there is so much more to talk about with bots. So let's take a bit of a deeper dive into the subject.

Topic 1: Using the proper text file encoding

The robots.txt file is used by webmasters to either specifically define which files and directories that compliant search engine bots may or may not crawl. Robots.txt files are basically text files. However, even something as seemingly straightforward as a text file is not as simple as it might seem. Which type of file encoding scheme is used to save the file makes a big difference. For example, when you use the quintessential text file editor, the Notepad utility in Windows, you can save your text files in your choice of the following encoding types:

If you choose to save your robots.txt file as either Unicode or Unicode big endian, the resulting file will not be compatible with most search engine bots.

Robots.txt file requirements

To ensure that the search engine bots can read the directives for blocking or allowing content access in robots.txt file (not just with Bing, but all of them), save the file using one of the following compatible encoding formats:

  • American Standard Code for Information Interchange (ASCII) (a 7-bit, 128 character set)
  • ISO-8859-1 (an 8-bit, 256 character set backward compatible with US ASCII)
  • UTF-8 (a variable-length character encoding version of Unicode that is backwards compatible with US ASCII)
  • Windows-1252 (aka ANSI, as used in Microsoft Windows, it is an 8-bit, 256 character set backward compatible with US ASCII)

Sticking with one of these compatible encoding formats will ensure that the bots you wish to control can read, and thus act upon, your robots.txt file. For more information, check out this article covering the history of character sets from the Microsoft Typography team.

Topic 2: Writing non-ASCII alphabetic characters in robots.txt

The limited number of compatible file encoding formats for robots.txt exposes a potential problem for some users.

The Internet Engineering Task Force (IETF) proclaims that Uniform Resource Identifiers (URIs), comprising both Uniform Resource Locators (URLs) and Uniform Resource Names (URNs), must be written using the US-ASCII character set. However, ASCII's 128 characters only covers the English alphabet, numbers, and punctuation marks. However, some of the alphabetic characters from other Latin-based languages, such as ñ in Spanish and ç in French, are left out of ASCII. More significantly, most characters in non-Latin-based alphabets, such as pi (π) in Greek, ya (я) in Cyrillic, and entire alphabets from many other world languages, can't be accurately written in the limited, English-oriented ASCII.

This limitation with regard to robots.txt can come into play for webmasters when bots visit web servers using languages whose characters fall outside of the ASCII character set. If a robots.txt file is present on that server and it includes directives to block bot access to files and directories whose names include non-ASCII characters, the bot may not interpret the directive as the webmaster intended.

Percent encoding to the rescue

There is a way to make sure the bots can properly read the file and directory path names, regardless of whether it adheres to ASCII standards. When writing directives that include characters unavailable in ASCII, you can "escape" (aka percent-encode) them, which enables the bot to read them.

Percent-encoded characters, discussed in the IETF's RFC 3986, are used as character substitutes. A percent-encoded character is a sequence of one or more three-character codes (aka octets), starting with the "%" sign and followed by two hexadecimal numbers. Percent encoding converts the character's hexadecimal UTF-8 value into a sequence of one or more ASCII-based octets that a URI-compliant bot can read.

To demonstrate what percent-encoded text looks like, type www.%62%69%6e%67.com in your browser's address bar. It will be automatically decoded into www.bing.com. The octet codes %62, %69, %6e, and %67 are decoded by the browser into letters b, i, n, and g, respectively. Note though, that that the recommended use for percent encoding is really for those non-ASCII characters in a URL path to minimize the potential for decoding translation errors.

Real world example

Let’s look at a real-world example. Suppose you were the webmaster for a website that contained the URL http://www.domain.com/папка/ (the folder name in the sample URL is written in Cyrillic and literally means "folder"). To block a bot from accessing that folder on your website using percent encoding in your robots.txt file, you would need to write the directive as follows:

Disallow: /%D0%BF%D0%B0%D0%BF%D0%BA%D0%B0/

If instead you simply wrote

Disallow: /папка/

the bot may not be able to read the directive and thus fail to perform as desired.

Performing percent encoding

So how do you translate your non-ASCII characters into escape-encoded octets? Well, it's a bit of a chore, frankly. If you search for them, there are a few websites and/or tools that offer to perform percent encoding for you, but rather than endorse a site I know nothing about, I'll instead tell you how to manually calculate the conversion. If you want to use an automated tool, go for it. But knowing how the process works will allow you to verify that a tool encoded your characters correctly.

Warning! I'm going to get pretty tech geeky here. If working with hexadecimal and binary numbers is not your thing, I apologize up front!

OK, thus warned, let's get to it. You first need to know the UTF-8 hexadecimal value for each character you want to encode. They are usually presented as U+ HHHH. The four "H" hex digits are what you need.

As defined in IETF RFC 3987, the escape-encoded characters can be between one and four octets in length. The first octet of the sequence defines how many octets you need to represent the specific UTF-8 character. The higher the hex number, the more octets you need to express it. Remember these rules:

  • Characters with hex values between 0000 and 007F only require only one octet. The high-order (left most) bit of the binary octet will always be 0 and the remaining seven bits are used to define the character.
  • Characters with hex values between 0080 and 07FF require two octets. The right most octet (last of the sequence) will always have the first two highest order bits set to 10. The remaining six bit positions of that octet are the first six low-order bits of the hex number's converted binary value (I set the Calculator utility in Windows to Scientific view to do that conversion). The next octet (the first in the sequence, positioned to the left of the last octet) always starts with the first three highest order bits set to 110 (the number of leading 1 bits indicates the number of octets needed to represent the character – in this case, two). The remaining higher bits of the binary-converted hex number will fill in the last five lower order bit positions (add one or more 0 at the high end if there aren't enough remaining bits to complete the 8-bit octet).
  • Characters with hex values between 0800 and FFFF require three octets. Use the same right-to-left octet encoding process as the two-octet character, but start the first (highest) octet with 1110.
  • Characters with hex values higher than FFFF require four octets. Use the same right-to-left octet encoding process as the two-octet character, but start the first (highest) octet with 11110.

Below is a table to help illustrate these concepts. The letter n in the table represents the open bit positions in each octet for encoding the character's binary number.

Hexadecimal value

Octet sequence (in binary)

0000 0000-0000 007F

0 nnnnnnn

0000 0080-0000 07FF

110 nnnnn 10 nnnnnn

0000 0800-0000 FFFF

1110 nnnn 10 nnnnnn 10 nnnnnn

0001 0000-0010 FFFF

11110 nnn 10 nnnnnn 10 nnnnnn 10 nnnnnn

Let’s demo this using the first letter of the Cyrillic example given above, п. To manually percent encode this UTF-8 character, do the following:

  1. Look up the character’s hex value. The hex value for the lower case version of this character is 043F.
  2. Use the table above to determine the number of octets needed. 043F requires two.
  3. Convert the hex value to binary. Windows Calculator converted it to 10000111111.
  4. Build the lowest order octet based on the rules stated earlier. We get 10111111.
  5. Build the next, higher order octet. We get 11010000.
  6. This results in a binary octet sequence of 11010000 10111111.
  7. Reconvert each octet in the sequence into hex. We get a converted sequence of D0 BF.
  8. Write each octet with a preceding percent symbol (and no spaces in-between, please!) to finish the encoding: %D0%BF

You can confirm your percent encoded path works as expected by typing it into your browser as part of a URL. If it resolves correctly, you're golden.

There always more to talk about with robots (and so many other webmaster-related topics). If you have any questions, comments, or suggestions, feel free to post them in our SEM forum. Until next time…

– Rick DeJarnette, Bing Webmaster Center

[Link]

MSNBot 1.1 is retired

The Bing team has been talking about its new crawler (aka bot), MSNBot 2.0b, in this blog for quite some time now. We have made numerous improvements in its performance, addressed some webmaster concerns, and published detailed information on how to control the bot with a robots.txt file. Today we are announcing that the new bot is fully operational. This development will enable Bing to do a better job at gathering the information we need from the myriad of websites we index worldwide.

As MSNBot 2.0b enters full-scale production, the time has come to retire our previous generation bot, MSNBot 1.1. By the end of the first week in November, you will no longer see the following user agent in your referrer logs:

msnbot/1.1 (+http://search.msn.com/msnbot.htm)

The only Bing user agent you will see in your logs from this point forward will be this, our new bot:

msnbot/2.0b (+http://search.msn.com/msnbot.htm)

This event is a major milestone for the Bing engineering team, and we look forward to the positive developments that this bot will bring to our search engine index and thus to Bing customers. We want to specifically thank all those webmasters who provided us with valuable feedback as we ramped up the production of the new bot. Your assistance and cooperation was essential to making this milestone happen.

Stay tuned for more information about this and other developments from the Bing engineering team. We'll have a lot more to talk about in the coming weeks and months.

If you have any questions, comments, or suggestions, feel free to post them in our Crawling/Indexing Discussion forum. Be back at you soon…

– Rick DeJarnette, Bing Webmaster Center

[Link]

Fixing 404 File Not Found frustrations (SEM 101)

You’ve seen it. So have I. Nearly every person who has actively browsed the Web for more than 15 minutes has seen it. I'm talking about the dreaded 404 File Not Found error. When it occurs, users simply abandon their search on that site and go elsewhere. That's a potential lost sale, subscription, or download opportunity (aka conversion) for the affected site! It has been estimated that up to 10% of traffic to large websites on the Web is looking for pages that don't exist, so this is a big problem.

Back in the early days of the Web, when you entered an incorrect URL, you got a nearly blank, stark white screen containing nothing but the simple words, "404 File not found." Yeah, thanks. That's really helpful. But back in the day, when people actually wrote webpages in Notepad, that behavior was de rigueur. That was the way things were. Back then. But that was then. This is now. We can do better.

Unusable URLs

There are a great many ways for URLs to be rendered unusable. They can be mistyped or misspelled by the user, the page can be moved, renamed, or deleted by the webmaster, and the URL can be incorrectly written by external webmasters who create the outbound link from their sites to your, which is the most frustrating situation for the webmaster of the intended destination.

So when a potential customer of yours, interested in learning more about what you do or what it is your site has to offer, uses an erroneous URL for a page within your website today, what do they get? Do they get the Web's equivalent of the blue screen of death, a useless page that stops them dead in their tracks and forces them to move on to a competitor's site? Or do they get a page with helpful guidance that keeps them in your site, offering to assist them with finding the information they seek?

As webmaster, you control how you rename and remove pages from your site (see information on using redirects to point to moved and renamed pages in an earlier SEM 101 article). But you can't control the content of your inbound links, nor can you nudge that user who can't spell or remember your page-naming scheme (yet another good reminder to make the file names of your pages logical and easy to remember). So instead of cursing the darkness of silly users over whom you have no control, instead light a metaphorical candle by creating a custom 404 error page.

Custom 404 error messages

Creating a custom 404 error page is not that hard to do, but so many webmasters overlook doing this small but crucial step. Your custom 404 error page will appear in place of a generic 404 error message when the URL to your site is broken. By anticipating what users will likely want to know when they come to your site, you can proactively give them enough information in the custom error message to keep them in your site, and then provide them with easy means to find the information they want.

OK, so this is a great idea, but how do you do it? Well, I've got you covered there. What you do depends on which web server platform you are using to host your site. Let's take a look at what you need to do to establish this safety net for your users. After all, adding this one little feature to your site might make the difference between a bounce and a conversion!

Dynamic or static?

Before we discuss anything else, you first must decide how you want to approach this situation. You can create a single, static custom 404 error page for your entire site. That will be pretty simple to implement and is likely a good fit for smaller sites, but larger sites might feel constrained by a single, static page. There are technologies available that allow you to create dynamic error pages based on script, and that might suit a larger site's needs better, but the caveat exists that if the host scripting engine suffers a failure, then you'll have no custom 404 message at all. Given that this is not a developer blog, I'll just point all webmasters interested in creating dynamic 404 error pages out to Bing searches for information on both Internet Information Services (IIS) and Apache platforms.

If you're interested in learning more about basic, static custom 404 pages, follow on. After all, this is SEM 101, right?

Create the custom page content

You need to decide what information and features you want to include in your custom 404 page. I suggest the following:

  1. Use the page template for your website to maintain consistency with your site's look and feel on this new page.
  2. Include your site's navigation scheme in this page. If you have created an HTML sitemap page or a dedicated site search box, include access to those features as well.
  3. In the page's text, first acknowledge that the URL for the page the user was intending to see does not exist. Then offer a quick description of your site's subject-matter theme and list the products/services/opportunities it offers. Follow that with a suggestion that the reader use the on-page site navigation (menus, sitemap, search box, etc.) to look for the information they are interested in.

Don’t add huge, hard-to-read fonts, auto-play music, videos, or animations, or include anything else that may be off-putting to the first-time reader. Leave off the advertisements as well – this will distract the user from the critical mission at hand – to get the user back onto a real page within your site. Remember, the whole point of a custom 404 error page is to prevent a bounce (aka a single page visit session in which the visitor abandons the site without visiting any other pages). Keep the page clean and easy to read.

Save this custom page file to the root directory of your site (for this discussion, I'll call the new file 404.htm). Now let's cover how to employ it on the various web server platforms.

Apache

Apache users can configure a special text file found in the root directory of their site to implement a custom 404 error page. The file, named .htaccess (the dot precedes the file name and contains no typical file name extension), can be edited in Notepad to include the following line (using our sample error page file):

ErrorDocument 404 /404.htm

Of course, you must name and store the custom 404 error page file identified in the location as specified or even it will return a 404 File Not Found error! Use Bing search results to find more information on what to do in Apache for custom 404 messages.

IIS

If you are using IIS, the implementation of a custom 404 page is simple. Here's how:

  1. First, open IIS and select the website you want to customize.

In IIS versions 5 and 6:

  1. Right-click the website and select Properties.
  2. Click the Custom Errors tab.
  3. Double-click the listing for status code 404 to edit that setting.
  4. In the Edit Custom Error Properties dialog box, set the Message type drop down list to URL.
  5. In the URL text box, per our example earlier, type /404.htm.
  6. Click OK to save your work.

In IIS 7 and higher:

  1. In the right pane, double-click Error Pages.
  2. Double-click the listing for status code 404 to edit that setting.
  3. In the Response Action group, select the option Insert content few static file into the error response.
  4. Click Set.
  5. In the Root directory path text box, either type the physical path of the root directory (up to the portion of the path that branches into a specific locale directory) of the custom error page file or click Browse and navigate to the error file root directory and then click OK.
  6. In the Relative file path text box, type the relative path of the localized error file, which if our example earlier was for US English only, could be written as EN-US404.htm.
  7. If you are implementing localized versions of a custom 404 error page, ensure the Try to return the error file in the client language check box is selected, and then click OK twice to save your work.

IIS 7.0 and higher users can alternatively edit their web.config file to include a snippet of code to accomplish the same task. Using the sample file 404.htm referenced earlier, here is an example code snippet:

Check out Bing search results for more information on creating custom 404 error pages in IIS.

Bing Web Page Error Toolkit

Aside from creating your own home-grown, dynamic custom 404 page solution for IIS, there is another option for IIS users that kicks it up a notch. The Bing Developer Center team has augmented the IIS experience in creating even more useful customized 404 error pages. Straight from the Bing Developer Center blog, check out the post from this past summer, Customize your 404 error pages with the Bing API Web Page Error Toolkit. The post reveals that the toolkit, built on the Bing API, replaces the default IIS 404 error page with a dynamically created Bing search page containing customized search results derived from keywords extracted from either the source Uniform Resource Identifier (URI) or the HTTP request. This creates a search list of relevant, alternative pages on your site, helping the user more easily and quickly find the information they originally wanted without abandoning their visit to your domain.

The Bing Web Page Error Toolkit is available as a free download. If you're using IIS, give it a try. Your users will show their gratitude with more conversions and fewer bounces! Note that you will need Microsoft Visual Studio to use this tool.

Test, test, test

Once you have implemented your custom 404 page, test it out. In a browser, type your domain name followed by a short, random string of characters you know does not match any existing file or directory name. If everything was implemented correctly, you should get the custom 404 page you created in response to that bad URL. From that point forward, your potential customers will also see that page, and if the new error page's content is well done, they will more likely stay on your site. And that's the goal after all, right?

If you have any questions, comments, or suggestions, feel free to post them in our SEM forum. Later…

– Rick DeJarnette, Bing Webmaster Center

* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

UPDATE: Cool news! The Bing Developer Center team has announced today (11/18/2009) the release of the Web Page Error Toolkit for PHP users. For more information, check out the Dev Center blog post titled, Finding the page not found is now even easier.

Rick

[Link]

Translator widget: Delivering your site to the world

About eight months back, the Microsoft Research Translator team delivered an entirely unique way of delivering your website's pages to visitors who speak a different language with no development effort on your part. Unlike any other translation widget/gadget available at that time, the Translator widget was unique in that it kept your audience on your site, rather than redirecting them to a proxy translation service. Since then, thousands of sites have adopted the Translator widget and have been able to attract a much broader audience from around the world.

Powered by the same machine translation technology that is used by Bing, Internet Explorer, and Office, the Translator widget provides a free option to deliver a "gisting" experience to a non-native audience. While machine translation cannot replace a professional or human localization, it aims to provide a rough understanding (the gist) of the content on the page to those that cannot read the original language. The translation engine is worked on continuously to deliver better quality and more languages. You can learn more about the pioneering work being done by our researchers in this space over at the Translator group's site at Microsoft Research. With the widget, given the on-demand nature of the translations, there is no load on your site and the freshest translations are delivered to the visitor.

Adopting the widget is as simple as copying and pasting a small snippet of JavaScript code into your site. You can customize and generate a snippet for your site at the widget adoption portal. Once this code snippet is pasted into an appropriate area of your page, the Translator widget appears on your site to your users in the language their browser is set to. This localization of the widget user interface ensures that your site's audience always sees "Translate this page" in their language and thereby are able to kick-off the translation (as shown in the images below). The translator team is also planning to add an "automatic" translation functionality, where you can set the widget to auto-translate the page into the visitor's browser language upon arrival.

Once translation has been kicked off, the page is translated using "progressive rendering" – a technique that ensures that the visitor can immediately get the benefit of translation without waiting for the whole page to be translated. As they navigate from page to page on your site, the pages get automatically translated, resulting in a seamless experience for your visitors. A progress bar and several other controls are displayed as well, to the visitor, floating at the top of the screen. Upon translation, hovering over the translated sentences displays tool tips that show the original source sentence, as shown in the image below. This can be useful in situations where the visitor has some familiarity with the source language.

Another interesting feature of the widget is the ability to share a link to the translated page. A visitor to a site who has translated the page to a particular language can share a link to the translated version of the page. When the recipient clicks on the link, they are taken to the page and translation to their language is kicked off automatically. For example, this page (http://viks.org/2009/06/11/instant-translations-in-bing/) can be auto-translated to Spanish by appending the code #mstto=es to the end of the URL (http://viks.org/2009/06/11/instant-translations-in-bing/#mstto=es).

So, what are you waiting for? Go get the widget and start making the Web more "worldly"!

Stay tuned to the Webmaster Center blog and the Translator team's blog for more information on additions to the widget functionality. You can also participate in the Translator user community.

– Vikram Dendi, Senior Product Manager, Microsoft Research

[Link]

Webmaster Center blog Q&A

We’ve been really busy here at the Bing Webmaster Center blog team, pumping out new content on a regular basis to create a nice library of content on issues that matter to webmasters and online publishers. I thought I'd take a moment to catch my breath, pause on creating a new thematic article (or yet another multi-part series!) for SEM 101, and address some commonly asked questions in the blog comments.

Q: Why wasn't my question in the blog comments answered?

A: Well, the Webmaster Center blog is really not the right place for such back and forth exchanges. In fact, at the end of each blog post, there is a reminder to post your questions and comments in the SEM forum. The collection of Webmaster Center forums are specifically designed and staffed to address your questions, so if you have a question for the Bing Webmaster Center team, please post it in the forums where you will get a reply. Now if you have a question for other webmasters, you can certainly post that to the blog comments, but even then, you may get better results from posting it in the forums.

Q: Why did my blog comment disappear?

A: As we truly value all the input we receive from the webmaster community, we are reluctant to delete any comments, but on occasion we have to. If a blog post comment is simply blank, includes profanity or obviously objectionable, business-inappropriate content, or is merely an off-topic advertisement for an external website (the basic definition of web spam), we delete those comments. In cases where the same comment is repeated multiple times in the same post by the same sender, we delete the redundancies but leave the original.

We love to get your feedback, and whether you like or hate our content, have a suggestion for clarifying a point or care to elaborate your own, related story, we're really happy when you contribute to the community. Please continue to do so! But if your comment was deleted, there was a compelling reason for doing so. On rare occasions, we get someone who decides to post the same web spam comment across dozens of our posts simultaneously, sometimes even spanning beyond Webmaster and going into other Bing community blogs, such as Maps, Developer, Travel, or the main Search blog. I've seen a couple of instances where someone comment-bombed our blogs in a huge, redundant web spam blast. Those comments are all quickly deleted and those spammers are banned from posting again to the blogs. Seriously, who wants to read that junk?

Q: I added in my site's URL in my blog comment. That's good link building for my site (it's coming from an authoritative site, after all), right?

A: Well, the basic intent of the idea is good. You do want to get as many high-quality, authoritative, inbound links as you can. That is one of the keys to improving your page rank of your site. But in this case, as so many blog commenters do this on a regular basis, the links entered in Bing blog comments are automatically created with the rel=”nofollow” attribute included in the anchor tag. This means that when search engines hit the blog page, the link using that attribute will not earn any inbound link credit for the referenced page. So sorry, folks, this one won't count.

Make no mistake, earning high-quality inbound links is hard work. You need to get webmasters from authoritative sites to link to you (this is why link exchanges don't help build page rank value). You usually do that by providing high quality content on your site that those webmasters value. But simply adding a URL to a blog comment is far from hard. And many websites make it a policy to add the rel=”nofollow” attribute to all visitor-generated content links because that content can be so hard to police. Who wants to allow a visitor to link out to web spam or malware? And who has time to police the quality of every user-generated link?

You had a good idea with good intentions. But in this case, it's a waste of time to include your site's URL if your goal is to get an authoritative inbound link.

Q: Why isn't my site indexed yet?

A: This is exactly the sort of question you should post in the Webmaster Center's Crawling/Indexing Discussion forum. The number of variables here that can affect the answer specific to your question is enormous, including:

  • the quality of your site's content
  • the quantity and authoritative quality of your site's inbound links
  • the ability of the search engine bot to discover and crawl your site's content
  • the validity of the HTML code used
  • the age of your site
  • the freshness of the site's content over time
  • whether or not malware was detected on your site
  • whether or not the content is judged to be web spam or duplicate content copied from other sites
  • whether your site violates the Bing search guidelines
  • and so much more

If you post this question in the Crawling/Indexing Discussion forum, our staff can look up your website's index information and help determine what can be done to improve your situation. Take advantage of their expertise and resources for this and other similar questions!

Q: Do the search engine optimization (SEO) recommendations you give for Bing affect my SEO performance with other search engines?

A: Yes. They improve it! And of course, if you are actively performing legitimate, white-hat SEO activities for other search engines, it'll also help you with Bing as well. The basic takeaway here is that SEO is still SEO, and Bing doesn't change that. If you perform solid, reputable SEO on your website, which entails a good deal of hard work, creating unique and valuable content, earning authoritative inbound links, and the like (see our library of SEM 101 content in this blog for details), you'll see benefits in all top-tier search engines.

But remember: SEO efforts ultimately only optimize the rank position that your site's design, linking, and content deserves. It removes the technical obstacles that can impede it from getting the best rank it should. However, it won't get you anything more than that. The most important part of SEO is doing the hard work of building value necessary to make your site stand out from the competing crowd of other websites for searchers.

One other thing to remember: there is a long tail in search. After the few obvious keywords used in a particular field, there are many, many more keywords used to a lesser degree that still drive a lot of traffic to various niches of that field. Instead of always trying to be number 1 in a highly competitive field for the obvious keywords and faltering, consider doing the work of finding a less competitive keyword niche in that same field and then do the hard work necessary to earn solid ranking there.

For more information on SEO and Bing, see our recent blog post, Search Engine Optimization for Bing.

Q: How do I submit a Sitemap to Bing?

A: There are a couple of ways to do this. If you have registered your site with Bing Webmaster Center tools, log into the site, select the site to use from the Site List page (webmasters can register multiple sites for one account), and then click the Sitemaps tab. From there, you can perform a direct Sitemap submission by typing the web address of your Sitemap file (such as www.example.com/sitemap.xml — be sure to omit the "HTTP://" protocol designation as it's not needed here). If you have not yet registered your site with Webmaster Center (why not?) and you just want to submit your Sitemap file through your web browser using our Sitemap ping service, use the following URL:

http://www.bing.com/webmaster/ping.aspx?sitemap= add your Sitemap web address here

Again, leave off the "HTTP://" as it isn't needed. For more information on using Sitemaps with Bing, see our Webmaster Center blog article, Uncovering web-based treasure with Sitemaps (SEM 101).

Q: Does Bing support sitemap index files?

A: Definitely. Bing supports Sitemaps with up to 50,000 entries, be they site URLs or, in the case of Sitemap index files, references to child Sitemap files. With a Sitemap index file containing 50,000 references to child Sitemaps, each of which containing 50,000 site URLs, your Sitemap strategy can reference up to 2.5 billion URLs. Let us know if you need more. :-)

For more information on Sitemap index file support, see the Webmaster Center blog post, Bing enhances support for large Sitemaps.

Q: I can't find you anymore. Where did your blog recently move to? How do I get to it now?

A: The blog didn't actually move (well, not since June, when Bing was introduced). We're still at http://www.bing.com/community/blogs/webmaster/. But the extras menu in the Bing user interface was recently removed and all references to Webmaster Center, the Bing Community, and the Bing Webmaster Center blogs and forums, were migrated under the More link, found on both the Bing home page and the top left menu on other Bing pages. Some folks might have missed that. Please be sure to save the Webmaster Center blog to your browser favorites or, better yet, subscribe to our blog's RSS feed.

Q: Why are your blog columns so long?

A: No reason.

Q: Thank you!

A: You're welcome! We get this comment most of all, and I wanted to make sure we acknowledged our appreciation for your kind words and support.

As always, if you have any questions, comments, or suggestions, feel free to post them in our SEM forum (or any of the other Webmaster Center forums as appropriate). Until next time…

– Rick DeJarnette, Bing Webmaster Center

[Link]

The merciless malignancy of malware Part 4 (SEM 101)

OK, so I totally geeked out with my recommendations on how to better secure your webmaster computing environment. As a result, I had too much material for one post and thus had to split it up into two pieces. Let’s wrap up this long series of posts on malware by finishing up with the last of the security recommendations.

In Part 1 of this series on malware, we discussed how to detect a malware infection on your website using tools like Bing’s Webmaster Center. The Part 2 post covered the resources and strategies for identifying the types and locations of malware code that typically affect websites with advice on how to remove it. The Part 3 post began the run-down through 10 recommendations (well, the first 5, anyway!) on how to better secure your workstation and web server computers to prevent the malware from coming back. Today’s post, Part 4, finishes the list, and then includes information on what steps you can take to get that pesky malware warning message removed from your recently cleaned site in the Bing index.

Recommendations continued

Getting rid of malware is only part of the battle. Hardening your security practices to keep it away is just as important. Let’s continue the list of recommended security strategies started in the previous post.

6. Run Microsoft Update

I am presuming with this recommendation that you are running a modern Microsoft Windows operating system. Regularly run Microsoft Update on every Windows-based computer you use to touch your website. When you do so, I recommend that you click Custom to see the total list of available updates for your computer rather than seeing only the High Priority updates. Always keep current with the latest High Priority updates and strongly consider applying others updates as well.

Note that the second Tuesday of every month is commonly referred to as “Patch Tuesday” for Microsoft Update, and time should be set aside on those dates to make sure all Windows-based systems in your web server infrastructure get the necessary security updates. Occasionally Microsoft, when necessary, also provides high-priority security updates ahead of this schedule, so it pays to stay on top of these releases as they occur. Signing up to receive Microsoft Technical Security Notifications can help!

7. Update non-Microsoft applications, too

Applications that touch the Internet are at least as vulnerable to security holes as are web browsers and operating systems. Some major software manufacturers are beginning to build into their applications an online update system analogous to Microsoft Update. But not all have this feature yet, and not all that do perform the update automatically. It’s a really good idea to scan for and plug the often nasty security holes in the applications on your workstation through a software updating tool. I like the Secunia Software Inspector tool (check the licensing requirements for commercial use, but it’s free for many users), but there are many other choices out there. Be sure that the web applications you use are checked in that process. The bottom line is you need to regularly check for and install any software updates on all of the computers associated with your website.

Keep in mind that software manufacturers regularly release updates for their products when they discover faulty features and security holes. The hacker community makes a point of studying those patches to learn what exploits the updates fix. If you don’t stay current with software updates, your computer may become vulnerable to reverse-engineered exploits.

8. Improve your wireless security

Many computers these days, especially laptops, are connected to the Internet only by wireless connections. If you work in a big organization with a security-conscious IT shop, you’re probably fine (while you’re at work, anyway). But many small shops and even more home users install their new Wi-Fi routers using default settings across the board. Hackers have developed such efficient wireless security cracking tools over the past decade that paranoia is no longer considered irrational or delusional behavior among IT security folks. (But if tin foil hats come out, all bets are off.)

There are several things you can do to improve the security of your wireless network router. Dig up the user’s manual for that old router and learn how to do all of the following:

  • Update the device firmware. Go to the router manufacturer’s website and browse to the Downloads page for your model (typically within the site’s Support section) to see if you have the latest firmware release. If not, download it and install it. You may get new functional features and/or have known security holes resolved. Either way, look into it. The router’s manufacturer put it out there for a reason!
  • Change the administrator password from the manufacturer’s default (using the tips in Create strong passwords). Hackers typically know the default administrator password for various routers. Leaving yours with the default is honestly no better than disabling the admin password altogether.
  • Change the network’s Service Set Identifier (SSID) friendly name from its default to a name of your own choosing. Then once done, then disable the SSID broadcast so that the wireless network is hidden. It’s harder to crack a wireless network if you don’t see it, especially when you don’t know its name!
  • Enable media access control (MAC) address filtering so that only computers and devices whose MAC addresses you specify can access the network. All others are denied access.
  • Exclusively use Wi-Fi Protected Access version 2 (WPA2) security with Advanced Encryption Standard (AES) encryption for the most secure connections. Forget relying upon Wired Equivalent Privacy (WEP) or WPA using Temporal Key Integrity Protocol (TKIP) encryption for security. Modern wireless security cracking tools can break these encryption schemes in minutes, even with the longest keys.
  • Enable advanced routing features such as SPI (as discussed in tip #3 in the previous post). If your wireless router doesn’t support SPI, it’s probably using old technology and it may be time to shop for a new, more secure Wi-Fi router.

Note that none of these changes by themselves will sufficiently upgrade your wireless security, but the aggregate value of implementing them all will make your wireless network much more difficult to crack. And unless you are dealing with extremely determined hackers with an abundance of both technical resources and time to focus on cracking your specific, secured network, they will almost always move on to another of the ubiquitous, softer targets in the wifisphere.

9. Protect your website’s configuration files

Ensure that the sensitive configuration files of your web server and your web applications aren't accessible to unauthorized, external users. Place them in directories that are not served to the public and then disable directory browsing on your web server. Refer to your web server documentation for specific instructions on how to do this. I also recommend researching additional methods of securing your web server, such as IIS or Apache, from attack.

10. Perform data validation on user input

If your website accepts user input, ensure it is validated before processing or displaying it back to the user. For example, if you have a login form that accepts user names and passwords that are checked against a database, ensure that the input is scrubbed of any unexpected or invalid characters that might offer malicious manipulation of the database. Also, if user input is accepted and displayed (such as on forums), ensure users aren't able to modify the source code of the webpage, such as adding script for running HTML code.

Also be sure that input from backend systems is validated. This protects the users of your website, even if attackers "only" managed to break into a backend system, like your database. For more information on similar, related website attacks, look into the topic of cross-site scripting (XSS).

Bonus tip: Backup your clean web content

Once you’ve ensured your site’s content and source code is clean, back it up! Disaster recovery is not just about fires, floods, and earthquakes. A sudden, major malware infection ranks right up there in terms of potential business outages, so protect your work, your site, your business, and your customers who depend on you with proper, functional backups of clean code.

Even more information on securing servers

To be ultra secure, you might simply consider flattening and rebuilding the server from scratch. But don’t simply rebuild it to the way it was – remember, it was hacked in that state! Put in place all of the hardening steps mentioned earlier, as well as triple-checking all of your permissions settings, before putting the server back in service online. For more information on dealing with hacked servers, check out What to Do If Your Website Has Been Hacked by Phishers.

Request removal of the Bing malware warning

Once you’ve resolved your malware infection, closed the security vulnerabilities that allowed your computer to be successfully attacked, and uploaded your cleaned-up source code to your web server, you’ve got one more job to do. It’s time to request that Bing re-evaluate your website for malware. Here’s how:

  1. Open the Bing support form.
  2. In the resulting Windows Bing Support web form, type your full name and email addresses in the text boxes provided.
  3. In the Service: Bing drop-down list, select My Site has a malware warning.
  4. In the new drop-down list that appears below, select the option that best matches your specific situation (in this case, that’ll be The malware has been removed.
  5. Complete the remainder of the form, adding as much detail as possible in the comments text box to help the support team resolve your request. Once completed, type the characters shown in the security image, and then click Submit.

By following this procedure, Bing will rescan your website to check that the malware has been removed. If confirmed, your content can then be reincluded in normal search results. Once done, keep monitoring your site’s malware status in the Crawl Issues tool of Bing’s Webmaster Center, just to be sure you stay on top of any new issues.

If you have any questions or comments about malware, please feel free to post them in our General Questions forum. For regular SEM and SEO questions and suggestions, please go to our SEM forum. See you again soon…

– Rick DeJarnette, Bing Webmaster Center

[Link]

The merciless malignancy of malware Part 3 (SEM 101)

We’re going to diverge a bit from our regularly scheduled programming. Normally this column discusses search engine optimization (SEO) and related elements of search engine marketing (SEM), but we’re knee deep into our multi-part series on malware and we’re going to begin the wrap-up with a talk about improving computer security. However, I geeked out a bit here, and the column went a bit long (yeah, even longer than usual!), so I decided to break this last section up into two pieces. Who wants to read a white paper as a blog post? I mean, besides me? :-)

While beefing up your computer security practices won’t necessarily have a direct affect your site’s SEO performance, consider the repercussions of not doing so. Presenting a malware-infected website to your customers is a great way to ruin the integrity and conversion potential of your online business. Top tier search engines like Bing will either block a malware-infected page from showing up in its search engine results pages (SERPs) or will redirect the affected page’s link to a malware warning message. Bing presents the following warning message when searchers click its SERP link for a malware-infected page:

Since the vast majority of searchers will never opt to click through to override a malware warning from a SERP, assuming the link to the affected page is even shown in the first place, failure to quickly address detected malware infections is a great way to kill off pretty much all of your search referral traffic. And those customers who navigate directly to your site will not likely come back once they’ve determined your site was the source of their newly acquired malware infection.

In Part 1 of this series on malware, we discussed how to detect a malware infection on your website using tools like Bing’s Webmaster Center. The Part 2 post was a long discussion on the resources and strategies for identifying the types and locations of malware code that typically affect websites, and included high-level information on removing it from your site. Today’s post, Part 3, and the next one, Part 4, present altogether 10 solid recommendations on how to better secure your workstation and web server computers so that the infections don’t come back. After all, what good is it to invest time in shooing away a kitchen full of house flies when you haven’t bothered to close the screen door?

Recommended security strategies

Once malware is removed, steps need to be taken to secure your website to prevent malware from reappearing on your website in the future. Securing all of the computers involved with creating, managing, and serving your website are the keys to success. If you were infected with malware, that means your computer infrastructure has one or more security vulnerabilities that need to be addressed. The following preventive measures are key tasks that either you or your hosting provider (likely a combination of both) need to take.

1. Install and use an antivirus tool

If you have not done so yet, install and run a fully capable antivirus software tool on the computer workstation you use to develop and upload your website content. If your web server is not otherwise protected, also install an appropriate antivirus solution on it as well. A high-quality antivirus product will support scanning embedded scripts and other locally saved webpage controls used in your website’s source code for any known malware, so don’t skimp on quality and features here.

Once you have an antivirus solution installed, be sure to regularly update both the tool’s program code and its malware signature files used for detection. Most modern antivirus tools have update features built-in, but make sure the update feature is working as expected before setting it and forgetting it. If you need some convincing as to why keeping your antivirus solution updated is important, I can only refer you to the Microsoft Security Intelligence Report (to which Bing is a key contributor). And lastly, remember to use your antivirus tool! You need to regularly scan your Internet-connected computers for malware to ensure they remain clean.

Microsoft offers a free, web-based, anti-malware scanner called Windows Live OneCare safety scanner. It works on computers running Windows XP, Windows Vista, and Windows 7. It checks for and removes viruses, spyware, and other likely unwanted software, as well as detects vulnerabilities in your Internet connection. Heck, it can even be used to clean up your hard drive and tune up your computer’s performance!

Microsoft has also just released its Microsoft Security Essentials program, a new, no-cost, anti-malware solution that runs in the background of your computer and protects it in real-time against viruses, spyware, and other malicious software. Check it out.

2. Install and use an anti-spyware tool

If your antivirus solution doesn’t specifically include it (and many do these days), you should also install a good anti-spyware scanning and protection tool on your workstation (since you likely don’t surf the Web directly from your web server, this protection is likely not needed there). As with the antivirus tool, keep this tool updated and use it regularly to scan your computer for problems. The last thing you want to do is introduce malware into your web server environment from a compromised workstation!

Microsoft also offers a free antispyware tool called Windows Defender. It actively protects your computer in real-time against pop-ups, performance problems, and security threats by detecting and removing spyware and other unwanted software.

3. Use a firewall

At a minimum, you should use a software firewall utility to protect your workstation and server from external hackers. A software firewall blocks unauthorized and inappropriate network traffic to your computer. Hackers employ these techniques to take control of, and thus install malware on, your system. Many software firewall options exist, both for Windows users and users of other platforms. On your server, use the firewall to block all inbound traffic except for normal web server requests traffic and a secure access method for your webmaster site uploads from predefined computers.

To improve security further, consider installing a separate hardware firewall device between your computers and the Internet that offers, at a minimum, stateful packet inspection (SPI). Firewall devices use SPI to track the state of the network connections passing through them. Rogue or malformed TCP/IP network packets, sometimes implemented by hackers to get through weaker firewall solutions, are rejected by SPI-enabled firewalls. Application-level filter firewalls are better yet, as they work at the application layer of the network protocol stack, where they can more safely examine which network protocol is used on which port and determine whether its use is appropriate.

4. Use a secure protocol to access your web server

Standard FTP protocol doesn’t encrypt the data as it’s transmitted, so if your computer or its network has been compromised by hacker using network sniffer technologies, your web server’s logon credentials are at risk of being stolen. As alluded to in the section on firewall, using Secure FTP or Secure Shell (SSH) eliminates this potential vulnerability. Make sure you do this end-to-end, from the site developer to the webmaster and from the webmaster to the server.

5. Change and strengthen your passwords

Your computer security is usually only as good as the freshness and strength of the passwords you use to access your computer. If your passwords haven’t been changed since the days ‘N Sync was still hot, it’s time to say "Bye Bye Bye" to that. You need to implement a regimen of regularly changing your passwords. And when you do, please make them harder to guess than “password” or something else hyper-obvious. Check out the article, Create strong passwords, for helpful tips on doing this.

Yeah, you don’t need to tell me that this is inconvenient. But if you choose to skip doing this, while you might be happier temporarily, hackers will be thrilled. Static, simple passwords are easy to crack, and once hackers figure out your logon credentials, they can do anything they want to your site, including locking you out! Imagine having a hacked site and you can’t even log in to fix the problem!

More recommendations to come

We’ll continue with another five recommendations for securing your webmaster computing environment in our next post. If you have any questions or comments about malware, please feel free to post them in our General Questions forum. For regular SEM and SEO questions and suggestions, please go to our SEM forum. I’ll be back…

– Rick DeJarnette, Bing Webmaster Center

[Link]

The merciless malignancy of malware Part 2 (SEM 101)

Malware infections are no laughing matter. When they afflict your website, they can infect your customers, who won’t appreciate your sharing, intentional or not (and I’m guessing it’s not)! And if Bing discovers malware on your site, your listing in the Bing search engine results pages (SERPs) will either be completely omitted or the link to your site will be disabled, so when the searcher clicks on it, only a malware warning appears. All told, this is bad news for conversions, don’t you think?

This article is Part 2 of a three-part series on malware. Part 1 covered how to detect the presence of malware on your site by using the Bing Webmaster Center tools to get access to the information the bot sees when it crawls your site’s pages and the external links they contain. In this post, we’ll cover the available resources and strategies to do a malware clean-up job. It’s usually a big job, so let’s get right to it.

Cleaning up the mess

Bing’s detection of malware on your site usually indicates that your site was hacked. Comprehensive information on how to clean up each specific malware infection could fill an entire book (and this post is quite long as is). Instead of deep dives into specifics, let’s talk about strategies and resources for combating this problem.

Sources of malware code

There are three primary ways your website might be serving malware:

  • External source. If hackers exploit an existing security vulnerability to gain access to your source code, they often edit your HTML or script files to make calls to externally based, malicious content on servers they control. Worse yet, hackers don’t even need access to your web server or source files to inflict this attack. If you include externally based content on your pages, and if the hackers can successfully attack the source at its external site, your pages then will unintentionally serve their malware to your customers.
  • Local source. Sometimes hackers, once they’ve gained full access to vulnerable web servers, put malware code directly in your webpage files and/or place malicious content in the directory structure of your website. Your page’s HTML source code may still appear to be clean, but in this case, the poisoned images, documents, or other binary files they call locally can be the source of the malware attack.
  • Man-in-the-middle attack. Although this is a less common form of attack due to its technical sophistication, hackers can, when server and network security is severely compromised, inject malware into your webpage content over the network as it travels from your web server to the end user.

Your webpage will be considered malicious if you serve malware from any source, be it from an external server, directly from your web server, or by man-in-the-middle attacks. A user browsing to your webpage from the Bing SERPs will not be able to distinguish your clean content from the malicious content inserted there by hackers. It’s all presented as content in your webpages, so you are ultimately responsible for protecting your customers.

Attack indicators

The malicious code changes that hackers will likely employ come in one or more of these five forms. If any of these elements in your code appear to be suspicious, unexpectedly modified, or unfamiliar to you as webmaster, investigate them further.

  • Script code. Webmaster should check for new JavaScript script code inserted into their pages. This code will be contained within tags. It is common for inserted, malicious script to be written in an encrypted hash function, accompanied by the decryption key hash value, which allows the script to execute but prevents the webmaster from being able to read and interpret the function of the code. This is known as obfuscated JavaScript code. Such code will look like many wrapped lines of continuous, random alphanumeric characters within a tag. If your pages do not normally use encrypted scripts, this form of script code will definitely stand out as different. This malicious code usually runs when the page is loaded, and it typically appends an exploit or a poisoned, hidden control to the page as it is loaded.
  • code. An is simply an HTML tag that enables an unrelated HTML document to be loaded within another HTML document. The tag enables hackers to inject poisoned HTML and script code into another webmaster’s webpage. Injected script code often employs tags as the means of creating hidden windows that enable malware exploits to execute without the user’s knowledge.
  • Page redirect. If a hacker get access to edit your home page, they can add code that will automatically and immediately redirect a web browser to another web page (usually to one similarly named and identical looking one on an external server, but possibly to one created on your server) that runs malware as the page loads. This can be done by means of refresh, JavaScript, or even 301/302 redirects. Unless you find the code for the redirect when you examine your content, you typically won’t see any malware on your page because it’s not executed there. Be sure to also visually inspect your web server configuration for unauthorized redirects.
  • Externally sourced content. Note that while the use of small, externally-based controls, like hit counters, can be legitimately secure when you first install them, if the webmaster of that control’s host server is not security conscious, those once-benign controls can themselves become malicious vectors later on. Also, small advertising hosts can outsource their contracts to other advertising hosts, who might sub-contract that work out again several times down the line, all done in order to sell more advertising. But the farther you get away from the original trusted external host, the more vulnerable your link becomes to that original, external ad host. Only use external (third party) content from highly trusted sources whose security practices are widely known to be good.
  • Obfuscation efforts. Attackers often try to hide their exploitation work from quick inspections by using external, referring domain names that are spelled very similarly to known, trusted entities of the Web. Check the spelling of domain names in all external resources to be sure the URLs were not changed to addresses that are similarly named but not the actual, intended target. This includes external references to advertisers, hit counters and other such controls, external images, analytics trackers, and the like. Also, look for URLs that substitute IP addresses for domain names, another common method of obfuscation.

What can you do?

You gotta look at the code. If someone cracked your web server’s security and modified your source code, you need to find what’s changed as the first step in identifying and cleaning up the malware. You can do this by visually inspect the HTML and script code on your pages for unauthorized changes.

When you examine your source code, carefully inspect your code on your web server. Look for newly added scripts in your HTML pages that execute when the page loads, especially obfuscated script. Consider any references to third-party domains in your source code as a potential source of malware. Suspects should include any inserted external code that runs on your site when the page is loaded, including hit counters, images, media content, and other externally sourced controls. External scripts should never be implicitly trusted without a careful consideration of that host’s security practices, as this is a major security vulnerability.

As much as possible, remove unnecessary, externally sourced content to reduce your exposure to exploits beyond your control. Only embed content from trusted third parties into your webpages. If you discover some code that was added to or modified on your page without authorization or realize a once-trusted external page element now appears to be malicious, simply remove that portion of the code from your file to clean it up.

Malware might also have been embedded in your existing images, document files, animations and media content, or other binary files that are presented on your pages. All of these should be scanned again with an antivirus tool for malware.

If you are using a version control system for maintaining your site’s source code, you can easily redeploy the last known good version before the infection occurred. Just be sure that the versioned source code from your workstation is not the source of the malware.

Diagnostic tools to use

To help in your source code examination, use these tools for additional insight on cleaning up a malware mess:

  • Run an antivirus utility on your source code. Install a fully capable antivirus software tool (and regularly update it to be sure its program code and malware signatures are current) to run a thorough scan of the folders containing your website’s source code. It may be able to detect some forms of malware if they are locally installed on your web server or your webpage files were modified with unauthorized, malicious scripts. Also run a thorough antivirus scan of your personal workstation (the one you use to edit the your site’s source code and connect to the web server for uploads). You may unknowingly infect an otherwise clean web server with a compromised workstation infected with malware. And if you get a key logger infection on your workstation, the hacker controlling that malware might steal your web server’s FTP logon credentials, providing them with full access to attack your site with malicious content.
  • Run Fiddler HTTP proxy on your website. The Fiddler web debugging proxy tool is a no-cost, web debugging proxy tool used to see what HTTP calls are being made when your page is loaded. By examining the multi-threaded, network traffic generated by your webpages, you can see if your pages are making unexpected calls to unknown resources, and if so, identify where they are going. Watch the Fiddler video tutorials and reads its documentation to learn how this valuable tool is used and how it works.

Checking for man-in-the-middle attacks

You might also inspect your source code as received by your browser using the browser’s View Source command to check for “man-in-the-middle” attacks. In that case, a direct inspection of the original webpage source code files on your web server would likely reveal no malware infection. However, by revealing and examining the source code for the infected webpages from your web browser and comparing the results to the original, clean file from the web server, you might find the malicious changes. If so, inform your web-hosting provider that they might be the victims of a "man-in-the-middle" attack. If your provider takes no action as a result, consider moving your website to a more trusted provider. Luckily, as this is a much more sophisticated attack, it is less common than overt modification of the code on your webpages.

Warning! Make sure both your browser and your operating system are running the latest security updates, along with running up-to-date antivirus, anti-spyware, and software firewall products, to minimize the vulnerabilities to your computer when loading pages likely to be infected with malware.

Also, most web browsers allow you to configure specific security settings for individual sites. Add your infected site to the list. (If your browser doesn’t allow you to specify security settings for individual sites, you can temporarily implement these settings for all sites during your testing, but you may want to revert those changes later to restore full functionality.) You’ll want to disable JavaScripts for your tests. If you’re using Internet Explorer, you’ll also want to disable ActiveX controls. These changes will protect your computer from the infection methods used by malware.

Verifying your fixes

Once you have cleaned up the problem, you should verify your work to be sure the revised code is clean.

  • Visually inspect the page on the web server to be sure the edits are in place.
  • Visually inspect the changed page and its source code in your web browser.
  • Use Fiddler to ensure that the malware’s unexpected external network calls have been eliminated.

A stumper

Sometimes you’ll scan your site’s code and find no clear source of malware, yet malware is clearly affecting the users of your website. If this is the case, look at portions of your code where you take user input without input validation, write cookies to the user’s computer, or other such personalized activity beyond simply displaying information to a generic user. Your site may be the victim of cross-site scripting (XSS). Resolving this specific issue is beyond the scope of this article, but it is very commonly used by hackers for exploiting computer security vulnerabilities, and you should learn how to protect your site against such attacks.

Additional information resources

Microsoft offers a number of useful, anti-malware resources to help you understand what you are up against and what you need to do. Check these out for starters:

The topic of malware clean up is admittedly not really an introductory level subject, despite this being the SEM 101 column. But the negative implications of detected malware infections on a website are huge. Referrals from Bing will likely dry up after the bots detect malware because of end user protection mechanisms employed on the SERPs to prevent searchers from clicking an infected page. And on top of that, the few customers who choose to circumvent those protections on the SERP or who choose to browse directly to an infected site may possibly suffer the frustrating consequences of a malware infection. Either way, the folks whom you are trying to convert, either with a purchase, a subscription, or a download, will be forced to deal with the unpleasant mess left by the malware picked up from your site. They won’t remain your customers for long. And that’s why this topic needs to be addressed in SEM 101, even though it’s not really a 101-level topic.

If you have any questions or comments about malware, please feel free to post them in our General Questions forum. For regular SEM and SEO questions and suggestions, please go to our SEM forum. Next up: how to better secure your computers against hacker attacks. Until then…

– Rick DeJarnette, Bing Webmaster Center

[Link]

11 Mar 2010   05 Mar 2010   26 Feb 2010   19 Feb 2010   11 Feb 2010  

Posted in Webmaster FeedsComments (0)

Twitter

Follow @bevyhost (5046 followers)
         
 
Dynamic Feed Control
Loading...
 
Dynamic Feed Control
Loading...
 
         
 

Recent Comments

    • marketing niche strategy: If you wish to make money from affiliate marketing, it’s likely going to be helpful...
    • Darvas boxes: Just wondering if any active traders are starting to trade the ETFs? After reading the book by Larry...
    • Huiqi Kok: Hey there bro , nice web there. I googled your webkeep it going .I seriously like to browse your site.Last...
    • Glenn: Great post about Google. Keep up the good work.
    • Pearl Beads: More power to you.i have actually bookmarked it to show some of my friends
  •  

    Webmaster Forum Feeds