MODX & SEO, the basic techniques

MODX provides so flexible templating system that's all web designers dreaming off: "designing in no restriction within the shortest time." Period.

But sometimes, this goodness spoils the eyes and makes the developer forgets about another important thing: "how to make this great website searchable within the shortest time." :)

On the other words, MODX developers need to be aware of how search engines crawl their websites.

There are many tutorials about SEO out there. But for now, I'll try to wrap it up on how the MODX's developers can implement the basic SEO techniques for MODX sites, and starting it with a brief overview of the search engines and SEO.

The Search Engines

On this article, I will only mention about the big three of the search engines: Google, Bing, and Yahoo!.

Google

http://google.com

At the time of writing, Google uses Google Panda algorithm (and manual action if there is a flag report) to measure the relevancy between keywords and websites. Basically it filters out the hi-lo of the quality of a web page content:

  1. don't duplicate the content for other sources,
  2. relate the content of the page to the Title and the Description tags,
  3. don't put too many advertisings,
  4. and most importantly, crawlable.

Searchers might have realized that Google always tries at the first place to narrow the search results based on the geolocation factor of the user's IP. So we are about to take the advantage of this matter later on.

Google has several more sections on their page related to the Search Engine that can not be ignored as the other SEO factors: Images, Maps, and Photos (for Picassa).

Ref:

  1. TED 2011: The 'Panda' That Hates Farms: A Q&A With Google's Top Search Engineers
  2. How Google makes improvements to its search algorithm
  3. How Google works

Yahoo!

http://search.yahoo.com/

Do you know that Yahoo now uses the Bing's search engine? It's a 10-year deal. You may try to search anything on Yahoo!, then you will see the "Powered by Bing" at their footer section. Even on their website, it is noted that

"The organic search listings on Yahoo! Search is now powered by Bing in the US and Canada. Webmasters should use the Bing Webmaster Center tool for issues related to Yahoo! Search and/or Bing."

[source]

So, for the web search results, web developers should use the Bing's technical references.

There are slight differences on how Yahoo! provides more features on their result's pages. One of the surpising thing is that for some results, Yahoo! also returns links from the twitter. You can try it with any movie artist's name. Its live search is also awesome, it has sub-division that directly rolls up 3 top pages of the hovered keywords.

For a reference, these are the Yahoo!'s other sections: Images, Video, Shopping, Blog, News, Local, Recipes, Sports, Finance, Movie, OMG!, Apps, and Directory.

Bing

http://www.bing.com/

Microsoft just introduced their new algorithm, Search Quality Insights. They try to catch up with Google by introducing some new features, like:

  1. Reducing Junk
  2. Answer Ranking

To use Bing more advanced, the web developer, who owns Windows Live account, can log in to Bing Webmaster Center Tools, and optimizes the SEO features over there, including registering the site to be indexed by Bing.

The developer will be brought further to verify the website's ownership using their own XML file, or adding a special meta tag, or adding CNAME record to DNS. Further more, there will be more options like Crawl Summary, Crawl Settings, Crawl Details, Sitemaps (XML, Atom, RSS), and Markup Validator.

After doing the verification, there will be more tab panels to the webmaster center related to the registered site. Newly added sites will be crawled within the next 3 days.

To explain in more detail of what a proper structure website is, Bing provides a Guidelines for successful indexing for the web masters. For short explanation, try to read a Microsoft's blog about how to put keywords on the page.

A fact that a web developer can not ignore is that the Bing's search engine connects to the Facebook's internal web search engine. That means that if a user searches something on Facebook, Bing will be recording the keywords.

For a reference, these are the Bing's other sections: Images, Videos, News.

Schema.org

The most interesting about these particular big three is that they unite to make search listings richer through structured data.

It's about to feed search engines with some microdata tags to determine the detail properties of some specified items on the page content.

But just don't too worry about this. It's still under testing and experiments. For this moment, at least a web developer should be aware of this, and he/she might be interested to do more research.

Social Network's Impact

Social networking brings a new game rule to the SEO. Just by tweeting, liking, or link sharing, any websites can gain more popularity and visitors. It does not add the relevancy values to the keywords, though, but this area creates a new mindset. The website will start get a higher ranking when a visitor make a blog about it and/or makes a link to it.

Search engines started to build connection from this new hype to their systems. As I've already mentioned previously that Bing relates to Facebook, Google has already built the Google+. Yahoo had a connection with twitter some time ago, Microsoft extends the collaboration with Twitter. On the other hand, Microsoft seems on a way to build a new social networking.

By the way, do you know about Topsy?

If you're interested about this topic, this is another good reading about how social network impacts SEO.

The SEO

First of all, we need to minimize down what the search engines need from our website, as we only talk about the basic implementation. Our target is to make our developed site can be seen on the first three pages of the Search Engine Results Pages (SERPs). Minimum.

If you want to have a quick understanding about the SEO, please read the explanation about Search Engine Optimization (SEO) from Google or from Search Engine Land.

I'll make the conversation on this matter shorter but clearer, so let's use this periodic table from Search Engine Land's website. This one is practically the simplest way.

Search Engine Land Periodic Table of SEO Ranking Factors

As you can read on this chart, we might only focus on maximizing several things, and try to avoid several negative things.

Further more, so I will try to implement the Search Engine Land's guidelines to SEO.

The MODX

Now, these are the MODX's stuffs. Let's start it with the "On the Page" factors.

Content

This is the most important part of SEO factor. Each of the web pages must contain enough words that can be indexed by search engines. So at the time of searching, prospective customers can be refered to the pages that have the most related contents to their keywords.

Probably some pages are intended to not having many words, like photo gallery, but not blogging. So blogging has an important value to the websites to make them positioned in better rankings on the search engines. If it exists, the content editor should keep using the same keywords on the page several times, may be by doing re-editing back after he/she finishes the article.

These are the key factors for this topic:

  • Cq: Content Quality
  • Cr: Content Research / Keyword Research
  • Cw: Content Words / Use Of Keywords
  • Ce: Content Engagement
  • Cf: Content Freshness

HTML Code & Templating

So, how come templating become important?

  1. People, page layout templating matters! Web designer should try to not put any ads on top of the page. At least not the excesive one.
  2. Create a good page structure.

Ht: HTML Title Tag & Hd: The Meta Description Tag

Why not using better descriptive words for Title and Description tags?

search result example

 

These tags will help users to figure out quickly what the page is about.

So, this is a good practice when creating the basic template for MODX:

<html>
<head>
<title>[ [*longtitle:notempty=`[ [*longtitle]]`:default=`[ [*pagetitle]]`]]</title>
<meta name="description" content="[ [*description:notempty=`[ [*description]]`:default=`Just dump of general description here`]]" />

If you're a MODX's new comer, I'm using the MODX's Output Modifier here. Basically it says, "if longtitle is not empty, then use it, else use the default which is the pagetitle". The pagetitle is a mandatory field for MODX's resource, so it will be there.

The description have to be unique for each page. It adds wider explanation about the content which can not be represented in the title.

Hh: Header Tags

For the Header tag, use them in appropiate manners. H1 is the most important tag and usually it describes the whole keywords for that particular page. The web designer can use multiple H1 tags for a page, but it might be better to only use this once.

That said, while the resources usually have H1 tags for the page title of each page, the web developer should use H2 for the titles in the template chunk when collecting them using getResources or renderResources.

HTML errors

HTML error doesn't have a direct impact to the search engine's rankings, but indeed if the error brokes several, if not all, parts of the page.

This is an example of a common error.

MODX developers must have been using getResources to retrieve several resources (web pages), like news, or latest articles.

In many cases, this is the code:

[ [!getPage?
&elementClass=`modSnippet`
&element=`getResources`

&parents=`17`
&depth=`2`
&limit=`10`
&pageVarKey=`page`

&includeTVs=`1`
&includeContent=`1`

&tpl=`blogListPost`
]]
<div class="paging">
<ul class="pageList">
[ [!+page.nav]]
</ul>
</div>

and the chunk blogListPost:

<div class="blogPost">
<div class="date">[ [+publishedon:strtotime:date=`%b %d %Y`]]</div>
<h2><a href="[ [~[ [+id]]]]" title="[ [+pagetitle]]">[ [+pagetitle]]</a></h2>

<p class="author"><strong>Author:</strong> <span class="author">[ [+createdby:userinfo=`username`]]</span></p>
<p class="summary">[ [+content:ellipsis=`200`]]</p>
<p class="readmore"><a href="[ [~[ [+id]]]]"><span>Read more</span></a></p>

<div class="clear"></div>
</div>
<hr/>

There is a hiccup on this code: [[+content:ellipsis=`200`]] (this is different with MODX's example which is using [[+introtext]]).

This output modifier cuts off the content after 200 characters, leaving the retrieved HTML body missing the proper closing HTML tags. What will happen if the unclosed tag is the a (link)? The rest of the texts become a link.

I had this situation before, and my solution was by creating another custom modifier using php tidy feature:

[ [+content:ellipsis=`200`:strip_tags=`<p><br>`:phptidyModifier]]

I lent the PHP Tidy plugin's properties, and created the phptidyModifier snippet as below:

<?php

// Specify configuration
function fixJson(array $array) {
$fixed = array();
foreach ($array as $k => $v) {
$fixed[] = array(
'name' => $v['name'],
'desc' => $v['desc'],
'type' => $v['xtype'],
'options' => empty($v['options']) ? '' : $v['options'],
'value' => $v['value'],
'lexicon' => $v['lexicon'],
);
}
return $fixed;
}

ob_start();
include $modx->getOption('core_path') . 'components/phptidy/elements/plugins/phptidy.plugin.default.properties.js';
$json = ob_get_contents();
ob_end_clean();

$properties = $modx->fromJSON($json);
$properties = fixJson($properties);
$defaultProperties = array();
foreach ($properties as $k => $v) {
$defaultProperties[$v['name']] = $v['value'];
}

// Set non-default configs
$tidyConfig = array();
foreach ($defaultProperties as $k => $v) {
if ($k === 'css-prefix'
|| $k === 'language'
|| $k === 'slide-style'
) {
continue;
}
// convert boolean strings (by a generated drop down options) to be uhm... boolean.
$v = (strtolower($v) === "true" || strtolower($v) === "yes" || strtolower($v) === "1") ? 1 : $v;
$v = (strtolower($v) === "false" || strtolower($v) === "no" || strtolower($v) === "0") ? 0 : $v;
$tidyConfig[$k] = $v;
$tidyConfig['show-body-only'] = 1;
$tidyConfig['preserve-entities'] = 1;
}

// Tidy
$tidy = new tidy;
$tidy->parseString($input, $tidyConfig);
$tidy->cleanRepair();

if ($tidy->errorBuffer) {
$errorBuffer = "There are some errors!\n";
$errors = explode("\n", $tidy->errorBuffer);
foreach ($errors as $error) {
$errorBuffer .= $error . "\n";
}
$modx->log(modX::LOG_LEVEL_ERROR, $errorBuffer);
}

return $tidy;

I used PHP Tidy because it automatically fixes any HTML's syntax errors. If anyone want to use this too, I recommend another further reading about its complete configs.

The bottom line is that the web developers must be sure that the HTML codes are correct from top to bottom, left to right, in the individual page or in the parent page.

Site Architecture

Ac: Site Crawlability

First of all, do not use javascript or flash for navigation links. Or at least try to use it as an unobstrusive javascript.

Web developer should also provide 2 (two) kind of sitemaps, one for human (in HTML) , one for search engines (in XML). Most of the search engines refer the sitemap.xml format to http://www.sitemaps.org/.

To create sitemap.xml in MODX, we have 2 options:

  1. using GoogleSiteMap snippet, or
  2. using getResources with its Google XML template.

I prefer the second option, because any web editors can set the changefreq and the priority through Template Variables for it.

Again, please remember to use output modifier to escape the HTML entities, like changing the '&' to '&amp;'.

Another important thing is the Robots.txt. Web developers should be aware that search engines read that. So they have to be sure that the robots.txt does not block the webspider bots to crawl the website.

To speed up a new site being indexed by search engines, the web developers can register them to Google Webmaster Tools or Bing Webmaster Center Tools.

As: Site Speed

Oh, well, with MODX Revolution, how fast can you go?

  • New for 2.2.1: the web developers can use session-less contexts. Bear in mind, this is intended for single user's websites,
  • Cache everything,
  • Use minifier and gzip for CSS and JS files.

Au: Are Your URLs Descriptive?

Website should have the Friendly URL. It is important to have human readable URL address. It adds another relevancy value to the search engines on measuring the user's keywords and the web page, including the site's URL. It'll be hard to gain the SEO's advantage if the site has index.php?id=1 for the URL, and/or use canonical link element in the MODX's template if the site uses more URL parameters on purpose.

Link Building

Now, we start talking about the "Off the Page" rangking factors.

Google interprets a link from page A to page B as a vote, by page A, for page B. While naturally it will be better if the link comes from elsewhere, internal linking is also applicable as an SEO practice.

For this website, I apply it by adding "Related Articles" or "Latest Articles" to each blog page using getResources.

Lq: Link Quality

It's about getting link for a respectable website to our website. Link on comments is only valued small weight to the SEO rangking. Getting linked by the V1a-gr4 websites makes it different.

Lt: Link Text / Anchor Text

It's about the text inside the link. It adds the relevancy of the anchor text to the website.

So using "click here" kind of anchor text will not help.

Ln: Number Of Links

Having a lot of backlinks is good. Having the spammy ones is not.

Social Media

As I've mentioned above on the "Social Network's Impact" section, social media brings a new rule to the SEO games. It boosts up the incoming visitors number because the link is shared, tweeted, or liked. There are several different comments by the engineers of how their machines react to this social signal. But it can not be denied that all of them are taking a race to build a real-time index.

The key factors of this chapter are:

  • Sr: Social Reputation
  • Ss: Social Shares

It's a good idea if the web developer adds a social bookmarking to the web page. While the tweets are fading away in minutes, this social bookmark stays to remind new readers after a period of time. Because basically it is a static HTML, MODX developers can add the social bookmark script as a MODX's chunk or template.

Trust, Authority

I skip this. This is not technical.

Personalization

Each of search engines gives different result. That is because of the personalization of user experience and geo-location factors. You might be interested to try the Rank Checker tool from seobook.com.

Pc: What Country?

Search engines detect IP of the user's computer. So if the website is made for a specific target market outside of the original country, webmasters can try to host it on a server in that particular country.

Pl: What City Or Locality?

Some websites need to be more precise on spotting smaller region targets. Let's say the real estate websites. They contain addresses and phones for contacts.

For this job, it's better to use Template Variables for the house's detail Template to implement one or all of the options of geo-location targeting, such as:

Ph: Personal History

To get the idea about the user experience factor, you can try to log in to any search engine's service, then refresh the search result. You'll notice that the results will be adjusted to the previous records of the your behaviors in their entire services.

Ps: Personal Social Connections

This is again connected to the "Social Media".

Violations

Now, we start the negative signals. These are penalties, so they can drag down in relative numbers of the website's rangkings.

Vt: "Thin" or "Shallow" Content

Honestly, I'm not quite sure about this particular topic.

  • How does google determine the quality of the content?
  • How does it do with non-English websites?
  • How does it go with scientific formula websites?

This is the Google Panda's update. So basically, the machine tries to determine the quality of the page content for the human and throws the bad ones.

Hm... weird, and frightening.

Vs: Keyword Stuffing

Don't think too much about meta keywords. Some search engines might measure it, but Google does not. It's been a bad practice for a long time that webmasters spammed the keywords tag with non-related words to the web page, just to get higher rankings. For a positive reason, some of the SEO experts still do this for those other-than-Google search engines.

If the web developer keeps using it, he/she is better to use it wisely. Don't repeat too many words, or even for the misspelings.

There are keyword tools that have been used to analyze keyword's statistics:

Vh: Hidden Text

Web developers should not try to cheat search engines by putting keywords inside a hidden area. Search engines will sink the website's rankings.

Vc: Cloaking

Cloaking is an attempt to provide different contents between human and the search engines using redirects.

Some people use the spiderbot's IP Address detection, or User-Agent HTTP header, then refer them to a different page.

Matt Cutts says, "don't do it".

Vp: Paid Links

Webmaster should not buy services that offer backlinks. When search engines catch it, not only the website's rankings will be sank, it can be banned!

Vl: Link Spam

Well, anyone can put a specific page on the comments for a reference, but spamming it with links is a bad idea.

Blocking

Bp: Personal Blocking

User can block a site from more appearance if he/she blocks the result's link.

Bt: Trust Blocking

If there are many users block it too, then the systems will start to analyze it as the same.

Multilingual Websites

You might have been using Babel add-on for this. At the time of writing, Babel is the best option for MODX Revolution to do the multilingual site. It handles the easy switching between different languages to edit the content. To make the best way to apply the Friendly URL with it, here is the tutorial to do the SEO Friendly Multilingual Websites with MODx and Babel.

Conclusion

It is easy to implement the SEO on a MODX website. Most of them are not technical, but there are still some parts for web developers when it comes to the coding dirty works. Thanks to the MODX's developer team that makes this CMS so easy to handle the front-end engineering, which is related to:

  • chunk
  • template
  • template variables
  • sitemap.xml
  • robots.txt
  • friendly URL

I hope this helps. Thanks for reading.

Other References

Some blogs you might want to read further related to this article:

  1. 85 Reasons Why Website Designers/Developers Keep SEOs in Business
  2. Basic SEO in MODX Revolution
  3. MODX Revolution 2.2 and schema.org

Comments

blog comments powered by Disqus