Today's Question:  What are you most afraid of as a programmer?        GIVE A SHOUT

Technical Article => Web =>  JavaScript

Regular expression to get html meta description

  Peter      2012-07-03 10:09:20      7,183    0    0

When we need to process a HTML page source code, we often need to retrieve the meta description of the page besides the links in the page. This description is usually located in <meta> tag of a HTML page. The meta description is very useful for search engine index. How can we retrieve the meta description? If we use a regular expression, we can easily get the meta description.

In JavaScript, the regular expression looks like :

var pattern = /<meta.*?name="description".*?content="(.*?)".*?>|<meta.*?content="(.*?)".*?name="description".*?>/i;

since the description is the content in the <meta> tag with a property name which has a value of description. So we need to find this tag and then use parenthesis to group the description for later retrieval.  Also here we use a | character to separate the two sub patterns , the meta tag can have either sub pattern above.

Suppose now we have a sample code snippet which contains

var data='<meta name="description" content="This is a sample code snippet">';

when we run

var arr=pattern.exec(data);

The returned arr is an array with 3 elements if it's matched. The first one arr[0] is the matched content in the data variable, arr[1] is the content matched in the first parenthesis in the pattern, arr[2] is the content in the second parenthesis in the pattern. If the first sub pattern is matched, then arr[1] will contain the description and arr[2] will be empty. Otherwise, arr[1] will be empty and arr[2] will contain the description.

In the above case, arr[0] will be <meta name="description" content="This is a sample code snippet"> and arr[1] will be This is a sample code snippet and arr[2] will be empty.

In conclusion, to get the meta description you only need to check whether arr[1] is empty or not, if it's empty, then the description is arr[2], otherwise it's arr[1].

JAVASCRIPT HTML REGULAR EXPRESSION META DESCRIPTION

  SAVE AS PDF   MARK AS READ   MARK AS IMPORTANT

Share on Facebook  Share on Twitter  Share on Google+  Share on Weibo  Share on Reddit  Share on Digg  Share on Tumblr    Delicious

  RELATED


  0 COMMENT


No comment for this article.


  WRITE ARTICLE

Stats for nerds

By sonic0002
The YouTube HTML5 player is naughty. As a nerd, you can view the stats of the video playing now. These stats are not how many views or likes but how many frames are dropped and bandwidth etc. Google always brings us some surprise.