Sitewide Search On A Shoe String

One of the questions I got a lot when I was building web sites for smaller businesses was if I could create a search engine for their site. Visitors should be able to search only this site and find things without the maintainer having to put “related articles” or “featured content” links on every page by hand.

Back when this was all fields this wasn’t easy as you either had to write your own scraping tool, use ht://dig or a paid service from providers like Yahoo, Altavista or later on Google. In the former case you had to swallow the bitter pill of computing and indexing all your content and storing it in a database for quick access and in the latter it hurt your wallet.

Times have moved on and nowadays you can have the same functionality for free using Yahoo’s “Build your own search service” – BOSS. The cool thing about BOSS is that it allows for a massive amount of hits a day and you can mash up the returned data in any format you want. Another good feature of it is that it comes with JSON-P as an output format which makes it possible to use it without any server-side component!

Starting with a working HTML form

In order to add a search to your site, you start with a simple HTML form which you can use without JavaScript. Most search engines will allow you to filter results by domain. In this case we will search “bbc.co.uk”. If you use Yahoo as your standard search, this could be:

<form id="customsearch" action="http://search.yahoo.com/search">
	<div>
		<label for="p">Search this site:</label>
		<input type="text" name="p" id="term">
		<input type="hidden" name="vs" id="site" value="bbc.co.uk">
		<input type="submit" value="go">
	</div>
</form>

The Google equivalent is:

<form id="customsearch" action="http://www.google.co.uk/search">
	<div>
		<label for="p">Search this site:</label>
		<input type="text" name="as_q" id="term">
		<input type="hidden" name="as_sitesearch" id="site" value="bbc.co.uk">
		<input type="submit" value="go">
	</div>
</form>

In any case make sure to use the ID term for the search term and site for the site, as this is what we are going to use for the script. To make things easier, also have an ID called customsearch on the form.

To use BOSS, you should get your own developer API for BOSS and replace the one in the demo code. There is click tracking on the search results to see how successful your app is, so you should make it your own.

Adding the BOSS magic

BOSS is a REST API, meaning you can use it in any HTTP request or in a browser by simply adding the right parameters to a URL. Say for example you want to search “bbc.co.uk” for “christmas” all you need to do is open the following URL:

http://boss.yahooapis.com/ysearch/web/v1/christmas?sites=bbc.co.uk&format=xml&appid=YOUR-APPLICATION-ID

Try it out and click it to see the results in XML. We don’t want XML though, which is why we get rid of the format=xml parameter which gives us the same information in JSON:

http://boss.yahooapis.com/ysearch/web/v1/christmas?sites=bbc.co.uk&appid=YOUR-APPLICATION-ID

JSON makes most sense when you can send the output to a function and immediately use it. For this to happen all you need is to add a callback parameter and the JSON will be wrapped in a function call. Say for example we want to call SITESEARCH.found() when the data was retrieved we can do it this way:

http://boss.yahooapis.com/ysearch/web/v1/christmas?sites=bbc.co.uk&callback=SITESEARCH.found&appid=YOUR-APPLICATION-ID

You can use this immediately in a script node if you want to. The following code would display the total amount of search results for the term christmas on bbc.co.uk as an alert:

<script type="text/javascript">
	var SITESEARCH = {};
	SITESEARCH.found = function(o){
		alert(o.ysearchresponse.totalhits);
	}
</script>
<script type="text/javascript" src="http://boss.yahooapis.com/ysearch/web/v1/christmas?sites=bbc.co.uk&callback=SITESEARCH.found&appid=Kzv_lcHV34HIybw0GjVkQNnw4AEXeyJ9Rb1gCZSGxSRNrcif_HdMT9qTE1y9LdI-">
</script>

However, for our example, we need to be a bit more clever with this.

Enhancing the search form

Here’s the script that enhances a search form to show results below it.

SITESEARCH = function(){
	var config = {
		IDs:{
			searchForm:'customsearch',
			term:'term',
			site:'site'
		},
		loading:'Loading results...',
		noresults:'No results found.',
		appID:'YOUR-APP-ID',
		results:20
	};
	var form;
	var out;
	function init(){
		if(config.appID === 'YOUR-APP-ID'){
			alert('Please get a real application ID!');
		} else {
			form = document.getElementById(config.IDs.searchForm);
			if(form){
				form.onsubmit = function(){
					var site = document.getElementById(config.IDs.site).value;
					var term = document.getElementById(config.IDs.term).value;
					if(typeof site === 'string' && typeof term === 'string'){
						if(typeof out !== 'undefined'){
							out.parentNode.removeChild(out);
						}
						out = document.createElement('p');
						out.appendChild(document.createTextNode(config.loading));
						form.appendChild(out);
						var APIurl = 'http://boss.yahooapis.com/ysearch/web/v1/' + 
													term + '?callback=SITESEARCH.found&sites=' + 
													site + '&count=' + config.results + 
													'&appid=' + config.appID;
						var s = document.createElement('script');
						s.setAttribute('src',APIurl);
						s.setAttribute('type','text/javascript');
						document.getElementsByTagName('head')[0].appendChild(s);
						return false;
					}
				};
			}
		}
	};
	function found(o){
		var list = document.createElement('ul');
		var results = o.ysearchresponse.resultset_web;
		if(results){
			var item,link,description;
			for(var i=0,j=results.length;i<j;i++){
				item = document.createElement('li');
				link = document.createElement('a');
				link.setAttribute('href',results[i].clickurl);
				link.innerHTML = results[i].title;
				item.appendChild(link);
				description = document.createElement('p');
				description.innerHTML = results[i]['abstract'];
				item.appendChild(description);
				list.appendChild(item);
			}
		} else {
			list = document.createElement('p');
			list.appendChild(document.createTextNode(config.noresults));
		}
		form.replaceChild(list,out);
		out = list;
	};
	return{
		config:config,
		init:init,
		found:found
	};
}();

Oooohhhh scary code! Let’s go through this one bit at a time:

We start by creating a module called SITESEARCH and give it an configuration object:

SITESEARCH = function(){
	var config = {
		IDs:{
			searchForm:'customsearch',
			term:'term',
			site:'site'
		},
		loading:'Loading results...',
		appID:'YOUR-APP-ID',
		results:20
	}

Configuration objects are a great idea to make your code easy to change and also to override. In this case you can define different IDs than the one agreed upon earlier, define a message to show when the results are loading, when there aren’t any results, the application ID and the number of results that should be displayed.

Note: you need to replace “YOUR-APP-ID” with the real ID you retrieved from BOSS, otherwise the script will complain!

var form;
var out;
function init(){
	if(config.appID === 'YOUR-APP-ID'){
		alert('Please get a real application ID!');
	} else {

We define form and out as variables to make sure that all the methods in the module have access to them. We then check if there was a real application ID defined. If there wasn’t, the script complains and that’s that.

form = document.getElementById(config.IDs.searchForm);
if(form){
	form.onsubmit = function(){
		var site = document.getElementById(config.IDs.site).value;
		var term = document.getElementById(config.IDs.term).value;
		if(typeof site === 'string' && typeof term === 'string'){      

If the application ID was a winner, we check if the form with the provided ID exists and apply an onsubmit event handler. The first thing we get is the values of the site we want to search in and the term that was entered and check that those are strings.

if(typeof out !== 'undefined'){
	out.parentNode.removeChild(out);
}
out = document.createElement('p');
out.appendChild(document.createTextNode(config.loading));
form.appendChild(out);  

If both are strings we check of out is undefined. We will create a loading message and subsequently the list of search results later on and store them in this variable. So if out is defined, it’ll be an old version of a search (as users will re-submit the form over and over again) and we need to remove that old version.

We then create a paragraph with the loading message and append it to the form.

var APIurl = 'http://boss.yahooapis.com/ysearch/web/v1/' + 
												term + '?callback=SITESEARCH.found&sites=' + 
												site + '&count=' + config.results + 
												'&appid=' + config.appID;
					var s = document.createElement('script');
					s.setAttribute('src',APIurl);
					s.setAttribute('type','text/javascript');
					document.getElementsByTagName('head')[0].appendChild(s);
					return false;
				}
			};
		}
	}
};

Now it is time to call the BOSS API by assembling a correct REST URL, create a script node and apply it to the head of the document. We return false to ensure the form does not get submitted as we want to stay on the page.

Notice that we are using SITESEARCH.found as the callback method, which means that we need to define this one to deal with the data returned by the API.

function found(o){
	var list = document.createElement('ul');
	var results = o.ysearchresponse.resultset_web;
	if(results){
		var item,link,description;

We create a new list and then get the resultset_web array from the data returned from the API. If there aren’t any results returned, this array will not exist which is why we need to check for it. Once we done that we can define three variables to repeatedly store the item title we want to display, the link to point to and the description of the link.

for(var i=0,j=results.length;i<j;i++){
	item = document.createElement('li');
	link = document.createElement('a');
	link.setAttribute('href',results[i].clickurl);
	link.innerHTML = results[i].title;
	item.appendChild(link);
	description = document.createElement('p');
	description.innerHTML = results[i]['abstract'];
	item.appendChild(description);
	list.appendChild(item);
}

We then loop over the results array and assemble a list of results with the titles in links and paragraphs with the abstract of the site. Notice the bracket notation for abstract as abstract is a reserved word in JavaScript2 :).

} else {
		list = document.createElement('p');
		list.appendChild(document.createTextNode(config.noresults));
	}
	form.replaceChild(list,out);
	out = list;
};      

If there aren’t any results, we define a paragraph with the no results message as list. In any case we replace the old out (the loading message) with the list and re-define out as the list.

return{
		config:config,
		init:init,
		found:found
	};
}();

All that is left to do is return the properties and methods we want to make public. In this case found needs to be public as it is accessed by the API return. We return init to make it accessible and config to allow implementers to override any of the properties.

Using the script

In order to use this script, all you need to do is to add it after the form in the document, override the API key with your own and call init():

<form id="customsearch" action="http://search.yahoo.com/search">
	<div>
		<label for="p">Search this site:</label>
		<input type="text" name="p" id="term">
		<input type="hidden" name="vs" id="site" value="bbc.co.uk">
		<input type="submit" value="go">
	</div>
</form>
<script type="text/javascript" src="boss-site-search.js"></script>
<script type="text/javascript">
	SITESEARCH.config.appID = 'copy-the-id-you-know-to-get-where';
	SITESEARCH.init();
</script>

Where to go from here

This is just a very simple example of what you can do with BOSS. You can define languages and regions, retrieve and display images and news and mix the results with other data sources before displaying them. One very cool feature is that by adding a view=keyterms parameter to the URL you can get the keywords of each of the results to drill deeper into the search. An example for this written in PHP is available on the YDN blog. For JavaScript solutions there is a handy wrapper called yboss available to help you go nuts.

This article available in German at webkrauts.de

About the author

Christian Heilmann grew up in Germany and, after a year working for the red cross, spent a year as a radio producer. From 1997 onwards he worked for several agencies in Munich as a web developer. In 2000 he moved to the States to work for Etoys and, after the .com crash, he moved to the UK where he lead the web development department at Agilisys. In April 2006 he joined Yahoo! UK as a web developer and moved on to be the Lead Developer Evangelist for the Yahoo Developer Network. In December 2010 he moved on to Mozilla as Principal Developer Evangelist for HTML5 and the Open Web. He publishes an almost daily blog at http://wait-till-i.com and runs an article repository at http://icant.co.uk. He also authored Beginning JavaScript with DOM Scripting and Ajax: From Novice to Professional.

More articles by Christian

Comments