Monday, November 15, 2004

Search4Code - Simple Source-Code Search Tool

I wrote this simple source-code search tool. It can be used to search the internet for source-code files in the specified language, containing the specified text in them. You may find the tool very useful to learn new APIs.

Text:  Language:   


How does it work?
Search4Code uses Google Search behind the scenes. It uses the filetype search parameter of Google to restrict a search within files having the well-known file-extension of the language being searched for. The tool currently supports C, C++, C#, VB etc. and support for more languages can be easily added in the future.

Koders (currently in beta) offers a similar service but they appear to have a dedicated search-engine of their own. Search4Code on the other hand is a meta-search engine (a search-engine that uses another search-engine behind the scenes). My ultimate aim with Search4Code is to also allow for the searching of package/class/method/variable declarations and references.

[Update on 16-Nov-2004] Thanks to Bheema for his suggestions in the comments. He suggested to include header files (.h) when searching in C program files (.c) and C++ program files (.cpp or .cc). Google does not understand the logical OR operator in its as_filetype parameter but offers filetype keyword in the search query. Thus the solution is to simply do some processing on the search query text before submitting the form if the search language happens to be C or C++. I chose to write my first manual JavaScript for this and hook onto the onSubmit event of the form.

Unfortunately, while updating this post, I found that Blogger.com does not allow <script> tags in the posts :-(

So I guess, its time to have a dedicated webpage for Search4Code! I shall try to get one up ASAP and update this post with info about the same.

Check back for more updates!

5 comments:

Abhimanyu said...

Good one Sid!!...when it comes to advanced searches I think making it a meta search is intelligent, because you already have a large index!!

Alpha0 said...

Great work. Keep it up.
And your idea of restricting search to various fields like: functions/sub-routines, class names and comments will make it a beautiful product.

Are you sure that .pl extension doent get confused with the website names ending with "pl" ?

Thanks,
Sandeep Giri

Sid said...

You have a valid point Abhimanyu, but as of now, Google only appears to index file-names of .gz files and not their contents. As most open-source projects, provide their content as .gz files, I am unable to use Google to search inside them.

Koders have an indexing engine of their own, and hence are able to index the contents of .gz files as well and also maintain the association between a source-code file and its other project files.

Search4Code is simple, and I feel it is a great way to start when you need to learn new APIs.

Some websites use Perl to dynamically generate HTML. Some of their webpages like http://use.perl.org/comments.pl have their URLs ending in .pl. Google considers these webpages also as Perl language source files and searches them for the specified text.

bheema v said...

1) .h in addition to .c; .hxx, .hpp in addition to .cpp, .cxx etc.

Sid said...

Bheema,

While implementing your suggestions I found that Blogger.com doesn't support <script> tags in posts. Gives me a reason to have a dedicated webpage for Search4Code. Details in post.

Thanks,
Sid