Thursday 12 September 2013

Understanding Data Mining

Well begun is half done. We can say that the invention of Internet is the greatest invention of the century which allows for quick information retrieval. It also has negative aspects, as it is an open forum therefore differentiating facts from fiction seems tough. It is the objective of every researcher to know how to perform mining of data on the Internet for accuracy of data. There are a number of search engines that provide powerful search results.

Knowing File Extensions in Data Mining

For mining data the first thing is important to know file extensions. Sites ending with dot-com are either commercial or sales sites. Since sales is involved there is a possibility that the collected information is inaccurate. Sites ending with dot-gov are of government departments, and these sites are reviewed by professionals. Sites ending with dot-org are generally for non-profit organizations. There is a possibility that the information is not accurate. Sites ending with dot-edu are of educational institutions, where the information is sourced by professionals. If you do not have an understanding you may take help of professional data mining services.

Knowing Search Engine Limitations for Data Mining

Second step is to understand when performing data mining is that majority search engines have filtering, file extension, or parameter. These are restrictions to be typed after your search term, for example: if you key in "marketing" and click "search," every site will be listed from dot-com sites having the term "marketing" on its website. If you key in "marketing site.gov," (without the quotation marks) only government department sites will be listed. If you key in "marketing site:.org" only non-profit organizations in marketing will be listed. However, if you key in "marketing site:.edu" only educational sites in marketing will be displayed. Depending on the kind of data that you want to mine after your search term you will have to enter "site.xxx", where xxx will being replaced by.com,.gov,.org or.edu.

Advanced Parameters in Data Mining

When performing data mining it is crucial to understand far beyond file extension that it is even possible to search particular terms, for example: if you are data mining for structural engineer's association of California and you key in "association of California" without quotation marks the search engine will display hundreds of sites having "association" and "California" in their search keywords. If you key in "association of California" with quotation marks, the search engine will display only sites having exactly the phrase "association of California" within the text. If you type in "association of California" site:.com, the search engine will display only sites having "association of California" in the text, from only business organizations.

If you find it difficult it is better to outsource data mining to companies like Online Web Research Services



Source: http://ezinearticles.com/?Understanding-Data-Mining&id=5608012

No comments:

Post a Comment