Automatic Data Acquisition System from Internet Resources and Social Networks for further Linguistic Analysis of Text Content in Real Time
Project Description
Client-server system provides a real-time natural language processing and computational platform for linguistic analysis of digital content and communications to perform sentiment analysis. This digital content includes static data such as forums, blogs, twitter and chats postings. 
Technical features:
  • Target server platform: Windows 2003 Server, Windows 2003 Server Cluster, IIS 6.0 and more.
  • Target client browsers: IE 7.0 and Fire Fox 2.0.
  • GUI Framework: ASP.Net 2.0.
  • Data sources: SQL server, XML files, binary storage, full-text search index
  • Data accesses: ADO.Net, XML parsers.
  • Web services access: SOAP web proxies.
  • Windows services access: WCF.
  • GUI features: DHTML, AJAX and Flex.
  • Localization resources: resx files.
Business-logic features: 
  • Data gathering services includes pluggable data crawler. 
  • Indexing and search services with full-text search. 
  • Smart search service includes linguistic analysis and statistic modules. 
  • Web GUI includes Flex charts for visualization of statistic data.
Managements and resources: 
  • Time line (versions 1.0 - 1.4): October 2008 - July 2009 
  • Resources: managers, team of developers and testers 
  • Development methodology: adopted RUP


See similar projects at our blog.