Software Development for Automatic Data Collection form Web Resources and Social Networks for Linguistic Analysis of the Text Content in Real Time

EM Project Overview

Client-server system provides a real-time natural language processing and computational platform for linguistic analysis of digital content and communications to perform sentiment analysis. This digital content includes static data such as forums, blogs, twitter and chats postings. 

Technical features: 
Target server platform:Windows 2003 Server, Windows 2003 Server Cluster, IIS 6.0 and more.
Target client browsers:IE 7.0 and Fire Fox 2.0.
GUI Framework:ASP.Net 2.0.
Data sources:SQL server, XML files, binary storage, full-text search index
Data accesses:ADO.Net, XML parsers.
Web services access:SOAP web proxies.
Windows services access:  WCF.
GUI features:DHTML, AJAX and Flex.
Localization resources:resx files.
Business-logic features: 
  • Data gathering services includes pluggable data crawler. 
  • Indexing and search services with full-text search. 
  • Smart search service includes linguistic analysis and statistic modules. 
  • Web GUI includes Flex charts for visualization of statistic data.
Managements and resources: 
  • Time line (versions 1.0 - 1.4): October 2008 - July 2009 
  • Resources: 2 Managers, 10 Developers and 2 testers 
  • Development methodology: adopted RUP