A best-effort approach to an infrastructure for Chinese Web related research

Research output: Contribution to journalArticlepeer-review

Abstract

The design of the infrastructure for Chinese Web (CWI), a prototype system aimed at forum data analysis, is introduced. CWI takes a best effort approach. 1) It tries its best to extract or annotate semantics over the web data. 2) It provides flexible schemes for users to transform the web data into eXtensible Markup Language (XML) forms with more semantic annotations that are more friendly for further analytical tasks. 3) A distributed graph repository, called DISGR is used as backend for management of web data. The paper introduces the design issues, reports the progress of the implementation, and discusses the research issues that are under study.

Original languageEnglish
Pages (from-to)388-396
Number of pages9
JournalFrontiers of Electrical and Electronic Engineering in China
Volume6
Issue number2
DOIs
StatePublished - Jun 2011

Keywords

  • Chinese Web infrastructure
  • distributed storage
  • graph data model
  • semantic entity

Fingerprint

Dive into the research topics of 'A best-effort approach to an infrastructure for Chinese Web related research'. Together they form a unique fingerprint.

Cite this