درخواست و سفارش کتاب خارجی از آمازون (رایگان)

برای ثبت درخواست به انتهای صفحه مراجعه کنید.

A Distributed Architecture for the Monitoring and Analysis of Time Series Data

Author(s): Ruairi O'Reilly,
Publisher: Ruairí O'Reilly
Pages: 151
Language: en
Categories: Computers / Computer Science , Computers / Artificial Intelligence / Expert Systems , Computers / Software Development & Engineering / General ,

Description:...

Abstract

It is estimated that the quantity of digital data being transferred, processed or stored at any one time currently stands at 4.4 zettabytes (4.4 × 270 bytes) and this ﬁgure is expected to have grown by a factor of 10 to 44 zettabytes by 2020 [1]. Exploiting this data is and will remain, a signiﬁcant challenge. At present there is the capacity to store 33% of digital data in existence at any one time; by 2020 this capacity is expected to fall to 15%. These statistics suggest that, in the era of Big Data, the identiﬁcation of important, exploitable data will need to be done in a timely manner.

Systems for the monitoring and analysis of data, e.g. stock markets, smart grids and sensor networks, can be made up of massive numbers of individual components. These components can be geographically distributed yet may interact with one another via continuous data streams, which in turn may aﬀect the state of the sender or receiver. This introduces a dynamic causality, which further complicates the overall system by introducing a temporal constraint that is diﬃcult to accommodate.

Practical approaches to realising the system described above have led to a multiplicity of analysis techniques, each of which concentrates on speciﬁc characteristics of the system being analysed and treats these characteristics as the dominant component aﬀecting the results being sought. The multiplicity of analysis techniques introduces another layer of heterogeneity, that is heterogeneity of approach, partitioning the ﬁeld to the extent that results from one domain are diﬃcult to exploit in another.

The question is asked can a generic solution for the monitoring and analysis of data that: accommodates temporal constraints; bridges the gap between expert knowledge and raw data; and enables data to be eﬀectively interpreted and exploited in a transparent manner, be identiﬁed?

The approach proposed in this dissertation acquires, analyses and processes data in a manner that is free of the constraints of any particular analysis technique, while at the same time facilitating these techniques where appropriate. Constraints are applied by deﬁning a workﬂow based on the production, interpretation and consumption of data. This supports the application of diﬀerent analysis techniques on the same raw data without the danger of incorporating hidden bias that may exist.

To illustrate and to realise this approach a software platform has been created that allows for the transparent analysis of data, combining analysis techniques with a maintainable record of provenance so that independent third party analysis can be applied to verify any derived conclusions.

In order to demonstrate these concepts, a complex real-world example involving the near real-time capturing and analysis of neurophysiological data from a neonatal intensive care unit (NICU) was chosen. A system was engineered to gather raw data, analyse that data using diﬀerent analysis techniques, uncover information, incorporate that information into the system and curate the evolution of the discovered knowledge.

The application domain was chosen for three reasons: ﬁrstly because it is complex and no comprehensive solution exists; secondly, it requires tight interaction with domain experts, thus requiring the handling of subjective knowledge and inference; and thirdly, given the dearth of neurophysiologists, there is a real-world need to provide a solution for this domain.

Show description

* ایمیل (آدرس Email را با دقت وارد کنید)

لینک پیگیری درخواست ایمیل می شود.

شماره تماس (ارسال لینک پیگیری از طریق SMS)

نمونه: 09123456789

نوع درخواست