The Web is perhaps the most complex system that we know. Its massive scale, complex dynamism, open richness, and social character mean that it may be more profitable to study it using tools and concepts appropriate for understanding nervous systems, organisms, ecosystems and society, rather than approaches more traditionally employed to engineer technology. Simultaneously, the scientists trying to understand this wide array of complex natural systems may have much to gain by considering the emergingstudy of the Web.

This workshop brings together researchers from a wide array of disciplines (physics, computing, philosophy, biology, social science) to explore the way in which concepts and tools from the emerging study of massive data flow (MDF) can be used to shed light on both the quantitative and qualitative dynamics of the Web. It will particularly focus on exploring how MDF ideas that are developing in physics and biology can be combined productively with those from humanities (e.g., s mobile sociology) and technology (e.g., the rise of web observatories).

MDF is a generic term used to identify a new kind of system dynamics: self-organization in complex open environments. Composed of many interacting heterogeneous elements, MDF systems exhibit self-referential, self-modifying, and self-sustaining dynamics that can enable door-opening innovation. While the web may be the best example of an MDF system, the concept is generic to natural/artificial systems such as brains, cells, markets and ecosystems.

Unlike systems studied in isolation or at equilibrium, MDF systems are open and driven systems existing within a rich context, constantly changing, growing, evolving, and thereby autonomously changing the way in which they interact with the environment around them. The patterns that they exhibit are neither imposed from outside, nor arise internally, but are a consequence of the interface between endogenous data flows within the system and exogenous data flows that perturb it. If "Big data" systems exhibit volume, velocity and variety, MDF systems exhibit vitality.

Example areas of interest include:

The workshop asks:

10:00 a.m. - 11:00 a.m.
Opening Remarks
Takashi Ikegami, University of Tokyo
"Massive Data Flows : Its Structure and Power"
After providing a short introduction to the background of MDF, I will discuss the power and structure of Massive Data Flows(MDF) with some concrete examples taken from Twitter time series and beehive analysis.

A position paper on the MDF can be found in, Takashi Ikegami, Mizuki Oka, Massive Data Flows: Self-organization of energy, material, and information flows, 6th International Conference on Agents and Artificial Intelligence ICAART 2014, pp. 237 - 242, Angers, France, March, 2014.
11:00 a.m. - 11:30 a.m.
11:30 a.m. - 12:30 p.m.
Olaf Sporns, Indiana University
"Brain dynamics and communication in the human connectome"
Recent advances in brain mapping techniques now allow the construction of whole-brain connection maps (connectomes). These maps are beginning to reveal some of the organizational principles underlying brain networks, and they also allow the design of models that capture patterns of brain dynamics. In this talk, I will survey the state of the field and discuss recent network models that attempt to forge a link between brain dynamics and communication processes.
12:30 p.m. - 1:00 p.m.
Edgar Vallejo, Charles Taylor, UCLA
"Birdsong structure and the analysis of real tweets"
Complex birdsong has proved amenable, via linguistics tools, network and path models, to analyses of lexicon and syntax, and hopefully will eventually lead to an illumination of semantics. Revealing patterns at these successive organizational levels is increasingly dependent on large and then larger data sets. While the data are rather easily collected -a recorder can accumulate strings of thousands of phrases from an active singer in just a few minutes- the analysis of such large data sets is not so easily accomplished. The analogies between birdsong organization and social networks are obvious, since different phrases comprise a linked network, created by their internal transitions, similar to that among social correspondents. I illustrate the complexity and the variety of the organizational structures in birdsong with examples from North America and Australia, set the biological context for their evolution, and describe goals to facilitate our understanding of the phenomenon.
1:00 p.m. - 2:30 p.m.
2:30 p.m. - 4:00 p.m.
Kazuhiro Sasahara, Nagoya University
"Emergence of collective attention on Twitter"
Online social media are increasingly facilitating our social interactions, thereby making available a massive "digital fossil" of human behavior. We examined the emergence of "collective attention" on Twitter by focusing on the fact that tweet activity exhibits a burst-like increase and an irregular oscillation when a particular real-world event occurs; otherwise, it follows regular circadian rhythms. The difference between regular and irregular states in the tweet stream was measured using the Jensen-Shannon divergence, which corresponds to the intensity of collective attention. The results of applying this method to 490 million Japanese tweets by over 400,000 users revealed 60 cases of collective attentions, including one related to the 2011 Japan earthquake. Retweet networks were also investigated to understand collective attention in terms of social interactions. Our findings provide new insights into human communication in the digital era.
Ciro Cattuto, ISI Foundation,
"Mining Concurrent Topical Activity in Microblog Streams"
Streams of user-generated content in social media exhibit patterns of collective attention across diverse topics, with activity profiles determined both by exogenous drivers and endogenous dynamics. Teasing apart different topics and resolving their concurrent activity timelines is a challenging problem. We discuss an approach based on tensor factorization techniques that are commonly used to extract latent signals in domains such as signal processing, psychometrics and brain science. We demonstrate our approach on a large collection of messages posted to Twitter during the London 2012 Olympics, for which a detailed schedule of events is independently available and can be used for validation. We show that, given appropriate techniques for latent signal detection, Twitter can be used as a social sensor to extract concurrent topical-temporal signals about real-world events with high temporal resolution.
Mizuki Oka, University of Tsukuba
"Self-organization on social media: endo-exo bursts and baseline fluctuations"
A salient dynamic property of social media is busrting behavior. In this paper, we study busrting behavior in relation to the structure of fluctuation, known as fluctuation-response relation, to reveal the origin of bursts. More specifically, we study the temporal relation between a preceding baseline fluctuation and the successive burst response using a frequency time series of 3,000 keywords on Twitter. We found that all the nouns have both endogenous and exogenous origins of bursts and there is a critical threshold to distinguish between the two. The critical threshold is defined in terms of the fluctuation in the number of tweets. The bursts below this threshold are endogenously caused and above this threshold, exogenous bursts emerge. We discuss the unique features of this self-organizing properties on Twitter. These findings are useful for characterizing how excitable a keyword is on twitter and could be used, for example predicting the response of a particular information on a social media.
4:00 p.m. - 4:30 p.m.
4:30 p.m. - 5:00 p.m.
Summary of the workshop.
Seth Bullock, University of Southampton
5:00 p.m. - 6:00 p.m.
All speakers.
6:00 p.m.
Closing Remarks