This article, ‘Patterns of Cascading Behavior in Large Blog Graphs’ written by Jure Leskovec, Mary McGlohon, Christos Faloutsos, is found from Society for Industrial and Applied Mathematics(SIAM). I found their discussion very interesting since I got to observe the patterns of cascade in reality. When we talked about information cascade in lecture, in terms of the application of cascade, we focused on the horizontal propagation of information. I wanted to examine how cascade would work in different settings. If there were someone with an interest like mine would find this article helpful. As the title indicates the patterns of cascade were observed in blogs and posts. In terms of dataset, the article says there can be a bias since its focus is mainly on active blogs. However, in my opinion, it should not cause a considerable discrepancy since inactive blogs are not likely involved or influential in information cascade. The experiment was conducted on a dataset of 2,422,704 posts from 44,362 blogs over two-month period. The total number of links was 245,404 among the posts in the dataset.
Like how we analyzed information cascades in lecture the article also provided a network model that described the patterns of cascade among blogs. Figure 1(c) on page 3 showed this model. According to their explanation, every node represented a blog and there was a weighted directed edge between blogs u and v, where the weight of the edge corresponded to the number of posts from blog u linking to posts at blog v. This model showed a bit more complex behavior than the one discussed in lecture. This model was able to describe simultaneous interactions among many blogs. In terms of connectivity, about half of the blogs was taking a part of the cascade and the other half seemed to be isolated blogs. The number of blog-to-blog links followed the power law. This is a part that I found very intriguing. As the article suggested in the beginning, my guess for a pattern for rankings on links among blogs was exponential distribution since I expected the number of blog-to-blog links to drop drastically after few top ones. However, a distribution over all cascade sizes and the number of blog-to-blog links followed the power law which, if drawn in a x-y axes graph, to the right is a long tail, to the left are the few that dominate. This is also known as the 80-20 rule.(Wikipedia) As the power law indicates, surely there exist few popular blogs that are linked to from many other blogs in the network, but the unpopular ones do get some links from others which holds their observation that there was no one absolute cascade that prevented other flows of information. According to their observation, cascades tend to be wide and not too deep– stars and shallow bursty cascades are the most common type of cascades. When I first found this article, I was curious about the result. So, I just jumped into the conclusion. According to this article, their finding meant that posts still attract attention (get linked) even if they are somewhat late in the cascade and appear towards the bottom of it. When I first read this part, I was confused since it was opposite from my expectation and from behavioral pattern that I observed online. However, as I read their experiment step by step and tried make sense out of their findings, I realized their results are, in fact, quite reasonable since, in reality, there are too many blogs with lots of different topics and interests and also there also exists ever-changing aspect of available information online, it is nearly impossible to have a complete lack of informativeness. And, my observations were quite narrowly focused only on a fast-paced web environment.
Additionally, one of the interesting findings was there was a seven-day periodicity as other common schedules usually behave such as work schedules or class schedules. Also, interestingly, they observed a weekend effect that decreased frequency drastically at weekends. Actually, this was opposite from my expectation. I expected people to participate more in blogs on weekends in their free time. But, I figured that this effect might be explained by the fact that due to the trend that many tasks are done on computer or online, there is a greater exposure to computer for people on weekdays. Overall, it was a nice opportunity to explore patterns of cascading behavior in depth.
http://www.siam.org/proceedings/datamining/2007/dm07_060Leskovec.pdf











Leave a Comment
You must be logged in to post a comment.
* You can follow any responses to this entry through the RSS 2.0 feed.