Department of Computer Science Unversity of Illinois at Urbana-Champaign
Home People Research Seminars Education Photos Links

Seminars

CS Department Colloquia

Each semester, there are departmental colloquia of interest to the DAIS community. Refer to the department seminar web pages and the Distinguished Lecturer/Entrepreneur Series web page for a complete listing of these seminars, which will usually also be announced on the DAIS mailing list described below.

The Yahoo!-DAIS Seminar (CS591MSW)

The Yahoo!-DAIS Seminar will be held on Tuesdays at 4 PM in 3403 SC. As in other semesters, we will have a few visiting speakers who must be scheduled at a different day or time, due to their travel schedules. Students who take the Yahoo!-DAIS Seminar for credit can miss up to two seminars. Speakers are announced on the DAIS mailing list (as are other items of interest to the DAIS community). It is quick and easy to subscribe to the DAIS mailing list.

Seminar schedules for past semesters: Spring 2009 | Fall 2008 | Spring 2008 | Fall 2007 | Spring 2007 | Fall 2006 | Spring 2006 | Fall 2005 | Spring 2005 | Fall 2004

Summer 2009 Schedule
Coordinator: Lu-An Tang, tang18 AT illinois.edu; Tao Cheng, tcheng3 AT illinois.edu

Wednesday July 8
SC 3403
4-5 PM

Title: ArnetMiner Extraction and Mining of Academic Networks
Speaker: Dr. Jie Tang, Tsinghua University


Abstract: Social network services (SNSs) have attracted much attention on the Web recently. In this talk, I will introduce our academic search system ArnetMiner, which is available at http://www.arnetminer.org.
Specifically, the system focuses on: 1) Extracting researcher profiles automatically from the Web; 2) Integrating the publication data into the network from existing digital libraries; 3) Modeling the entire
academic network; and 4) Providing search services for the academic network. So far, 448,470 researcher profiles have been extracted using a unified tagging approach. We integrate publications from online Web databases and propose a probabilistic framework to deal with the name ambiguity problem. Furthermore, we propose a unified modeling approach to simultaneously model topical aspects of papers, authors, and publication venues. Search services such as expertise search and people association search have been provided based on the modeling results. The system has been in operation on the internet for more than two years. System logs show that users of the system cover more than 180 countries. Averagely, the system receives about 2,000 visits of independent IP address per day. The number of visits continuously increases by +20% per month.

Bio: Jie Tang is an assistant professor at the Department of Computer Science and Technology, Tsinghua University. His main research interests include social network mining, text mining, statistical learning, and semantic web. He has published over 60 research papers in major international journals and conferences including: KDD, IJCAI, SIGMOD, ACL, ISWC, TKDE, JWS and JoDS. He is the principal investigator of National High-tech R&D Program (863) Program, NSFC project, Chinese Young Faculty Research Funding, National 985 funding, several international collaborative projects such as Minnesota/China Collaborative project and IBM innovative joint-research projects, Tsinghua-Google Joint project, and Sougo.Inc. He serves as co-chair of Workshop SWSM?08-09, LDM-TA?09, FDM?09, and also serves as the PC member of more than 40 international conferences. He serves as the editor of Journal of Software and Journal of Advances in Information Technology, the guest editor of TKDD special issue on large-scale data mining, and reviewers of a dozens of journals such as TKDE, TKDD, Machine Learning Journal, PRL, and TALIP. HomePage: http://keg.cs.tsinghua.edu.cn/persons/tj/
 

Online Video: Link
 

Friday

July 31
SC 3403
4-5 PM

Title : Mining Patterns and Building Classifiers from Software Data:
Addressing Software Maintenance and Reliability Issues
Speaker: Dr. David Lo, Singapore Management University


Abstract: Software is a ubiquitous component of our daily life. There are many issues related to software development; these include: reducing cost involved in maintaining a software systems, and ensuring reliability of systems. Can data mining help?

Studies have shown that program comprehension takes up to 45% of software costs. One contributing factor is the lack of documented specification. In the first part of the talk, a technique to efficiently mine common software temporal patterns serving as candidate specifications would be described.  This work extends latest study in sequential pattern mining and episode mining by mining a compact representative set of patterns that repeat frequently within a sequence and across many sequences. These patterns in turn could be post-processed to form rules and UML sequence diagrams and fed to downstream program analysis tools.

We often depend on the correct workings of software systems.  Due to the difficulty and complexity of software systems, bugs and anomalies are prevalent. In the second part of the talk, we discuss a technique
to classify software behaviors based on past history or runs. With the technique, it is possible to generalize past known erroneous behaviors to capture other failures. This work proposes a new pattern-based
classification approach working on a set of sequences and applies it for software failure detection.

Bio: David Lo is an Assistant Professor in the School of Information Systems, Singapore Management University. He received his Ph.D. in Computer Science from the National University of Singapore in 2008.
Before that he receives his B.Eng. in Computer Engineering from Nanyang Technological University in 2004. His research interests are frequent pattern mining, classification, reverse engineering, software
maintenance, and software reliability. His work has been published in various venues in both data mining and software engineering area including: ICDE, KDD, SDM, FSE, ASE, ISSTA, etc.
 

Online Video: Link

Friday

Aug. 7

SC 3403

1:30-2:30pm

Title: Modeling and Algorithmic Challenges in Online Social Networks
Speaker: Ravi Kumar (Yaoo! Research)



Abstract: Online social networks have become major and driving phenomena on the web. In this talk we will address key modeling and algorithmic questions related to large online social networks.  From the modeling perspective, we raise the question of whether there is a generative model for network
evolution. The availability of time-stamped data makes it possible to study this question at an extremely fine granularity.  We exhibit a simple, natural model that leads to synthetic networks with properties similar to the online ones.
From an algorithmic viewpoint, we focus on challenges posed by the magnitude of data in these networks.  In particular, we examine topics related to influence and correlation in user activities and compressibility of such networks.

Bio: Ravi Kumar joined Yahoo! Research in July 2005. Prior to this, he was a research staff member at the IBM Almaden Research Center in the Computer Science Principles and Methodologies group. His primary interests are web algorithms, algorithms for large data sets, and theory of computation. He obtained his PhD in Computer Science from Cornell University in December 1997.
 

Online Video: Link