Baptism of fire for new SubSift at SDM10

The new JISCRI SubSift has been used to support the paper reviewing process for the 2010 SIAM International Conference on Data Mining (SDM10), which will take place 29 April – 1 May, 2010 in Columbus, Ohio.

2010 SIAM International Conference on Data Mining

The initiative came from SDM10 programme co-chair, Bart Goethals, University of Antwerp, who heard about the original SubSift software at KDD09 earlier this year. Bart pasted the names of the 145 programme committee members into the new SubSift software and uploaded a file containing the 327 submitted abstracts (as exported directly from SDM10’s conference management system). SubSift then compared the pc members’ online bibliographies, harvested from the DBLP website, with each of the submitted abstracts and produced personalised web pages, listing the submitted papers most closely matching each pc member. These pages were made available to the pc members to guide their bidding for papers to review. Also, to assist Bart in assigning papers that no one bid on, SubSift produced a list of the closest matching reviewers for each paper.

This really is a baptism of fire for the new JISCRI SubSift software because it has only just been completely rewritten in a single programming language, Perl, based on the original, eclectic code base of Prolog, Java, C++ and Matlab M-code. The original SubSift had no user interface other than the command line; the new version has been augmented with a web-based step-by-step “wizard” interface to avoid the end-user having to install any software and to guide the user through the process of matching paper submissions to reviewers. The project still has much work to do in packaging up the functionality of the new SubSift in an easy to use and well documented way. Fortunately for this trial, given the technical nature of the audience at SDM10, there was no need to explain terminology like “tfidf” and “cosine similarity”, although such jargon will need translating to everyday language when SubSift is eventually released to the wider academic community.

We are currently gathering qualitative feedback from Bart and the pc members, but intend to quantitatively analyse how closely the actual bids match the recommendations generated by SubSift once the SDM10 reviewing process closes.