SubSift's demonstrator user interface is implemented as a wizard-like series of web forms, taking the user through the above process form by form. At the end of the sequence SubSift produced downloadable web pages with ranked lists of papers per reviewer, and ranked lists of reviewers per paper. However, at the more abstract level, this workflow for comparing Programme Committee (PC) members (i.e. reviewers) and submitted abstracts consists of the following three parts.

  1. A. Profile the PC Members
  2. B. Profile the Abstracts
  3. C. Match PC Member Profiles against Abstract Profiles

Behind the functionality of each of these parts is a series of SubSift REST API invocations, which together constitute the workflow illustrated in the following diagram.

Submission Sifting workflow

The SubSift REST API calls that make up this workflow are described below. For readability, the HTTP request methods and their parameters are denoted using the format below.

<http_method> [<uri>][<user_id>]<path>
<parameter_name_1> = <parameter_value_1>
<parameter_name_2> = <parameter_value_2>
...
<parameter_name_N> = <parameter_value_N>

For brevity we omit <uri>, which has the value https://subsift.ilrt.bris.ac.uk for the publicly hosted version of SubSift, and <user_id> which will always be the account name (e.g. kdd09 for the SIGKDD'09 conference).

Note also that we omit details of the security token needed in the HTTP request header of all DELETE, POST and PUT requests. The token is also required to access folders and data marked as private, irrespective of request method.

A. Profile the PC Members

Step 1. Obtain a list of PC member names and their DBLP author page URIs. SubSift's DBLP Author Finder demo accepts a list of author names and then looks up these names on the DBLP Computer Science Bibliography and suggests author pages which, after disambiguation, are returned as a list with each line as: <pc member name>, <uri>.

Step 2. Create bookmarks folder to hold the list of PC member URIs found in step 1.

POST /bookmarks/pc

Step 3. Create bookmarks in this folder - one per PC member URI.

POST /bookmarks/pc/items
items_list=<list of URIs from step 1>

Step 4. Create a documents folder to hold the web page content (text) of the DBLP author pages.

POST /documents/pc

Step 5. Import the bookmarks folder into the documents folder. This adds the URIs to SubSift Harvester Robot's crawl queue. In time, all the URIs will be fetched and a document created in the documents folder for each webpage fetched.

POST /documents/pc/import/pc

We name the documents folder the same as the bookmarks folder. This is a convention, not a requirement, but makes the ancestry of the folder obvious.

Step 6. Create a profiles folder from the bookmarks folder.

POST /profiles/pc/from/pc

B. Profile the Abstracts

Step 7. For bulk upload, pre-process the abstracts into CSV format so that each line is: <paper id>, <abstract>. Include the text of the paper title in with the abstract text.

Step 8. Create a documents folder to hold the abstracts.

POST /documents/abstracts

Step 9. Use the abstracts CSV text to create a document item for each abstract.

POST /documents/abstracts/items
items_list=<csv from Step 7>

Step 10. Create a profiles folder from the documents folder.

POST /profiles/abstracts/from/abstracts

C. Match PC Member Profiles against Abstract Profiles

Step 11. Match the PC members profiles folder against the abstracts profiles folder.

POST /matches/pc_abstracts/profiles/pc/with/abstracts

Step 12. Fetch the ranked list of papers per PC member. Optionally, specify an XSLT stylesheet to transform the XML into a custom web page for each PC member.

GET /matches/pc_abstracts/items
profiles_id=pc
full=1

Note that you can get the data in smaller chunks by using other API calls.

Step 13. Fetch the ranked list of reviewers per paper.

GET /matches/pc_abstracts/items
profiles_id=abstracts
full=1

Step 14. If required, fetch the similarity matrix to use for bidding, optionally specifying manually chosen thresholds to discretize the scores into the range 3..1 as bid values.

GET /matches/pc_abstracts/matrix

Or omit the profiles_id parameter of the items call to get XML instead of a matrix:

GET /matches/pc_abstracts/items
full=1