MSR-Bing Image Retrieval Challenge

MSR-Bing Image Retrieval Challenge @ ICME 2014

Challenge evaluation guideline and entrance (with a notice of an important change on the evaluation process): 


With the success of the 1st MSR-Bing Image Retrieval Challenge (MSR-Bing IRC) at ACM Multimedia 2013, Microsoft Research in partnership with Bing is happy to launch the 2nd MSR-Bing IRC at ICME 2014. 

Do you have what it takes to build the best image retrieval system? Enter this Challenge to develop an image scoring system for a search query. 

In doing so, you can:
  • Try out your image retrieval system using real world data; 
  • See how it compares to the rest of the community’s entries; 
  • Get to be a contender for ICME 2014 Grand Challenge; 
  • Be eligible to win a cash prize.

Task
The topic of the Challenge is web image retrieval. The contestants are asked to develop systems to assess the effectiveness of query terms in describing the images crawled from the web for image search purposes. A contesting system is asked to produce a floating-point score on each image-query pair that reflects how relevant the query could be used to describe the given image, with higher numbers indicating higher relevance. The dynamic range of the scores does not play a significant role so long as, for any query, sorting by its corresponding scores for all its associated images gives the best retrieval ranking for these images.

Dataset
The data is based on queries received at Bing Image Search in the EN-US market and comprises two parts: (1) the Training Dataset which is a sample of Bing user click log, and (2), the Dev Dataset which, though may differ in size, is created to have consistent query distribution, judgment guidelines and quality as the Test Dataset. The two datasets are intended for contestants’ local debugging and evaluation. Below table shows the dataset statistics.
More details about the dataset please see the dataset document, and the dataset can be download at the MSR-Bing Image Retrieval Challenge 2013 website.

Measurements
Each entry to the Challenge is ranked by its respective Discounted Cumulated Gain (DCG) measure against the test set. To compute DCG, we first sort for each query the images based on the floating point scores returned by the contesting entry. DCG for each query is calculated as
where 〖rel〗_i={Excellent=3,Good=2,Bad=0} is the manually judged relevance for each image with respect to the query, and 0.01757 is a normalizer to make the score for 25 Excellent results 1. The final metric is the average of for all queries in the test set.

In addition to DCG, the average latency in processing each image-query pair will be used as a tie-breaker. For this Challenge, each entry is given at maximum 12 seconds to assess each image-query pair. Tied or empty (time-out) results are assigned the least favorable scores to produce the lowest DCG.

Process
As mentioned above, a dataset based on Bing Image search index is available for offline training purposes. Detailed descriptions of the dataset can be found at the “Datasets” section of this page. In addition, the organizer will also make available a web service accessible from the “Team” section of this site for online test runs three months before the final submission deadline. Each contestant can enter the URI of the web service implementing a contending entry at the website. Upon receiving an entry, the Challenge web site will schedule a job to call the web service, evaluate the responses of the web service and post the results on the “Team” and the “Leaderboard” sections of the web site if the entry is designated to show its result in public. 

Web Service Development Phase
Initially, the web site will evaluate each entry by computing the DCG on a trial data set with 10 queries and 50 images each. Contestants can therefore submit as many test runs as necessary.

Final Challenge
At the final challenge starts (time TBD), the web site will switch the data set to the Challenge test set. All contestants must ensure their entries are properly registered with the website prior to this time and their web services are up and running for at least 1 week. No further revisions to the entry are allowed at this point. 

Once the winners of the Challenge are determined, the website will resume accepting submissions and evaluating results from the general public. The web site with the Challenge test set will be maintained indefinitely after the Challenge for future researchers to include in their studies as a baseline.

Web Service Interface
Each entry is a URI to REST-based web service hosted by each team. The web service must be publicly accessible with HTTP POST through the internet with the following parameters:

name

Type

description

runID

UTF-8 string

A unique identifier to name a particular run when a system is submitted for evaluation. White spaces are not allowed in the string

query

UTF-8 string

A text query in its raw form of user input (all capitalization, punctuations etc retained)

image

Base 64 string

A base64 encoded JPEG image thumbnail, processed so that the larger dimension between width and height is at most 300 pixels


A contesting system should process the image and respond as soon as possible with a HTTP 200 OK. The response body should be encoded in UTF-8 with the MIME type ‘text/plain’ and contain a floating-point score with higher scores indicating more relevant result for the query. The organizer can call each contesting web service multiple times during the Final Challenge week to obtain statistically significant results to determine winners. 

Participation and Prizes
The Challenge is a team-based contest. Each team can have one or more members, and an individual can be a member of multiple teams. No two teams, however, can have more than 1/2 shared members. The team membership must be finalized and submitted to the organizer prior to the Final Challenge starting date. 

At the end of the Final Challenge, all entries will be ranked based on the metrics described above. The top three teams will receive award certificates and/or cash prizes (prize amounts TBD).

Paper Submission
Please follow the guideline of ICME 2014 Grand Challenge for the corresponding paper submission.

Detailed Timeline
  • October 30, 2013: Dataset available for download
  • April 1, 2014: Trial set available for download
  • April 2, 2014: Deadline for the participants to register on the CMT system for submitting evaluation results
  • April 7, 2014: Encrypted evaluation set is available for downloading
  • April 10, 2014: Challenge start 0:00AM PDT, password will be sent to emails that are registered in the Challenge CMT system
  • April 11, 2014: Challenge ends 0:00AM PDT. Submission system closes
  • April 16, 2014: Challenge paper submission deadline (4-6 pages following the same guideline of the main conference)
  • April 23, 2014: Notification
  • April 30, 2014: Camera ready
More information
Please note this time we don’t separate the Challenge to two tracks as we did in MSR-Bing IRC 2013. Instead, we only have one track this time. The evaluation will be mainly based on the final challenge results while the paper submissions will be also taken into account.  And please also note that though we use the same training data as MSR-Bing IRC 2013, the test data in final challenge will be different.

Questions related to this challenge should be directed to Xian-Sheng Hua (xshua@microsoft.comor ICME 2014 Grand Challenge Chairs.

Last Update: Oct 30, 2013