Microsoft announces Turing Bletchley v3 vision-language model for Bing image searches

Microsoft announces Turing Bletchley v3 vision-language model for Bing image searches

October 16, 2023

In a significant technological stride, Microsoft, a prominent IT company, has unveiled the third iteration of the Turing Bletchley Multilingual Vision Language Foundation Model. This groundbreaking advancement is now integrated into various Microsoft products, including Bing, ushering in notable improvements for image searches.

Microsoft introduced the inaugural Turing Bletchley model in November 2021, marking the inception of a transformative journey. In an official blog post on Bing, the custom software development company disclosed that testing for the third version of the model commenced in the autumn of 2022, leading to its subsequent integration into Bing and other Microsoft products.

The essence of the Turing Bletchley v3 model lies in its ability to seamlessly connect text and images, enriching the user experience on Microsoft's Bing search engine, recently integrated with AI. The primary objective is to narrow the gap between textual descriptions and corresponding images in search results, ensuring a more precise and intuitive search process.

One of the key techniques employed by Turing Bletchley v3 to forge these connections is through its ingenious use of extensive masking. Microsoft's approach involves masking certain words in an image caption, prompting a neural network to predict these concealed words based on both the image and the accompanying text. This method can also be reversed to mask pixels instead of words. This innovative mask training, coupled with the formidable capabilities of a large transformer-based model, culminates in a robust pre-trained model adaptable to a diverse array of downstream tasks.

Beyond enhancing Bing image searches, the Turing Bletchley v3 model finds practical applications in content moderation for Xbox, Microsoft's gaming platform. It plays a pivotal role in identifying inappropriate images and videos uploaded by Xbox players to their profiles, ensuring alignment with the company's community standards and guidelines on the Xbox platform.

Microsoft's commitment to innovation and refinement, as exemplified by the Turing Bletchley v3 model, signifies a relentless pursuit of excellence in bridging the gap between text and images. This advancement promises to elevate the search experience for Bing users and uphold the integrity of content on Xbox, marking yet another milestone in Microsoft's technological journey.