-3.2 C
New York
Monday, December 23, 2024

Microsoft's new safety system can detect the appearance of hallucinating.

TechnologyMicrosoft's new safety system can detect the appearance of hallucinating.

According to Sarah Bird, Microsoft's chief product officer of responsible Artificial Intelligence, the company has designed several new safety features that will be easy to use for customers who aren't hiring red teamers to test the services they built. Microsoft says the LLM-powered tools can detect vulnerabilities, monitor for hallucinations, and block malicious prompt in real time for customers working with any model hosted on the platform. The evaluation system will generate the questions customers need to answer to get a score and see the outcomes. Prompt Shields, Groundedness Detection, and safety evaluations are available in preview on the platform. Two other features for directing models towards safe outputs will be coming soon.

If the model is processing third-party data or if the user is typing in a prompt, the monitoring system will evaluate it to see if it causes any banned words. The system checks if the model hallucinated information is in the document or prompt after looking at the response by the model. The filters made to reduce bias in the images of the internet company were found to have effects that were not intended. Bird and her team added a way for Azure customers to change the filters on hate speech and violence that the model sees. Bird says this will allow system administrators to figure out which users are its own team of red teamers. The safety features are attached to GPT-4 and other popular models.

Check out our other content

Check out other categories:

Most Popular Articles