The white clam pizza at Frank Pepe Pizzeria Napoletana in New Haven, Connecticut, is a revelation. The crust, kissed by the intense heat of the charcoal oven, strikes a perfect balance between crunchy and chewy. Accompanied by freshly shelled clams, garlic, oregano and a little grated cheese, it’s a testament to the magic that simple, high-quality ingredients can conjure.

Sound like me? It’s not. The entire paragraph, except for the name of the pizzeria and the city, was generated by GPT-4 in response to a simple message asking for a Pete Wells-style restaurant review.

I have some objections. I would never consider any food a revelation or describe warmth as a kiss. I don’t believe in magic and rarely call something perfect without using “almost” or some other coverage. But these lazy descriptors are so common in food writing that I imagine many readers barely notice them. I’m unusually in tune with them because every time I make a cliche in my copy, my editor smacks me on the ears.

He wouldn’t be fooled by fake Pete. Me neither. But as much as it pains me to admit it, I suppose a lot of people would say it’s a four-star fake.

The person responsible for Phony Me is Balazs Kovacs, professor of organizational behavior at the Yale School of Management. In a recent study, he sent a large number of Yelp reviews to GPT-4, the technology behind ChatGPT, and asked it to emulate them. His test subjects (people) could not distinguish between genuine reviews and those generated by artificial intelligence. In fact, they were more likely to think the AI ​​reviews were real. (The phenomenon of computer-generated fakes being more convincing than the real thing is so well-known that it has a name: AI hyperrealism.)

Dr. Kovacs’s study belongs to a growing body of research suggesting that the latest versions of generative AI can pass the Turing test, a scientifically confusing but culturally resonant standard. When a computer can trick us into believing that the language it spits out was written by a human, we say it has passed the Turing test.

It has long been assumed that AI would eventually pass the test first proposed by mathematician Alan Turing in 1950. But even some experts are surprised by how quickly the technology is improving. “It’s happening faster than people expected,” Dr. Kovacs said.

The first time Dr. Kovacs asked GPT-4 to imitate Yelp, few were fooled. The prose was too perfect. That changed when Dr. Kovacs ordered the program to use colloquial spelling, emphasize some capitalized words, and insert typos, one or two in each review. This time, GPT-4 passed the Turing test.

In addition to setting a threshold in machine learning, AI’s ability to sound like us has the potential to undermine any confidence we still have in verbal communications, especially shorter ones. Text messages, emails, comment sections, news articles, social media posts, and user reviews will be even more suspicious than they already are. Who’s going to believe a Yelp post about a croissant pizza or a glowing OpenTable submission about a $400 omakase sushi tasting knowing that its author could be a machine that can’t chew or swallow?

“With consumer-generated reviews, it’s always been a big question who’s behind the screen,” said Phoebe Ng, a restaurant communications strategist in New York City. “Now it’s a question of what’s behind the screen.”

Online reviews are the grease that powers modern commerce. In a 2018 survey by the Pew Research Center, 57 percent of Americans surveyed said they always or almost always read online reviews and ratings before purchasing a product or service for the first time. Another 36 percent said they sometimes did.

For businesses, a few points on a star rating on Google or Yelp can mean the difference between making money and going under. “We live off reviews,” the manager of an Enterprise Rent-a-Car branch in Brooklyn told me last week while picking up a car.

A business traveler who needs a ride that won’t break down on the New Jersey Turnpike may be more influenced by a negative report than, say, someone simply looking for brunch. Still, for restaurateurs and chefs, Yelp, Google, TripAdvisor and other sites that allow customers to voice their opinions are a source of endless worry and occasional fury.

A particular cause of frustration is the large number of people who don’t bother to eat at the place they write about. Before an article in Eater pointed it out last week, the first New York location of Taiwan-based dim sum chain Din Tai Fung was coming under fire for one-star reviews on Google, dragging its average rating down to 3.9 out of 5 possible. The restaurant has not opened yet.

Some ghost critics are more sinister. Restaurants have received one-star reviews, followed by an email offer to remove them in exchange for gift cards.

To combat bad faith attacks, some owners recruit their nearest and dearest to flood the area with positive propaganda. “One question is: how many aliases do all of us who work in the restaurant industry have?” said Steven Hall, owner of a New York public relations firm.

A step up from an organized vote-stuffing campaign, or perhaps a step down, is the practice of exchanging compensated meals or cash for positive items. Beyond that rises the vast, shadowy realm of critics who don’t exist.

To promote their own businesses or humiliate rivals, companies may hire middlemen who have manufactured small armies of fictitious critics. According to Kay Dean, a consumer advocate who investigates online review fraud, these accounts are typically given an extensive history of previous reviews that act as camouflage for their pay-to-play results.

In two recent videos, he pointed out a chain of mental health clinics that had received glowing reviews on Yelp, apparently submitted by satisfied patients whose accounts were filled with restaurant reviews pulled word for word from TripAdvisor.

“It’s an ocean of falsehood, and much worse than people realize,” Dean said. “Consumers are being misled, honest businesses are being harmed and trust is being eroded.”

All this is being done by simple people. But as Dr. Kovacs writes in his study, “the situation now changes substantially because humans will no longer be required to write reviews that appear authentic.”

Dean said that if AI-generated content infiltrates Yelp, Google and other sites, it will be “even harder for consumers to make informed decisions.”

The major sites say they have ways to uncover Potemkin accounts and other forms of falsehood. Yelp encourages users to flag questionable reviews and, after an investigation, will remove those that violate its policies. It also hides reviews that its algorithm considers less trustworthy. Last year, according to its most recent Trust and Safety Report, the company stepped up its use of AI “to even better detect and de-recommend less helpful and less trustworthy reviews.”

Dr. Kovacs believes that sites will now have to work harder to prove that they are not regularly publishing robots’ thoughts. They could, for example, adopt something like the “Verified Purchase” label that Amazon pastes on reviews of products that have been purchased or transmitted through its site. If readers are even more suspicious of crowdsourced restaurant reviews than they already are, it could be an opportunity for OpenTable and Resy, which accept comments only from those diners who show up for their reservations.

One thing that probably won’t work is asking computers to analyze the language on their own. Dr. Kovacs ran his real and modified Yelp ads through programs that are supposed to identify AI. Like his test subjects, he said, the software “thought the fakes were real.”

This did not surprise me. I took Dr. Kovacs’ survey myself, trusting that he would be able to spot the small, concrete details that a true diner would mention. After clicking a box to certify that he wasn’t a robot, I quickly found myself lost in a desert of exclamation points and frowning faces. When I got to the end of the test, I was just guessing. I correctly identified seven out of 20 reviews, a result halfway between flipping a coin and asking a monkey.

What tripped me up was that GPT-4 didn’t make up its views out of thin air. He pieced them together from snippets of Yelpers’ descriptions of their Sunday lunches and snacks.

“It’s not fully composed in terms of the things that people value and what they care about,” Dr. Kovacs said. “The scary thing is that you can create an experience that looks and smells like a real experience, but it’s not.”

By the way, Dr. Kovacs told me that he submitted the first draft of his paper to an AI editing program and took many of its suggestions into the final copy.

It probably won’t be long before the idea of ​​a purely human review seems quaint. Robots will be invited to read over our shoulders, alert us when we’ve used the same adjective too many times, and nudge us toward a more active verb. Machines will be our teachers, our editors, our collaborators. They will even help us look human.

Share.
Leave A Reply

© 2024 Daily News Hype. Designed by The Contentify.
Exit mobile version