Late last year, to significant fanfare, City Hall rolled out an AI-powered chatbot geared towards starting and operating NYC-based businesses. The bot was meant to dispense advice, and that it did, though not how the city expected.
As reported by tech investigations site The Markup, the MyCity Chatbot has been advising people to break the law in myriad ways, from telling employers that they could illegally steal tips from workers to informing landlords they could discriminate against people with rental assistance, the law be damned. This debacle is especially infuriating not because it was unpredictable, but indeed because it was entirely predictable.
The failure comes down to the basic problem with every single large language model (LLM) system in AI, a reality that will always exist below the hype: these systems don’t know anything, can’t look it up and wouldn’t know how to synthesize the information if they could. That’s because, while they are able to very convincingly simulate thought, it is only a simulation.
An LLM has been trained on huge troves of data to become essentially an incredibly potent version of e-mail response generators; it predicts the structure of what could be a response to the query, but is not actually capable of thinking about it like a human. This isn’t a software issue that can just be patched or easily remedied because it’s not an error. It’s the output based on the system’s very structure.
Perhaps the embarrassing liability from having an official government program dispense plainly illegal advice will finally convince policymakers that they’ve been duped by the buzz around a technology that has never been proven to be effective for the purposes they’re trying to use it for. Recently, a Canadian court ordered Air Canada to refund a customer that had been given incorrect information by the airline’s chatbot, rejecting the stupid argument that the bot was meaningfully a separate entity to the company itself.
It’s not hard to imagine that happening in New York, except that it will be a remedy for a worker fired unlawfully or whose wages were stolen, or would-be a tenant discriminatorily denied housing. That’s if they bother to bring a case at all; many people whose rights have been violated on the erroneous advice of MyCity Chatbot probably won’t even seek redress.
Does this mean that all AI applications are useless or should be shunned by government? Not necessarily; for certain targeted reasons, a specifically tailored AI can cut down on wasteful tedium and help people.
Yet it simply isn’t currently possible to craft an LLM that will be able to routinely return accurate responses to the full range and complexity of the city’s legal, regulatory and operational functions. LLMs cannot pull laws or court decisions for the simple reason that they do not understand the question you are asking them, only what might look like a plausible enough answer based on reams of similar text they’ve ingested from millions of online sources.
We’re asking an adding machine to do sudoku; it doesn’t matter how good and fast it is at adding and subtracting and multiplying, an adding machine cannot solve puzzles. It’s time to step back, before real people get hurt by MyCity Chatbot.