Bar exam score shows AI can keep up with ‘human lawyers,’ researchers say

Bar exam score shows AI can keep up with 'human lawyers,' researchers say

According to the company, “ChatGPT-4 is a large multimodal model (accepting image and text inputs, emitting text outputs) that, while less capable than humans in many real-world scenarios, exhibits human-level performance on various professional and academic benchmarks.”

Artificial intelligence can now outperform most law school graduates on the bar exam, the grueling two-day test aspiring attorneys must pass to practice law in the United States, according to a new study released Wednesday.

GPT-4, the upgraded AI model released this week by Microsoft-backed OpenAI, scored 297 on the bar exam in an experiment conducted by two law professors and two employees of legal technology company Casetext.

That places GPT-4 in the 90th percentile of actual test takers and is enough to be admitted to practice law in most states, the researchers found.

The bar exam assesses knowledge and reasoning and includes essays and performance tests meant to simulate legal work, as well as multiple choice questions.

“Large language models can meet the standard applied to human lawyers in nearly all jurisdictions in the United States by tackling complex tasks requiring deep legal knowledge, reading comprehension, and writing ability,” the authors wrote.

Less than four months ago, two of the same researchers concluded that OpenAI’s earlier large language model, ChatGPT, fell short of a passing score on the bar exam, highlighting how rapidly the technology is improving.

The newer GPT-4 got nearly 76% of the bar exam’s multiple-choice questions right, up from about 50% for ChatGPT, outperforming the average human test-taker by more than 7%.

The National Conference of Bar Examiners, which designs the multiple-choice section, said in a statement Wednesday that attorneys have unique skills gained through education and experience that “AI cannot currently match.”

Study co-author Daniel Martin Katz, a professor at Chicago-Kent College of Law, said in an interview that he was most surprised by GPT-4’s ability to produce largely relevant and coherent essay and performance test answers.

“I heard so many people say, ‘Well, it might get the multiple choice but it will never get the essays,'” Katz said.

AI has also performed well on other standardized tests, including the SAT and the GRE, but the bar exam has garnered more attention. OpenAI touted its passing score when it announced the latest model on Tuesday.

Bar exam tutor Sean Silverman attributed the focus on the bar exam to its widely recognized difficulty. This year’s first-time pass rate on the attorney licensing exam was 78% among test takers who spent three years in law school.

Silverman said people may be less impressed to learn that AI can pass a test designed for high-schoolers, like the SAT, “rather than the test to become a lawyer.”

GN Awards

FacebookTwitterLinkedin