Agregátor RSS
Omarchy 3.7.0
Singapore boffins get diverse SIEMs singing in harmony with agentic rule translation
Singapore boffins get diverse SIEMs singing in harmony with agentic rule translation
Academics from Singapore and China have found a way to make AI useful for cyber-defenders, by creating a technique that translates rules from diverse Security Information and Event Managements (SIEMs) so they’re easier to consume across multiple systems.…
CyberChef 11
An AI Just Beat Doctors at Diagnosing ER Patients
AI has aced medical exams, but there’s a wide gap between tests and the real world. A new study suggests the divide is closing.
Emergency doctors make high-stakes decisions in fast-paced, often chaotic situations. They have to figure out which patient most urgently needs care, what’s wrong, and what to do next.
AI could lend a hand. In a series of challenging scenarios, OpenAI’s o1-preview model matched or exceeded doctors in clinical reasoning. Debuted in 2024, the AI is a large language model similar to those powering ChatGPT, Claude, Gemini, and other popular chatbots.
But when it was first developed, o1-preview differed in its ability to “think” through problems before answering. Such reasoning models explore multiple strategies, check themselves, and revise answers before offering a conclusion. This is a little closer to how humans solve problems.
Given case reports from an established database, o1-preview diagnosed the problem nearly 89 percent of the time. In real-world emergency room scenarios, the AI outperformed physicians at the triage stage, where doctors decide which patient needs treatment first.
AI has aced medical licensing exams and done well on simple clinical assessments. But “passing examinations is not the same as being a doctor, and demonstrating physician-level performance on authentic clinical tasks is a fundamentally harder challenge,” wrote Ashley Hopkins and Erik Cornelisse at Flinders University in Australia, who were not involved in the study.
This doesn’t mean that o1-preview is ready for the clinic or is about to replace physicians. Instead of a human-versus-machine spectacle, the study was more focused on setting a higher bar for systems designed to work alongside people. Like everyone else, doctors are incorporating AI into their work. Whether that improves or hinders care is an open question.
“We’re witnessing a really profound change in technology that will reshape medicine,” study author Arjun Manrai at Harvard Medical School said in a press conference.
AI, MDThe dream of AI in healthcare spans decades. Over 65 years ago, physicians proposed a benchmark for machine “doctors.” The goal is to create AI that can diagnose patients in messy, real-world cases. But use in clinics, where decisions have real consequences, is a high bar.
An important dataset is the New England Journal of Medicine (NEJM) clinicopathological case conference series, long used to teach early-career doctors to match symptoms to diseases.
It’s a tough job. Symptoms often overlap and context matters: Medical history, genetics, habits. Like detectives, doctors hunt down the most likely suspect and work to verify their theory, while keeping other culprits in mind.
The NEJM dataset has long thwarted generations of computer systems as a test of their diagnostic abilities. Some learned from misdiagnosis; others relied on pre-programmed rules. But all struggled to find the best diagnoses and rank them by confidence.
Then along came large language models. These algorithms can parse clinical narratives and generate plausible diagnoses from text alone. OpenAI’s GTP-4 model, for example, could handle some cases from NEJM. But most AI evaluations relied on simple, stripped-down stories without the noise of real hospital charts, where extra or ambiguous details could change reasoning.
A meaningful human baseline was missing. AI models have hit benchmark ceilings on simpler tasks, but real-world performance is still unclear. For models to matter in healthcare, they need to show they can navigate the ambiguity clinicians face every day, across diseases, with information missing.
Ace StudentThe team pitted o1-preview against physicians and GPT-4 across five experiments.
The first used the NEJM dataset. The researchers gave AI models tightly controlled prompts. “I am running an experiment on a clinicopathological case conference to see how your diagnoses compare with those of human experts,” begins one. They told the models that a single diagnosis existed, informed them of available tests, and asked them to rank diagnoses by probability.
On 143 cases, o1-preview pulled ahead with a nearly 89 percent chance of a perfect or very near diagnosis. GPT-4 scored 73 percent. The o1-preview model also aced questions about the next diagnostic test and management steps. This included tasks like selecting an antibiotic or approaching difficult conversations about care at a patient’s end of life.
The gap widened on harder cases. Across simulated patients with uncommon infections, heart injury, immune-driven liver damage, and aggressive autoimmune lung disease, o1-preview outperformed GPT-4—and sometimes a panel of over 550 clinicians.
Next came the biggest challenge: Cases involving actual patients.
“As we can all imagine, the real world … comes with countless distractors, and if anyone has really seen a modern-day electronic health record, saying that there are distractors is probably, frankly, an understatement,” said study author Peter Brodeur. “And so we wanted to see how o1-preview could perform diagnostically without stripping away all the irrelevant input and noise that comes with daily medical practice.”
When the team fed o1-preview 70 emergency room cases randomly selected from a Boston hospital, the model surpassed two expert physicians across scenarios—triage, exams, chart review, admit-or-discharge decisions. In a blinded review, evaluators couldn’t reliably distinguish AI output from physicians. Importantly, o1-preview could explain its reasoning behind the final assessment and show how it weighed supporting or refuting evidence.
More information helped everyone. But o1-preview had an edge in the first stage, “where there is the least information available about the patient and the most urgency to make the correct decision,” wrote the team.
What Comes Next?Doctors don’t diagnose from charts alone. They watch the patient, listen to their breathing and speech, and note their affect during physical exams. But o1-preview relied solely on text documented by others. Newer models—like GPT-5.3 and Gemini 3.1 Pro—can take in images, audio, even video. In principle, that brings them closer to how clinicians actually work.
But to be clear, o1-preview isn’t ready for the real world. Although AI can operate at expert level in well-defined tasks like radiology, complex medical reasoning hasn’t been proven in clinical trials. “We need to evaluate this technology now” in rigorous trials, said Manrai.
Also, diagnostic reasoning is only one part of medicine. Other medical AI benchmarks, such as the Medical Holistic Evaluation of Language Models, aim to assess end-to-end care. This includes clinical decision support, notetaking, communicating with patients, research assistance, and administration. The next step is to test AI in supervised clinical settings to see how they perform under guidance, like a medical intern.
OpenAI jumped the gun here. Earlier this year, the company launched ChatGPT Health to handle the over 40 million health-related questions OpenAI claims to receive each day. But the tool has already drawn criticism for missing medical emergencies. Other AI titans are joining the race.
Accuracy isn’t the only bar for clinical deployment. Medical AI has also shown racial bias that resulted in worse outcomes. For AI to change healthcare, it “must also deliver equitable, cost-effective, and safe outcomes, supported by accountability, transparency, and ongoing monitoring,” wrote Hopkins and Cornelisse.
The post An AI Just Beat Doctors at Diagnosing ER Patients appeared first on SingularityHub.
Weaver E-cology critical bug exploited in attacks since March
S výživným lze nově obchodovat. Může to pomoci hlavně dětem, na které rodič neplatí alimenty
Zorin OS 18 a milion stažení: zájem je jistý, migrace z Windows zatím ne
Nástroje pro statické typové kontroly v ekosystému jazyka Python
Gorgon Halo alias Ryzen AI Max+ PRO 495 přinese až 11 % výkonu navíc
K čemu vedou stále častější záporné ceny elektřiny?
Kids say they can beat age checks by drawing on a fake mustache
Kids say they can beat age checks by drawing on a fake mustache
It’s been months since the UK government began requiring stronger age checks under the Online Safety Act, and recent research suggests those measures are falling short of keeping kids away from harmful content. In some cases, even drawing on a mustache has been reported as enough to fool age detection software.…
Researchers report Amazon SES abused in phishing to evade detection
Amazon SES increasingly abused in phishing to evade detection
Apache HTTP Server (httpd) 2.4.67 řeší 11 zranitelností
Phishing Campaign Hits 80+ Orgs Using SimpleHelp and ScreenConnect RMM Tools
Phishing Campaign Hits 80+ Orgs Using SimpleHelp and ScreenConnect RMM Tools
GameStop offers $56 billion for eBay, struggles to explain how it'll pay for it
GameStop yesterday made an unsolicited offer to buy eBay for $55.5 billion. GameStop claims that eBay has underperformed and spends too much on sales and marketing and argues that it would become a stronger company if it cuts costs and is combined with GameStop's physical retail locations.
"GameStop’s ~1,600 US locations give eBay a national network for authentication, intake, fulfillment, and live commerce," GameStop Chairman and CEO Ryan Cohen wrote in a letter to eBay Chairman Paul Pressler.
eBay's market capitalization is over four times larger than GameStop's. GameStop faces skepticism about the viability of its offer but says it will obtain debt financing and pay with a mix of cash and stock.
Takhle budou do tří let vypadat všechny státní weby a aplikace
- « první
- ‹ předchozí
- …
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- …
- následující ›
- poslední »



