Discover this podcast and so much more

Podcasts are free to enjoy without a subscription. We also offer ebooks, audiobooks, and so much more for just $11.99/month.

?ThursdAI - LAION down, OpenChat beats GPT3.5, Apple is showing where it's going, Midjourney v6 is here & Suno can make music!

?ThursdAI - LAION down, OpenChat beats GPT3.5, Apple is showing where it's going, Midjourney v6 is here & Suno can make music!

FromThursdAI - The top AI news from the past week


?ThursdAI - LAION down, OpenChat beats GPT3.5, Apple is showing where it's going, Midjourney v6 is here & Suno can make music!

FromThursdAI - The top AI news from the past week

ratings:
Length:
82 minutes
Released:
Dec 22, 2023
Format:
Podcast episode

Description

Hey everyone, happy ThursdAI!As always, here's a list of things we covered this week, including show notes and links, to prepare you for the holidays. TL;DR of all topics covered: * Open Source AI* OpenChat-3.5-1210 - a top performing open source 7B model from OpenChat team beating GPT3.5 and Grok (link, HF, Demo)* LAION 5B dataset taken down due to CSAM allegations from Stanford (link, full report pdf) * FLASK - New evaluation framework from KAIST - based on skillset (link)* Shows a larger difference between open/closed source* Open leaderboard reliability issues, vibes benchmarks and more* HF releases a bunch of MLX ready models (LLama, Phi, Mistral, Mixtral) (link)* New transformer alternative architectures - Hyena & Mamba are heating up (link)* Big CO LLMs + APIs* Apple - LLM in a flash paper is making rounds (AK, Takeaways thread)* Anthropic adheres to the messages API format (X)* Microsoft Copilot finally has plugins (X)* Voice & Audio* AI Music generation Suno is now part of Microsoft Copilot plugins and creates long beautiful songs (link)* AI Art & Diffusion* Midjourney v6 is out - better text, great at following instructions (link)Open Source AIWe start today with a topic I didn't expect to be covering, the LAION 5B dataset, was taken down, after a report from Stanford Internet Observatory found instances of CSAM (Child Sexual Abuse material) in the vast dataset. The outlined report had identified hundreds to thousands of instances of images of this sort, and used something called PhotoDNA by Microsoft to identify the images by hashes, using a sample of NSFW marked images. LAION 5B was used to train Stable Diffusion, and 1.4 and 1.5 were trained on a lot of images from that dataset, however SD2 for example was only trained on images not marked as NSFW. The report is very thorough, going through the methodology to find and check those types of images. Worth noting that LAION 5B itself is not an image dataset, as it only contains links to images and their descriptions from alt tags. Obviously this is a very touchy topic, given the way this dataset was scraped from the web, and given how many image models were trained on it, the report doesn't allege anything close to influence on the models it was trained on, and outlines a few methods of preventing issues like this in the future. One unfortunate outcome of such a discovery, is that this type of work can only be done on open datasets like LAION 5B, while closed source datasets don't get nearly to this level of scrutiny, and this can slow down the advancement of multi-modal open source multi modal models while closed source will continue having these issues and still prevail. The report alleges they found and validated between hundreds to a few thousand of CSAM verified imagery, which considering the size of the dataset, is infinitesimally small, however, it still shouldn't exist at all and better techniques to clean those scraping datasets should exist. The dataset was taken down for now from HuggingFace and other places. New version of a 7B model that beats chatGPT from OpenChat collective (link, HF, Demo)Friend of the pod Alpay Aryak and team released an update to one of the best 7B models, namely OpenChat 7B (1210) is a new version of one of the top models in the 7B world called OpenChat with a significant score compared to chatGPT 3.5 and Grok and with very high benchmark hits (63.4% on HumanEval compared to GPT3.5 64%) Scrutiny of open source benchmarks and leaderboards being gamedWe've covered State of the art models on ThursdAI, and every time we did, we covered the benchmarks, and evaluation scores, Whether that's the popular MMLU (Multi-Task Language Understanding) or HumanEval (Python coding questions) and almost always, we've referred to the HuggingFace Open LLM leaderboard for the latest and greatest models. This week, there's a long thread on the hugging face forums that HF eventually had to shut down, that alleges that a new contender for the top, without
Released:
Dec 22, 2023
Format:
Podcast episode

Titles in the series (58)

Every ThursdAI, Alex Volkov hosts a panel of experts, ai engineers, data scientists and prompt spellcasters on twitter spaces, as we discuss everything major and important that happened in the world of AI for the past week. Topics include LLMs, Open source, New capabilities, OpenAI, competitors in AI space, new LLM models, AI art and diffusion aspects and much more. sub.thursdai.news