What is your opinion on the current state of AI/LLMs

Chippys_mittens@lemmy.world · edit-2 6 hours ago

What is your opinion on the current state of AI/LLMs

AdamBomb@lemmy.world · 3 minutes ago

They’re useful and getting better, but they’re improving by burning more tokens behind the scenes, and the prices they charge only cover a fraction of the cost. Right now there is no foreseeable path to profitability.

jtrek@startrek.website · 49 minutes ago

It enables unskilled people to punch above their weight class, similar to giving a chainsaw to a toddler.

I’ve used them a little for coding, but it’s not always correct. It’s often incorrect in subtle ways. Or inefficient in non obvious ways. It gets worse as you build more.

Often it’s better overall to do it yourself if you know what you’re doing. If you stick to letting the LLM do it, you won’t learn much.

venusaur@lemmy.world · 3 hours ago

You’re gonna get a very anti-AI bias on here

Chippys_mittens@lemmy.world · 50 minutes ago

That’s fine I figured I would but might learn something regardless

TootSweet@lemmy.world · 2 hours ago

MagicShel@lemmy.zip · 4 hours ago

They are useful. My teams are seeing modest productivity gains by self reporting, but I’m going to give it another six months to see if it shows up in actual metrics.

I’m enthusiastic about AI but I remain skeptical. I don’t mean to always be contrarian but I’m dead in the middle and everyone who says they are great or terrible I tend to offer my experiences in the other direction.

They are not to be trusted to handle customers directly, but they can assist experts when they have to step out of their expertise. For example I can’t write Python, but I’ve been coding for 30 years. I can certainly write some good directions on what needs to be done and I can review code and correct it. So AI has let me write a bunch of complex Python scripts to automate minor parts of my job to let me focus on the hard parts.

For example I can execute GDPR delete requests in a few moments where doing it by hand with Hoppscotch or Postman probably takes me 5-10 minutes. We have a multiple systems and sometimes I have to delete multiple profiles for a given request.

It’s great at rubber ducking as long as you think critically about its proposed solutions. It’s fine at code review before sending it to an actual person for review. It flags non-issues but it also flags a few actionable fixes.

The important thing though is to never trust it when it comes to anything you don’t know about. It’s right a fair amount of the time, depending on what you ask, but it’s wrong enough that you should never, ever rely on it being right about something. The moment you put your life in its hands, it’ll kill you with nothing to say to the survivors but, “Your right about that. Sorry, that was my mistake.” And it isn’t even sincere. Because it can’t be. Because it doesn’t think or feel anything.

statelesz@slrpnk.net · 3 hours ago

Great answer.

zxqwas@lemmy.world · 3 hours ago

It’s useful to churn through a lot of data and do tedious repetitive tasks.

You have to check the results. I’ve had it give me correct results and wrong results compared to known data points.

naught101@lemmy.world · 2 hours ago

Are you talking about churning through quant data with an LLM? Because that sounds like a kind of silly thing to do…

TootSweet@lemmy.world · 5 hours ago

They’re a straight up scam.

Chippys_mittens@lemmy.world · 4 hours ago

How so?

TootSweet@lemmy.world · edit-2 4 hours ago

They just don’t do anything useful, and the hype-ers are acting like they’re AGI. Hallucinations make them too unreliable to be trusted with “real work”, which makes them useless for anything beyond a passing gimmick. Vibe coded software is invariably shit. Doing any serious task with “AI assistance” ends up either taking more work than doing it without LLMs or sacrificing quality or correctness in huge ways. Any time you point this out to hype-ers, they start talking about “as AI advances” as if it’s a foregone conclusion that they will. People talked the same way about blockchain, and the only “advancements” that have been made in that sphere are more grifts, and meanwhile it still takes anywhere between 10 minutes and an hour to buy a hamburger with Bitcoin, and it gets worse with greater adoption. Just like you can’t make a distributed blockchain cryptocurrency that resolves discrepancies automatically without relying on humans fast at scale (and even if you could make it fast, it’d introduce at least as many problems as it purports to “solve”), you can’t make LLMs not hallucinate. The only way to solve hallucinations is by abandoning LLMs in favor of a whole different algorithm.

If anything LLMs have blocked us from making progress toward AGI by distracting us with gimmicky bullshit and taking resources from other efforts which may otherwise have pushed us in the right direction.

Mind you, “AI” is a very old term that can mean a lot of different things. I took a class in college called “Introduction to Artificial Intelligence” in… maybe 2006 or 2007. And in that class, I learned about the A* algorithm. Every time you played an escort mission in Skyrim and had an NPC following you, it was the A* algorithm or some slight variation on it that was used to make sure that NPC could traverse terrain to keep roughly in toe with you despite obstacles of various sorts. It’s absolutely nothing like LLMs. It doesn’t need to be trained. The algorithm fully works the moment it’s implemented. If you want to know why it made a particular decision, you can trace the logic and determine exactly why it did what it did, unlike LLMs. It’s for a few very niche purposes rather than trying to be general purpose like an LLM. It requires no massive data centers and doesn’t consume massive amounts of memory. And it doesn’t hallucinate. The AI hype-ers (and the media who have mostly fallen for their grift hook, line, and sinker) love to conflate completely unrelated technologies to give the impression that LLMs are getting better because such-and-such article mentions an “AI” that discovered a groundbreaking new drug. But the kind of AI they use to find drugs is very special purpose and has nothing to do with how LLMs work.

LLMs can’t do your job, but the grifters are doing a damned good job of convincing your boss that LLMs can in fact do your job. As Cory Doctorow says, the current AI craze “is the asbestos that we’re shoveling into our walls”. We’re causing huge problems with it and if/when the bubble properly pops, we’re going to spend a long time painstakingly extracting it from our systems, replacing it with… you know… stuff that actually works, and repairing the damage it’s done in the meantime.

Meanwhile, it’s Nvidia and OpenAI and so on who are boosting the LLM bubble. And they’ve made a shit ton of money off of their grift at the expense of everyone else. How anyone can look at all this and not think “scam” is beyond me.

LumpyPancakes@piefed.social · 2 hours ago

I have a vague memory that Bitcoin used to be instant in the first versions - or at least with near certainty that the advertised transaction was real, but that the protocol was later modified in such a way that this mechanism was no longer reliable. It might have been enshittified.

AI is still largely affected by garbage in garbage out.

Norin@lemmy.world · 4 hours ago

They’re digital yes men, mostly, and really lack in the nuance when you prompt them to answer on anything you have a deep knowledge of.

Chippys_mittens@lemmy.world · 4 hours ago

Makes sense

CodenameDarlen@lemmy.world · edit-2 5 hours ago

They’re annoying to be honest.

I used Qwen 3.5 for some research a few weeks ago, at first the good thing was every sentence was referenced by a link from the internet. So I naturally thought “well, it’s actually researching for me, so no hallucination, good”. Then I decided to look into the linked URLs and it was hallucinating text AND linking random URL to those texts (???), nothing that the AI outputs was really in the web page that was linked. The subject was the same, output and URLs, but it was not extracting actual text from the pages, it was linking a random URL and hallucinating the text.

Related to code (that’s my area, I’m a programmer), I tried to use Qwen Code 3.5 to vibe code a personal project that was already initialized and basically working. But it just struggles to keep consistency, it took me a lot of hours just prompting the LLM and in the end it made a messy code base hard to be maintained, I asked to write tests as well and after I checked manually the tests they were just bizarre, they were passing but it didn’t cover the use cases properly, a lot of hallucination just to make the test pass. A programmer doing it manually could write better code and keep it maintainable at least, writing tests that covers actual use cases and edge cases.

Related to images, I can spot from very far most of the AI generated art, there’s something on it that I can’t put my finger on but I somehow know it’s AI made.

In conclusion, they’re not sustainable, they make half-working things, it generates more costs than income, besides the natural resources it uses.

This is very concerning in my opinion, given the humanity history, if we rely on half-done things it might lead us to very problematic situations. I’m just saying, the next Chernobyl disaster might have some AI work behind it.

Buckshot@programming.dev · 4 hours ago

Had the same research issue from multiple models. The website it linked existed and was relevant but often the specific page was hallucinated or just didn’t say what it said it did.

In the end it probably created more work than it saved.

Also a programmer and i find it OK for small stuff but anything beyond 1 function and it’s just unmaintainable slop. I tried vibe coding a project just to see what i was missing. Its fine, it did the job, but only if I dont look at the code. Its insecure, inefficient, and unmaintainable.

CodenameDarlen@lemmy.world · 4 hours ago

I agree, I assumed this error was LLM related not Qwen itself. I think LLMs aren’t able to fit the referenced URL within the text extracted from it. They probably do some extensive research (I remember it searched like 20-40 sites), but it’s up to the LLM if it’ll use an exact mention of a given web page or not. So that’s the problem…

Also it’s a complete mess to build frontend, if you ask a single landing page or pretty common interface it may be able to build something reasonable good, but for more complex layouts it’ll struggle a lot.

I think this happens because it’s hard to test interfaces. I never got deep into frontend testing but I know there are ways to write actual visual tests for it, but the LLM can’t assimilate the code and an image easily, we’d need to take constant screenshots of the result, feed it back to the LLM and ask it to fix until the interface matches what you want. We’d need a vision capable mode more a coding one.

I mean you may get good results for average and common layouts, but if you try anything different you’ll see a huge struggle from LLMs.

leoj@piefed.social · 4 hours ago

For context and to your knowledge of the field, is Qwen 3.5 supposed to be cutting edge?

CodenameDarlen@lemmy.world · edit-2 4 hours ago

It’s the best open source model, pretty next to Claude on benchmarks.

TootSweet@lemmy.world · 3 hours ago

Is Qwen really Open Source, or do they just let you download weights? (Like LLaMa.)

CodenameDarlen@lemmy.world · 3 hours ago

Not sure now, but it says Apache 2.0 in their GitHub repo.

WolfLink@sh.itjust.works · 4 hours ago

Qwen 3.5 is one of the best of the open-weight (self-host able) models right now. It’s not as good as some of the extra massive proprietary models like the bigger Claude models.

leoj@piefed.social · 4 hours ago

ah ok, I have some experience hosting Ollama and of course stable diffusion, but haven’t really messed with too many others, thanks for the insight!

WolfLink@sh.itjust.works · 4 hours ago

Qwen 3.5 can be run via ollama

leoj@piefed.social · 4 hours ago

well now I have something to do this weekend if the weather is poor, thank you!

snoons@lemmy.ca · edit-2 5 hours ago

They might be good given time, probably a lot of time, but right now all they’re doing is allowing that well meaning roommate that puts your cast iron in the dishwasher to also ruin Wikipedia articles and fuck up open source projects.

statelesz@slrpnk.net · 3 hours ago

Today I tasked Gemini Pro to assist me code a quite simple web GUI in Python using NiceGUI and besides somewhat doing what I asked it to do it also added a bunch of childish emojis to buttons and removed my name from the project and replaced it with ‘admin’. This is a real tool that I develop for a hand full of my very real coworkers and my boss is paying Google for this shit. Next time I much rather give the task to one of our apprentices and point them to the docu then having a supposedly ‘Pro’ model do random shit I haven’t asked it to do.

alternategait@lemmy.world · 2 hours ago

One of my hobbies seems like it should be be easy for AI to move into. Crochet patterns are often fairly mathematically based. However every generated pattern I’ve looked at has been nonsensical garbage. Images that look cute and plausible at first pass are basically imagination. It used to be really easy to spot but it is getting harder/more subtle.

I used to work in healthcare and when I first got access to chat GPT I asked it some simple differential diagnosis questions. It gave a few very common possibilities for the symptoms I gave but completely ignore an important red flag that even a first year med student should (probably would) keep in mind.

alternategait@lemmy.world · 2 hours ago

So basically if I ignored all impacts of using it I would be ok with it as a first pass at some sort of information gathering if I was willing to check deeper.

When I used it, I asked things like “change the tone of this writing” or “make these bullet points include key words to match this job post”.

That said, it does have impacts and I’m not willing to incur them for my passing curiosities or to speed up job applications.

stoy@lemmy.zip · 4 hours ago

I hate it.

I am an IT guy, and AI has just about killed my enthusiasm for tech, I made a post about it a month or two ago, and it is still valid.

kbal@fedia.io · 5 hours ago

Given enough time and research it won’t be too many more years before they’re ready for production use. Of course that use will probably be mass surveillance and suppression of dissent.

Chippys_mittens@lemmy.world · 4 hours ago

Yeah mass surveillance seems likely

nocturne@slrpnk.net · 4 hours ago

From what I can tell, they are best used to plan mass shootings, or your suicide.

theherk@lemmy.world · 5 hours ago

More capable than the crowd here lets on. My take is like this, unchecked capitalism is a danger to mankind. The pervasiveness of LLM’s right now is just a symptom of that. The rich are the problem, not the AI.

It is a tool; a very good one along many axes. I think people that think it isn’t good for writing code are misinformed or intentionally disingenuous. It is extremely good at that, but it is just a tool not a replacement.

But it is the applications in pure maths, virology, protein folding, etc. where it gets really interesting.

Water consumption, power consumption, and profit motives aside, they are fascinating tools.

That said, If Anyone Builds It, Everyone Dies is a fascinating take on how this could all go wrong.

In any case, I can’t understand the people that say stuff like, “It is just autocomplete on steroids,” or “it is just a probabilistic prediction tool.” Okay, but like… that’s all we are too.

Summary, interesting tools being used for profit at the expense of economies, the environment, and creative fields.

Elting@piefed.social · 4 hours ago

They literally can’t do pure math. Like everyone knows how bad they are at even simple math. We have had tools that do pure math for thousands of years, and we call them calculators. A hotbox for an imaginative mathematician? Sure, but any conclusions drawn get drawn elsewhere with more traditional tools.

queerlilhayseed@piefed.blahaj.zone · 1 hour ago

I hear this criticism of LLMs all the time and I just don’t get it. They’re language models, they take language inputs and produce language outputs. They aren’t designed to do math. It’s like complaining that a reciprocating saw can’t do math.

theherk@lemmy.world · 4 hours ago

There is active research right now for their use in pure maths. I don’t think it is primarily about direct solutions, but in program synthesis for formal logic. Keep in mind this isn’t just LLM’s, but also graph networks and other non-transformer networks.

Catoblepas@piefed.blahaj.zone · 4 hours ago

Okay, but like… that’s all we are too.

Whoever told you that was lying to you or misinformed. Neuroscientists do not look at the brain as a probabilistic prediction tool. You are not a database with weights, you’re a human being with experiences, emotions, and thoughts.

theherk@lemmy.world · 4 hours ago

We are nearly precisely that. The brain functions as a massive, self-organizing neural network where cognitive architecture is determined by the strength of connections (the biological equivalent of adjustable computational weights) that modulate signal transmission via the flow of ions.

Every decision made or breath taken is the outcome of how ions flow through this network.

Catoblepas@piefed.blahaj.zone · 4 hours ago

Let me know when you find a neurologist that says brains are just like LLMs.

theherk@lemmy.world · 4 hours ago

That isn’t likely to happen. Fortunately, neither have I said that. But a pithy comeback won’t change the accuracy of the brain being a self-assembling probabilistic network. All your memories, experiences, and emotions are part of that.

Catoblepas@piefed.blahaj.zone · 4 hours ago

Rewording a description of what an LLM is and saying brains are just like that is still saying that brains work like LLMs, even if you didn’t use those exact words. The acknowledgment that neurologists do not find evidence to support that is pretty much all that is necessary to tear that down, no matter how many times you repeat it.

theherk@lemmy.world · 4 hours ago

If I say “A screwdriver is a tool,” and “The brain is a tool,” am I then saying “The brain is just like a screwdriver”? Or is it possible that applying seconding order logic to an admittedly and clearly reductive statement I made isn’t productive?

And which part of the brain description is inaccurate, specifically?

Upgrayedd1776@sh.itjust.works · 4 hours ago

pithy hottakes is 90% of ai criticism