Very inconsistent machine learning model training

Charlie Fish@eventfrontier.com · 4 天前

This is not a “mistake”. This clearly proves they have Apple TV app integration implemented (just turned off). And someone accidentally turned it on.

But they have clearly put in effort and work into adding this functionality.

New functionality doesn’t just happen by mistake.

Charlie Fish@eventfrontier.com · 19 天前

Got it. Thanks so much for your help!! Still a lot to learn here.

Coming from a world of building software where things are very binary (it works or it doesn’t), it’s also really tough to judge how good is “good enough”. There is a point of diminishing returns, and not sure at what point to say that it’s good enough vs continuing to learn and improve it.

Really appreciate your help here tho.

Charlie Fish@eventfrontier.com · 21 天前

So someone else suggested to reduce the learning rate. I tried that and at least to me it looks a lot more stable between runs. All the code is my original code (none of the suggestions you made) but I reduced the learning rate to 0.00001 instead of 0.0001.

Not quite sure what that means exactly tho. Or if more adjustments are needed.

As for the confusion matrix. I think the issue is the difference between smoothed values in TensorBoard vs the actual values. But I just ran it again with the previous values to verify. It does look like it matches up if you look at the actual value instead of the smoothed value.

Charlie Fish@eventfrontier.com · edit-2 27 天前

Sorry for the delayed reply. I really appreciate your help so far.

Here is the raw link to the confusion matrix: https://eventfrontier.com/pictrs/image/1a2bc13e-378b-4920-b7f6-e5b337cd8c6f.webm

I changed it to keras.layers.Conv2D(16, 10, strides=(5, 5), activation='relu'). Dense units still at 64.

And in case the confusion matrix still doesn’t work, here is a still image from the last run.

EDIT: The wrong image was uploaded originally.

Charlie Fish@eventfrontier.com · 30 天前

Ok I changed the Conv2D layer to be 10x10. I also changed the dense units to 64. Here is just a single run of that with a Confusion Matrix.

I don’t really see a bias towards non-blurred images.

Charlie Fish@eventfrontier.com · 30 天前

So does the fact that they aren’t converging near the same point indicate there is a problem with my architecture and model design?

Charlie Fish@eventfrontier.com · 1 个月前

Got it. I’ll try with some more values and see what that leads to.

So does that mean my learning rate might be too high and it’s overshooting the optimal solution sometimes based on those random weights?

Charlie Fish@eventfrontier.com · 1 个月前

I think what you’re referring to with iterating through algorithms and such is called hyper parameter tuning. I think there is a tool called Keras Tuner you can use for this.

However. I’m incredibly skeptical that will work in this situation because of how variable the results are between runs. I run it with the same input, same code, everything, and get wildly different results. So I think in order for that to be effective it needs to be fairly consistent between runs.

I could be totally off base here tho. (I haven’t worked with this stuff a ton yet).

Charlie Fish@eventfrontier.com · 1 个月前

Thanks so much for the reply!

The convolution size seems a little small

I changed this to 5 instead of 3, and hard to tell if that made much of an improvement. It still is pretty inconsistent between training runs.

If it doesn’t I’d look into reducing the number of filters or the dense layer. Reducing the available space can force an overfitting network to figure out more general solutions

I’ll try reducing the dense layer from 128 to 64 next.

Lastly, I bet someone else has either solved the same problem as an exercise or something similar and you could check out their network architecture to see if your solution is in the ballpark of something that works

This is a great idea. I did a quick Google search and nothing stood out to start. But I’ll dig deeper more.

It’s still super weird to me that with zero changes how variable it can be. I don’t change anything, and one run it is consistently improving for a few epochs, the next run it’s a lot less accurate to start and declines after the first epoch.

Charlie Fish@eventfrontier.com · 1 个月前

Very inconsistent machine learning model training

Charlie Fish@eventfrontier.com · 1 个月前

Very inconsistent machine learning model training

Charlie Fish@eventfrontier.com · 2 个月前

That’s attached to the instance? Do you have a screenshot maybe?

Charlie Fish@eventfrontier.com · 2 个月前

What is the error that you get?

Charlie Fish@eventfrontier.com · 2 个月前

📸 Image Post Support - Echo v1.5

Charlie Fish@eventfrontier.com · 3 个月前

Yes. It just will fill your feed with a bunch of things you might not care about. But admin vs non admin doesn’t matter in the context of what I said.

Charlie Fish@eventfrontier.com · 3 个月前

Your instance is the one that federates. However it starts with a user subscribing to that content. Your instance won’t federate normally without user interaction.

Normally the solution for the second part is relays. But that isn’t something Lemmy supports currently. This issue is very common with smaller instances. It isn’t as big of a deal with bigger instances since users are more likely to have subscribed to more communities that will automatically be federated to your instance. You could experiment with creating a user and subscribing to a bunch of communities so they get federated to your instance.

Charlie Fish@eventfrontier.com · 3 个月前

It’s not really any different than hosting any other service.

Charlie Fish@eventfrontier.com · 3 个月前

I was lucky to get in in the early days when posting Mastodon handles on Twitter was common so was able to easily migrate. But this is a problem with ActivityPub right now I feel like. Discovery algorithms can be awful in the timeline, but so useful for finding people/communities to follow.

Charlie Fish@eventfrontier.com · 3 个月前

Yep just saw that too after I researched it a bit more. What is strange is I don’t remember Eve Energy having a firmware update since then. Makes me wonder if they had it ready to go in previous firmware versions based on internal specs they saw? Or maybe I just forgot about a firmware update I did.

Charlie Fish@eventfrontier.com · 3 个月前

Hurricane-stricken Tampa Bay Rays to play 2025 season at Yankees' spring training field in Tampa

Charlie Fish@eventfrontier.com · edit-2 3 个月前

but as the Matter standard doesn’t yet support energy monitoring, users are limited to basic features like on and off and scheduling

- from this link

Granted the article is almost a year old. But I just didn’t realize that Matter now supports energy monitoring. Somehow I just missed that news.

Charlie Fish@eventfrontier.com · 3 个月前

Eve Energy smart plugs transmit Energy information via Matter

Charlie Fish@eventfrontier.com · 3 个月前

Multiple Account Support is here! - Echo 1.4

Charlie Fish@eventfrontier.com · 3 个月前

Apple teams up with airlines for new ‘Share Item Location’ AirTags feature in iOS 18.2 - 9to5Mac

Charlie Fish@eventfrontier.com · 3 个月前

I know I’m not necessarily the target audience for this. But it feels too expensive. 6x the price of Cloudflare R2, almost 13x the price of Wasabi. Even iCloud storage is $0.99 for 50 GB with a 5 GB free tier. But again, I know I’m not necessarily the target audience as I have a lot of technical skills that maybe average users don’t have.

If you ever get around to building an API, and are interested in partnerships, let me know. Maybe there is a possibility for integration into [email protected] 😉.

Charlie Fish@eventfrontier.com · 4 个月前

This worked!!! However it now looks like I have to pass in 32 (batch size) comments in order to run a prediction in Core ML now? Kinda strange when I could pass in a single string to TensorFlow to run a prediction on.

Also it seems to be much slower than my Create ML model I was playing with. Went from 0.05 ms on average for the Create ML model to 0.47 ms on average for this TensorFlow model. Looks like this TensorFlow model also is running 100% on the CPU (not taking advantage of GPU or Neural Engine).

Obviously there are some major advantages to using TensorFlow (ie. I can run on a server environment, I can better control stopping training early based on that val_accuracy metric, etc). But Create ML seems to really win in other areas like being able to pass in a simple string (and not having to worry about tokenization), not having to pass in 32 strings in a single prediction, and the performance.

Maybe I should lower my batch_size? I’ve heard there are pros and cons to lowering & increasing batch_size. Haven’t played around with it too much yet.

Am I just missing something in this analysis?

I really appreciate your help and advice!