Frankly, your statements come across as if you may not have much of an idea about how any of this works. First, it looks like one of the recalcs has gone through, and here's where that team is showing now:
I think over the next day or so as it goes through recalcs, Defense will likely start to look more normal as well.
The way SR works, is to apply a rating to a team entity, based on how well it does against another team entity. If one entity does better than expected, its rating goes up slightly; conversely if it does worse than expected, its rating goes down slightly. Once team has enough rated games in its recent history, its rating becomes public and it's now on the ranked list. The point of all of this isn't to create a list, and to make sure the list is "right". It is to predict game performance/outcomes, and if Team A plays Team B, and Team A has a higher rating than Team B, Team A should win. This can be (and is) checked recursively, as often as every week or two. Of the last few thousand most recent games, how many predictions came true (a higher rated team won), and how many didn't. That measure of predictivity is the main objective of the entire system, and however high it can get, the more likely the individual ratings will predict game results, and a downstream result is that putting the ratings in an ordered list will show teams in order of best performing to least performing. A number to keep in mind from last season, was that SR was running at over 82% predictivity for all games, which means it was picking about 5 of 6 winners correctly (where there was a winner). If you want more recent #'s, shoot the support email a question, I'm sure they'd be happy to share. The predictivity number of one of the other well known ranking systems is so far off from this they should be embarrassed. IMO I think the predictivity data should be posted publicly within the app as the number updates over time - but any time it's posted publicly on boards like this or on any of the FB groups, there's more chatter from people who simply don't understand what it really means, that it becomes more frustrating to try and explain than to just leave it unsaid. But ultimately, that is the reason that people should trust or not trust the predictions, and therefore the ratings. It is measured quite often at the macro level, and I'm sure there are all kinds of tweaks over time as the data shifts to optimize that number (number of weeks of games that are relevant, how many games are necessary to be relevant, how much to discount a game 2 months ago vs 1 month ago, how much to discount a 5-0 win vs. a 3-2 win, and any number of additional factors that someone can modify if they have access to all of the data). Another check that any user can do over time is just to follow the results for a team they are familiar with. Over time - is it predicting outcomes at an expected rate, or is it not? The good news - is that it generally is, that's what the macro numbers can show. Seeing that it predicted X number of wins for that team correctly over the last 2 or 3 seasons, is also a good indicator and way to build trust (or not). All of this can also be seen just by looking down the game history, and seeing how many games were significant overperforms or underperforms against their rating (Green/Red); most teams have a smaller number of these than one might expect.
Now in this particular case, you noticed an outlier - and from time to time these pop up - but the good news is they seem to be temporary. After a short period of time, when the recalcs run again, it becomes normalized. If you look at the first screenshot, this team was a 43.x, and now it is a 41.x. If you look at just the handful of games in their history, and how they performed against other teams and their own ratings, back of the napkin math shows that 41.x is probably more accurate, and 43.x makes no sense. I've seen that before, for example merging a team 15th in state with more of their games in an entity ranked 22nd in state, and the merged team is suddenly #1 in state or even #1 nationally. There is some yet not understood reason why this happens sometimes, and the initial rating is inflated. The good news is that this has always turned out to be temporary, and all will be sorted with just a little patience as it recalcs.
The math isn't particularly complicated here, but it is quite alot of pretty iffy quality data sometimes from hundreds of different sources, and it is quite a production that they have put together to make it as automated as they have. Whenever I notice math or other display errors, I've worked with them to try and understand how/what/why to see what I wasn't getting, or if it was a bug to be squashed. There have been quite a few tweaks over time to make the data that is displayed more accurate. As one example, the club ratings were for quite a long time not quite matching what we could see through the app if we did the same math manually. It turns out that there are/were a bunch of different categories about whether a team rating is trusted enough to show publicly, and one that is stored against the team - for when enough games showed that it would be public. The team list in a club would show some ratings for teams, but those ratings weren't trusted enough to be averaged in as a team counted in the club rating. Tracking that bug down took weeks, until enough examples appeared that showed more clearly what was happening - and the club ratings calc was then adjusted to more correctly display what one would expect by seeing the ratings of the individual teams.
This one is perhaps even more insidious, but appears less impactful. For an unknown reason, on rare occasions, when data is added manually, it seems that the initial calcs show unexpected results. Since this is such a small amount of data compared to the vast quantity that is ingested and matched automatically, it's not a big deal in the scheme of things - rankings and ratings over time do not seem to be affected much at all. And any of the outliers appear to fix themselves in short order.
IMO, calling this a "major red flag" and wondering who is behind this and how can we trust this, as if this puts much or all of the other data at risk, greatly exaggerates what the actual problem may be or its impact.