How Much Solar Is Enough? The Data I Wasn't Collecting

The first post on Above the Grid. Notes from a 7,000-foot off-grid build in New Mexico.

A few weeks ago I had what felt like a great solar day. My system (a Victron MPPT 450/100 driving a 32 kWh LiFePO4 battery bank, inverted through a Schneider Conext XW+) pulled down 20.7 kWh. I used maybe 10 kWh that day. That’s the kind of ratio you want to see if you’re trying to prove out an off-grid lifestyle.

My first instinct was to anchor the “how much solar do I eventually need?” question to that day. If I could consistently hit 150% of daily usage, I figured, I’d have enough buffer for cloudy stretches and could eventually scale up to electrified heating and hot water. It felt reasonable. It was almost completely wrong.

The problem with averages (and ratios)

The 150% rule has a nice round feel to it, but it falls apart under any real scrutiny. For one, that 20.7 kWh “great day” was in April, not December, when production up here can drop to 50-60% of peak. My 10 kWh usage that day was also propane-assisted. Once I electrify heating and hot water, winter daily loads could realistically hit 25-40 kWh at exactly the time solar is at its weakest.

Any sizing rule anchored to best-days production ignores the bottom of the curve. And the bottom of the curve is what determines whether you stay off-grid in February.

But even that critique doesn’t go deep enough, because it turned out my daily yield numbers themselves were lying to me.

The clipping problem

Here’s what I hadn’t internalized: in an off-grid system, your solar charge controller can only push as much energy into the batteries as the batteries can accept. When the battery hits 100%, production just… stops. The panels are capable of producing more. The weather is fine. But with nowhere to send the electrons, the controller curtails.

This is “clipping,” and it’s invisible in raw yield data.

The day after my 20.7 kWh high, I logged 9.9 kWh. My first read was “must have been cloudy.” It wasn’t. I had started the day with my batteries at 73% instead of 60%. That 13-point head start meant I hit 100% earlier in the afternoon, which meant the controller spent hours curtailing, which meant the logged yield was dramatically lower than what the array could actually have produced. The weather hadn’t changed. My storage capacity had.

So now the “target 150% of usage” rule has two problems stacked on top of each other:

It ignores seasonal worst-case scenarios.
On good days, measured production isn’t a ceiling at all. It’s a floor being set by storage capacity.

The real question isn’t “how much solar can my array produce in a day?” It’s “how much solar can my array produce on a day where storage isn’t the bottleneck, during the worst month of the year?” That’s a fundamentally different number, and I didn’t have the data to calculate it.

Building the data foundation

The fix isn’t a better rule of thumb. It’s an infrastructure for capturing the right data so that future sizing decisions can be anchored in reality. I spent a few evenings setting this up in Home Assistant, which is where my Victron gear already reports via Venus OS and MQTT.

The core metrics I’m now capturing daily:

Solar yield, per tracker, so I can compare strings against each other.
Consumption, derived from solar minus net battery change. I don’t have a direct AC meter yet, but the math is close enough on daily totals.
Battery SoC curve: not just end-of-day, but peak SoC reached and total minutes spent at full.
Charge stage timing (bulk, absorption, float), so I can see how the controller actually behaved.
Peak instantaneous power, to flag days where the array cleared a new ceiling.

The two most valuable derivations from all of this:

Clipping detection. Any day where peak SoC hit 100% and the battery stayed at ≥99.5% for more than 30 minutes gets flagged. These days don’t represent array capacity. They represent storage capacity. When I do worst-month analysis later, I have to filter them out. Otherwise I’d be optimizing against a constraint I’m about to fix (my planned 64 kWh battery expansion).

Unclipped-day yield. Days where the battery never ceilinged are the days that actually show what the array can do. Those are the sizing-relevant data points. Everything else is noise until I solve the storage bottleneck.

Persisting the data

I log in three places, each serving a different purpose:

Home Assistant long-term statistics: continuous, automatic, indefinite retention. This is the source of truth. Every sensor with the right state_class gets its hourly aggregates stored forever. Detailed state history is pruned after ten days by default, but the aggregates survive.
HA logbook entries: a nightly automation at 23:55 writes a human-readable summary attached to the yield sensor. Scrollable daily audit trail inside HA.
Notion database: a queryable mirror, synced on demand. Good for sitting down with a coffee and looking at trends, adding notes, tagging weird days.

The Notion side requires occasional manual syncing, but since HA preserves everything indefinitely, I can backfill any day retroactively without losing data. Missing a sync doesn’t cost me anything. It just delays when I look at the numbers.

A small caveat worth knowing

While setting this up, I discovered that Victron’s own “time in bulk/absorption/float” counters, as surfaced through the MQTT integration, are broken. The values they report are off by roughly a factor of sixty. I wasted a couple hours assuming my system was misbehaving before I realized the data was just wrong.

The fix was to compute stage timing myself using HA’s history_stats platform on the verified-accurate solarcharger_state entity. That sidestep is the kind of thing I expect to surface more often as I go: integrations promise clean data, the data turns out to be partial or buggy, and you end up deriving what you need from more fundamental signals.

Good for a blog post. Less good when you’re trying to make a decision.

What this enables

The short version: I now have a path to answering the sizing question with actual data instead of rules of thumb.

By next December I’ll have a full year of production data, clipping-filtered, with seasonal context. I’ll know the real worst-month ceiling. I’ll know how often I clip in summer, which tells me exactly how much additional storage will pay off. I’ll know what load profile the house actually runs at.

That lets me answer the question that matters for my eventual expansion: not “what’s a reasonable solar target” but “given my demonstrated load curve and my demonstrated worst-month production, what array size hits 120% of worst-month load with enough margin to ride out three to five consecutive bad-weather days?”

That’s a number I can spend money against.

The lesson, if there is one: when you’re planning an off-grid system, the first investment isn’t panels or batteries. It’s measurement.

Next post: how I pulled my EV charger into the same data pipeline, why that took longer than it should have, and why separating household load from EV load changed what my data was actually telling me.

The problem with averages (and ratios)#

The clipping problem#

Building the data foundation#

Persisting the data#

A small caveat worth knowing#

What this enables#