Forum Discussion

Candyman999's avatar
Candyman999
Novice III
6 months ago

Snapshots and growth

I have a question about snapshot growth and retention.

Last week we had 14 days worth of snapshots and due to some storage growth, we changed this to 7 days worth of snapshots. 

Before the change was made snapshots were taking up about 21 TB of space, after the change that number is around 10 TB.  This reduction of space was more than expected.   We expected around a 5 TB reduction.  We just added up days 8-14 to get the 5 number.  The other 6 TB reduction came from the most recent snapshot which at the time was 11 TB in size and now its down to around 5 TB in size. 

Does anybody know why the most current snapshot also had a large reduction after making this change? We are trying to figure out future growth including snapshot growth.

 

5 Replies

  • Garry's avatar
    Garry
    Day Hiker III

    So you are telling me that there was 6 TB of shared and dedup data in the most recent snapshot that was reference in the snapshots from days 8 - 13?

    I believe the answer is YES

    Your original statement has a clue in it

    "Last week we had 14 days worth of snapshots and due to some storage growth, we changed this to 7 days worth of snapshots." I believe you had some storage growth due to unexpected change rate which got captured in your "older" snapshots.

    That ~6 TB were blocks whose last reference lived in the older snapshots. When you deleted days 8–14:

    • Those blocks either became unreferenced (freed) or coalesced/deduped with active/newer data.
    • The array’s “space that would be freed if you delete the most-recent snapshot” metric shrank—from ~11 TB to ~4.6 TB.

    Scenario

    • Daily snapshot at 00:00.
    • A 6 TB temp dataset exists Day 1–7, then is deleted on Day 8.
    • Other normal churn adds ~4–5 TB across the week.

    Before: 14-day retention (S1–S14 kept)

    • The array must retain those 6 TB blocks because S1–S7 still reference them.
    • Due to how “would-free” is attributed, a large chunk of that 6 TB can be charged to the newest snapshot’s metric (e.g., S14 shows ~11 TB would-free).

    After: cut to 7-day retention (keep S8–S14; delete S1–S7)

    • All references to the 6 TB dataset are gone (they only lived in S1–S7).
    • Those blocks are fully freed.
    • The “most-recent” snapshot’s would-free drops (e.g., from ~11 TB → ~4.6 TB), even though S8–S14 didn’t “contain” that data. The accounting changed because the old snapshots that forced the blocks to exist no longer do.

    Takeaway: Snapshot sizes are interdependent. Deleting older snapshots can shrink the reported size of the newest snapshot when those older snapshots held the last references to big, now-deleted data. For forecasting, model daily unique churn (including short-lived datasets) rather than summing per-snapshot numbers.

    Hope it helps, Garry Ohanian

  • Hello!

    So at a high level, behind the scenes, we're only really retaining deltas.  So it's entirely plausible for 14 days = 21TB snapshot space to drop to 7 days = 10TB.  

    Can you please help me understand how you got to the 5TB reduction that you said is what you originally expected?  Did you mean that you simply added the snapshot size values together?

    Keep in mind that the array is also globally deduplicating.  So a subset of that data could also have been shared and/or no longer necessary once a set of snapshots were removed.  

    I would suggest getting in touch with your SE.  They can take the time to conduct a personalized storage consumption prediction analysis with you based on your array and your workloads.

    Cheers!

    • Candyman999's avatar
      Candyman999
      Novice III

      For the 5 TB that we expected, we just added the snapshot sizes from days 8 thru 14.  

      Before the change, snapshots sizes where like this:

      Most recent: 11 TB
      Recent -1: 1 TB
      Recent -2: 900 GB
      Recent -3: 800 GB
      and so on for a total of 21 TB

      Now the data looks like this:

      Most recent: 4.6 TB
      Recent -1: 1 TB
      Recent -2: 900 GB
      Recent -3: 800 GB
      and so on for a total of 10 TB

      So you are telling me that there was 6 TB of shared and dedup data in the most recent snapshot that was reference in the snapshots from days 8 - 13?

      I have reached out to our SE to see if he could insight into this.

      • Candyman999's avatar
        Candyman999
        Novice III

        Here is the response I got back from our SE

        "The GUI shows the newest snapshot as larger than the others primarily due to how storage space is reported concerning snapshots. When a new snapshot is created, it initially captures the entire state of the volume at that moment. This means that the size displayed for the newest snapshot may appear larger because it incorporates shared data that has not yet been de-duplicated.

        Here are the key points to understand:

        1. Baseline Snapshot: The largest snapshot in a group may contain baseline data not present in earlier snapshots. This is typical behavior, as newer snapshots represent the cumulative data differences since the last baseline was established.
        2. Family Shared Space: The newer snapshot is often reported to include "Family Shared Space," which is the amount of data shared across multiple snapshots. This shared data, which might not be reflected in the logical or unique space of earlier snapshots, can inflate the size of the newest snapshot displayed in the GUI.
        3. Snapshot Space Accounting: When a snapshot is taken, it initially has a size of zero (as it is an exact copy of the volume). As data changes between snapshots, space used increases based on the amount of unique data since the last snapshot.
        4. Data Reduction and Deduplication: The apparent discrepancy may also arise from data reduction and deduplication processes, which can lead to unusual space reporting in snapshots post-processing.

        Thus, the relatively large size of the newest snapshot is a normal occurrence and reflects the complexities of space management in snapshot technology, particularly in regards to baseline data and space sharing among snapshots"

  • bmcdougall's avatar
    bmcdougall
    Community Manager

    Hi Ryan! Thanks for the question. We (the community team) have reached out to the product experts across the company to get the right eyeballs on this. Hope to have a reply for you today!