Microsoft Azure: Get All-flash Storage for Azure Local
May 21 | Register Now! Azure Local now supports on-premises storage, and Everpure is a certified solution. It offers a new alternative to legacy virtualization platforms, using familiar Windows and Hyper-V technology. Why add local storage to Azure? Delivers data sovereignty as all data remains on premises Ultra-low latency storage for your most demanding workloads, such as databases and AI Separate compute from storage, to allow granular scaling and avoid over-spending Join our webinar to learn how easy it is to deploy Everpure FlashArray™ with Azure Local and how it can completely transform your virtualization journey. Register Now!7Views0likes0CommentsPlanning SQL Server Storage Layout for Snapshot Recovery
Two of the most important things you need to consider when thinking about snapshots are your snapshot recovery goals, and your database instance deployment model. To take full advantage of volume snapshots, your SQL Server environment should be planned with snapshot usage in mind. Your instance deployment model (physical vs. virtual), storage presentation (vVols, VMFS, iSCSI, etc.), and snapshot recovery scope all have a direct impact on storage and database layout. Making these decisions up front helps you ensure that snapshot operations align with recovery objectives, avoid unintended side effects, and remain manageable over time. In this post we'll walk through the different recovery goals folks might have, how those goals are impacted by technology choices, and what changes you might need to make to reach your goals. If you are introducing snapshots into an existing environment things can be a little less flexible, but this post can help you better understand the challenges you might run into. Snapshot recovery scope Below is a summary of some possible recovery goals along with the impact your technology choices can have as well as the impact on how you plan database storage layout. Instance-level recovery is easiest to implement but most coarse-grained; single-database recovery is the most flexible but requires the most careful volume design and operational discipline. Note: tempdb should NOT be included in volume snapshots. It is recreated automatically on startup, and not meant for recovery. The following figure summarizes the possible recovery scopes and how database volumes can be organized for each. Instance-level recovery With instance-level recovery you are looking to recover an entire SQL Server instance at a point in time (all user databases that live on the snapped volumes). This could be part of a data protection plan for an instance that hosts a single application or part of a workflow to snapshot a production system for use in a dev/test workflows. This could even be a temporary workflow for migrating an existing server or adding an HA/DR replica. Special considerations System databases (master, msdb, model) can be either included with the snapped volumes for true instance-level rollback, or kept separate and protected via regular backups, depending on your recovery strategy. If you plan to use instance-level snapshots to create an HA/DR replica, it would be best to leave system databases out of your snapshots. Potential layouts Physical / In-guest / vVols: Shared volumes for all user database data and logs. Best practices and performance will likely mean there are separate log vs. data volumes, and often multiple data volumes. These will need to be a part of the same Everpure volume protection group. Shared-datastore virtualized (VMFS, CSV, AHV): 1 datastore per SQL Server instance, with many VMDKs/VHDXs Impact on volume layout/recovery Very simple to implement and operate. All databases on those volumes/datastores share the same snapshot schedule and rollback behavior. A single database cannot be safely recovered without affecting the others on the same volumes. Application / DB-group recovery In some scenarios you might have groups of related databases that need to be recovered together, but the whole instance should not be recovered as a unit. Maybe the instance hosts different applications with different SLAs, or you just need the flexibility to recover applications separately. Whatever the reason, in this situation you need to keep groups of databases in sync and recoverable to the same point in time. Potential layouts Physical / In-guest / vVols: Application data and log volumes will need to be a part of the same application-specific protection group. For a given instance you will end up with multiple protection groups, one per unit of recovery needed. Shared-datastore virtualized (VMFS, CSV, AHV): VMDKs/VHDXs for each app grouped together on a datastore; other workloads use different datastores. Everpure volumes are created at the datastore level, so applications must also be separated at the datastore level. Impact on volume layout/recovery Databases related to a specific application need to be kept together on the same set of volumes; unrelated databases need to be kept separate Protection groups should be defined so that all volumes that contain files for that app’s databases are snapped together Single-database recovery In cases where you are managing single instances with many databases you often need to recover at the database level. Common situations where this type of recovery is desirable are multi-tenant systems where each customer or user has a dedicated database, highly consolidated environments where large numbers of unrelated databases are housed on the same instance. These cases could both come up in production, but are also very common in dev/test environments. Special considerations For single-database recovery it is possible to run into limits along your storage stack depending on how many databases you have and how they are distributed around your environment. When going down this route it's important to understand limitations around: Volume count (and drive letters) supported by Windows Volume and protection group counts supported by Purity Volume limits per host supported by Purity Potential layouts Physical / In-guest / vVols: Per-database volumes for data and log (or at least for each high-value database), grouped into per-database protection groups. Shared-datastore virtualized (VMFS, CSV, AHV): Shared-datastores are not ideal for this recovery goal as the whole datastore has to be recovered at once. Per-database datastores could work, but would add management complexity. Grouping databases by aggregate size or throughput characteristics could reduce this complexity, but will still present challenges for recovery. Impact on volume layout/recovery All files for the given database must live on volumes that are included in the same protection group. This gives the most recovery flexibility, but increases the number of volumes, mount points, and protection groups (and potentially datastores) to manage. In shared-datastore models, single-DB recovery typically requires cloning the datastore volume and extracting only the virtual disks for that database using vendor-specific tooling; this is significantly more complex than in-guest or vVol layouts. Overall Each recovery goal has its own challenges and trade-offs, but they all share a few core requirements. Keep the recovery unit together: All data and log volumes for a given recovery scope (instance, application, or single database) should reside in the same protection group. This ensures that snapshots capture a consistent point in time and that you can safely roll forward or roll back without orphaning files. Be intentional about what you exclude Since tempdb holds transient data, is recreated on startup, and cannot be used in application consistent snapshots, it is typically placed on its own volume(s) outside of snapshot protection groups. Also because of it's high change rate, a snapshot including tempdb can quickly consume capacity. System databases (master, msdb, model) are usually protected with traditional SQL backups and kept separate from user database protection groups, unless you have very specific reasons to include them. Plan for growth and change: Whatever layout you choose, it has to survive new databases, additional volumes, and changing workloads. Making sure new volumes are consistently added to the correct protection group (manually, via automation, or through Everpure Fusion presets) is key to continuing to meet your recovery goals over time. Read more about Fusion presets on the Everpure support portal. With proper planning, volume snapshots can be a powerful new tool in your toolbox. They can simplify day-to-day operations, make complex recovery scenarios more predictable, and unlock new possibilities for dev/test, reporting, and migration workflows without consuming a lot of time or additional storage.108Views0likes0CommentsNVMe-over-Fabrics (NVMe-oF) with Windows Server Initiator and Everpure FlashArray
Are you actively using or considering to use NVMe-oF with Linux and FlashArray? Do you know that Microsoft recently announced a preview of Windows Server NVMe-oF initiator? For those who are interested in this topic, I tested the initial preview with FlashArray and posted the results into the Everpure blog page here54Views1like0CommentsEnabling Agentic AI via Pure1 Manage MCP Server
Everpure now offers a Pure1® Manage MCP Server so you can query information about your fleet using natural language questions. In this post, I’ll explain how the Pure1 Manage MCP Server works. The first section will explain MCP in general, and the second section will explain how to use our specific server. Feel free to skip to the Quick Start section if you’re already familiar with MCP and just need the parameters to plug into your host. What is MCP? MCP stands for "Model Context Protocol," and it's a way for users to connect their AI applications to external systems using tool calls. MCP tools are fundamentally rooted in application programming interfaces (APIs). An API is a set of rules and protocols that allows different software applications to communicate with each other. It acts as an intermediary, enabling one piece of software (the client) to request information or functionality from another piece of software (the server) without needing to know the server's internal workings. For instance, when you check the weather on your phone, the weather app uses an API to send a request to a weather service, which then returns the current weather data. AI applications have trouble making API calls directly because APIs are designed for completeness and correctness, not for an LLM to use easily. When an AI application wants to use an external system to handle a user’s request, it uses the MCP protocol to make a tool call. The AI (client) requests a function (the tool) from an external system (the server), and the system executes the function and returns a result. This makes MCP a system that standardizes and mediates API-like interactions, allowing AI models to leverage external, real-world capabilities. For more information, see this article on the MCP website: “What is the Model Context Protocol (MCP)?” How can customers benefit from the Pure1 Manage MCP Server? The Pure1 Manage MCP Server enables customers to securely integrate AI assistants, copilots, and agentic systems with live Pure1 telemetry and operational data—without building custom API integrations. It transforms Pure1 from a dashboard-centric experience into an AI-accessible platform, enabling natural language interaction, contextual automation, and real-time operational intelligence. Customers benefit from faster AI integration, reduced engineering effort, preserved security controls, and improved decision velocity across hybrid environments. What types of customer workflows are best suited for MCP? The Pure1 Manage MCP Server is particularly well-suited for agentic and AI-driven workflows, including: Fleet telemetry integration with customer copilots Expose Pure1 telemetry—arrays, volumes, workloads, metrics, and alerts—into internal copilots, chatbots, or AI platforms via MCP endpoints. Value: Unified operational visibility across hybrid and multi-platform environments Automation with context awareness Use MCP to validate storage state, health, performance, or capacity before executing provisioning, backup, or disaster recovery workflows. Value: Safer automation with contextual validation, reduced execution errors, and greater rollback confidence Hybrid cloud observability Correlate Everpure array performance and capacity metrics with application, VM, container, or cloud telemetry across environments. Value: Faster troubleshooting and improved end-to-end performance insights Conversational operational analytics Enable operators to ask real-time natural language questions, such as: “Which arrays are nearing capacity risk?” “Show me the top latency spikes in the last 24 hours.” “Summarize all critical alerts across regions.” Value: Rapid insight without navigating dashboards or exporting reports What the Pure1 Manage MCP Server can do The Pure1 Manage MCP Server exposes the Pure1 Manage REST APIs to AI applications. This means you can ask ad hoc questions about your fleet using natural language, and your AI application will use the server’s tools to query the answer. No programming required. Example conversation Here’s an example conversation demonstrating the use of the Pure1 Manage MCP Server: User: "What is the name of my oldest FlashArray and what is its serial number?" Chatbot:Processing Request: The AI recognizes this as a request for fleet information requiring a tool call. Tool Invocation: Get_hardware Response: The oldest FlashArray in your fleet is named FA-Prod-01 with serial number FA10123456. User: "Can you show me the average I/O latency for that array over the last 7 days?" Chatbot: Let me first check the available latency metrics, then query the historical data: Tool Invocation: Get_metrics Tool Invocation: Get_metrics_history Response: The average I/O latency for FA-Prod-01 (serial FA10123456) over the last 7 days was 0.385 milliseconds. User: "How many arrays do I have in total that are still under warranty?" Chatbot: Processing Request: The AI recognizes this as a request for a count of arrays filtered by warranty status. Tool Invocation: Get_array_support_contracts Response: You currently have 12 arrays in your fleet that are still under active warranty. Quick start Step 1: Register an API key in Pure1 Manage The Pure1 Manage MCP Server leverages the Pure1 Manage REST APIs. In order to access those APIs, you need to register an API key in Pure1 Manage. To do that, follow the directions in the The Pure1® REST API introductory blog post. After going through the instructions, you will have an application id and a private key file, which will be used to generate an access token to access the MCP server in step 2. Step 2: Set up the pure1_token_factory.py script Prerequisites: you need Python 3.12 or greater to run the script. Download pure1_token_factory.zip. Unzip the archive. Go to the unzipped folder in your command-line terminal. Optional but recommended: create and activate a Python virtual environment: python3 -m venv .venv source .venv/bin/activate Install the requirements: pip3 install -r requirements.txt. Run python3 pure1_token_factory.py <application_id> <private_key_file> Copy the generated access token from the script output for the next step. Step 3: Add remote MCP server to your AI application Follow the directions for your AI application to add a remote MCP server (see the Pure1 Manage MCP Server User Guide for instructions for specific chatbots). In general, they need the following information: Remote MCP Server address: https://api.pure1.purestorage.com/mcp Authorization type: header Header name: Authorization Header value: Bearer <access-token> Important: <access-token> is just a placeholder for the access token you generated in step 2. The actual header value should look something like “Bearer eyJ0eXAiO…” Important: you need to generate a new access token every 10 hours and copy it into your AI application You’ll need to run pure1_token_factory.py to generate a new access token every 10 hours, and manually copy the access token into your AI application’s config. Claude Desktop instructions Claude Desktop is a special case because it doesn’t let you set the Authorization header directly. You have to run the mcp-remote local MCP server and configure that to use the Pure1 Manage remote MCP server. Prerequisites You need to have Node.js version 18 or newer installed on your system. Configuration In Claude Desktop, go to Settings > Developer, and click Edit Config. Open the claude_desktop_config.json file in a plain-text editor like VS Code. Configure the mcp-remote server, which is necessary to pass the Authorization header to the Pure1 Manage MCP Server. Paste the token into the configuration file, then restart Claude Desktop. { "mcpServers": { "Pure1 API": { "command": "npx", "args": [ "-y", "mcp-remote", "https://api.pure1.purestorage.com/mcp", "--header", "Authorization:${AUTHORIZATION_HEADER}" ], "env": { "AUTHORIZATION_HEADER": " Bearer <paste access token here>" } } } Note: there might be other configuration options in this file. Be sure to leave them unchanged, and only insert the Pure1 API config in the mcpServers section. The space in the AUTHORIZATION_HEADER environment variable is important. It's there to work around a bug in Windows argument parsing. Please note that: The first time it uses a tool, it will ask you for permission. You can grant permission to all tools at once by going to Customize > Connectors > Pure1 API, and selecting Always Allow under Other tools. For more detailed instructions from Anthropic, please refer to: Connect to local MCP servers - Model Context Protocol.122Views0likes0CommentsWhy Object Storage Still Matters
In Part 2, I wrote a line that, at the time, felt almost like a side comment — something I typed without fully appreciating how much it would change the direction of the story: “BREAKING NEWS: The FlashArray now supports Object??? What in the world? I may need to write an article about that!!” That reaction wasn’t planned, and it definitely wasn’t me being clever. It was me looking at the GUI and thinking, “that can’t be right… can it?” It didn’t line up with how I’ve been modeling storage architectures in my head for years, which usually means one of two things: either something fundamentally changed… or I’ve been confidently wrong about part of this for a while. And if I’m being completely honest, there was also a second reaction happening in parallel — one that I didn’t write down at the time because it sounded slightly ridiculous even in my own head: “Wait… do I actually understand why object storage exists in the first place? And more importantly… what exactly was wrong with files?” That’s the part nobody likes to admit out loud. We’ve all spent years confidently explaining block, file, and object as if we were born with that knowledge, when in reality most of us learned it incrementally, retroactively, and with just enough conviction to sound credible in front of a customer. Object storage, in particular, has always carried this aura of inevitability — like of course it’s better, of course it scales, of course it’s what modern applications need — without always forcing us to question why the previous model stopped being enough. Because for as long as most of us have been designing infrastructure, object storage has not simply been another protocol layered onto an existing system. It has represented a fundamentally different way of organizing and accessing data, one that required its own architectural approach, its own scaling model, and, more often than not, its own dedicated platform. The separation between block, file, and object was not arbitrary; it was a reflection of how deeply different those paradigms were in terms of metadata handling, access patterns, and performance expectations. This is precisely why platforms such as Everpure FlashBlade exist in the first place. They were not created as extensions of traditional storage systems but as purpose-built architectures designed to treat unstructured data — and particularly object data — as a first-class citizen. The use of distributed metadata services, sharded across independent nodes, combined with a key-value store storage model, allows such systems to achieve levels of parallelism and throughput that simply cannot be replicated within a controller-based design. In that context, object storage is not something that is “added” to the system; it is the system. Which is why seeing S3 support appear on FlashArray required a pause. Not excitement. Not skepticism alone. Something closer to intellectual friction. Reconciling Two Architectural Worlds The most important step in understanding what FlashArray has introduced is to resist the temptation to treat it as a direct comparison to FlashBlade. These aren’t two different ways of solving the same problem. They’re two different answers to two different problems—and pretending otherwise is where people get themselves into trouble. FlashBlade is built for object, not adapted to it. S3 talks directly to a distributed engine that thinks in objects, not files pretending to be objects. Metadata is spread across blades instead of becoming a centralized choke point, and the whole system scales the way modern workloads actually need it to. There’s no file system layer to fight with, no directory structure to navigate, no POSIX semantics getting in the way. It just does what you’d expect when you remove all of that: it goes fast, it scales cleanly, and it keeps up with workloads like HPC, AI and analytics without breaking a sweat. FlashArray takes a very different path, and in reality, it’s not what most people expect. It doesn’t try to reinvent itself as an object platform, and it doesn’t throw an S3 gateway in front of the array and call it a day. With Purity 6.10.5+, S3 just shows up as another protocol the system understands, right next to block and file. That distinction matters more than it seems. This isn’t something duct-taped on the side — it’s part of the same control plane, the same data path, the same system you’ve already been running. But let’s not pretend it turned into FlashBlade overnight. This is still a controller-driven architecture. The primary controller does the heavy lifting — handling requests, authenticating them, coordinating operations — before anything actually hits the storage engine. Which means it behaves differently, especially as workloads scale. So it ends up in this interesting middle ground. Not a native object system in the pure sense, but not a hack either. Just a different way of exposing what’s already there. The Translation Layer and Its Consequences It would be irresponsible to discuss FlashArray S3 without explicitly addressing the implications of this design. Even with its native integration into Purity, S3 operations are still subject to the realities of a controller-bound architecture. Every request must be processed, authenticated, and coordinated before it is executed, introducing a measurable difference in behavior compared to both native block operations and distributed object systems. The most immediate effect is latency. While FlashArray continues to deliver sub-150 microsecond performance for block workloads, S3 operations typically operate at higher latencies (in 1 millisecond range) due to the additional processing steps involved. This is not a flaw; it is the natural outcome of introducing a protocol that was designed for scale and flexibility into a system optimized for low-latency transactional workloads. Metadata handling further reinforces this distinction. FlashBlade distributes metadata across its architecture, enabling massive parallelism and consistent performance at scale. FlashArray processes metadata through its controller framework, which introduces natural serialization points under high concurrency. As workloads become increasingly metadata-heavy — particularly with small objects — this difference becomes more pronounced. The system also enforces clearly defined operational limits to maintain predictable performance. As of Purity 6.10.5+, FlashArray supports up to 250 S3 buckets per array and a maximum of 1,000,000 objects per bucket. FlashArray Object Store Limits Object storage operates at the array scope and does not integrate with multi-tenancy or “realms”, which has implications for service provider models and strict tenant isolation requirements. These constraints are not arbitrary limitations; they are guardrails that ensure the system behaves consistently within its architectural boundaries. Where the Architecture Becomes Secondary Having established those boundaries, the conversation naturally shifts from “how it works” to “why it matters”. In many enterprise environments, particularly within SLED organizations, the challenge is not achieving exabyte-scale throughput or supporting billions of objects. The challenge is delivering capabilities in a way that is operationally sustainable, economically efficient, and aligned with existing infrastructure. This is where FlashArray’s approach becomes compelling. By exposing object storage within the same platform that already supports block and file workloads, it eliminates the need to introduce a separate system, a separate operational model, and a separate set of dependencies. The same management interface, the same automation framework, and the same data services extend across all protocols. More importantly, object data inherits the full set of Purity capabilities. Global inline deduplication and compression apply to S3 workloads, significantly improving storage efficiency compared to many object-native platforms. SafeMode snapshots extend immutability to object storage, providing a critical layer of protection against ransomware. ActiveCluster, combined with ActiveDR, enables a three-site resilience model that ensures data availability across multiple locations with zero RPO between primary sites. These are not incremental improvements. They represent a shift in how object storage can be consumed within an enterprise. Practical Use Cases in a Unified Model When viewed through this lens, the use cases for FlashArray S3 become both clear and grounded in reality. Development and Staging Environments Some applications rely on S3 APIs but do not require massive scale, FlashArray provides a consistent and integrated object interface without introducing additional infrastructure. Developers can build and test against a familiar model while remaining within the same operational environment. Backup and Recovery Workflows FlashArray S3 enables modern data protection strategies that leverage object storage while benefiting from flash performance, deduplication, and indelible snapshots. This combination improves both recovery times and storage efficiency. Tier-two repositories and application-integrated storage represent another natural fit. Workloads such as document management systems, logs, and archival data often require object semantics but do not justify the higher cost of a dedicated object platform. Consolidating these workloads onto FlashArray simplifies operations while maintaining reliability and performance. Where the Boundaries Still Matter None of this diminishes the importance of selecting the appropriate platform for workloads that demand a different architecture. High-performance AI pipelines, large-scale analytics environments, and use cases requiring massive parallelism remain firmly within the domain of FlashBlade. The ability to scale performance linearly, distribute metadata across many nodes, and support billions of objects is not optional in these scenarios — it is essential. What has changed is not the relevance of those systems, but the necessity of deploying them for every object storage use case. A Subtle but Significant Shift The introduction of S3 on FlashArray does not represent a replacement of one architecture with another. It represents a convergence of capabilities within a unified operational framework. Object storage, in this model, is no longer a destination that requires its own platform. It becomes a capability — one of several ways to access and manage data within the same system. That shift is easy to overlook, but its implications are significant. It allows organizations to design around outcomes rather than protocols, to reduce complexity without sacrificing capability, and to align infrastructure more closely with the needs of modern applications. Closing Reflection Looking back at that line in Part 2, it is clear that the reaction was not just about a new feature appearing in the interface. It was about the recognition — however incomplete at the time — that something foundational was beginning to change. Object storage did not suddenly become simpler, nor did it lose the architectural complexity that defines it. What changed is where it lives. And once that becomes clear, you start asking a slightly uncomfortable but very honest question: If this works… and it works well enough for most of what I actually need… why was I so convinced it had to live somewhere else in the first place? That is usually where the interesting work begins. Appreciate you reading. Dmitry Gorbatov © 2025 Dmitry Gorbatov | #dmitrywashere27Views1like0CommentsFusion for the Win: You No Longer Have to Decide Where the Data Lives
Dmitry Gorbatov Apr 10, 2026 In the first post, I walked through enabling file services on a FlashArray. There was nothing particularly complicated about it. The process was clean, predictable, and by the end of it I had a fully functional file platform running on the same system that was already supporting the rest of the environment. It behaved exactly the way you would expect it to behave. And that is precisely what started to bother me. Because if you step back and look at what we actually did, the workflow has not really changed in years. I still made a series of decisions in a very specific order. I chose where the workload should live, I created the file system, I attached protection, and I made sure everything was named and organized in a way that made sense at that moment. It was structured. It was controlled. It was also entirely dependent on me. That model works well enough when the environment is small or when the same person is making the same decisions repeatedly. But as soon as you introduce scale, or simply more people, those decisions start to drift. Not in a dramatic way, but in small inconsistencies that accumulate over time. A slightly different naming convention here, a missed policy there, a workload placed somewhere because it “felt right.” Nothing breaks. It just becomes harder to operate. When the model stops making sense What stood out to me after going through the manual process is that we are still treating storage as something that needs to be individually managed, even though the platform itself has already moved beyond that. We have systems that can deliver consistent performance, global data services, and non-disruptive operations, yet we still rely on human judgment to decide where things go and how they should be configured. That disconnect is where Everpure Fusion begins to make sense. Not as an additional feature, but as a way to remove an entire class of decisions that we have simply accepted as part of the job. From managing infrastructure to defining intent The idea behind the Enterprise Data Cloud is not particularly complicated, but it does require a shift in perspective. Instead of treating each array as a separate system with its own boundaries, the environment becomes a unified pool of resources. Data is no longer something that you place on a specific array. It is something that exists within a global pool, governed by policies that define how it should behave. Once you start thinking this way, the questions change. You are no longer asking where a workload should go. You are asking what that workload needs to look like. Performance expectations, protection requirements, naming, and lifecycle behavior become the inputs, and the system automation takes responsibility for everything else. That is the role of Everpure Fusion. What actually changes in practice The easiest way to understand Fusion is to look at what it removes. In the manual model, every step is explicit. You build storage object by object, and then you attach policies to those objects. You rely on memory, experience, and sometimes documentation to make sure everything is done correctly. With Fusion, that entire process becomes declarative. Instead of building storage step by step, you define a preset. A preset is a reusable definition of what “correct” looks like for a given workload. It captures performance expectations, protection policies, naming conventions, and any constraints that should apply. Once that definition exists, it becomes the standard. When you create a workload from that preset, Fusion evaluates the environment and places it on the array that best satisfies those requirements. It creates the necessary objects, applies the policies, and ensures that everything is consistent with the definition. The important shift is not that tasks are automated. It is that decisions are no longer made ad hoc. Trying it in the lab After building file services manually in the previous post, I wanted to see what this would look like using the same environment, but driven through Fusion. I started by defining a fleet, grouping the array into a logical boundary where resources and policies could be managed collectively. Once the array becomes part of a fleet, you stop thinking of it as an individual system and start treating it as part of a shared pool. From there, identity becomes the next requirement. Fusion relies on centralized authentication, typically through secure LDAP backed by Active Directory. This is what governs access to presets and workloads, and it ensures that everything aligns with existing organizational controls. Up to this point, everything felt exactly like I expected. Then I moved to the part I was actually interested in. Where things didn’t quite line up The goal was to take the file services I had already built and express them as a preset. I wanted a single definition that would describe the file system, its structure, its policies, and its behavior, and then use that definition to create workloads without going through the manual steps again. Conceptually, that is exactly what Fusion is supposed to do. In practice, I ran into a limit that I had not fully appreciated at the start. I was running Purity OS 6.9.2. Which, to be fair, is where most production environments should be. It is a Long-Life Release, stable, predictable, and already capable of delivering Fusion for fleet management, intelligent placement, and policy-driven storage classes. You can create Presets and Workloads for block workloads. What it does not include is full support for File Presets on FlashArray. That capability, where a file system, its directories, and its access policies are all defined and deployed as a single unit, arrives in the 6.10.X Feature Release line. Which means that the exact outcome I was trying to demonstrate was sitting just one version ahead of me. This is where I had to laugh at myself There is always a moment in a lab where you realize that the limitation is not the platform. It is you. In this case, it was me getting ahead of the version I was actually running. My intentions were “ever” so “pure” (IYKYK). The execution was slightly behind the feature set. So I upgraded One of the advantages of working with this platform is that upgrading does not carry the same weight it used to. The system is designed for non-disruptive operations, and moving between versions does not require downtime or migration. The upgrade to 6.10.5 was uneventful in the best possible way. Controllers were updated in sequence, workloads continued to run, and the system transitioned to a new set of capabilities without introducing risk. There is something very satisfying about performing an upgrade not because something is broken, but because you want access to what comes next. BREAKING NEWS: The FlashArray now supports Object??? What in the world? I may need to write an article about that!! When it finally clicks Once on 6.10.5, the model finally aligns with the intent. Once I clicked on Create Your First Preset, it gave me these options: I defined a preset that described the file workload I had previously built manually. It included the expected behavior, protection policies, and naming conventions. Instead of creating individual components, I was defining the service as a whole. Now this was really neat - when you select Storage Class, it knows that arrays are available in your environment. In my case, I only have FA //X. At this point a new field opens and allows you to select the Storage Resources. Once I hit “Publish'“ this was the result: Think of this entire process like this: Define your Recipe (Preset) Order from the Menu (Workload) Lets create a workload from that preset. Once I clicked on + to add a new Workload, the Wizard opened: Give a name to that Workload: Since Fusion Fleet has both of my lab arrays, I have an option to select an array for the workload placement. Our of curiosity I clicked: “Get Recommendations” and this was the result: Once I hit Deploy, within seconds, the workflow executed and I had my File System created. How awesome is this? Come on, give me a cheer! Think about the magnitude of what just happened. I provided minimal input, and Fusion handled the rest. It selected the appropriate array based on capacity and performance, created the file system, applied the policies, and ensured that everything matched the definition. There was no second pass. There were no additional steps. The outcome matched the intent. By moving to this model, I just shifted from being a "storage admin" to a "data architect." I defined the outcomes and it happened “automagically”. Why this matters more than efficiency It would be easy to describe this as a way to reduce manual effort, but that misses the point. The real value is consistency. When every workload is created from a defined preset, variability disappears. Policies are enforced by default. Naming is consistent. Placement is based on a complete view of the environment rather than individual judgment. Over time, that consistency reduces operational friction and lowers risk in ways that are difficult to measure but easy to recognize. Environments behave predictably, scaling becomes simpler, and the likelihood of human error decreases. Where this leads In the first post, I showed that file services can run natively on the array without additional infrastructure. In this post, the focus shifted to removing the manual decisions involved in building and managing those services. The next step is where things move beyond automation. As capabilities like ActiveCluster for File continue to evolve, the conversation shifts toward mobility and continuous availability. At that point, it is no longer just about simplifying operations, but about removing the constraints that tie workloads to a specific system or location. That is a conversation for Part 4. Appreciate you reading. © 2025 Dmitry Gorbatov | #dmitrywashere41Views0likes0CommentsStop Running File Servers on VMs
Dmitry Gorbatov Apr 06, 2026 One of the superstar Pre-Sales Systems Engineers on my team was in a customer meeting not too long ago, walking through what was, by all accounts, a well-run environment. The team knew what they were doing, the infrastructure was stable, and nothing stood out as particularly problematic. It was one of those conversations where everything feels “fine,” which in our world usually means there are inefficiencies hiding in plain sight. Then he started asking questions about enterprise file services. They were running a couple of Windows Server virtual machines on top of VMware vSphere, serving SMB shares to the rest of the organization. Again, nothing unusual there. This is still the default design in a lot of places, and it works well enough that nobody feels compelled to question it. But as the meeting on, a few details started to surface. One of the VMs was consistently running hot during backup windows. Another one hadn’t been patched in a while because nobody wanted to risk disrupting access to shared data. The storage policies applied at the VM layer didn’t quite line up with what was actually configured on the array. And there was an unspoken understanding that maintaining these systems was just part of the job — something you deal with, not something you optimize. What made it more interesting was that the same environment had an Everpure FlashArray running their critical workloads. It was handling databases, transactional systems, and anything else that required consistent performance and reliable data services. It was protected, replicated, and trusted. File services, however, were living on top of virtual machines, with their own lifecycle (please, please… don’t say VMware snapshots), their own dependencies, and their own set of operational overhead. That disconnect is what stuck with me. So instead of continuing the theoretical discussion about architecture and “best practices,” I went back to my lab and decided to try something very simple. I wanted to see what would actually happen if I enabled file services directly on the array and treated it as a first-class file platform instead of assuming that role belonged to something else. There was no redesign exercise, no migration plan, and no phased rollout. I wasn’t trying to prove a point on a whiteboard. I just wanted to turn it on and see if the experience matched what we tend to claim in conversations. Nothing broke. Nothing felt forced. And more importantly, nothing about it felt like a compromise. This post walks through exactly what I did to enable and run file services on a FlashArray //X20R4 running Purity 6.9.2. The goal is not to explain the architecture in abstract terms, but to show how straightforward it is to take something that already exists in your environment and use it in a way that removes unnecessary complexity. What I realized (and why this matters) Once everything was up and running, the first realization was that this is not a workaround or a secondary feature designed to fill a gap. FlashArray File is integrated into the platform in a way that makes it behave like a natural extension of what the system already does well. It uses the same controllers, the same global storage pool, and the same data services that are already in place for block workloads. There is no separate management layer, no additional appliance (remember Data Movers and NAS Personas?), and no need to think about it as something different from the rest of the system. That by itself is useful, but it is not the most important part. What stood out more was the amount of operational overhead that simply disappeared. When file services run on virtual machines, you inherit everything that comes with them. You are responsible for the guest operating system, including patching cycles, security updates, and the occasional issue that appears at the worst possible time. You are also consuming hypervisor resources and, in many cases, paying for licensing that exists solely to support a function that could be handled elsewhere. On top of that, you end up managing data protection, performance, and capacity in two different places (remember RDMs, or in-guest iSCSI?), which introduces opportunities for inconsistency. By moving file services onto the array, that entire layer is removed. You are not just changing where the workload runs; you are simplifying how it is operated, protected, and maintained over time. The second realization was that this approach aligns with where things are clearly heading. Everpure is already extending these capabilities with ActiveCluster for File, which will bring synchronous replication and continuous availability to unstructured data. I do not have that running in my lab yet, but it is not difficult to see the direction. As those capabilities become more widely available, the remaining reasons to maintain separate file platforms will continue to shrink. That will be a conversation for a future post. Let’s tentatively call it Part 3 of the series. Before you start (the part that actually matters) Enabling file services on the array is straightforward. The part that tends to create friction is everything that surrounds the configuration, particularly networking and integration with existing services. The first consideration is the choice of network interfaces. Although the array provides 1GbE management ports, those interfaces are not intended for serving file workloads. Using them for SMB or NFS traffic introduces an artificial bottleneck that will affect performance and, more importantly, perception. File services should be configured on the 10 or 25GbE data ports, which are designed to handle production traffic and provide the throughput expected from the platform. Here is what my array looked like earlier today: The highlighted ports are ETH10 and ETH11 on both controllers. Redundancy should be planned, but it does not need to be over engineered. A simple and reliable starting point is to use at least two ports per controller, ensuring that the configuration remains consistent across both sides. The goal is to achieve predictable failover behavior rather than to build a complex network design that is difficult to troubleshoot. One concept that is worth understanding early is the File Virtual Network Interface, or File VIF. This is the logical identity of the file service—the IP address that clients use to connect. It is designed to move between controllers as needed, maintaining availability during failover events. Once this concept is clear, the rest of the networking configuration becomes much easier to follow. My lab was built within budgetary constraints - that means I don’t have separate ethernet switches and I don’t have the time to build a separate DNS Server for FA File Services. Everpure recommends separating file client traffic from management traffic, but that’s a best practice, not a requirement. Since my lab switch is a single flat, untagged network and the environment is really just 192.168.1.0/24, I will just us the most practical approach - put the FA File VIFs on that same 192.168.1.0/24 network with their own IP addresses. Here is what I did: I just kept the file VIFs on 192.168.1.0/24 since that is the only real network available. FlashArray expects unique layer-3 subnets and does not support overlapping networks. DNS In my specific configuration, I don’t need a new DNS server. My existing management DNS servers can resolve the AD/DC hostnames and the FA File names/computer object. FA File can use the same DNS as management with no extra file-DNS configuration. By default, DNS lookups will go out the management interfaces, so my DNS server just needs to be reachable from the management network. And it is. Let’s turn the lights on, shall we? After assigning the IP addresses and enabling the ports, the lights came on. Important design note I will use one client-facing VIF IP for the file service, for example: File VIF IP: 192.168.1.135 Netmask: 255.255.255.0 Gateway: 192.168.1.2 default gateway Do not try to use 192.168.1.131-134 as four separate FA File IPs unless you intentionally want multiple VIFs. The ct*.eth* ports are transport underlay, not the SMB/NFS endpoint IPs. Configuring a File Server and File VIF Open the File Services server page Go to Storage → Servers. Open the default server (_array_server) or create a new file server if you want a dedicated namespace. Stay on that server’s details page. 3. Create the File VIF Use physical bonding first; it’s the simplest. In the Virtual Interfaces section, click + Create VIF. Choose Physical Bonding. Select the underlying port pairs: Pair 1: ct0.eth10 and ct1.eth10 Pair 2: ct0.eth11 and ct1.eth11 Name the VIF something simple, e.g.: filevip1 Enter network settings: IP Address: 192.168.1.135 Netmask: 255.255.255.0 Gateway: 192.168.1.1 Leave VLAN blank since there are no VLANs. Save and Enable the VIF. That creates the client-facing IP for SMB/NFS. 4. Configure DNS Integration with DNS and Active Directory is another area where a bit of preparation goes a long way. File services rely on proper name resolution and domain integration, and it is important to recognize that file-related DNS settings are separate from the array’s management DNS configuration. The system effectively becomes a participant in the domain as a file server, which means that DNS records, domain join operations, and permissions should be planned accordingly rather than improvised during setup. Since my DNS is 192.168.1.2 and I want to reuse management DNS: Go to the server’s DNS Settings. My management DNS is already configured and points to 192.168.1.2 If you want to explicitly add file DNS: Click + in DNS Name: file-dns Domain suffix: your AD/domain suffix DNS server: 192.168.1.2 Service: file Source interface can remain default unless you specifically need file VIF sourcing. 5. Create required DNS A records On my DNS server 192.168.1.2, I created an A record for the file service name pointing to the File VIF IP. Name: fa-file01 IP: 192.168.1.135 If you are joining AD for SMB/Kerberos: Make sure DNS also has A records for all relevant domain controllers. Create the A record that matches the AD computer object / FA File service name. 6. Join Active Directory or configure LDAP If using SMB Use Active Directory. Go to: Storage → Servers → _array_server Then look for one of these panels: Remote Directory Service Click Edit Configuration Select Active Directory Enter: Name Domain DNS Name Computer Name Use Existing Account if applicable AD User Password TLS Mode Save / Join This part took me 2 hours. I was getting some crazy error messaged that I’m simply embarrassed to share here. It was not the DNS. It was an NTP server misconfiguration that was causing Kerberos to not authenticate properly. There was a 10 minute time skew between the FlashArray and the domain controller. 7. Create a File System The file system is the top-level container for your unstructured data. GUI Method: Navigate to Storage > File Systems and click the plus sign (+). Enter a name and click Create. CLI Method: Use the following command: purefs create <file-system-name>. 8. Create a Managed Directory Managed directories allow you to apply specific policies (like quotas or snapshots) to subfolders within a file system. GUI Method: Go to Storage > File Systems. Click on the name of the file system you just created. Select the Directories tab and click the plus sign (+). Enter the directory name and the internal path (e.g., /users). CLI Method: Use the following command: puredir create filesystem1:users --path /users. 9. Create an Export The export makes the managed directory accessible to clients over the network. GUI Method: Navigate to Storage > Policies > Export Policies. Select an existing policy (e.g., a standard SMB or NFS policy) or create a new one. Within the policy view, click the plus sign (+) to add an export. Select your Managed Directory, choose the appropriate Server (use _array_server for standard configurations), and provide an Export Name (this is the name clients will use to mount the share). CLI Method: Use the following command: puredir export create --dir <file-system-name>:<directory-name> --policy <policy-name> --server <server-name> --export-name <client-facing-name>. A quick validation step At this point, it is worth validating access from a client system. Map the SMB share and perform a simple set of operations—create files, read data, and verify permissions. This is less about testing performance and more about confirming that networking, authentication, and access controls are behaving as expected. In most cases, if the earlier steps around DNS and Active Directory were done correctly, this validation step is uneventful, which is exactly what you want. And now let the data migration begin. I am actually doing it from my Mac. And it just works!!! What becomes apparent after completing these steps is how little effort is required to stand up a fully functional file platform on infrastructure that is already in place. Unless, of course, your NTP server crashed. The system behaves predictably, integrates cleanly with existing services, and avoids many of the operational burdens associated with VM-based file servers. And that is where things start to get interesting. Because everything described so far is still being done manually—selecting where things live, defining configurations, and applying policies one step at a time. It works, and it works well, but it also mirrors the way storage has traditionally been managed. In the next post, I will show what happens when you stop doing these steps manually and let Pure Fusion handle placement, policy, and provisioning instead. Appreciate you reading. © 2025 Dmitry Gorbatov | #dmitrywashere12Views1like0CommentsThe Idea That Was Supposed to Fail
Why DirectFlash and Evergreen//One suddenly look a lot smarter in a world of NAND and DRAM price shocks Dmitry Gorbatov Mar 20, 2026 Important Note for my readers: Writing this piece took me a lot longer than I normally spend on a post. It took a lot of reading and research. Many articles and blogs were written on the subject before NAND and DRAM costs went crazy. The dry-humor version is that the storage industry spent years insisting flash was just disk with better manners, and then acted surprised when the underlying physics eventually asked to speak with management. Now, let’s get to it. I can still picture the room. It wasn’t anything special — just another corporate competitive training session, the kind you’ve sat through many times if you’ve spent enough years in enterprise tech. This was at NetApp, in 2015 or 2016, back when flash was still a question mark. Not if, but how. The industry had not fully committed yet, and everyone was trying to figure out what role it would play. The presenter clicked to the next slide, paused for a second, and said something that stuck with me in a way most of those sessions never do: “Pure Storage is crazy! They’re building their own flash modules. That’s stupid. It’s not sustainable. They won’t survive.” It wasn’t said for effect. There was no dramatic pause afterward, no attempt to persuade. It was delivered as a simple, almost obvious conclusion. And to be fair, it felt obvious. Because the entire storage industry operated on a shared assumption: you didn’t build components, you assembled them. You relied on a mature ecosystem of suppliers who specialized in drives, storage controllers, and memory, and you focused your differentiation on software features and integration. That was the efficient path. That was the scalable path. That was how serious companies behaved. What Pure was proposing at the time — what would later become Everpure — felt like a deviation from that logic. Building your own flash modules didn’t just introduce complexity; it seemed to reject the economic advantages of the broader supply chain. It looked like a risk without a clear payoff. So the conclusion made sense. Until it didn’t. Looking Back, Differently If I think back to that training session now, I do not really see it as a moment where someone was foolish. I see it as a moment where the industry was trapped inside the logic of its own assumptions. If you believe flash should look like disk, then building your own flash modules sounds silly. If you believe storage is just a sequence of refresh cycles, then a model built around non-disruptive evolution sounds unnecessary. If you believe component pricing will keep trending in the right direction forever, then architectural efficiency feels like an academic luxury. But once those assumptions start to crack, the logic changes. And when it changes, the things that once looked eccentric start to look oddly prescient. A Change You Don’t Notice Right Away For years, nothing about that statement felt particularly worth revisiting. The industry moved forward in predictable ways. Flash became mainstream. Performance improved. Density increased. Vendors competed on features, benchmarks, and price points. The conversations most of us had with customers followed familiar patterns. If anything, the abstraction layers built around flash made things easier to consume. SSDs behaved like faster disks — and that was good enough. There is a reason they showed up in familiar HDD form factors. The industry was trying to preserve the old world while sneaking in a new medium. Keep the slots. Keep the enclosures. Keep the assumptions. Change as little as possible. That made adoption easier, but it also buried the problem. Because flash is not a disk. It never was. It does not behave like one, and it does not particularly enjoy being treated like one. The only reason the illusion worked is because the industry built a fairly elaborate translation layer to maintain it. That translation layer is where the story really starts. The Trick That Made Flash Look Simple When commodity SSDs became the standard way to bring flash into enterprise storage, they depended on a piece of internal firmware called the Flash Translation Layer, or FTL. Its job was deceptively simple: make raw NAND look like a disk. That sounds harmless enough until you think about what that actually requires. NAND cannot just overwrite data in place the way the rest of the stack would like it to. It has to handle erase cycles, wear leveling, garbage collection, bad block management, and the constant translation between logical addresses and physical locations on the media. So every SSD became its own little self-contained world, complete with its own controller, its own metadata tables, and its own DRAM to keep track of everything. In other words, every drive became a tiny independent computer, making local decisions in isolation. That design solved the adoption problem. It did not solve the architecture problem. For a while, the tradeoff seemed worth it. The drives were fast enough, the packaging was familiar, and the whole system kept pretending that flash was just a much nicer version of disk. But what looked neat and modular at small scale turned out to be awkward and expensive at enterprise scale. And that is where the “stupid” decision begins to look a lot smarter. What Commodity SSDs Actually Drag Along With Them The more I researched this topic (and believe me I did), the more I realized how much of the industry got comfortable with an abstraction that was doing a lot of quiet damage. Commodity SSDs carry four structural inefficiencies that matter much more today than they did when pricing was stable. Trapped DRAM. Every SSD maintains its own mapping tables, so large-scale systems end up carrying a remarkable amount of DRAM inside the drives themselves. That memory is necessary for the SSD to function, but it does not really help the array think globally. It is duplicated overhead, repeated again and again, drive by drive. In a petabyte-scale system, that is not a rounding error. It is cost, power, and complexity hiding in plain sight. Unpredictable Latency. Garbage collection inside a traditional SSD happens when the drive decides it needs to happen. When that occurs, the drive may become temporarily less responsive, and in an array full of independent drives, those little stalls start to show up as tail-latency spikes. The system is always vulnerable to one drive having a private crisis at exactly the wrong time. Write Amplification. Because the SSD does not really understand the workload or the data structures above it, it moves data more often than necessary. More movement means more writes. More writes mean more wear. More wear means the media gets consumed faster than it should. Over-provisioning. Every SSD holds back some raw capacity for its own housekeeping and spare-cell management, but that reserved space is siloed. The array cannot use it intelligently across the system because each drive is managing its own private affairs. None of this sounded especially dramatic when NAND kept getting cheaper and the economics of flash kept improving. It sounded like engineering trivia. The sort of thing infrastructure people argue about while everyone else waits for the quote. Today it is not trivia. Today it is exposure. Why AI Made This Suddenly Everyone’s Problem For years, one of the quiet assumptions in enterprise IT was that storage capacity would continue to become cheaper and more abundant over time. Not perfectly, not smoothly, but predictably enough that the inefficiencies of the underlying architecture could be tolerated. That assumption is now not only under pressure, it is getting decimated. AI did not just create a new category of interesting workloads. It created a global appetite for silicon that is large enough to bend supply curves. The cute part of AI is easy to mock. The cat kicking the T-Rex. The surreal generated videos. The deepfakes that make you look twice and then sigh a little for civilization. But behind every one of those outputs is a less funny reality: extraordinary consumption of DRAM, NAND, GPUs, and supporting infrastructure. The novelty at the edge is powered by very serious resource demand at the core. And that demand is landing directly on the components enterprise storage depends on. This is the part customers are beginning to feel in ways that are no longer abstract. Expansion quotes do not look as comfortable as they once did. Refresh cycles feel more expensive. Delivery windows stretch. Budgets built on assumptions from even two years ago suddenly need more explaining than anyone wanted. There is a tendency to call this inflation because that is the easiest word available. It is not really inflation. It is supply and demand, with a side of semiconductor reality. And that matters, because a traditional SSD array is exposed to both sides of the problem at once. It is exposed to NAND because that is the medium you are buying, and it is exposed to DRAM because every SSD drags its own DRAM overhead along for the ride. When those two markets tighten at the same time, the cost of the architecture gets hit twice. That is not just a technical nuance. That is economics. Revisiting the “Stupid” Decision This is where the old training-room comment starts to age badly. Because what looked like unnecessary vertical integration was really a decision to stop pretending flash was a disk and start treating it like what it actually is: semiconductor media with very specific physical behaviors that should be managed at the system level, not hidden inside dozens of drives. That is the DirectFlash idea in plain English. Take the Flash Translation Layer out of the individual drive. Pull media management into the operating environment. Let Purity manage flash globally instead of leaving each device to improvise its own local strategy. That changes more than performance charts. It means metadata no longer has to be duplicated and trapped inside every SSD. It means wear leveling can happen across the full system instead of inside the borders of a single device. It means bad block handling, garbage collection, and data placement can be coordinated with global context. It means the platform can see the difference between data that should live together and data that should not, which dramatically reduces unnecessary movement and lowers write amplification. And when write amplification drops, the economics change. The NAND lasts longer. The useful life of the media extends. Lower-endurance flash, like QLC, becomes viable for serious enterprise use because the software is smart enough not to abuse it. The system extracts more useful work from the same raw silicon. That is not just clever engineering. That is insulation from volatility. The reason this matters now is that DirectFlash changes the ratio between the silicon you buy and the value you get from it. If the rest of the market is paying more for NAND and more for DRAM, an architecture that reduces trapped DRAM, minimizes wasted writes, extends media life, and packs far more capacity into far denser modules is not just elegant. It is economically defensive. This is where the old “they build their own flash” criticism misses the point. Building your own flash modules was never the point by itself. The point was controlling the relationship between software and media well enough to eliminate the inefficiencies the commodity model had normalized. Why Purity Is the Real Story DirectFlash makes for a good visual. It is a module. You can point to it. You can talk about density and reliability and the fact that a 150TB module can do work that would have required a small army of traditional devices not all that long ago. But the real story is Purity Operating Environment, i.e. software. Purity is where the architectural bet pays off. It is what turns raw NAND into a coordinated system instead of a pile of politely disagreeing SSDs. Because Purity sees the entire media pool, it can write more intelligently. It can group data with similar expected lifespans together, so that when a snapshot or a temporary workload disappears, whole regions of storage can be retired cleanly instead of forcing background reshuffling of still-live data. That reduces unnecessary churn. Less churn means fewer writes. Fewer writes mean longer media life. Because Purity sees when a NAND die is busy with an erase or program cycle, it can avoid letting that become a host-visible performance problem. RAID-3D and system-level awareness allow the platform to reconstruct data from parity rather than simply waiting for a busy drive to get its act together. The end result is deterministic performance rather than a roulette wheel of occasional latency spikes. Because Purity owns media management globally, the over-provisioning and spare resources are no longer trapped in per-drive silos. The system can use them strategically. I know that all of this sounds a bit scientific, and to be fair, it is. I did spend over 7 years working for Everpure and a few weeks researching for this post. I wanted to sit with that science for a bit. Where the Economics Start to Matter The moment component pricing becomes unpredictable, architecture stops being an engineering preference and starts becoming a financial strategy. That is the part that matters most to customers right now. A traditional buying model assumes that at some point you will hit a refresh cycle, a capacity wall, or a migration event that forces a purchase whether the market timing is good or terrible. You buy when you have to buy. If NAND is expensive, that is unfortunate. If DRAM is expensive too, even better, because apparently the universe enjoys symmetry. That is what makes the combination of DirectFlash and Evergreen so important. DirectFlash reduces the amount of waste, duplication, and premature wear in the system. Evergreen removes the old habit of tying innovation to forklift replacement. Controllers evolve. Capacity can be consolidated into denser modules over time. Data stays in place. The customer is not forced into rebuying the whole environment every few years just to remain current. That already changes the economics. But it still leaves one more question: who is carrying the price risk? And this is where Evergreen//One matters more than ever. The Part I Actually Wanted to Get To Evergreen//One is not just a consumption model. It is not just a nicer way to finance storage. It is a mechanism for moving volatility away from the customer. That is the conclusion I wanted to earn, not just declare. When NAND and DRAM prices start climbing, most traditional models push that turbulence straight into the customer’s planning cycle. The customer eats the increase, absorbs the uncertainty, and tries to explain to the business why the infrastructure line now behaves like it has a gambling problem. Evergreen//One changes that relationship. The customer consumes capacity as a service. Everpure owns the burden of the underlying hardware lifecycle, the media strategy, and the ongoing optimization. DirectFlash makes that model stronger because the platform is structurally more efficient with the silicon it uses. It needs less trapped DRAM, wastes fewer writes, extends media life, and supports denser modules that deliver more usable capacity per unit of power, space, and raw media. Purity compounds that advantage with data reduction, ongoing software improvements, and smarter system-wide media management. Put differently, Everpure is in a much better position to absorb and manage component volatility than a customer buying boxes on a refresh schedule. That is the real price protection story. Not some magical promise that economics no longer apply. They do. NAND still costs what NAND costs. DRAM still costs what DRAM costs. Physics remains annoyingly undefeated. The difference is who is exposed to that volatility, how much inefficiency is built into the system before the customer ever sees it, and whether the operating model gives the customer a stable runway instead of a quarterly surprise. DirectFlash reduces the waste. Evergreen removes the forced disruption. Evergreen//One shifts the risk. That combination is a lot more interesting than it sounded in that room 11 years ago. The Part I Didn’t Appreciate Then What I did not understand sitting in that room 11 years ago was that some decisions are made for futures that have not arrived yet. The market eventually caught up to the architecture. That does not happen often enough in enterprise tech to ignore when it does. DirectFlash was never interesting just because it was different. It was interesting because it removed layers of inherited inefficiency that the rest of the market had accepted as normal. And in a period where NAND and DRAM pricing are under pressure, removing inefficiency is no longer just a performance story. It is a protection story. That is why this matters now. Not because it makes for a clever slide. Because it gives customers a more predictable way forward when the underlying component markets are anything but predictable. And in the current environment, that might be the most practical definition of innovation there is. Appreciate you reading. Dmitry Gorbatov © 2025 Dmitry Gorbatov | #dmitrywashere68Views0likes0CommentsWhen Data Becomes the Mission
Why state and local government, cities, and research universities are reorganizing infrastructure around data itself If you remember one thing from this article: infrastructure used to organize around applications. Increasingly, now it organizes around data. If you spend enough time around enterprise infrastructure, you start to notice something about how conversations begin. Someone asks about storage. Not in a philosophical way. In a practical way. How much capacity do we have left? What’s the refresh cycle? Is this staying on premises or moving to cloud? What’s the backup strategy? For years, that framing made perfect sense. Infrastructure was the foundation, and the job of infrastructure teams was to keep the lights on and the foundation solid. But lately, in conversations with customers across state and local government, municipalities, cities, and universities, something feels different. Because eventually someone says something like this: “We have this data… but we can’t actually use it.” And that is when the real conversation begins. Why the public sector reveals the truth about data There’s a perspective I heard recently that stuck with me. The public sector isn’t a niche market. It’s a microcosm of the entire enterprise technology world. At first that sounds counterintuitive. The stereotype is that government IT has been quietly living under a rock since the previous century, next to a beige server and a stack of COBOL manuals. But if you look closely, the opposite is true. State agencies, cities, and research institutions operate in environments that combine nearly every architectural challenge the private sector faces — all at once. Massive datasets Highly distributed users Strict security requirements Long retention policies Global collaboration And an absolute requirement that systems remain available when people need them most. In other words, the public sector experiences the full spectrum of data challenges simultaneously. If you want to stress-test a data architecture, put it inside government. Think about it. A state government may run thousands of systems across dozens of agencies, each serving different missions but increasingly sharing the same underlying data. A city manages infrastructure at the physical edge of society — traffic, water, SCADA, emergency services — where real-time decisions depend on accurate information. Universities generate some of the largest research datasets on earth while collaborating across institutions and countries. Each of these environments demands something slightly different from infrastructure. But they all demand the same thing from data: Security. Integrity. Mobility. Context. Availability. And when those requirements collide in one environment, something interesting happens. The solutions that work there tend to work everywhere. A laboratory for the modern data enterprise This is why many technology leaders quietly view the public sector as something more than a vertical market. It’s a laboratory for enterprise-scale data architecture. If a platform can operate in a world where: sensitive personal data must remain protected • systems span thousands of locations • regulatory oversight is constant • and uptime has real public consequences …then that architecture will almost certainly succeed in commercial environments. Banks, manufacturers, healthcare providers, and global enterprises face the same challenges. Just rarely all at once. Government simply compresses those problems into a single environment. Solve the data problem for government, and you solve it for the enterprise. That’s one reason the shift toward data-centric platforms is becoming so important. When organizations treat infrastructure as a place to store files, they solve only a small part of the problem. But when they treat data as the central operational asset — something that must be understood, governed, protected, and made usable across environments — the architecture begins to look very different. And the public sector, with all its complexity, becomes the place where those architectures are tested first. Which brings us back to the shift we’re seeing across the industry. Because once you start looking at infrastructure through the lens of data itself, something else becomes obvious. The center of gravity has moved. When multiple systems depend on the same dataset, the data becomes part of the operating foundation. And once that happens, moving it — or even restructuring it — becomes dramatically harder. Which brings us to the concept that explains a lot of what is happening right now. The quiet physics of data gravity The first time I heard the term “data gravity” wasn’t in a conference keynote or a vendor presentation. It was in 2015, when a recruiter from a startup called DataGravity (now Anomalo) reached out and asked if I would be interested in interviewing. At the time, the idea sounded fascinating — and slightly theoretical. The company was built around the premise that data itself was becoming the most valuable asset in the data center, and that infrastructure needed to understand the content, context, and behavior of data, not just store it. The name alone hinted at something deeper: the idea that as datasets grow, they start exerting a kind of gravitational pull on the systems around them. Back then, it felt like an interesting concept. Today it feels like a description of reality. The term “data gravity” itself was introduced by Dave McCrory back in 2010, and it turns out to be a remarkably accurate way to describe modern infrastructure. Dave McCrory Blog The idea is simple. As datasets grow, they become harder to move. More applications depend on them. More workflows connect to them. More policies govern them. Eventually, the architecture starts organizing around the data itself. Not because someone designed it that way. Because the physics of large systems leave you very little choice. Imagine trying to relocate a state Medicaid dataset that has been integrated with multiple benefit programs, identity verification systems, and fraud detection tools. Technically possible? Sure. Operationally trivial? Not even close. The larger and more interconnected the dataset becomes, the stronger its gravitational pull. Compute moves closer to the data. Applications move closer to the data. Infrastructure reorganizes around the data. This is why organizations that once talked primarily about storage capacity are now talking about data platforms. The center of gravity moved. When data stops being passive The moment data becomes operational, everything changes. For years, most organizations treated data as something that accumulated quietly inside systems. Applications produced it. Storage kept it safe. Backups made sure it could be restored. But that model starts to break down when the data itself becomes part of real-time decision making. You can see this most clearly in environments that generate enormous volumes of information. Cities now run infrastructure that continuously streams telemetry — traffic sensors, utility meters, environmental monitors, emergency response platforms. A water meter that once reported usage once a month might now generate thousands of readings per year. A traffic system that once relied on static timing can adapt dynamically to real-time conditions. Each improvement creates more data. More importantly, it creates operational dependence on that data. Universities experience the same phenomenon in a different form. Research environments produce extraordinary datasets across genomics, climate science, and artificial intelligence. Sequencing a single human genome generates roughly 100 gigabytes of raw data, and large research programs may create terabytes or petabytes of new information every week. In those environments the challenge isn’t just storing data. It’s feeding it fast enough to the systems that depend on it. Modern research clusters and GPU environments can process enormous volumes of information, but only if the underlying data pipeline keeps up. When storage cannot deliver data fast enough, expensive compute resources sit idle and discovery slows down. And that reveals an important truth about modern infrastructure. When systems depend on data in real time, the question stops being where the infrastructure lives. The question becomes whether the data is available, trustworthy, and recoverable. That distinction also explains why ransomware has become so disruptive to public institutions. Attackers understand that the real leverage is not the servers or the network. It’s the data. When access to data disappears, the services built on top of it disappear as well. Which brings us back to the deeper shift happening across the industry. If data has become this central to operations, services, and discovery, then managing it as a passive byproduct of infrastructure is no longer enough. Infrastructure alone is no longer the strategic layer. The strategic layer is the data itself. Organizations still need performance, availability, and resilience. Those fundamentals have not changed. What has changed is the expectation that infrastructure should also help organizations understand, govern, protect, and use their data more effectively. That is a very different problem than simply storing it. And it is the reason the conversation is evolving from storage management to data management platforms. The real punch line Public sector organizations didn’t set out to become data enterprises. Over time the data accumulated. Then the dependencies formed. And eventually everything started orbiting the datasets that mattered most. Data has gravity. Data has risk. Data has power. Infrastructure still matters. But increasingly, the real mission is something else entirely. The mission is the data. Appreciate you reading. Dmitry Gorbatov © 2025 Dmitry Gorbatov | #dmitrywashere39Views0likes0Comments