industry

16 Topics

The Art of Sizing: Breaking the Myths of Oracle Compression
If you work in storage or databases, you have probably heard the pitch: turn on Oracle compression, send fewer bytes, save space, reduce I/O, and lower cost. That sounds great on a slide. In the real world, it is often the opposite. This installment of The Art of Sizing breaks down one of the most persistent myths in enterprise infrastructure: that Oracle host-based compression is automatically a win. We are going to walk through the major compression types, where they help, where they hurt, and why the wrong compression decision can actually create more cost, more network traffic, more I/O, and more work for both storage admins and DBAs. The goal is not to say compression is bad. The goal is to size it correctly, understand where it belongs, and avoid paying premium dollars to make your systems do extra work. Why this myth survives Compression has a good reputation for a reason. Historically, it solved real problems. Storage was expensive, bandwidth was limited, and shrinking data was often the simplest path to efficiency. That logic still holds in some places. But in Oracle environments, especially transactional ones, the story gets more complicated. Oracle is not just writing datafiles. It is writing redo, managing undo, reorganizing blocks, updating symbol tables, and sometimes re-processing data later for deeper compression. That means a “smaller data footprint” does not always equal a smaller infrastructure burden. Sometimes it just shifts the burden somewhere else. First, let’s separate the compression families Not all compression is the same, and not all of it behaves the same way in Oracle. 1. General lossless compression This is the classic world of ZIP, GZIP, LZ77, LZ78, DEFLATE, ZSTD, and similar algorithms. The point is simple: reduce size without losing information. These methods are excellent for files, backups, archives, and many data services. Modern storage platforms use fast versions of these ideas in ways that are largely invisible to the application. 2. Lossy compression Think JPEG, MP3, MPEG, and H.264. These formats intentionally throw away some data in exchange for dramatic size reduction. They are incredibly effective for media, but they are not relevant for Oracle datafiles because databases generally require exact fidelity. 3. Oracle database compression This is where the confusion starts. Oracle has several different compression approaches, each with different behavior, licensing implications, and performance trade-offs: - Basic Table Compression - Advanced Row Compression - Advanced Index Compression - SecureFiles LOB Compression - Hybrid Columnar Compression - RMAN Backup Compression - Automatic Data Optimization and ILM-driven background re-compression Lumping all of those together under “Oracle compression” is one of the fastest ways to make bad architecture decisions. The big myth: compressed writes mean less work Here is the myth in plain English: If Oracle compresses the data before it sends it to storage, the network carries fewer bytes, the array writes less data, and the whole system gets more efficient. What that myth ignores is the full lifecycle of an Oracle write. In active transactional systems, Oracle prioritizes commit latency. That means the redo stream is written first, and it is written uncompressed. Later, as blocks fill and thresholds are crossed, Oracle may compress or re-compress them in memory. That structural change can generate additional redo. If data is later pushed into deeper formats through background optimization or archive-style compression, the system may read, process, and write the same data again. So yes, one part of the path may get smaller. But the total system effort often gets bigger. Figure 1: The life of a compressed I/O inside Oracle (OLTP write path). Notice that data is NOT compressed when it first travels to storage as redo, that write amplification happens at the threshold-hit step, and that this is where the footprint can start to grow. How to actually see it happening This is the part both DBAs and storage admins care about: how do you know Oracle is revisiting blocks, delaying compression, and generating extra work after the commit already succeeded? The threshold delay, in plain English With Advanced Row Compression, Oracle does not usually compress every row the moment it is inserted. Instead, rows are typically written into the block uncompressed first so Oracle can keep transactional latency low. Oracle keeps watching the remaining free space in that 8KB block. Once the block crosses an internal fullness threshold, Oracle goes back, builds or updates the symbol table, batch-compresses the block in memory, and then has to account for that structural change. That delayed work is what I mean by the threshold delay. So the timeline looks more like this: user writes data Oracle writes redo for durability commit returns quickly the block stays in buffer cache the block fills further threshold is crossed Oracle compresses or re-compresses the block in memory additional redo and block maintenance activity can follow That is why the system can look quiet at commit time and then busy again later. How you can tell Oracle is "redoing" things You are usually not looking for one giant smoking gun. You are looking for a pattern where post-commit activity does not line up with the simple story of "we wrote it once and moved on." Common signs include: redo generation that seems higher than expected for the amount of business data changed continued redo and log write activity after the original insert burst is over CPU spikes around block maintenance rather than just around user SQL periodic write bursts that do not line up cleanly with front-end transaction volume maintenance-window bandwidth spikes when colder data is being reworked into deeper compression formats storage-side churn where the array still sees a lot of activity even though the data was supposedly "compressed already" At the database layer, the giveaway is often the mismatch between application change volume and the total work observed in redo, background activity, and later data movement. At the storage layer, the giveaway is seeing traffic patterns that look like read-process-write loops rather than a single smooth write path. Why the database can get larger after a few days This confuses a lot of people because they expect compression to make the footprint immediately smaller and keep it smaller. Figure 2: The background ADO/HCC re-compression path. Days or weeks after the initial write, background jobs read cold data off the array, re-process it on the host, and write it back, so segments can grow from extra redo, undo, rewritten copies, and unreclaimed space. But Oracle compression can create delayed growth behaviors for several reasons: new rows may land uncompressed first and only later be reorganized recompression work can generate extra redo and undo background optimization jobs may read old blocks, reorganize them, and write new versions back out old extents may not be reclaimed immediately even after data is moved or rewritten free space inside segments may become fragmented in ways that do not instantly shrink the physical files if data is updated repeatedly, blocks can split, migrate, or be rewritten in ways that increase segment size before any long-term savings appear So what looks like "compression should have made this smaller" can become "the system created more structures, more history, and more rewritten copies before it settled down." For storage admins, this often shows up as a database that writes out one size on day one and then consumes more logical or physical space over the next several days as Oracle continues block maintenance, redo generation, archive activity, and background reorganization. The practical operator lesson If you want to understand whether compression is helping or hurting, do not just compare the first write size to the final stored size. Instead, look at the full lifecycle: initial redo volume later redo spikes buffer-cache and CPU behavior around block fullness archive log growth background maintenance windows segment growth over time instead of only at load completion storage bandwidth and write churn several days after the original ingest That is where the threshold delay becomes visible. That is where the myth breaks. Why host-based compression can cost more money This is where The Art of Sizing matters most. Compression is often sold as a capacity story, but in Oracle it can quickly become a licensing and CPU story. Advanced Row Compression and SecureFiles LOB Compression are paid features. That means you are not just paying in cycles, you may be paying in Oracle licensing. And if compression overhead pushes CPU consumption higher, you may end up needing to license more cores just to preserve the performance you had before. That is a brutal trade: - You pay for the compression feature. - You spend host CPU running the compression feature. - You may need more licensed cores because of the compression feature. - You still do not eliminate redo overhead. At that point, “saving space” can become one of the most expensive optimizations in the stack. Why it can create more network traffic This is the part that surprises people. On transactional writes, the redo stream still moves as uncompressed change data so Oracle can preserve low-latency commit behavior. That means the initial transactional path does not magically shrink just because the table eventually lands in a compressed state. Then the hidden traffic starts: - secondary redo generated when blocks are compressed or re-compressed - additional log activity to track structural changes - background movement when colder data is reorganized into deeper compression formats - read-process-write cycles for jobs like ADO or HCC-related maintenance For the SAN, that can mean less of a neat “compressed payload” story and more of a churn story. Why it can create more I/O Storage admins know this instinctively: once a system starts revisiting the same data repeatedly, the theoretical savings usually get eaten by operational noise. That is what can happen here. A write is not always a single write anymore. It can become: - the original transactional activity - redo logging for durability - later in-memory compression work - secondary redo for compression state changes - future background read-and-rewrite operations for deeper compression That is not reduced work. That is redistributed work, with extra steps. For busy OLTP systems, that redistribution can show up as more write amplification, more jitter, and more performance variance than people expected when they first heard the word “compression.” Why it creates more operational work Compression decisions do not just affect hardware. They create administrative drag. DBAs have to understand which compression mode is active, what is licensed, what is free, what silently triggered usage, and how it affects redo, CPU, and maintenance windows. Storage admins have to explain why the array still sees redo churn, why bandwidth spikes appear during data reorganization, and why dedupe or downstream efficiency may not look the way a simplified Oracle story suggested. And everyone gets more work when performance troubleshooting starts with a bad assumption. A quick breakdown of the Oracle compression types Basic Table Compression Good for bulk loads and relatively static datasets. It is not a magic answer for active transactional workloads because standard ongoing DML does not benefit the same way. Advanced Row Compression This is the big one in OLTP discussions. It supports active transactional operations, but it is also where deferred compression, block threshold behavior, secondary redo, and paid licensing can combine into a very expensive surprise. Advanced Index Compression Useful in the right indexing scenarios, especially with repetitive keys. This is more targeted and usually not the villain in the story, but it still needs to be understood separately from table compression. SecureFiles LOB Compression Can reduce footprint for large objects like documents, JSON, XML, and similar content, but it pushes work onto host CPU and can throttle ingestion performance when volumes are high. Hybrid Columnar Compression Very powerful for analytics, archival, and cold data patterns. It is not designed like OLTP row compression, and it often belongs in a very different conversation. When used through background movement or deep reorganization, it can generate substantial read-process-rewrite churn. RMAN Backup Compression A separate discussion from live transactional compression. Useful when applied deliberately, but some algorithms can also introduce licensing implications. The sizing lesson This is the heart of the series. Do not size from the brochure claim. Size from the full path of work. When evaluating compression, ask these questions: - What happens on the initial write path? - What happens to redo? - What CPU tax lands on the host? - Does this feature introduce licensing cost? - Will background maintenance create bursts of read-write churn later? - Is the data really a good fit for host-side database compression, or would array-level reduction be cleaner? If you do not answer those questions, you are not sizing compression. You are just hoping it behaves the way marketing described it. Where compression often belongs instead For many environments, especially where modern storage platforms provide inline reduction, the cleaner design is to let the database do database work and let the array do storage work. Figure 3: Myth vs reality, where should compression live? Host-side compression adds CPU, redo, and license cost, while array-level compression keeps host writes normal and delivers predictable I/O with global dedupe. That changes the equation: - less host CPU consumed by compression logic - fewer surprises tied to paid database options - fewer extra redo side effects from re-compression behavior - more predictable storage-side efficiency - a simpler operational model for both DBAs and storage teams That does not mean every Oracle compression feature is wrong. It means compression should be placed where it creates the least total system friction. And in many real-world environments, that is not at the host. Final thought: compression is not free just because it saves space Compression can absolutely be part of a smart architecture. But if you only measure saved capacity and ignore processor cost, network churn, redo behavior, maintenance overhead, and operational complexity, you can easily end up paying more to store less. That is the myth this post is here to break. In The Art of Sizing, the best design is not the one with the smallest number on a capacity chart. It is the one that delivers the best total outcome across cost, performance, simplicity, and operational sanity. And when it comes to Oracle compression, that usually starts with asking a harder question: Is this actually reducing work, or just moving it somewhere more expensive? Coming next In the next installment, we will look at the relationship between compression and encryption in Oracle, and why that combination can further change what the storage team sees and what the database team pays for.
Toms
6 days ago Place User Blogs
10Views
0likes
0Comments
Accelerate 2026 - Part 2 - The Light Switch Test
Earlier, in Part 1, I wrote that the Everpure Accelerate 2026 opening keynote did not really feel like a storage keynote. My takeaway from day one was simple: Everyone wants your data. The bigger question is who owns the context. Day two answered a different question. If day one was about why the Enterprise Data Cloud matters, day two was about how customers are supposed to get there without turning it into another giant transformation project that sounds great on stage or in a boardroom and then dies somewhere between budget approval, staffing constraints, internal politics, and the next urgent outage. That is why the second keynote mattered. It was not trying to restart the vision. The vision had already been established. It was about turning that vision into something customers could actually use: a methodology, a blueprint, and a way to connect data architecture to risk reduction, efficiency, agility, modernization, and business outcomes. And then John Colgrove, Coz, did what Coz does. He simplified the whole thing. Not by making it smaller. By making it clearer. The phrase that stayed with me from his session was not a technical phrase. It was not Enterprise Data Cloud, Data Primacy, Fusion, data intelligence, or workload mobility, even though all of those ideas were underneath what he was saying. It was the light switch. Coz talked about walking into a room at home and turning on the light. You know exactly what is going to happen. It is simple. It is obvious. It works the way you expect it to work. Then he compared that to walking into a conference room at the office, where five people spend the first few minutes trying to figure out how to turn on the right lights, dim the screen area, wake up the display, connect the laptop, and make the audio work. Everyone has lived that moment. It is also a perfect way to explain what Everpure has been trying to do since the beginning. Make the complicated thing feel like the light switch. That may sound too simple for enterprise infrastructure, but I think it is exactly the point. The best infrastructure does not feel simple because the problem is simple. It feels simple because somebody did the hard engineering work to hide complexity without hiding control. That has always been part of the Everpure story. When Pure Storage first became known in the market, the message was not only flash performance. Performance mattered, of course. But the thing customers really felt was that the experience was different. The arrays were simpler. The upgrades were non-disruptive. The support model was different. Evergreen architecture was different. The idea that you could keep modernizing without the usual forklift pain was different. Over time, that simplicity moved from one array to more of the environment. Fusion extended the idea from a single system to a fleet. Policy, placement, automation, workload mobility, service levels, compliance, and lifecycle management started to move from device-by-device thinking toward something broader. Now, with the Enterprise Data Cloud, Everpure is trying to move that simplicity again. From array to fleet. From fleet to data. From data storage to data management. That was the thread both Nirav Sheth and Coz pulled through the keynote, and I think it connected day two back to day one in a very useful way. They made it clear that the move from Pure Storage to Everpure is not an abandonment of what got the company here. It is a continuation of the same journey. That matters because customers are rightfully skeptical when technology companies rebrand or expand their message. They wonder whether the company is moving away from the thing they trusted. They wonder whether the new story is strategy or just vocabulary. Coz addressed that directly. We are not abandoning storage infrastructure. We are going to keep building the best storage infrastructure we can. But we are also going higher, because to build better infrastructure, you have to understand more about the data above it. That is a founder’s version of the message. Less theater. More first principles. If you store data, you want to know what it is. You want to know how it will be accessed. You want to know how often. You want to know what it relates to. You want to know whether there are copies. You want to know whether those copies create risk. You want to know whether the rules are being followed. The problem, as Coz pointed out, is that nobody really knows the future. The infrastructure has to be built for agility. That word gets overused, but in this context it matters. Agility is the ability to change without breaking everything. It is the ability to move workloads non-disruptively. It is the ability to rebalance a fleet. It is the ability to modernize hardware without turning it into a migration event. It is the ability to adjust policies as risk changes. It is the ability to bring intelligence to data that already exists instead of forcing the business to start over. That is where the Enterprise Data Cloud story becomes more practical. And I personally think the Enterprise Data Cloud Success Blueprint was the clearest example of that. I liked this part because it moved the conversation away from “look at all these capabilities” and toward “here is why it matters to you” and “what outcomes are you trying to drive?” That is where a lot of technology conversations go wrong. We get excited about the architecture and forget that customers are not buying architecture for the sake of architecture. They are trying to solve business problems with limited people, limited time, limited budget, and increasing pressure from every direction. They are dealing with supply chain constraints. They are being asked to do more with the same team. They are trying to create VMware optionality without making a reckless move. They are modernizing applications while still running legacy workloads that cannot just disappear. They are dealing with cyber risk, ransomware, and minimum viable business recovery. They are being asked to support AI before the data foundation is ready. The blueprint framework organized those pressures into three simple categories: risk reduction, efficiency, and agility. That may seem obvious, but obvious is underrated. Risk reduction is not just a security feature. It is knowing whether your data is protected, whether your snapshot policies are aligned, whether you can recover the minimum viable business, whether sensitive data is duplicated everywhere, and whether compliance follows the data instead of living in someone’s spreadsheet. Efficiency is not just a density number. It is energy efficiency, automation, operational scale, fewer manual tasks, fewer migrations, and fewer people spending nights and weekends babysitting infrastructure that should be managing itself. Agility is not just modernization language. It is VMware optionality, container readiness, AI readiness, cloud flexibility, application mobility, and the freedom to make the next decision without being trapped by the last one. I think that is a much better way to have the conversation with customers. Not “Do you want this product?” But “Which business outcome are you trying to improve, and what is standing in the way?” The Red Hat and CSX discussion made that practical. When Eric Grabill from CSX talked about Positive Train Control, sensors along the tracks, safety requirements, and systems where a loss of data can affect train operations, the conversation moved from platform strategy into the real world. That is where infrastructure earns its keep. CSX has already moved a large portion of its applications to Kubernetes on OpenShift, but still has legacy VMs remaining. That is the real enterprise pattern. It is not containers or VMs. It is containers and VMs. It is cloud and on-premises. It is modern and legacy. It is AI coming next while everything else still has to run today. The Red Hat and Portworx conversation made the point that modernization cannot mean creating another disconnected stack. Customers need one operating model across VMs, containers, and eventually AI workloads. They need a practical transition path, not a big bang migration. They need data services that protect the applications, not just compute platforms that can host them. The St. Elizabeth Healthcare conversation made the same point in a more personal way. Charles Shepherd talked about joining St. Elizabeth in 1997, starting at the help desk, moving through Novell, GroupWise, backups, storage, and eventually becoming part of the team responsible for systems that support a healthcare environment that never really stops. What stayed with me was not only the technical story. It was the laptop on vacation. Anyone who has worked in infrastructure understands that detail. The laptop that comes with you just in case. The phone you keep checking because maybe something happened. The family event where part of your brain is still in the data center. The trip where you are physically present but operationally on standby. That is not a feature comparison. That is a life comparison. Charles said he recently was able to go to his niece’s graduation and not get called. That sounds small only if you have never been the person who always gets called. He also talked about more than one hundred hardware upgrades and more than one hundred fifty Purity upgrades without downtime. He talked about moving from older systems to modern ones without the traditional forklift migration pain. He talked about change boards becoming comfortable with upgrades during the day because the process had earned trust. That is the kind of customer proof that matters. It shows what the solution that was delivered gives back. It gives back time, trust and confidence. That connects directly to the light switch idea. Simplicity is not cosmetic. It is not just a better UI. It is not just fewer clicks. Simplicity changes what people can spend their time on. It changes what teams believe they can safely do. And it changes whether the infrastructure team is trapped maintaining the past or free to prepare for what comes next. Coz also said something important about time. This Enterprise Data Cloud journey is not a one-year story. It is not one product cycle. It is not done because it showed up in an Accelerate keynote. Coz described it as a journey that will take five to ten years, and even then, it will not really be done because the solution will keep improving. I appreciate that kind of honesty. So when a founder says this is a long journey, I believe that more than I believe a slide that says “seamless transformation” in large font. But I also think now is the right time for the journey to become possible. And Coz reminded us that the best version of this is not complexity with better branding. The best version is the light switch. Coz, in the most Coz way possible, reminded everyone that the goal is not to make enterprise infrastructure sound impressive. The goal is to make the hard things feel obvious. Like turning on the lights. I appreciate you reading. Dmitry Gorbatov © 2025 Dmitry Gorbatov | #dmitrywashere
dmitrywashere
1 month ago Place User Blogs
36Views
0likes
0Comments
Keeping Your Fleet Up-to-Date Just Got a Lot Easier
Did you know: 95% of Purity upgrades now finish in under 90 minutes. You can run them in parallel and your whole fleet finishes in the same time it takes to do one. Every Purity release delivers more: better performance, new capabilities, the latest security updates. Staying current is how you keep pulling value out of hardware you already own. At Everpure, upgrades shouldn't be something you plan your week around, or something that delays the benefits every Purity release brings. Self-Service Upgrades in Pure1 (SSU) let you upgrade Purity on your own schedule, directly from Pure1, without opening a support ticket. It has quietly become the most popular way customers keep their fleets current. What's new: Automated SSU SSU has always given you full control over the upgrade flow, with mandatory pauses after each major step (health check, download, installation) and deciding when to continue. For teams who want to validate at every checkpoint, that is exactly how it should work and that manual flow isn't going anywhere. For everyone else, it meant mandatory delays and too much hands-on involvement. Arrays sitting idle between phases, waiting for someone to click through.More time spent on an upgrade than necessary, and enough that some teams never tried SSU at all, and kept pushing upgrades for later. Automated SSU is the option for those who want to go fast without giving anything up. Pick any number of appliances, select the target Purity version, authenticate, and go. The workflow runs to completion on its own, non-disruptive by design, so your workloads keep running throughout. If anything goes wrong, the upgrade pauses on that appliance and a proactive case opens with Everpure Support. Over 100 automatic health checks run before and during the upgrade, and the workflow won't move past a critical failure. First response from Support is typically 30 minutes for install issues, 60 minutes for others. Built for fleets Need to cover your whole fleet? Select your appliances in bulk, hit go, and they upgrade in parallel, finishing in the same time it takes to do one. The Software Lifecycle dashboard shows you exactly what's running, what's done, and what (if anything) needs your attention. If your target version is several releases ahead, SSU computes the upgrade path and runs the intermediate hops on its own. Get started in 15 minutes Not on SSU yet? The one-time setup takes about 15 minutes: enable cloud connection on each appliance from the CLI, then bulk-install the Purity Upgrade Agent from Pure1. After that, it is ready when you need it. Give Automated SSU a try. It really is easier than you think. Full SSU prerequisites and setup guide
PrachiJain24
2 months ago Place User Blogs
44Views
1like
0Comments
Security Is Not a Feature — It's the Foundation
Let's get something out of the way upfront: this is not a ransomware horror story. This is not a "cyber resilience framework" deep-dive full of three-letter acronyms that could potentially make your eyes glaze over if it's not your cup of tea. And this is definitely not a pitch deck disguised as a blog post. This is the real story of how Everpure thinks about security — at the architecture level — and why that distinction matters more than most people realize when they're evaluating storage platforms. Because here's the thing: security isn't a bolt-on. It's not a checkbox. And it's certainly not a conversation you should have to schedule separately from the one about performance or reliability. At Everpure, security is baked in from the ground up — and once you understand how, you'll never look at a storage spec sheet the same way again. Start With the Five S's At Everpure, we talk a lot about what we call the Five S's of data: Simplicity, Speed, Scale, Sustainability, and Security. They're not independent pillars — they're interlocking principles that define every design decision we make. Simplicity because complexity is the enemy of agility. If you can't iterate quickly, you can't grow. Speed because we've been all-flash since day one — full stop. Every generation of our platform has been optimized around flash, not retrofitted for it. Scale because data doesn't stop growing, and your storage shouldn't hit a wall when your business doesn't. Sustainability because power, cooling, and physical footprint are real constraints — especially now, as those pressures trickle down from hyperscalers to everyone else. Security because none of the other four matter if your data isn't protected. Security is the one that tends to get either oversimplified ("we encrypt everything") or overcomplicated ("here's our 47-page compliance matrix"). Neither is helpful. What's helpful is understanding how it works, why it's different, and what it means in a real conversation with a real customer. The Compliance Landscape: What Customers Are Actually Asking About Before we get into the architecture, let's talk about the validations — because customers are increasingly asking about them, and the answers matter. FIPS 140-3 is the latest standard from the Cryptographic Module Validation Program (CMVP), managed by NIST. It validates that a cryptographic module — the thing actually doing the encryption — meets a defined security standard. Everpure's FlashArray is FIPS 140-3 validated. That's the current gold standard, and it matters especially as post-quantum cryptography conversations start entering the room. (More on that in a moment.) Common Criteria is an international standard for evaluating the security of IT products — not just storage, but networking, applications, hardware modules, and more. Everpure's FlashArray is certified under the Network Device collaborative Protection Profile (NDcPP) via NIAP, while FlashBlade holds an EAL2 certification. Independent testing and verification confirm that each platform meets its defined security target. You can actually enable Common Criteria mode directly on a FlashArray — it's a CLI command, not a professional services engagement. PCI DSS compatibility is table stakes in financial services, but it increasingly shows up in other industries too. It means end-to-end data masking, encryption in-flight and at rest, and a well-documented audit trail. Everpure's platforms are designed to support PCI DSS requirements natively — though it's worth noting that PCI DSS certification belongs to the merchant environment as a whole, not to any individual storage component. TLS 1.2 and 1.3 are the current standards for securing data in-flight at the management layer. Everpure standardizes these across all management communications — and yes, you can turn off older cipher suites if your security posture requires it. TAA Compliance means that Everpure's hardware is manufactured in the United States. For customers in regulated industries or government, this isn't a nice-to-have — it's a requirement. And for anyone who cares about supply chain transparency, Everpure can show its work. None of this is marketing fluff. These are independently validated, publicly verifiable certifications. You can find all of them — current CVE database, FIPS status, NIST 800-53 alignment, media sanitization documentation — at our Customer Trust portal. Bookmark it as It's fully public-facing and constantly updated. The Hardware Story: Why No Keys on the Drive Is the Point Here's where things get interesting. Take a Direct Flash Module — Everpure's approach to flash — and look at what's not on it. No CPU. No memory. No encryption keys. It is not a self-contained storage array. It is purpose-built flash media, and everything else — the intelligence, the encryption, the key management — lives in software. Why does that matter? Because self-encrypting drives (SEDs) are a pain. Anyone who's managed them in a regulated environment knows this intimately. When the encryption is in the hardware, you inherit all the complexity that comes with it: drive-level key management, FTL overhead, KMIP integration headaches, and the ever-present risk that a single drive failure or misconfiguration creates a data accessibility nightmare. Everpure's approach flips this entirely. Because the Direct Flash Module has no CPU, no memory, and no keys, all encryption is handled at the software layer — in Purity, running across the entire system. This means no hardware dependency, no FTL management overhead, and no encryption key tied to a specific piece of media. The portability this creates is remarkable. And as you'll see in a moment, it's the foundation of everything else. How Everpure's Encryption Actually Works Let's peel back the layers here, because this is genuinely cool — and it's the kind of thing that separates a confident storage conversation from a "let me get back to you" one. Everpure's encryption architecture is built around three components: The Data Encryption Key (DEK) is the actual key used to encrypt customer data. There's one per array, and it doesn't change. You might think: why would you never rotate the key that's protecting your data? The answer is that the DEK never needs to rotate because of what wraps it. The Key Encrypting Key (KEK) is a key that encrypts other keys — specifically, it wraps the DEK. This is standard cryptographic practice, and it's the mechanism that makes key rotation safe, fast, and completely transparent to the workload. The Armored DEK is the DEK after it's been wrapped by the KEK. This is the piece that gets distributed. At no point is the raw Data Encryption Key exposed in clear text. It's always wrapped, always protected. Here's where the architecture gets elegant: when a FlashArray or FlashBlade initializes, it generates a KEK. That KEK wraps the DEK to create the Armored DEK. The Armored DEK is stored as a complete copy in every Direct Flash Module header — but it cannot be decrypted without the KEK. The KEK itself is derived from a scrambled key, which is split into individual shares and distributed one per DFM header using a sharding algorithm that requires a quorum to reconstruct. What does quorum mean in practice? The system can tolerate drive losses and still unlock all data, as long as enough DFMs remain present and healthy to reconstruct the scrambled key. No single drive is a single point of failure for your encryption keys. When a read request comes in, here's what happens: the system reconstructs the scrambled key from a quorum of DFM shares, derives the KEK, and uses it to unwrap the Armored DEK — exposing the DEK temporarily in memory, never persisted in clear text — and uses it to decrypt the data. The process is reversed for writes. At no point is customer data stored or persisted in clear text. Everything written to NVRAM is encrypted before it ever reaches upper-level system processes. This isn't "we encrypt everything." This is a specifically designed cryptographic architecture that is portable, resilient, and opaque to any unauthorized party — including someone who physically removes a drive. Key Rotation: The Part Most Vendors Skip By default, Everpure rotates the Key Encrypting Key every 24 hours. Automatically. No KMIP server required. No scheduled maintenance window. It just happens. When a KEK rotates, the system generates a new one, re-encrypts the Armored DEK, and redistributes the updated scrambled key shares across all DFM headers. The DEK itself doesn't change — the workload never sees it — but the wrapping layer that protects it is refreshed daily. When drives are added or removed, the system treats this as a high availability event: it generates a new KEK immediately, re-encrypts everything, and rebalances the shards across the new drive configuration. The key material always matches the current system state. And when a DFM is removed from the system? The scrambled key shares on that drive correspond to a KEK that no longer exists — or will be rotated away within 24 hours. A removed drive becomes cryptographically useless. This is how Everpure delivers what some would call "instant media sanitization" — not by wiping the drive, but by invalidating the key that makes its contents meaningful. Rapid Data Locking: When You Need the Nuclear Option For environments where security isn't just a compliance requirement but a physical reality — air-gapped facilities, defense deployments, high-security data centers — Everpure has a capability called Rapid Data Locking (RDL). The concept: the Key Encrypting Key can be placed on a pair of hardware security tokens (one YubiKey per controller, two total) and inserted into the array. As long as the tokens are present, the array operates normally. If they are removed and the array is subsequently rebooted or power-cycled, the array cannot complete startup without the tokens present — the data remains physically intact, but it is cryptographically inaccessible. The array becomes, in the most literal sense, an expensive brick. Reinsert the tokens and power the array back on, and it boots up normally. This is the kind of capability that used to require expensive, bespoke security architecture. For Everpure customers, it's a feature of the platform. Dark Sites Are Getting Less Dark One more topic worth addressing: dark site deployments. Air-gapped environments have always involved painful tradeoffs — disconnected from cloud management, manual support processes, limited visibility into system health. That's changing. Dark site customers can now see their assets within Pure1 — subscriptions, health status, the ability to open and manage support cases — without compromising their air-gap requirements. Log obfuscation tooling is available today and will be integrated directly into the platform going forward, giving customers granular control over what telemetry leaves their environment and when. For partners and customers managing dark site deployments, this is a meaningful quality-of-life improvement. And it's consistent with how Everpure builds everything: the security architecture makes the operational flexibility possible, not the other way around. The Takeaway Security conversations in the storage industry tend to go one of two ways: a recitation of certifications that nobody fully understands, or a vague reassurance that "everything is encrypted." Neither builds confidence. Neither answers the real question, which is: how does this actually work, and why should I trust it? Everpure's answer starts with architecture. Software-managed encryption, no hardware key dependency, automatic key rotation, cryptographic portability, quorum-based scrambled key distribution, and capabilities like Rapid Data Locking that scale to the most demanding security requirements in the world. The certifications — FIPS 140-3, Common Criteria, TLS 1.3, TAA — aren't the story. They're the evidence. The story is that security was designed in from the beginning, not layered on afterward. That's a meaningful difference. And now you know why.
greGPT
2 months ago Place User Blogs
210Views
0likes
1Comment
Part 2: MCP Is Interesting. Everpure Fusion Makes It Useful.
In Part 1, I tried to give MCP a proper “…splanation,” mostly because the first several times I heard people talking about Model Context Protocol, I had the same look Joey had in Friends when the salesman asked him if his friends ever had a conversation and he just nodded along without really knowing what they were talking about. That was me. MCP this. MCP server that. Agentic AI. Tool calling. Context windows. Protocols. Hosts. Clients. Servers. At some point, I realized I was nodding with the confidence of a man who had understood approximately 41% of the conversation and was hoping nobody asked a follow-up question. The simple version is this: MCP is a standard way for AI applications to connect to tools and data. It is not the AI model itself. It is not the magic brain. It is the plumbing that lets the AI reach into approved systems, ask better questions, retrieve useful context, and potentially take action through well-defined tools. That is important in the abstract. But for Everpure customers and prospects, it becomes much more interesting when we stop talking about MCP as a general AI concept and start talking about what it could mean for storage operations, data infrastructure, and Everpure Fusion. Because this is where the conversation moves from “AI is coming someday” to “your infrastructure may already need to be ready for how AI will interact with it.” Everpure recently published a blog with a sneak peek of the Everpure Fusion MCP Server, describing it as an open-source service that connects AI assistants to Everpure Fusion storage fleets through the Model Context Protocol. The important part is not simply that an AI assistant can talk to storage. That would be interesting, but it would also be easy to misunderstand. The important part is that the assistant can interact with the storage environment through the Fusion control plane, which already understands fleet-wide context across FlashArray and FlashBlade. That distinction matters. Without Fusion, many environments are still managed in a way that looks very familiar to anyone who has spent time supporting infrastructure. One array over here. Another array over there. Scripts in one folder. Notes in another. Naming standards that started strong and then apparently met reality. Screenshots in tickets. Tribal knowledge in the heads of a few people who somehow remember which workload lives where, which array is doing what, and why nobody should touch that one volume because “there was a reason,” even if nobody is entirely sure what the reason was anymore. That model may work, but it does not scale gracefully. More importantly, it is not especially friendly to automation, and it is definitely not ideal for AI-assisted operations. Most troubleshooting in mature environments is not hard because people lack tools. It is hard because the context is not immediately obvious. The storage admin has one view. The DBA has another view. The virtualization team has another view. The application owner has a completely different view, usually delivered through a ticket that says something deeply scientific like “the app feels slow.” Everyone may be looking at a valid piece of the puzzle, but the real work is in the correlation. Which volume maps to which workload? Which array is hosting it? What did latency look like during the reported window? Were IOPS elevated? Was bandwidth constrained? Did anything change recently? Are we looking at a storage issue, a database issue, an application issue, a noisy neighbor, a misconfigured VM, a bad query, or just another case of “the network is innocent until proven guilty, but still somehow looks suspicious standing there”? That is where Fusion and MCP together become compelling. The Everpure Fusion MCP example makes the idea real. Instead of forcing an administrator to manually build low-level REST API calls or jump between tools, the MCP-aware AI assistant can query Fusion through higher-level tools exposed by the MCP server. In the example Everpure blog described, a storage admin can ask about workloads and volumes supporting a production SQL environment, including arrays, IOPS, latency, and bandwidth over a recent time window. The assistant can then correlate that storage perspective with information from another MCP server, such as SQL Server context around database files, wait types, and query behavior. That does not mean the AI replaces the storage admin. It does not mean the AI replaces the DBA. It does not mean everyone goes to lunch while the robot fixes production. And this is where I need to bring in The Big Bang Theory again, because apparently this is who I am now. There is a scene in the show where Raj is very open to the idea of aliens and extraterrestrial life. At the planetarium, Raj can look at flashes of light in the sky and talk about how scientists cannot fully rule out the possibility of alien civilizations. It is funny because Raj is a scientist, but he is also Raj, so the line between rigorous possibility and “maybe the aliens are waving at us” gets wonderfully blurry. That is how some people talk about AI operations right now. A light flashes in the sky, and suddenly someone is ready to announce that the robots are here to run the data center. Let’s not do that. The point is not that the AI is an alien civilization arriving to take over infrastructure operations. The point is that the interface is changing. The way humans interact with infrastructure is starting to move from manual lookup, command execution, and tribal knowledge toward assisted reasoning, guided action, and cross-system correlation. That is much more practical than aliens. It is also much more useful. Fusion already gives customers a fleet-wide control plane. It gives you the ability to think above individual arrays, above one-off configuration, and above the old habit of managing infrastructure like every system is its own little island with its own weather pattern. MCP gives that control plane another interface, one designed for the way AI agents work. This is why Fusion adoption matters. If your environment is still managed mostly array by array, script by script, ticket by ticket, and screenshot by screenshot, then AI can only help so much. It may summarize the pain beautifully, but it is still summarizing pain. When you use Fusion to create a more consistent, policy-driven, fleet-aware operating model, you are not just modernizing storage management. You are making the environment more understandable to automation, to operations teams, and now to AI agents that need structured context in order to be useful. That is a very different conversation from “look, the AI can query storage.” The better conversation is this: if AI is going to become part of operational workflows, then your infrastructure needs to be ready to participate in those workflows. Fusion is one of the ways you prepare for that. Not someday. Now. And Fusion is not the only example of this direction. Another Everpure technical article shows how an MCP server can be built to integrate with FlashBlade, allowing an AI assistant to query system data and even take direct actions through a natural-language interface. That example is useful because it shows the bridge between the old world and the new one. In the old world, storage management often meant CLI commands, scripts, API calls, screenshots, and specialized knowledge living in the heads of a few very tired people. In the new world, those capabilities can be surfaced through an AI-assisted experience that understands the available tools and can help operators ask better questions in plain English. Again, that does not mean the AI should blindly run your infrastructure while everyone disappears. Please do not read this article and tell your change advisory board that “the blog guy said the robot can handle it.” That is not the point, and I would like to remain welcome in polite infrastructure society. The point is that the operational model is changing. For years, we have talked about automation in infrastructure, but a lot of what we called automation still required a human to know exactly what to automate, where to look, which command to run, which script was safe, which API endpoint mattered, and which piece of documentation had not quietly aged into fiction. AI-assisted operations changes the interaction pattern. Instead of always beginning with the operator knowing the exact command or API call, the operator can begin with the question. Why did this workload slow down? Which volumes support this application? What changed in the last four hours? Which arrays are carrying the highest latency? Which workloads are consuming the most bandwidth? Which policies are inconsistent across the fleet? Where do we have capacity pressure? Which storage objects are tied to this SQL environment? Those are the kinds of questions humans actually ask when something is happening. MCP gives AI assistants a standard way to ask approved systems for the data behind those questions. Fusion gives the storage estate a more consistent, policy-aware, fleet-level way to answer. That combination is where the opportunity lives. Now, because this is enterprise technology and not a children’s book, we also need to talk about the dangerous part. One of the readers posted this comment on Linked in yesterday: The moment an AI system can access tools and data, the conversation changes. A chatbot that gives a bad answer is annoying. An agent that takes the wrong action in a business system can become a real incident. If a model can read sensitive files, query databases, send messages, modify records, trigger workflows, or touch infrastructure, then security is not a feature. Security is the premise. This is where some of the MCP enthusiasm needs adult supervision. We have spent years telling users not to click strange links, not to approve unknown applications, not to reuse passwords, and not to download random files. Now we are building systems where an AI assistant might read strange content, call external tools, and act on behalf of the user. That can be incredibly powerful, but only if we are honest about the risk. In some ways, MCP may expose organizational problems faster. If your data is scattered, stale, contradictory, or politically curated, an AI agent connected to it will not magically produce truth. It may simply produce a more polished version of the confusion. If your workflows are unclear, connecting AI to them may help automate the ambiguity, which is not quite the same thing as progress. The model can gather information, call tools, and complete steps, but people still need to define what should happen, what should not happen, what requires approval, and what good looks like. For Everpure customers and prospects, the more important question is not whether MCP is interesting. It is whether your environment is ready for this kind of interaction. That is where I would encourage customers to take a serious look at Fusion. Not because Fusion is another checkbox on a feature list, and not because every new technology conversation needs to end with someone saying “platform” three times into a mirror. Fusion matters because it changes the operational model. It gives you a way to manage data infrastructure as a fleet, with policy, consistency, automation, and context. Those are exactly the things AI agents need if they are going to do more than produce nicely formatted guesses. If you already met all the prerequisites (Purity 6.8.+, LDAP enabled), use it. Explore it. Get comfortable with it. Stop thinking about Fusion as something reserved for a future automation project after everyone finally gets through the current list of fires, renewals, upgrades, and meetings that should have been emails. MCP may be the plumbing that helps AI connect to the enterprise. Fusion helps make the storage environment worth connecting to. And that is the real call to action. Fusion is how Everpure customers make sure their data infrastructure is ready for it. Appreciate you reading. Dmitry Gorbatov © 2025 Dmitry Gorbatov | #dmitrywashere
dmitrywashere
2 months ago Place User Blogs
83Views
0likes
0Comments
MCP, Joey Tribbiani, and the Moment AI Needed Plumbing - Part 1
People close to me know that I have a very annoying habit of memorizing, remembering, and using movie and TV show lines in normal conversation. I wish I could tell you this is a carefully curated personality trait, but it is probably closer to a long-running defect in the #dmitrywashere operating system. Some people remember birthdays. Some people remember where they parked. I remember a line from a sitcom episode that aired before half the people reading this had a LinkedIn profile. My two favorite sources are Friends and The Big Bang Theory, which probably says something about me that I am not emotionally prepared to unpack in public. There is a scene from Friends that has lived rent-free in my head for years, mostly because it captures something deeply human and mildly embarrassing. A salesman is talking to Joey and asks him a question that is both funny and a little too accurate: “Let me ask you one question. Do your friends ever have a conversation and you just nod along even though you’re not really sure what they’re talking about?” Joey, of course, immediately zones out. Not metaphorically. Not politely. He disappears into that wonderful Joey place where the mouth stays closed, the face stays agreeable, and the brain has clearly left the building. That was me the first few times I started hearing people talk about MCP. Not once. Not twice. Everywhere. MCP this. MCP server that. MCP is the future of agents. MCP is the USB-C of AI. MCP is how models connect to tools. MCP is the protocol that will make agentic AI real. MCP is the standard. MCP is the integration layer. MCP is the thing everyone apparently understood already, except somehow nobody had bothered to send me the memo. So I did what any responsible technology professional does in that situation. I nodded thoughtfully. The next thing I did was call my son, who is a Data Scientist, and ask him what MCP actually was. After listening to his explanation, I had the uncomfortable realization that he knew more about it than I did, which, naturally, did not feel great. That was just my ego talking, of course. He is way smarter than me. Then I went away and tried to figure out whether MCP was actually important or whether it was just another acronym that had wandered into the AI conversation wearing a conference badge. And that brings me to the other sitcom line that kept popping into my head while I was trying to explain this to myself. In The Big Bang Theory, there is a scene where a very drunk Penny says, “I think I owe you …splanation,” clearly attempting to say ‘explanation’ while her brain and mouth are no longer managed by a ‘unified control plane.’ That is exactly how MCP felt to me at first. I did not need another acronym. I needed a …splanation. A real one. Preferably in English. Preferably without requiring a PhD in distributed systems, three browser tabs of developer documentation, and someone on YouTube drawing boxes and arrows while saying “obviously” before explaining the least obvious thing I had heard all week. So this article is my attempt at that …splanation. After spending time researching MCP, I think it is important. More importantly, I think it is important in a very practical way. It is not the kind of important that requires everyone to become an AI researcher, read white papers at midnight, or pretend that “agentic workflow orchestration” is something normal people say at dinner. MCP matters because AI is moving from something that talks to something that can actually do work, and doing real work requires access to real systems. That is the part worth slowing down for. Most people first experienced modern AI as an LLM chat bot window. You typed something in, and the model responded. Sometimes the answer was impressive. Sometimes it was useful. Sometimes it was wrong with the confidence of a man giving directions in a city he has never visited. But the basic pattern was easy to understand. You asked a question. The LLM answered. That was the product experience. The problem is that most real work does not happen inside a blank chat box. Real work lives in messy places. It lives in documents, calendars, databases, code repositories, CRM systems, ticketing tools, emails, Slack messages, service logs, storage platforms, cloud consoles, spreadsheets, procurement systems, and all the other places where business reality hides after the meeting ends. That is why the first wave of AI, as magical as it felt, was also strangely trapped. A model could write a beautiful summary of a business problem, but unless you gave it the actual business context, it was still guessing. An LLM is not programmed to say “Sorry, I don’t know.” So it makes stuff up with proper grammar and punctuation. It could explain how to troubleshoot an issue, but unless it could inspect the logs, check the configuration, or look at the environment, it was still operating from theory. It could tell you how to prepare for a customer meeting, but unless it could see the account history, the open opportunities, the support cases, the renewal status, and the meeting notes from last quarter, it was basically giving you a very articulate horoscope. MCP is one of the attempts to fix that. MCP stands for Model Context Protocol. The name sounds like it was assembled by people who are very good at distributed systems and very bad at naming things for humans, but the words are actually useful. “Model” refers to the AI model. “Context” refers to the information and tools the model needs in order to be useful. “Protocol” means a standard way for systems to communicate. In plain English, MCP is a standard way for AI applications to connect to external tools and data sources. That may sound boring, but boring is often where the real technology changes happen. Nobody gets a standing ovation for plumbing until the plumbing stops working. Nobody thinks about electrical standards when they plug in a night light. Nobody wants to understand every detail of networking just to open a website. Standards become invisible when they succeed, and that invisibility is exactly why they matter. The analogy people use is that MCP is like USB-C for AI. I know that analogy is already dangerously close to becoming a bumper sticker, but it works well enough if we do not abuse it. USB-C did not make your laptop smarter. It did not make your monitor more creative. It did not make your phone more emotionally available, although at this point I would appreciate it if mine at least tried. What USB-C did was standardize connection. Instead of every device requiring its own special cable, adapter, dongle, ritual, and small sacrifice to the drawer of dead electronics, USB-C created a common interface. MCP is trying to do something similar for AI. It gives AI applications a common way to connect to the tools and data they need. The model does not need to know the internal details of every application. The application does not need to build a completely different integration for every model. MCP creates a shared language in the middle. That middle layer is what matters. Without something like MCP, the AI world runs into what technical people call the N-by-M problem. Katie Baker wrote about it last year: NxM Problem If you have ten AI applications and ten systems they need to connect to, you do not want one hundred custom integrations. If you have fifty AI applications and two hundred systems, you definitely do not want ten thousand custom integrations, unless your business model is selling painkillers to integration teams. The better model is not N times M. It is closer to N plus M. Each AI application learns how to speak the protocol. Each tool or data source exposes itself through the protocol. Once both sides understand the same standard, the number of custom connections drops dramatically. This is the point where MCP starts to become more than an AI developer convenience. It starts to look like infrastructure. To understand how it works, you do not need to become a protocol engineer. You just need to understand three roles: the host, the client, and the server. The host is the AI application the user interacts with. That could be Claude Desktop, ChatGPT, Cursor, Visual Studio Code, or an internal enterprise assistant with a name like Atlas, Navigator, Compass, or whatever else the branding team selected after eliminating “Dave.” The host is where the experience lives. It is where the user types the request, where the model reasons, and where the answer or action comes back. The client lives inside the host and manages the connection to an MCP server. You can think of it as the part of the application that knows how to speak MCP on behalf of the model. It handles the conversation between the AI application and the external capability. The server is the wrapper around a data source. There might be an MCP server for GitHub, another for Slack, another for a database, another for a filesystem, another for a CRM, another for a cloud service, and eventually one for every system that vendors decide must now be described as “AI-ready” in a press release. The server’s job is to expose what it can provide in a way the AI application can understand. It might say, in effect, “Here are the documents I can make available. Here are the actions I support. Here is the format you need to use if you want to call one of those actions. Here are the permissions required. Here is the result you can expect back.” That is where the value appears. The AI application does not need to understand every internal detail of GitHub, Slack, Salesforce, Postgres, Kubernetes, or your company’s deeply loved but spiritually exhausted internal ServiceNOW ticketing system. It needs a standard way to discover and use the capabilities exposed by those systems. MCP gives it that standard way. The protocol itself is built around a few core ideas that are easier to understand than the terminology makes them sound. MCP servers can expose tools, resources, and prompts. Tools are actions the model can ask to perform. A tool might search a database, send a Slack message, create a support ticket, run a test, update a CRM record, query an API, or retrieve the status of a system. Tools are where the AI starts moving from “I can answer your question” to “I can help complete the task.” Resources are information the model can read. These could be files, documents, schemas, database records, logs, API responses, or other pieces of context. Resources matter because AI without context is mostly a very confident intern on the first day of work. It may be talented, it may be fast, and it may be enthusiastic, but it does not know where anything is. Prompts are reusable instructions or workflows. That sounds small, but it is not. In business, consistency matters. You may not want every user inventing their own version of “analyze this account,” “review this code,” “summarize this incident,” or “prepare this forecast update.” A prompt can define how a model should approach a task, what standards it should follow, what inputs it should consider, and what kind of output is expected. Tools let the model act. Resources give the model context. Prompts help shape the model’s behavior. That combination is what makes MCP useful. Let’s make this practical. Suppose you ask an AI assistant to help prepare you for a customer meeting. Without access to your systems, the assistant can give you a generic meeting prep template. It can tell you to understand the customer’s goals, review previous discussions, identify risks, prepare discovery questions, and align to business outcomes. None of that is wrong. It is also not especially magical. It is the kind of advice that sounds helpful until you realize it could apply to almost any meeting with almost any customer in almost any industry. Now imagine that same assistant has controlled access to the right systems through MCP servers. It can read the meeting notes from prior briefings, pull the current opportunity data, review support tickets, check the renewal timeline, inspect open technical issues, summarize the customer’s stated initiatives, and identify where the account team may be telling itself a story that is more optimistic than the facts support. It can then generate a briefing that is not generic at all. It is specific, grounded, and useful. That is the difference between AI as a writing assistant and AI as a work assistant. This is why MCP keeps showing up in conversations about agents. An agent is not just a chatbot with a better title. An agent is expected to reason through a goal, choose tools, gather information, take steps, observe results, and continue until the task is complete or until it needs human help. That requires a standard way to connect reasoning to action. MCP is one of the strongest candidates for that standard layer. This is also where the MCP conversation stops being abstract for anyone running Everpure Fusion. It is one thing to say that MCP allows AI agents to connect to enterprise systems. That sounds interesting, but it can still feel like one of those technology ideas that lives safely inside a product roadmap, an architecture diagram, or a conference session where the coffee is somehow both expensive and terrible. It becomes much more practical when you look at what Everpure is doing with the Everpure Fusion MCP Server. I can almost guarantee that you will not click the link below, so I read it for you. But that will be in Part 2. I already drafted it, but I want to be respectful of your time. Not all of my readers are Everpure customers (yet). So that is my MCP “…splanation,” at least the Part 1 version. MCP is not the robot, and it is not the magical brain that suddenly makes every workflow intelligent. It is the standard connection layer that helps AI move from “I can answer your question” to “I can interact with the systems where your work actually happens.” That may not sound glamorous, but neither does plumbing, electricity, networking, or storage until something important depends on it. And that is why MCP matters. Because the next phase of AI will not be defined only by which model sounds the smartest in a chat window. It will be defined by how safely, consistently, and usefully those models can connect to real tools, real data, and real workflows. In Part 2, I will bring this closer to home and look at what this means for Everpure Fusion, because once AI starts needing context from infrastructure, the way we manage that infrastructure starts to matter a lot more. Appreciate you reading. Dmitry Gorbatov © 2025 Dmitry Gorbatov | #dmitrywashere
dmitrywashere
2 months ago Place User Blogs
68Views
0likes
0Comments
Ask Us Everything: Everpure & Databases - From Firefighting to Forward Thinking
Databases aren’t going anywhere—in fact, they’re becoming more important than ever. In this Ask Us Everything session, Don Poorman sat down with Everpure database experts Anthony Nocentino and Ryan Arsenault to talk all things structured data. And while AI continues to dominate headlines, one theme came through clearly: AI doesn’t replace databases—it depends on them. If you’re running Oracle, SQL Server, SAP, or anything mission-critical, here’s what stood out.
anocentino
3 months ago Place User Blogs
96Views
2likes
0Comments
Ask Us Everything: Pure Storage + Nutanix — What the Community Really Wanted to Know
The January Ask Us Everything (AUE) session tackled one of the hottest topics in infrastructure right now: what Pure Storage and Nutanix are doing together—and what that means for our customers. Judging by the volume and depth of questions, it’s clear that many of you are actively evaluating next-generation virtualization options and want real answers, not marketing slides. With Cody Hosterman (Sr Director Product Management, Pure Storage), Thomas Brown (Field CTO, Nutanix), myself - Joe Houghes (Field Solutions Architect, Pure Storage), and our host Don Poorman (Technical Evangelist, Pure Storage), the conversation went deep into architecture, migration realities, and the practical problems this joint solution is designed to solve. Here are the biggest takeaways from what attendees asked—and what they learned. This is joint engineering, not just “interoperability” One of the most important clarifications came early: this isn’t a case of “here’s a LUN, good luck.” Nutanix has natively integrated Pure Storage FlashArray APIs directly into the Nutanix stack. That means: No plugins to install No bolt-on frameworks to manage No separate operational silos In Prism, the Nutanix management plane, Pure Storage behaves like a first-class storage backend. Snapshots, protection, provisioning, and automation are driven from Nutanix, while Pure Storage delivers its strengths—performance, data reduction, SafeMode, and simplicity—under the covers. NVMe/TCP support is a deliberate, forward-looking choice Several attendees asked why Fibre Channel or legacy protocols weren’t the focus. The answer: this solution is built for where infrastructure is going, not where it’s been. By standardizing on NVMe/TCP over Ethernet, Pure and Nutanix: Avoid decades of SCSI and FC tech debt Enable massive bandwidth scalability (100G, 400G, and beyond) Lay the groundwork for modern security features like TLS and in-band authentication This is a design meant to still make sense 10 years from now. Object-style vDisks eliminate old datastore limits A recurring “aha” moment came when attendees learned how vDisks are implemented. Instead of traditional filesystem-based datastores (with all their historical limits), each virtual disk maps directly to a Pure Storage volume. What that unlocks: Petabyte-scale virtual disks (no more 64TB ceilings) No datastore gymnastics to scale performance No artificial limits inherited from legacy file systems This felt especially relevant for customers running large databases, analytics platforms, or fast-growing enterprise apps. HCI isn’t going away—this complements it A key question from the audience: Does this replace Nutanix HCI? The answer was a clear no. Nutanix HCI still makes perfect sense for many workloads. But when customers: Need to scale storage independently of compute Have performance-heavy or capacity-dense workloads Want an “apples-to-apples” replacement for traditional VMware + external storage …Pure Storage + Nutanix provides a clean alternative without forcing architectural compromises. Migration is real, and the hard parts were addressed honestly Migration questions dominated the session—and the tone was refreshingly pragmatic. Attendees learned: Nutanix Move is fully supported and preserves Purity’s data reduction–which makes this a zero-cost migration in terms of storage capacity VMware NSX rules can be translated into Nutanix Flow during migration Backup tools (Veeam, Rubrik, Commvault, Cohesity, etc.) continue to work without re-engineering or changes in backup operations Most migration risk doesn’t lie in the hypervisor—it’s overlooked third-party dependencies The guidance was consistent: plan carefully, take stock of any dependencies, and don’t rush a wholesale cutover just to meet an artificial deadline. No user ever wants to be forced to do that. Operational simplicity is a major design goal A subtle but powerful theme emerged: you don’t need to tune this solution. VMware users often ask about “nerd knobs” and the need to tweak things to get them working right. In this solution, they’re mostly gone—and intentionally so. Best practices for queue depths, multipathing, performance tuning and more are already baked into the platform by the joint engineering teams. Improvements are managed through upgrades, eliminating the need for manual scripting or implementing performance tweaks for a "snowflake" deployment. The result of this best-of-breed, jointly-engineered solution is consistency, predictability, and easier support—especially during migrations–so that you can focus on the work that makes your business run. The roadmap is active—and community feedback matters This solution was not positioned as “done and dusted.” The GA release is the foundation, not the finish line. Capabilities like Kubernetes support, deeper snapshot orchestration, VDI validation, and migration optimizations are all on the roadmap. And importantly: your use cases drive priorities. And the Pure Storage Community is a great place to drop your feedback for the teams! Keep the conversation going This partnership sparked a lot of interest for a reason: it’s not just about changing hypervisors—it’s about modernizing how infrastructure works. If you missed the live session—or want to dive deeper—join the ongoing discussion in the Pure Storage Community: 👉 https://purecommunity.purestorage.com/discussions/virtualization/ask-us-everything-about-pure-storage--nutanix/3634 You’ll find Pure Storage and Nutanix experts answering follow-ups, clarifying edge cases, and sharing lessons learned from real deployments. While you’re there, be sure to check out past Ask Us Everything events—they’re packed with practical, practitioner-level insights.
jhoughes
5 months ago Place User Blogs
314Views
1like
0Comments
OT: The Architecture of Interoperability
In previous post, we explored the fundamental divide between Information Technology (IT) and Operational Technology (OT). We established that while IT manages data and applications, OT controls the physical heartbeat of our world from factory floors to water treatment plants. In this post we are diving deeper into the bridge that connects them: Interoperability. As Industry 4.0 and the Internet of Things (IoT) accelerate, the "air gap" that once separated these domains is evolving. For modern enterprises, the goal isn't just to have IT and OT coexist, but to have them communicate seamlessly. Whether the use-cases are security, real time quality control, or predictive maintenance, to name a few, this is why interoperability becomes the critical engine for operational excellence. The Interoperability Architecture Interoperability is more than just connecting cables; it’s about creating a unified architecture where data flows securely between the shop floor and the “top floor”. In legacy environments, OT systems (like SCADA and PLCs) often run on isolated, proprietary networks that don’t speak the same language as IT’s cloud-based analytics platforms. To bridge this, a robust interoperability architecture is required. This architecture must support: Industrial Data Lake: A single storage platform that can handle block, file, and object data is essential for bridging the gap between IT and OT. This unified approach prevents data silos by allowing proprietary OT sensor data to coexist on the same high-performance storage as IT applications (such as ERP and CRM). The benefit is the creation of a high-performance Industrial Data Lake, where OT and IT data from various sources can be streamed directly, minimizing the need for data movement, a critical efficiency gain. Real Time Analytics: OT sensors continuously monitor machine conditions including: vibration, temperature, and other critical parameters, generating real-time telemetry data. An interoperable architecture built on high performance flash storage enables instant processing of this data stream. By integrating IT analytics platforms with predictive algorithms, the system identifies anomalies before they escalate, accelerating maintenance response, optimizing operations, and streamlining exception handling. This approach reduces downtime, lowers maintenance costs, and extends overall asset life. Standards Based Design: As outlined in recent cybersecurity research, modern OT environments require datasets that correlate physical process data with network traffic logs to detect anomalies effectively. An interoperable architecture facilitates this by centralizing data for analysis without compromising the security posture. Also, IT/OT convergence requires a platform capable of securely managing OT data, often through IT standards. An API-First Design allows the entire platform to be built on robust APIs, enabling IT to easily integrate storage provisioning, monitoring, and data protection into standard, policy-driven IT automation tools (e.g., Kubernetes, orchestration software). Pure Storage addresses these interoperability requirements with the Purity operating environment, which abstracts the complexity of underlying hardware and provides a seamless, multiprotocol experience (NFS, SMB, S3, FC, iSCSI). This ensures that whether data originates from a robotic arm or a CRM application, it is stored, protected, and accessible through a single, unified data plane. Real-World Application: A Large Regional Water District Consider a large regional water district, a major provider serving millions of residents. In an environment like this, maintaining water quality and service reliability is a 24/7 mission-critical OT function. Its infrastructure relies on complex SCADA systems to monitor variables like flow rates, tank levels, and chemical compositions across hundreds of miles of pipelines and treatment facilities. By adopting an interoperable architecture, an organization like this can break down the silos between its operational data and its IT capabilities. Instead of SCADA data remaining locked in a control room, it can be securely replicated to IT environments for long-term trending and capacity planning. For instance, historical flow data combined with predictive analytics can help forecast demand spikes or identify aging infrastructure before a leak occurs. This convergence transforms raw operational data into actionable business intelligence, ensuring reliability for the communities they serve. Why We Champion Compliance and Governance Opening up OT systems to IT networks can introduce new risks. In the world of OT, "move fast and break things" is not an option; reliability and safety are paramount. This is why Pure Storage wraps interoperability in a framework of compliance and governance, not limited to: FIPS 140-2 Certification & Common Criteria: We utilize FIPS 140-2 certified encryption modules and have achieved Common Criteria certification. Data Sovereignty: Our architecture includes built-in governance features like Always-On Encryption and rapid data locking to ensure compliance with domestic and international regulations, protecting sensitive data regardless of where it resides. Compliance: Pure Fusion delivers policy defined storage provisioning, automating the deployment with specified requirements for tags, protection, and replication. By embedding these standards directly into the storage array, Pure Storage allows organizations to innovate with interoperability while maintaining the security posture that critical OT infrastructure demands. Next in the series: We will explore further into IT/OT interoperability and processing of data at the edge. Stay tuned!
ebiser
7 months ago Place User Blogs
118Views
0likes
0Comments
Understanding Deduplication Ratios
It’s super important to understand where deduplication ratios, in relation to backup applications and data storage, come from. Deduplication prevents the same data from being stored again, lowering the data storage footprint. In terms of hosting virtual environments, like FlashArray//X™ and FlashArray//C™, you can see tremendous amounts of native deduplication due to the repetitive nature of these environments. Backup applications and targets have a different makeup. Even still, deduplication ratios have long been a talking point in the data storage industry and continue to be a decision point and factor in buying cycles. Data Domain pioneered this tactic to overstate its effectiveness, leaving customers thinking the vendor’s appliance must have a magic wand to reduce data by 40:1. I wanted to take the time to explain how deduplication ratios are derived in this industry and the variables to look for in figuring out exactly what to expect in terms of deduplication and data footprint. Let’s look at a simple example of a data protection scenario. Example: A company has 100TB of assorted data it wants to protect with its backup application. The necessary and configured agents go about doing the intelligent data collection and send the data to the target. Initially, and typically, the application will leverage both software compression and deduplication. Compression by itself will almost always yield a decent amount of data reduction. In this example, we’ll assume 2:1, which would mean the first data set goes from 100TB to 50TB. Deduplication doesn’t usually do much data reduction on the first baseline backup. Sometimes there are some efficiencies, like the repetitive data in virtual machines, but for the sake of this generic example scenario, we’ll leave it at 50TB total. So, full backup 1 (baseline): 50TB Now, there are scheduled incremental backups that occur daily from Monday to Friday. Let’s say these daily changes are 1% of the aforementioned data set. Each day, then, there would be 1TB of additional data stored. 5 days at 1TB = 5TB. Let’s add the compression in to reduce that 2:1, and you have an additional 2.5TB added. 50TB baseline plus 2.5TB of unique blocks means a total of 52.5TB of data stored. Let’s check the deduplication rate now. 105TB/52.5TB = 2x You may ask: “Wait, that 2:1 is really just the compression? Where is the deduplication?” Great question and the reason why I’m writing this blog. Deduplication prevents the same data from being stored again. With a single full backup and incremental backups, you wouldn’t see much more than just the compression. Where deduplication measures impact is in the assumption that you would be sending duplicate data to your target. This is usually discussed as data under management. Data under management is the logical data footprint of your backup data, as if you were regularly backing up the entire data set, not just changes, without deduplication or compression. For example, let’s say we didn’t schedule incremental backups but scheduled full backups every day instead. Without compression/deduplication, the data load would be 100TB for the initial baseline and then the same 100TB plus the daily growth. Day 0 (baseline): 100TB Day 1 (baseline+changes): 101TB Day 2 (baseline+changes): 102TB Day 3 (baseline+changes): 103TB Day 4 (baseline+changes): 104TB Day 5 (baseline+changes): 105TB Total, if no compression/deduplication: 615TB This 615TB total is data under management. Now, if we looked at our actual, post-compression/post-dedupe number from before (52.5TB), we can figure out the deduplication impact: 615/52.5 = 11.714x Looking at this over a 30-day period, you can see how the dedupe ratios can get really aggressive. For example: 100TB x 30 days = 3,000TB + (1TB x 30 days) = 3,030TB 3,030TB/65TB (actual data stored) = 46.62x dedupe ratio In summary: 100TB, 1% change rate, 1 week: Full backup + daily incremental backups = 52.5TB stored, and a 2x DRR Full daily backups = 52.5TB stored, and an 11.7x DRR That is how deduplication ratios really work—it’s a fictional function of “what if dedupe didn’t exist, but you stored everything on the disk anyway” scenarios. They’re a math exercise, not a reality exercise. Front-end data size, daily change rate, and retention are the biggest variables to look at when sizing or understanding the expected data footprint and the related data reduction/deduplication impact. In our scenario, we’re looking at one particular data set. Most companies will have multiple data types, and there can be even greater redundancy when accounting for full backups across those as well. So while it matters, consider that a bonus.
jasonwalker
8 months ago Place User Blogs
390Views
1like
1Comment